Weapon hits in three parts: the stack that makes a swing feel like it landed
I worked on a small action game where the lead complaint from the first internal playtest was "swords don't feel like they connect." The animation had impact frames. The hit reaction on the enemy was correct. The SFX played at the right moment. The team had even paid for a premium melee SFX pack. And it still felt weightless.
I sat in the dev room for half a day debugging. The fix wasn't a different sword sound. It was that every hit in the game was a single sound, and a real hit on a real thing is never one sound. It's three things happening within 80 milliseconds: the weapon contacting the target, the target's body reacting, and the player's hand absorbing the recoil. We were playing one of the three. The other two were missing and the brain knew.
This article is how I build a hit stack for melee or close-range ranged weapons that consistently reads as "that landed."
What's actually happening in a real hit
If you record yourself hitting a side of meat with a wooden stick (a thing I have actually done for production), what gets captured is three overlapping events:
- Contact — the front of the strike. Sharp, short, broadband. The signature of the weapon material: wood thunks, metal rings, stone cracks. Lasts 20–50 ms before it decays into the next layer.
- Body response — the target compressing, deforming, ringing. Lasts 80–200 ms. Frequency content depends entirely on what got hit: flesh is low and dull (40–200 Hz energy bulge), wood is mid (200–800 Hz with a brief sustained tail), metal armor is mid-high with a long ring (300 Hz–4 kHz, can ring 500+ ms).
- Follow-through — the trailing sounds. Things falling, the weapon decelerating, debris, vocalization. 100–500 ms, depending. This is the part that sells "this had consequences."
Most game weapon SFX provide a single sample that bakes some of (1) and a bit of (2) together. Almost none include (3) as part of the SFX — and (3) is the layer that turns "I hit something" into "I hit a person who is now reacting."
The three layers, in detail
Layer 1 — Contact (the weapon's signature)
This is your sharpest, most attack-heavy sample. Think 30–80 ms total, with all the energy in the first 20 ms. The sound should identify the weapon, not the target. A wooden club has its own contact signature whether you hit a body or a wall. Bake the target reaction into Layer 2 instead, and your weapon SFX scales across enemy types without re-recording.
What to look for in source material:
- Sharp transient under 5 ms.
- Energy distribution that matches the weapon material: clubs concentrated 200–800 Hz, swords with a metallic bright tail above 3 kHz, blunt weapons low and broad.
- No reverb tail. You'll add space later.
A mistake I see often: people use a "sword hit" sample that already has a flesh impact and a body fall baked in. Then when the same sword hits a stone wall, the flesh sound still plays and it's jarring. Source samples for Layer 1 should be the weapon contacting an abstract, dry surface (a wooden board, a sandbag, a foam block). The contact signature reads as the weapon. Everything else is for Layers 2 and 3.
Pitch jitter ±2 semitones per swing, ±2 dB. Same range as footsteps for the same reason: wider feels broken, narrower feels mechanical.
Layer 2 — Body response (the target's voice)
This is where target-specific variation lives. You should have a Layer 2 sample family per target type:
- Soft enemies (cloth/leather/flesh): low-mid thump (60–300 Hz), 80–150 ms decay, can include a faint cloth rustle in the tail.
- Armored enemies (mail/plate): bright mid (400 Hz–3 kHz), 200–400 ms ring, with a clean decay curve, not a long resonant tail (long tails fight subsequent hits in fast combat).
- Wooden targets (training dummies, doors): mid-only thump (250–600 Hz), 150 ms decay, slight knock-resonance.
- Stone/metal walls: short pure-tone ring (varies by material, often 1–3 kHz center frequency), 100–250 ms.
Layer 2 starts within 10–20 ms of Layer 1. If you offset it further than 20 ms it reads as two separate events. If you trigger it at the exact same moment as Layer 1 they smear together and you lose the weapon signature.
Volume relative to Layer 1: −2 to −4 dB. Layer 1 is the announcement; Layer 2 is the consequence. Consequences should sit slightly behind announcements in the mix.
A specific failure mode: Layer 2 for armored enemies is often too long because designers want "metal ring" to feel impressive. In a sustained combat scene with 4 swings per second, a 500 ms armor ring stacks into a wash of high-mid that fatigues fast and obscures positional cues. Cap your Layer 2 ring at 250 ms for combat-relevant enemies, and use the longer ring only for rare cinematic finishing hits.
Layer 3 — Follow-through (the world reacts)
This is the layer designers skip and then can't figure out why hits feel small.
For melee hits on a human-sized enemy, Layer 3 includes:
- Cloth/armor rustle as the body recoils, 100–200 ms after Layer 1.
- A subtle "air displacement" or "body shift" — a low-mid swooshy element that suggests mass moving.
- For lethal/heavy hits, a brief vocalization layer (grunt, gasp), 80–150 ms after Layer 1.
For hits on inanimate targets:
- Debris/dust sound as appropriate (small particles for stone, splinters for wood).
- Resonance in the surrounding structure if it's a wall or large object (a brief, low room rumble).
Layer 3 is the quietest of the three: −8 to −12 dB relative to Layer 1. The player shouldn't consciously hear it as a separate event. They should just feel that "more happened" than just the strike.
In Wwise this is a separate event triggered with a 100 ms offset from the main hit event. In FMOD it's an additional instrument inside the impact multi-instrument with a delay parameter. In either system, randomize the offset by ±20 ms so the rhythm doesn't lock in.
Mixing the stack
Starting points:
- Layer 1 (contact): 0 dB reference
- Layer 2 (body): −2 to −4 dB, varied by target
- Layer 3 (follow-through): −8 to −12 dB
Same EQ logic as footsteps: don't filter the transient on Layer 1. The brightness is the weapon. If Layer 2 fights the music, notch the music or roll off Layer 2 above 6 kHz — never compress Layer 1's transient flat.
One specific spectral move that helps: high-pass Layer 3 around 80 Hz. Almost all of Layer 3's "feel" is in the 150–800 Hz band; everything below 80 Hz is rumble that piles up in fast combat. The high-pass keeps your low end clean for whatever sub content the actual world has (footsteps Layer C, magic, explosions).
Timing across multiple hits
The stack approach has a hidden benefit: it survives fast combat better than single-sample hits do. Three-hit combos at 0.3 second intervals are common in action games. With single-sample hits, the second and third hits often sound thinner than the first because the brain is still resolving the previous one — auditory masking. With a layered stack, even when Layer 2 of hit one is masking Layer 2 of hit two, Layer 1 of hit two is still clean and the transient lands.
For very fast combo strings (4+ hits per second on light weapons), I attenuate Layer 3 by 6 dB on every hit after the first in a combo, and skip it entirely on hits past the second. This avoids the wash of "more happened" sounds piling up.
Common playtest fixes
"Hits feel light." Either Layer 3 is missing or Layer 1 is too compressed. If Layer 3 is missing, add it before you touch anything else. If Layer 1 is compressed, swap to a source with the transient preserved.
"The combat sounds muddy in busy fights." Layer 2 tails too long. Cap at 250 ms for combat enemies.
"Hits sound the same regardless of what I hit." Layer 2 is not varied per target type, or worse, Layer 1 is doing all the work and includes a baked target sound. Split Layer 1 (weapon-only) from Layer 2 (target-specific).
"Headshots / weakpoint hits don't feel distinct." Add a fourth, optional layer specifically for crit hits — a brief brightness boost (+3 dB EQ peak at 2 kHz with a 200 ms decay) layered over the standard stack. Or swap Layer 1 to a brighter alternate. Don't add new sounds — modify the existing stack.
"Ranged weapon hits feel less impactful than melee." Almost always because ranged hit events don't include Layer 3 — designers think "the projectile is the SFX, the hit is just a confirmation." Wrong. Ranged hits need the same three-layer treatment, just with a brighter, shorter Layer 1 (the projectile's contact, not the weapon's swing).
Source picking
If you're assembling this from a library, here's what to pull for each layer:
- Layer 1: short impact one-shots (under 100 ms) tagged with the weapon material. Look in short-tier SFX catalogs — anything organized by duration tier tends to have these isolated.
- Layer 2: medium-tier impacts and material-response sounds. Body falls, armor rings, cloth thumps. Trim the contact transient if it's there — you only want the response.
- Layer 3: cloth rustles, debris one-shots, brief vocalizations, soft body shifts. Medium-tier ambient/foley.
The freesoundlab catalog is structured this way: short tier for transient/contact material, medium for body responses, with separate categories for cloth, debris, and vocalizations. Whatever library you use, look for duration-based organization — it makes three-layer building tractable instead of a hunt.
Once the stack is working, the next thing to vary is the per-weapon character: pitch each layer slightly differently for different weapons in the same weapon class (a longsword 0 dB Layer 1, a greatsword same Layer 1 down 2 semitones and Layer 3 up 2 dB). That's polish on a working foundation. Get the foundation right first.