Why your UI sounds feel exhausting after ten minutes (and three things to try)

technique·7 min read·ui, ux-audio, mixing, fatigue

Two months into the alpha of a turn-based strategy game I worked on, a tester filed a comment that took us a while to act on: "the menu sounds are nice but they make me want to mute the game after a while." We'd shipped a clean, crisp UI palette — one click sound, one hover tick, one confirm chime. Properly mixed. No clipping. And after twenty minutes of inventory management, players hated it.

It wasn't the volume. It wasn't the brightness. It was that every click was identical, every confirm was identical, and they were sitting right in the middle of where the music was working hardest. The ear was adapting to the sounds, then re-encountering them as a small irritant every few seconds. By minute ten, the brain had categorized them as "noise to ignore," and the cognitive cost of ignoring them was real.

I see this in a lot of indie games. The audio is technically clean and aesthetically considered, and it's still tiring. Here's what's going on and three changes that consistently help.

What's actually happening to the listener

There are three mechanisms working against you when UI sounds become fatiguing:

Auditory adaptation. The ear physiologically reduces response to a stimulus that's been repeated unchanged. After about 8–12 repetitions of an identical sound at a similar volume, the listener stops registering it consciously. But the subconscious recognition is still firing, which feels like low-grade irritation rather than salience. Imagine someone tapping their finger at exactly the same interval forever — the same effect, just on a smaller scale.

Frequency masking against music. Most UI sounds land between 2 and 6 kHz, which is exactly where music vocals, lead instruments, and high-mid percussion live. When the UI click sits in the same band as the music, the listener's auditory system has to do extra work to separate them. Doable for ten minutes. Tiring over a long session of inventory management or talent-tree planning.

Lack of timing variation. A real-world click — a pen, a switch, a button — has microscopic timing variation in its components. The contact, the spring release, the housing resonance — these don't fire at exactly the same offset every time. A digital UI sound is bit-for-bit identical on every trigger. The brain reads identical timing as artificial, and the cumulative effect of "this is not a real thing" is fatiguing.

Fix one: pitch jitter, narrower than you think

The instinct is to randomize pitch widely so each click sounds different. This is wrong. Wide pitch randomization (±50 cents or more) makes the UI feel sloppy, like the sounds were assembled by accident. Listeners read it as a bug.

What actually works: ±10 to 15 cents. That's about a fifth of a semitone. Imperceptible as a pitch change to almost everyone, but enough to break the "bit-identical trigger" signature that the brain reads as artificial.

In Wwise this is the random pitch property of the sound container. In FMOD it's the pitch randomization on the instrument. In Unity's built-in audio you can set AudioSource.pitch to a small random range before each PlayOneShot. Set it once and don't touch it.

Combine this with a very small timing offset — ±5 to 10 ms on the trigger — and the fatigue floor drops noticeably. Half the games I've A/B tested this on stop getting "the sounds get tiring" feedback within the first patch after this change. The other half had a deeper problem in the music mix.

Fix two: 2–3 alternates, not 8

A common piece of advice is "have lots of variations." This is also wrong, or at least over-prescribed for UI specifically. For UI, two or three slight variations of the same sound — not different sounds, just the same sound recorded or rendered three times — is the sweet spot.

More than three and you start running into a different problem: the listener notices that "there are different click sounds" and the system feels gimmicky. Two or three feels natural. The ear hears variation without being able to enumerate it.

How to make these alternates if you don't have a sound designer on the team: take your single click, duplicate it, and apply imperceptibly different processing to each copy. Examples:

  • Alternate 1: original, −0.5 dB
  • Alternate 2: original, +50 ms slower attack on the envelope, −0.2 dB
  • Alternate 3: original, lowpass filter at 9 kHz instead of unfiltered

Each alternate is "the same click" but the spectral fingerprint is different enough that the adaptation mechanism doesn't lock onto a single signature. Cycle through them in order or random-no-repeat. Don't pure-random — pure random will sometimes play Alternate 1 three times in a row, defeating the point.

Fix three: cut a hole for the music

This is the fix that gets resisted because it sounds like it makes the UI thinner when you preview it solo, and on its own at full volume it does. But in context — with music playing — it makes the whole mix breathe and the UI clearer.

Look at the frequency content of your background music. For most game music, the energy is densest between 200 Hz and 5 kHz, with a particular spike wherever the lead instrument lives (often 1–3 kHz for synths, 2–5 kHz for guitar leads, 800 Hz–2 kHz for vocals or strong synth pads).

Now look at your UI sound's frequency content in a spectrum analyzer. If they overlap in the dense music region, the UI is fighting for attention. The fix is to cut a small notch — 3–6 dB at the music's lead frequency, with a Q around 2 — out of the UI sound.

This makes the UI noticeably quieter when soloed. In context, it makes the UI cleaner to hear because it's not being masked anymore. You get the perceived clarity back without raising the level, which means you don't have to fight the music for headroom.

If your music changes between menus (loading screen vs combat result), you can either pick the most common lead frequency and notch for that, or notch differently per state if you have audio middleware that supports it.

A note on hover sounds specifically

Hover sounds are a separate problem. Clicks fire on user action — the player chose to make them happen. Hover fires passively as the cursor moves across elements. If the click rules above apply doubly, the hover rules apply triply, because hover repeats faster and is more clearly outside player control.

Practical hover guidelines I keep coming back to:

  • Volume −6 to −10 dB below click. Hover is informational, click is committal. The volume relationship should mirror that.
  • Duration under 60 ms total including tail. Long hover sounds pile up when the cursor moves across a menu list quickly.
  • Lowpass at 6 kHz. Hover should sit above the conversation between music and click; cutting the very top makes it feel less needling.
  • No alternates. For hover, identical is fine because the listener never focuses on it long enough for adaptation to bite. Adding variation here gives diminishing returns and makes the menu sound busy.

If hover is the source of fatigue specifically — players complain about menu navigation more than menu actions — the answer is almost always "cut the hover by 4 dB" before anything else.

A few things that don't help

For completeness, these come up a lot in pitch meetings and they don't move the needle:

Lowering UI volume across the board. Players turn up game volume to hear the music, so the UI ends up at the same effective level. This is a volume balance problem against music, not a UI level problem.

Adding reverb to UI. It makes UI feel like SFX, which is wrong. UI should feel close and immediate. The illusion of being in the same space as the menu is broken if menu sounds have a tail. Keep UI dry or use the shortest possible plate, −25 dB or quieter, only on confirm chimes if at all.

Replacing the sound entirely with something "more pleasant." Pleasantness is the wrong target. Click sounds need to be informationally crisp and not fatiguing. Pleasantness in UI comes from the absence of irritation, not the presence of beauty. A click that sounds beautiful in isolation will still fatigue if it has no variation and overlaps with the music band.

Spatializing UI to feel "more immersive." Don't. UI is not a thing in the world. It's a layer on top. Spatializing it confuses the player about whether the sound is diegetic.

How to check if it's working

After making changes, do this: open a menu-heavy section of your game and run it for 20 minutes. Not in headphones — through whatever speakers a typical player would use. Phone speakers if you're shipping mobile. Laptop speakers if you're shipping desktop and assuming the worst case. If you stop wanting to mute the game after 15 minutes, the changes worked. If you still want to mute at 10, the problem is deeper than the UI palette — probably the music sits on top of the UI band and needs to be remixed.

The best UI audio in games is the kind you stop noticing without stopping registering. That's a thin line. The three fixes above — pitch jitter ±10 cents, two or three alternates, notched for the music's lead band — get you most of the way there with no new sound design work, just better use of what you already have.

Where to pull source from

If you're starting from scratch rather than fixing what you have, the catalog work I do for freesoundlab puts UI sounds in the short tier specifically. Two practical reasons: short SFX with clean transients are what you want for UI (the long ones have ambient tails you'd just have to trim), and a duration-based catalog makes it easier to pick three samples that already share a sonic family and use them as your three alternates.

Whatever source you use, the discipline is the same: pick three closely related sounds, apply ±10 cents jitter, notch for your music's lead frequency, and listen on the worst speaker your players will use. Most UI fatigue problems do not survive that protocol.