F · AUDIO QUALITY & BALANCE

Voice-to-music balance

How far the music bed should sit under the speech.

By Thomas Linck, founder · Updated June 2026

Voice-to-music balance is how far the music bed sits below the speech in your mix. For spoken content, common guidance is to keep the music 15–20 dB under the voice while anyone is talking, ducking the bed under each line. Get it wrong and the viewer rewinds to catch the words.

WHY IT MATTERS

Platforms normalize your overall loudness, but they never touch the balance inside your mix — if the music buries the voice in your export, it stays buried after upload. The tell is unmistakable: viewers rewind to catch words. Keep the bed 15–20 dB under the speech, duck it while anyone talks, then test on a single phone speaker, where the balance collapses first.

TARGET · STANDARD
Music under voice15–20 dB belowwhile anyone is talking
Duckingdrop the bed under speechsidechain or keyframes
The testsingle phone speakerevery word, no effort
How CutScore measures it CutScore measures how far the music bed sits under your speech across the file and flags the stretches where the music is winning, with timestamps and the exact decibel cut to make — so you fix the masking before a viewer has to rewind.
QUESTIONS

Frequently asked.

Roughly 15–20 dB below the voice whenever someone is talking. The real test is a phone speaker: if you catch every word without effort while still hearing the music, the balance is right.
Ducking automatically lowers the music whenever the voice is present and lifts it back in the gaps. Most editors do it with a sidechain compressor or keyframes, and it is the difference between a clean mix and a constant fight between the two.