F · AUDIO QUALITY & BALANCE

Voice-to-music balance

How far the music bed should sit under the speech.

By Thomas Linck, founder · Updated June 2026

Voice-to-music balance is how far the music bed sits below the speech in your mix. For spoken content, common guidance is to keep the music 15–20 dB under the voice while anyone is talking, ducking the bed under each line. Get it wrong and the viewer rewinds to catch the words.

WHY IT MATTERS

Platforms normalize your overall loudness, but they never touch the balance inside your mix — if the music buries the voice in your export, it stays buried after upload. The tell is unmistakable: viewers rewind to catch words. Keep the bed 15–20 dB under the speech, duck it while anyone talks, then test on a single phone speaker, where the balance collapses first.

TARGET · STANDARD

Music under voice	15–20 dB below	while anyone is talking
Ducking	drop the bed under speech	sidechain or keyframes
The test	single phone speaker	every word, no effort

How CutScore measures it CutScore measures how far the music bed sits under your speech across the file and flags the stretches where the music is winning, with timestamps and the exact decibel cut to make — so you fix the masking before a viewer has to rewind.

Get this measured automatically See a sample report

QUESTIONS

Frequently asked.

How much quieter should music be than voice?

Roughly 15–20 dB below the voice whenever someone is talking. The real test is a phone speaker: if you catch every word without effort while still hearing the music, the balance is right.

What is ducking?

Ducking automatically lowers the music whenever the voice is present and lifts it back in the gaps. Most editors do it with a sidechain compressor or keyframes, and it is the difference between a clean mix and a constant fight between the two.