How do I make my audio sound professional?
Professional audio is not a plugin you buy, it is a short chain you run in order: a clean source, a tamed room, the voice on top, then loudness on target. Here is the chain, link by link.
By Thomas, founder of CutScore · Updated June 2026
Here is the part nobody wants to hear. People will forgive a soft shot, a slightly green skin tone, even a wobbly handheld pan. They will not forgive bad audio. Sound is the sense your viewers cannot consciously fault but instantly distrust. When the voice is thin, distant, or buried under a stock-music bed, the brain quietly decides "amateur" before a single word has fully landed. I have shipped videos exactly like that and watched the retention graph fall off a cliff in the first ten seconds.
The trap is your monitoring. You edit on nice headphones in a quiet room, late at night, leaning in. Of course it sounds fine. Then someone watches on a phone speaker on a bus, and your carefully placed music is now a wall and your voice is a rumour behind it. The room you recorded in adds its own tax: bare walls bounce sound back into the mic, and that reflected energy is what we hear as "hollow" or "cheap."
None of this requires a studio to fix. Professional-sounding audio is mostly decisions, not money. The decisions just have to happen in a specific order, because each one builds on the last. Get the source clean and the rest is easy. Skip the source and no plugin will save you. So let us go link by link.
The five links that make audio sound professional.
Run them top to bottom. Each link has a target you can actually hit, and skipping any one of them is something a listener will notice, even if they cannot name it.
| Link in the chain | Target to hit | What it costs you if you skip it |
|---|---|---|
| Source distance | mic ≈ 1 ft, off-axis | A distant mic records more room than voice, and that reads as hollow and cheap. |
| Room reflections | soft, not bare | Bare walls bounce sound back into the mic and add an echoey "stairwell" tone. |
| Noise floor | low, steady | Hiss, hum and fan noise read as amateur before you have said a word. |
| High-pass filter | cut below ~80 Hz | Low rumble eats headroom and muddies the voice without you noticing. |
| Voice vs music | voice clearly on top | Music burying speech is the single most common amateur tell there is. |
| Loudness | ≈ −14 LUFS | Too quiet and your video feels timid next to everything else in the feed. |
| True peak | ≤ −1 dBTP | Hot peaks crackle and distort once the platform re-encodes your file. |
Checking loudness, peaks and the voice-music balance on every video by ear is slow and unreliable. CutScore measures all of it in one pass and hands back the exact gain changes.
Capture clean, then fix in the edit.
1. Get close, and tame the room first
Distance is the loudest enemy of professional audio, and it costs nothing to fix. Roughly speaking, doubling the gap between you and the mic halves the direct sound and lets the room take over, which is exactly what "hollow" means. Get the capsule about a foot from your mouth, slightly off to the side so plosives do not thump it. Then soften the space: a rug, a sofa, a wardrobe of clothes, anything absorbent kills the reflections that make a bare room sound like a bathroom. A budget mic up close in a soft room beats an expensive one across a tiled kitchen, every single time. This is the cheapest professional upgrade you will ever make.
2. Clean the floor and cut the rumble
Once the take is in, the first edit move is subtraction, not addition. Apply a light noise-reduction pass to knock down steady hiss, hum and fan noise, but go gently: push it too hard and the voice starts to sound underwater and robotic. Next, set a high-pass filter to roll off everything below about 80 Hz, where the rumble of traffic, air conditioning and desk thumps lives. You will not miss those frequencies in a voice, and removing them frees up headroom so the words can sit louder and clearer. If your audio still sounds muffled rather than noisy, the cause is usually the room and the mic distance, and there is a whole separate fix for muffled audio.
3. Put the voice on top of the music
This is the one that fools people into thinking they need better gear. They do not, they need to pull the music down. The most common amateur tell in all of online video is a stock-music bed sitting at the same level as the voice, or louder. In my own early videos I set the music by how it felt while editing on headphones, and it was always five or six decibels too loud on a phone. The rule that works: lower the music until you can hear every word clearly on the worst speakers you own, then drop it one more decibel for safety. If the music wants to swell, duck it under speech with a simple sidechain or a few manual keyframes. Here is the full breakdown on why music ends up louder than your voice.
4. Set the loudness, then guard the peaks
Now make the whole mix the right size. Normalise the full export toward −14 LUFS, which is where YouTube normalises loudness, so your video does not feel timid next to the next one in the feed. Short-form feeds tend to run a touch hotter in practice, but −14 LUFS is a safe professional anchor across platforms. Then guard the ceiling: keep your true peak at or below −1 dBTP. That headroom matters because platforms re-encode your file, and a peak that touched zero on your timeline can crackle and distort after compression. Loud and clean is the goal, not loud and crunchy. If you are unsure where your file sits, check your audio levels before uploading.
5. The contrarian last step: listen on the worst device
Most "professional audio" advice ends at the mix. The professionals I trust do the opposite of obsessing over their nice monitors: they finish by checking on the worst speaker they can find. A phone held at arm's length, a single laptop speaker, cheap earbuds in a noisy room. That is where most of your audience actually is, and it is the only honest test of whether the voice still cuts through. If your words survive the bad speaker, they will sound great everywhere. If they vanish, your mix is too quiet, your music is too loud, or your voice never had enough top end to begin with. Fix it for the worst case and the best case takes care of itself.
Here is a real CutScore report for an everyday video: loudness, true peak and the voice-music balance, all measured, with timestamps and the exact gain changes to make.
If you only fix three things.
Most of the jump from "homemade" to "this person knows what they are doing" comes from these three audio moves. Fix them first.
By ear, by meter, or in one pass.
By ear, on the worst speaker
Free, and better than nothing. The catch is the one we opened with: your monitoring flatters you. It works best when you finish on a phone speaker or cheap earbuds, not your nice headphones. Use the chain above so you are testing against targets, not vibes.
With a loudness meter
Accurate and honest. An EBU R128 loudness meter and a true-peak readout tell you exactly where your mix sits. The cost is time and knowledge: you have to know the targets, open the meter, and read it correctly for every single export. Great if you enjoy this. Most people do not.
With a coach in one pass
Hand the file (or a link) to CutScore. It measures loudness, true peak, noise and the voice-music balance against the right standard for your genre, then gives you a 0 to 100 score with timestamped evidence and the fixes. No meters to read. See a sample report.
Frequently asked.
Make every video sound like you meant it.
CutScore measures loudness, true peak, noise and the voice-music balance, then tells you exactly what to fix, with the evidence to back it up. Join the waitlist for early access.
Join the waitlist