How do I reduce filler words in my videos?
Um, like, you know, so, basically. They creep into every take, and they are the fastest way to sound unsure of a point you are actually right about. Here is how to cut them at the script, in the delivery, and in the edit.
By Thomas, founder of CutScore · Updated June 2026
Filler words are not a character flaw. They are a stalling tactic your brain runs without asking you. When you are talking and the next thought has not arrived yet, your mouth refuses to sit in silence, so it plays a little holding music: "um," "so," "like," "you know." It buys you half a second to think. The problem is that the viewer hears the half second, not the thinking.
I have shipped videos where I counted my own ums afterward and physically winced. The strange part is you almost never hear them while you are talking. You are concentrating on the idea, so the fillers go in under the radar, the same way you stop hearing a fridge hum. Then you watch the export and there are nine "likes" in one sentence and you wonder who let you on camera.
So the goal is not to feel more confident and hope the ums vanish. The goal is to remove the reason your brain reaches for them, and then clean up whatever slips through. That splits neatly into three jobs: the writing, the speaking, and the cutting. Most of the work happens before you ever press record.
Where filler words come from, and how to kill each source.
Almost every filler traces back to one of three causes. Fix them in this order, because each one makes the next easier, and the first one does the heaviest lifting.
Scrubbing the timeline guessing where the fillers hide is slow and you miss half of them. CutScore transcribes the video, counts your fillers per minute, and drops you a timestamp on every cluster.
What counts as too many filler words?
There is no official rule, but there is a useful range. The point is not to hit zero, it is to land few enough that nobody is counting while they watch.
| Fillers per minute | How it reads | What to do |
|---|---|---|
| 0 / min | Suspiciously perfect. Often means a heavily scripted read. | Relax. A few small fillers make you sound human, not rehearsed. |
| 1–3 / min | Confident and natural. Most viewers never consciously notice. | Nothing. This is the target. Ship it. |
| 4–7 / min | Fine in conversation, a little loose for a polished video. | Cut the obvious clusters in the edit and move on. |
| 8–11 / min | Starts to feel hesitant. The viewer notices the pattern. | Tighten the script first, then jump-cut the rest. |
| 12+ / min | The fillers become the thing people hear, not your point. | Reshoot with an outline. The edit alone will not save it. |
Four habits that quietly cut your fillers.
1. Outline the beats, do not script the words
A full word-for-word script kills fillers but introduces a worse problem: you sound like you are reading, because you are. Bullets are the sweet spot. Write the beat ("explain the loudness number," "show the before and after"), not the sentence. Your brain fills in the words naturally, and because it always knows what comes next, it stops reaching for "um" to stall. This is the single biggest lever, and it costs you ten minutes before you record.
2. Get comfortable being slightly slower
Fillers love speed. When you rush, your mouth outruns your thoughts and the gap fills with "like." Drop to a steady pace, somewhere around 150 words a minute for talking-head, and the gap becomes a pause instead. Slower also gives the viewer time to follow you, which helps your on-camera delivery on every other axis too. You are not droning. You are leaving space.
3. Do a throwaway take first
Your first take of any section is a rehearsal whether you planned it or not. Record it, throw it away, and record again. The second pass always has fewer fillers because you have already found the words once. I treat the first take as me talking to myself to figure out what I actually mean. Then take two is the keeper. It feels wasteful and it saves you an hour in the edit.
4. Edit from the transcript, not the waveform
Hunting for ums by scrubbing audio is slow and you will miss half of them. Pull a transcript instead. Fillers are easy to spot as text, and you can see the clusters at a glance, the four "you knows" in one paragraph that your ear glossed over. Cut to those, leave the odd small one, and use a jump cut on a pause so the join disappears. This is also where a tool that already counts and timestamps your fillers saves the most time.
Here is a real CutScore coaching report for an everyday talking-head video: filler words per minute, pacing, loudness and the rest, scored, with timestamps and the exact fixes.
Should you cut every single um?
No, and trying to is its own mistake. A video with literally zero fillers can feel uncanny, like a customer service hold message. A small "um" before a hard point, or a soft "so" that starts a sentence, is how real people talk, and it makes you sound like one. The thing viewers actually dislike is not the existence of fillers. It is the clusters: the stacked ums, the false starts, the "like, like, you know, like" that turns one idea into a traffic jam.
So cut the lumps and leave the texture. In my experience, a video that drops from twelve fillers a minute to two or three sounds night-and-day better, and a video that goes from two to zero sounds slightly worse, not better. Chase smooth, not sterile. The same logic runs through how I think about the rest of the craft: the goal is a video that does not trip the viewer, not one that has been polished until it has no fingerprints left. That is most of what we analyze.
Frequently asked.
Find out how many fillers you actually say.
CutScore counts your filler words per minute, timestamps every cluster, and scores your delivery alongside the rest of the craft. Join the waitlist for early access.
Join the waitlist