ON-CAMERA DELIVERY BLOG / 8 MIN READ

How do I reduce filler words in my videos?

Um, like, you know, so, basically. They creep into every take, and they are the fastest way to sound unsure of a point you are actually right about. Here is how to cut them at the script, in the delivery, and in the edit.

<4fillers/min reads clean
3places to fix them
1jump cut hides the rest
0–100craft score

By Thomas, founder of CutScore · Updated June 2026

DELIVERY CHECK · talking_head.mp4
A presenter speaking into a microphone on camera, the moment where filler words like um and like quietly creep into a take and make a confident point sound unsure.
CRAFT SCORE
FIXES ADVISED
delivery, counted not guessed
Filler words high · 11 / min00:42
Cluster of ums · 4 in 8s01:57
Pace is fine · 155 wpm
The 30-second answer To reduce filler words in your videos, fix them in three places. One, the script: write a bullet outline so your brain has somewhere to go and stops buying time with "um." Two, the delivery: slow down and let yourself pause in silence instead of filling the gap. Three, the edit: a clean jump cut removes the leftover fillers your script and delivery missed. Aim for under four fillers per minute, which reads as confident without sounding robotic. If counting them by ear sounds tedious, that is the exact thing CutScore measures for you.
WHY THE UMS HAPPEN

Filler words are not a character flaw. They are a stalling tactic your brain runs without asking you. When you are talking and the next thought has not arrived yet, your mouth refuses to sit in silence, so it plays a little holding music: "um," "so," "like," "you know." It buys you half a second to think. The problem is that the viewer hears the half second, not the thinking.

I have shipped videos where I counted my own ums afterward and physically winced. The strange part is you almost never hear them while you are talking. You are concentrating on the idea, so the fillers go in under the radar, the same way you stop hearing a fridge hum. Then you watch the export and there are nine "likes" in one sentence and you wonder who let you on camera.

So the goal is not to feel more confident and hope the ums vanish. The goal is to remove the reason your brain reaches for them, and then clean up whatever slips through. That splits neatly into three jobs: the writing, the speaking, and the cutting. Most of the work happens before you ever press record.

THE THREE FIXES

Where filler words come from, and how to kill each source.

Almost every filler traces back to one of three causes. Fix them in this order, because each one makes the next easier, and the first one does the heaviest lifting.

1
BEFORE RECORDINGSCRIPT
Give your brain somewhere to go
Most fillers are the sound of you searching for the next point with the camera rolling. A short bullet outline removes the search. You do not need a word-for-word script, which often sounds stiff anyway. You need to know what the next beat is, so the gap between sentences gets filled by a glance at your notes instead of an "um."
How Write five to seven bullets, one per beat, and keep them where your eyeline lands. Talk to the bullet, not to the silence.
2
ON CAMERADELIVERY
Replace the filler with a pause
Silence feels like a mistake when you are the one talking. It is not. A one second pause that you hear as a yawning void reads, to the viewer, as someone who knows what they are saying. The trick is to let yourself stop. Close the sentence, breathe, then start the next one. The pause is also free editing: you can cut on it cleanly later.
How When you feel an "um" coming, shut your mouth instead. Slightly slower delivery (around 150 words a minute) gives you room to do it. See how fast to talk.
3
IN THE EDITEDITING
Cut the survivors with jump cuts
Some fillers always get through, and that is what the timeline is for. Open your transcript, find the clusters (the stacked ums, the false starts where you restart a whole sentence), and remove them with a clean cut. You are not aiming for zero. You are removing the lumps that make the speech feel like it is wading through mud.
How Cut on a pause or a natural breath so the edit is invisible. A jump cut on a held frame hides the join.
STOP COUNTING UMS BY EAR

Scrubbing the timeline guessing where the fillers hide is slow and you miss half of them. CutScore transcribes the video, counts your fillers per minute, and drops you a timestamp on every cluster.

Join the waitlist
HOW MANY IS TOO MANY

What counts as too many filler words?

There is no official rule, but there is a useful range. The point is not to hit zero, it is to land few enough that nobody is counting while they watch.

Fillers per minuteHow it readsWhat to do
0 / minSuspiciously perfect. Often means a heavily scripted read.Relax. A few small fillers make you sound human, not rehearsed.
1–3 / minConfident and natural. Most viewers never consciously notice.Nothing. This is the target. Ship it.
4–7 / minFine in conversation, a little loose for a polished video.Cut the obvious clusters in the edit and move on.
8–11 / minStarts to feel hesitant. The viewer notices the pattern.Tighten the script first, then jump-cut the rest.
12+ / minThe fillers become the thing people hear, not your point.Reshoot with an outline. The edit alone will not save it.
One number, not a vibeFiller words per minute is one of the things you can actually measure, instead of feeling. Count your fillers, divide by the minutes of speech, and you have a baseline you can beat next video. See filler words per minute for how the metric works.
THE PRACTICAL PLAYBOOK

Four habits that quietly cut your fillers.

1. Outline the beats, do not script the words

A full word-for-word script kills fillers but introduces a worse problem: you sound like you are reading, because you are. Bullets are the sweet spot. Write the beat ("explain the loudness number," "show the before and after"), not the sentence. Your brain fills in the words naturally, and because it always knows what comes next, it stops reaching for "um" to stall. This is the single biggest lever, and it costs you ten minutes before you record.

2. Get comfortable being slightly slower

Fillers love speed. When you rush, your mouth outruns your thoughts and the gap fills with "like." Drop to a steady pace, somewhere around 150 words a minute for talking-head, and the gap becomes a pause instead. Slower also gives the viewer time to follow you, which helps your on-camera delivery on every other axis too. You are not droning. You are leaving space.

A podcast studio with a microphone and headphones set up for a single speaker, the kind of relaxed recording setup where slowing down and pausing replaces the reflex to say um.
A pause you hear as a void reads as confidence to the viewer. Photo: Jakub Żerdzicki / Pexels.

3. Do a throwaway take first

Your first take of any section is a rehearsal whether you planned it or not. Record it, throw it away, and record again. The second pass always has fewer fillers because you have already found the words once. I treat the first take as me talking to myself to figure out what I actually mean. Then take two is the keeper. It feels wasteful and it saves you an hour in the edit.

4. Edit from the transcript, not the waveform

Hunting for ums by scrubbing audio is slow and you will miss half of them. Pull a transcript instead. Fillers are easy to spot as text, and you can see the clusters at a glance, the four "you knows" in one paragraph that your ear glossed over. Cut to those, leave the odd small one, and use a jump cut on a pause so the join disappears. This is also where a tool that already counts and timestamps your fillers saves the most time.

RATHER SEE IT THAN READ IT?

Here is a real CutScore coaching report for an everyday talking-head video: filler words per minute, pacing, loudness and the rest, scored, with timestamps and the exact fixes.

See a sample report
THE PART NOBODY ADMITS

Should you cut every single um?

No, and trying to is its own mistake. A video with literally zero fillers can feel uncanny, like a customer service hold message. A small "um" before a hard point, or a soft "so" that starts a sentence, is how real people talk, and it makes you sound like one. The thing viewers actually dislike is not the existence of fillers. It is the clusters: the stacked ums, the false starts, the "like, like, you know, like" that turns one idea into a traffic jam.

So cut the lumps and leave the texture. In my experience, a video that drops from twelve fillers a minute to two or three sounds night-and-day better, and a video that goes from two to zero sounds slightly worse, not better. Chase smooth, not sterile. The same logic runs through how I think about the rest of the craft: the goal is a video that does not trip the viewer, not one that has been polished until it has no fingerprints left. That is most of what we analyze.

How CutScore counts your fillers for you CutScore is an AI video quality coach for pre-publish QC. It transcribes your video, counts ums, likes, you-knows and false starts, and reports filler words per minute with a timestamp on every cluster, so you cut straight to the problem instead of scrubbing the timeline. Delivery sits alongside loudness, pacing, the hook and the rest, all rolled into one 0 to 100 score with the evidence and the fixes, before anyone else sees the video. It judges the craft of the video itself, not your tags or thumbnail, so it sits next to a growth tool rather than competing with one. More on the method and the standards.
QUESTIONS

Frequently asked.

Fix it in three places. First, the script: a bullet outline gives your brain somewhere to go, so it stops buying time with um. Second, the delivery: slow down, and let yourself pause in silence instead of filling the gap. Third, the edit: a clean jump cut removes the leftover fillers your script and delivery missed. Most people who do all three drop from a dozen fillers a minute to two or three.
As a rough guide, under four fillers a minute reads as confident and natural, and most viewers never consciously notice them. Around eight a minute starts to feel hesitant. A dozen or more a minute and the fillers become the thing people hear instead of your point. The goal is not zero, which sounds robotic. The goal is few enough that nobody counts.
No. A few fillers make you sound human, and cutting every single one can make the pacing feel unnatural and jittery. Cut the clusters: the long ums, the stacked you-knows, the false starts where you restart a whole sentence. Leave the odd small one. The aim is a video that flows, not a surgically sterile read.
Yes. CutScore transcribes your video, counts your fillers, and reports filler words per minute with the timestamps where the clusters sit, so you can cut to them directly. It scores delivery alongside your loudness, pacing and the rest, and hands back concrete fixes before you publish, so you are not scrubbing the timeline guessing where the ums hide.
EARLY ACCESS

Find out how many fillers you actually say.

CutScore counts your filler words per minute, timestamps every cluster, and scores your delivery alongside the rest of the craft. Join the waitlist for early access.

Join the waitlist