How do I improve my on-camera delivery?
Stiff, flat, too many ums: the things that make people click away from a talking head are almost all fixable, and most of them are habits, not talent. Here is what actually moves delivery, with targets you can hit before you publish.
By Thomas, founder of CutScore · Updated June 2026
Delivery is the one part of video that nobody can fix for you in post. You can pay an editor to grade your colour and balance your audio. You cannot pay anyone to give you eye contact you never gave the lens. So the thing that most separates a video that holds attention from one that loses it is the part most creators avoid looking at: how you actually came across when you pressed record.
And it is uncomfortable, because you are watching yourself. I have shipped talking-head videos that I thought were warm and confident, then rewatched them six months later and physically winced. Flat voice, dead eyes, a verbal tic I had no idea I had. The performance that felt fine in the room read as half-asleep on screen. That gap between how it feels and how it lands is the whole problem.
Here is the reassuring part. Almost none of this is about charisma you were born with. Delivery is a stack of habits, and habits have targets. Filler words have a count. Pace has a number. Energy has a baseline you can push. None of it requires a personality transplant. It requires watching yourself back, honestly, and changing one thing at a time. Let me give you the five things that move the needle.
The five things that fix on-camera delivery.
Ranked by how much they change the viewer's experience per minute of effort. Fix them top to bottom, not all at once.
| Habit | Target to hit | What it costs you if you skip it |
|---|---|---|
| Filler words | < 3 / min | A dozen "ums" a minute quietly tells people you are not sure of yourself. |
| Speaking pace | ≈ 140–160 wpm | Too slow and the scroll wins; a flat, even pace reads as boring. |
| Energy level | one notch up | The camera flattens everyone, so "natural" reads as flat on screen. |
| Eye contact | to the lens | Looking at your own preview reads as shifty, like you are avoiding the viewer. |
| Pitch variation | rise and fall | A monotone flattens even good writing into a lecture nobody finishes. |
| Pauses | on purpose | Filling every silence with sound makes you sound nervous, not natural. |
| Body language | open, still-ish | Crossed arms and constant sway pull focus off what you are saying. |
| First 3 seconds | one reason to stay | Most of your drop-off happens right here, before you have warmed up. |
Counting your own fillers and judging your own energy is brutal. CutScore measures pace, filler rate and where your energy dips, with timestamps, so you can fix the take instead of agonising over it.
Five fixes, one at a time.
1. Cut the filler words first
This is the fastest visible win, so start here. Record a normal take, then play it back and count every "um," "like," "you know" and "so." Most people are shocked by the number. The fix is not willpower, it is the pause: when you feel an um coming, close your mouth and let the silence sit for half a second instead. Silence reads as confident on camera. The verbal tic reads as nervous. You can track this directly as filler words per minute, and a good target for talking-head video is under three a minute. A separate piece on how to stop saying um and like goes deeper on the drills.
2. Get your pace right, then vary it
Most engaging talking-head delivery runs around 140 to 160 words per minute, a touch faster than ordinary conversation. But the number is the easy part. The thing that actually kills delivery is a pace that never changes, the same metronome rhythm from the first line to the last. Speed up on the connective tissue, the bits nobody needs to remember. Then slow right down on the one sentence you want to land, and let a pause hang after it. If you are not sure whether you drone, the detail is in how fast you should talk in a video.
3. Push your energy one notch past comfortable
The camera flattens everyone, by roughly a notch. The level that feels natural and warm to you, sitting in your room, arrives on screen as muted and a bit tired. So the move that feels like over-acting in the moment usually reads as completely normal to the viewer. Sit up. Smile before the first word, even if you cut it. Let your voice rise and fall instead of holding one flat pitch, because monotone is what people actually mean when they say a video is "boring." If you suspect that is your problem, why you sound boring on camera breaks it down.
4. Talk to the lens, not to yourself
This one is small and it changes everything. If your camera shows a live preview, you will instinctively watch yourself instead of the lens, and on screen that reads as shifty eyes that never quite meet the viewer. Cover the preview, or stick a tiny arrow above the lens, and talk to the glass as if it were one person you like. Eye contact is most of what we read as confidence. Pair it with open body language, hands visible, shoulders down, and stop the nervous sway. Stillness is not stiffness. Stillness is authority.
5. Script and rehearse the first three seconds
Your delivery is judged hardest at the exact moment you are least warmed up: the opening. Most drop-off happens in the first three seconds, so do not waste them clearing your throat or saying "hey guys, so today." Write that opening line out, say it out loud ten times, and deliver it with your best energy of the whole video. Everything after it has a little forgiveness baked in. The first three seconds have none, so over-prepare them. What the first three seconds should be covers what to actually say.
Here is a real CutScore coaching report for an everyday talking-head video: pace, filler rate, energy dips and the hook, scored, with timestamps and the exact fixes.
If you only fix three things.
Most of the jump from "stiff and forgettable" to "this person knows how to hold a camera" comes from these three. Fix them first.
Practice, a real critic, or one pass.
Reps and the rewatch
Free, and it works, but slowly. Record, watch yourself back honestly, pick one habit, fix it, repeat. The catch is the same one we opened with: judging your own delivery is hard, and most people quietly stop watching the cringe-worthy parts, which are the parts that need the work.
A blunt human critic
Honest feedback from someone who will not spare your feelings is gold. The cost is finding that person, and getting it for every video. Friends say "it was great." You want the one who says "you said 'like' eleven times and you trailed off at the end." Those people are rare.
A coach in one pass
Hand the file (or a link) to CutScore. It measures your pace, your filler rate, where your energy sags and how your hook lands, then gives you a 0 to 100 score with timestamped evidence and the fixes. No friend to bribe, no cringe to push through alone. See a sample report.
Frequently asked.
Stop guessing how you came across.
CutScore measures your pace, your filler words and where your energy dips, then tells you exactly what to fix, with timestamps. Join the waitlist for early access.
Join the waitlist