ON-SCREEN TEXT BLOG / 8 MIN READ

Do I need captions on my videos?

Short answer: for almost anything on a feed, yes. Here is who genuinely needs captions, why most viewers see them before they hear a word, and how to make yours actually readable instead of just present.

Mutehow most feeds autoplay
3things make text readable
Safezone or it gets covered
0–100craft score

By Thomas, founder of CutScore · Updated June 2026

CAPTION CHECK · reel_v3.mp4
A creator filming a piece to camera on a phone with a ring light, the kind of talking-head clip that lives or dies on the feed by whether its on-screen captions are readable with the sound off.
CRAFT SCORE
FIXES ADVISED
are your captions readable on mute?
Caption size small · hard to read on a phone00:04
Low text contrast · add a solid backing00:12
Inside the safe zone · not under the UI
The 30-second answer Yes, you almost certainly need captions on your videos. Most people scrolling a feed see your clip before they hear it, because platforms autoplay on mute, so on-screen text is often the only way your words land. Captions also serve viewers who are deaf or hard of hearing, watching in a second language, or stuck somewhere they cannot turn the sound on. The only real exceptions are videos with no speech, like pure music or ambient pieces. So the real question is not whether you need captions, it is whether yours are actually readable: big enough, high enough contrast, and inside the safe zone where the platform interface will not cover them.
THE PART PEOPLE GET WRONG

Most creators treat captions as an accessibility checkbox they add at the end if they have time. That framing is why so many videos ship with bad ones. Captions are not a courtesy bolted on after the edit. On a modern feed they are part of the video, and for a big chunk of your audience they are the only part that gets through, because the audio never plays.

Picture how people actually watch. Someone is on a train, or in an open-plan office, or lying next to a sleeping partner at 1am. The platform autoplays your clip with the sound off. They have not decided to mute you. The phone arrived muted, the way most of them do. In those first couple of seconds, your beautiful voiceover does not exist. The only thing carrying your point is whatever text is on the screen.

I have shipped videos with no captions and watched them sink, then re-uploaded the same edit with clean captions and watched it actually hold people. Same footage, same audio, same jokes. The difference was that the second version made sense with the sound off. Captions are not decoration. They are the silent version of your video, and the silent version is the one most people meet first.

WHO ACTUALLY NEEDS THEM

Who needs captions, and the rare cases who do not.

Almost every talking video benefits. A short list of formats can skip them. Here is how to tell which side of the line you are on.

Video typeCaptions?Why
Talking-head / vlogyes, alwaysYour words are the product, and muted viewers need to read them to stay.
Tutorial / how-toyesSteps, names and numbers are easy to mishear and easy to misread without text.
Short-form (Reels, Shorts, TikTok)yes, criticalThese autoplay on mute and live or die in the first second, on text alone.
Product demoyesFeature names and prices have to be legible, not just spoken once and gone.
Interview / podcast clipyesTwo voices and crosstalk are hard to follow on mute without on-screen text.
Music-only / ambientoptionalNo speech to caption. Lyrics or a title card can still help, but it is not required.
Closed playback you controldependsA kiosk or a known room with sound on changes the math. A feed never does.
Accessibility is the floor, not the bonusCaptions are how deaf and hard-of-hearing viewers follow your video at all. Treat that as the baseline reason to add them, and the muted-feed reason as the second one stacked on top. Both point at the same answer.
DON'T EYEBALL IT

Present is not the same as readable. CutScore checks whether your captions are the right size, high enough contrast, and inside the safe zone, then tells you where they fail.

Join the waitlist
MAKE THEM READABLE

Having captions is step one. Readable captions is the real job.

A caption that is technically on the screen but too small, too faint, or hiding behind a play button is doing nothing. Three things decide whether yours work.

1
QUICKTEXT
Size: readable on a phone at arm's length
Most of your audience is on a small screen, often held away from their face. If you have to lean in to read your own captions on your laptop, they are far too small on a phone. Bigger is almost always better. A few large words per line beats a thin paragraph nobody can parse while scrolling.
How Hold your phone at arm's length and play the clip. If you squint, size up. More in the right caption font size.
2
2-MIN FIXTEXT
Contrast: white text disappears on a bright background
Plain white captions vanish the moment your shot has a window, a wall, or anything pale behind them. Give the text a solid backing, a dark box or a soft shadow, or a heavy outline. Then it survives any background your video throws at it, including the busy ones.
How Add a semi-opaque box or a thick stroke. Test it against your brightest frame. See checking text contrast.
3
PLACEMENTTEXT
Placement: keep it inside the safe zone
Every platform stacks its own buttons, captions and profile bits over the bottom and side of your video. If your text lives there, the interface eats it. Pull captions toward the middle third, away from the edges, so nothing important sits where a like button or the platform's own subtitles will land.
How Keep text off the bottom and side margins. See where to place text and the Shorts safe zones.
A video playing on a tablet propped on a desk, showing how on-screen captions have to stay large and high-contrast to read on a small handheld screen across the room.
The test that matters: can you read it on a handheld screen, across the room, with the sound off? Photo: Pixabay / Pexels.
CAPTIONS VS SUBTITLES VS ON-SCREEN TEXT

Captions, subtitles, burned-in text: which do you need?

Captions (the spoken words, often burned in)

For short-form, the captions people mean are usually the spoken words burned right into the picture, styled and animated. They are baked into the frame, so they always show, on every player, on mute, with no setting to toggle. This is the format that carries Reels, Shorts and TikTok. The downside is they are permanent: get the timing or the spelling wrong and it ships with the video. Worth getting right the first time.

Subtitles (a separate track the viewer turns on)

On longer YouTube videos you can upload a subtitle file the viewer toggles on or off, and translate it into other languages. This is great for reach and for people who want a clean frame. The catch is that it is off by default for most viewers, so it does nothing for the muted scroller who never opens the menu. On long-form, do both: a subtitle track for accessibility and a strong hook on screen for the first few seconds.

On-screen text (titles, labels, callouts)

This is not captions, but it lives in the same family and obeys the same rules. A title card, a name lower-third, a "step 2" label: all of it has to be readable, high-contrast and inside the safe zone, exactly like your captions. If you have on-screen text, it is part of what we analyze under the same on-screen text checks. Same targets, same failure modes.

RATHER SEE IT THAN READ IT?

Here is a real CutScore coaching report for an everyday clip: caption size, contrast and safe-zone checks scored, with timestamps and the exact fixes.

See a sample report
DON'T JUST AUTO-GENERATE AND WALK AWAY

Auto captions are a draft, not the final cut.

What auto captions get wrong

Auto-generated captions are a brilliant head start and a terrible final answer. They mangle names, brand terms and jargon. They drop or invent punctuation, which changes meaning. They mishear numbers, which is fatal in a tutorial or a price. And they often time the text a beat behind the speech, so the caption arrives after the word it belongs to. None of that is the tool's fault. It just means the draft needs a human read before it ships.

The five-minute cleanup that fixes most of it

Read your captions through once, start to finish, the way a viewer would. Fix the spellings and the wrong numbers. Break long lines so each caption is a few words, not a wall. Nudge the timing so text appears on the word, not after it. That is usually five minutes, and it is the gap between captions that help and captions that quietly say "I didn't check." If you are sweating the rest of your pre-upload checklist, this one belongs on it.

Make the captions match the spoken pace

Captions are a reading task layered on a listening one, so they have to keep up without overwhelming. If you talk fast, your captions flash by too quickly to read; if you cram a sentence into one card, it is a wall. Shorter lines, held a little longer, read far easier. This is the same discipline as watching filler words and pace in the delivery itself: less on screen at once, but each piece legible.

How CutScore checks your captions CutScore is an AI video quality coach for pre-publish QC. For on-screen text it checks the things that decide whether a caption actually works: is it large enough to read on a phone, is the contrast high enough to survive a busy background, and does it sit inside the safe zone instead of under the platform interface. You get one score from 0 to 100, the timestamped evidence behind it, and a prioritised list of fixes. It judges the craft of the video, not your tags, thumbnails or rankings, so it sits next to a growth tool rather than competing with one. More on the method and the standards.
QUESTIONS

Frequently asked.

For anything posted to a feed, yes. A large share of people watch with the sound off, especially at the start of a clip when the platform autoplays it on mute. Captions also help anyone who is deaf or hard of hearing, anyone watching in a second language, and anyone in a noisy or quiet room. The exceptions are rare: pure music or ambient pieces with no speech, and some closed environments where you fully control the playback.
They are a starting point, not a finished job. Auto captions get names, jargon, numbers and punctuation wrong, and they often time the text badly so it lags the speech. Always read through them once, fix the errors, and tidy the timing. Five minutes of cleanup is the difference between captions that help and captions that quietly make you look careless.
Three things: size, contrast and placement. The text needs to be big enough to read on a phone at arm's length, it needs a solid backing or outline so it survives a busy background, and it has to sit inside the safe zone so the platform interface does not cover it. Short lines beat long ones. Aim for a few words a line, not a paragraph.
They help the video get watched, which is the part you control. Muted viewers who can read your point are far more likely to stay than muted viewers staring at lips moving in silence. CutScore checks whether your captions are present, readable and inside the safe zone, but it judges the craft of the video, not your tags, thumbnails or rankings.
EARLY ACCESS

Make sure your captions actually read.

CutScore checks whether your captions are big enough, high enough contrast, and inside the safe zone, and tells you exactly what to fix before you publish. Join the waitlist for early access.

Join the waitlist