Text overlaid on 360 video has some unique and special considerations that we should be aware of when creating 360 video.

Text On Video Is Still Text On Video

The first part of understanding design considerations that overlay text has is to understand the design considerations of overlay text in regular video content. Nothing I am going to say here will help if we can't produce readable, aesthetically enjoyable, and thematically fitting titles or overlay text.

Establishing shots of a city might have "New York, New York" to place us, or they might show us how much time has passed, what time it is, and other such details that may aid the viewer. These are labels. They need to be treated differently from a title, which is more general. I will talk about titles as text that, while on top of the scene, is not necessarily linked, visually, to the image also on the screen.

When making overlay titles or labels, we still need to make the text have enough contrast with the background to be readable, the title should be large enough to be read, yet not too large to cause our viewer head strain (We can't quite fill a visual frame, as the viewer will start moving the frame (their head), perhaps distorting the text and impacting the viewers experience. We need some padding). The text should still feel appropriate to the content. Typewriter text during a spy film, command line prompt during a hacking sequence, and so on.

Let's focus on the design considerations that are unique to 360 video.

Overlay Text In 360 Video

Viewers are extremely used to a HUD: Heads Up Display. From video games to car dashboards, the idea of having text in our screen with moving content behind it is accepted.

Battlefield: Bad Company 2 multiplayer gameplay Screenshot from Battlefield: Bad Company 2 Multiplayer Gameplay footage.

In the above, the map (bottom left), ammo indicator (bottom right), feed (top left), various positional indicators, "CHARGE SET AT BRAVO" text, the crosshairs would all be a part of the Heads Up Display. They are present regardless of the player's position, viewing angle, or lighting conditions.

360 video players, currently, do not let us have a HUD video layer to our films (get on that, someone). All 360 video creators can deliver is an equirectangular projected video file, so we have to choose somewhere to put the text. The viewer, as they move their head around the scene, will also be moving the text around. The text is "attached" to the scene, and not appear like it is overlaid. It feels very diegetic.

Let's say our text appears above a couch. Viewers are highly likely to think that this couch is the thing being mentioned in the text! As opposed to the text talking about the video as a whole or a titular character about to enter. They'll have to think about it, solve a little [easy] puzzle, but that's not what we - the storytellers - want the viewer thinking about. We want the text to be immediately and instantly be recognized for what it is, what it refers to, and how to interpret it, without the viewer having to think about it.

Consider this New York Times Daily 360 video, Sandboarding in Peru. Most of the titles here are Labels, telling us our geographic location, who is speaking, or who we are looking at.

Sandboarding In Peru - New York Times - Daily 360

The only time this doesn't happen is when somebody off-camera is speaking, this label is placed in the sky away from anything visual, and the text explicitly explains: "voice of ...". It stays consistent visually, and the explanation it "unlinks" the words from their place in the frame.

Sandboading In Peru - New York Times - Daily 360.

Titles May Not Be Seen

There is a decision every designer needs to make: Risk the viewer not seeing a title, or risk distracting and overloading the viewer by making the titles too large or repeating them too often.

This is similar to the film editing decision of how long to keep titles on the screen, except that decision deals with not boring audiences who can't stop looking at the title. In 360 video, a viewer can read a title, then move on to ignore it! This is a good thing, for we can hold our titles long enough for even the slowest readers without insulting faster readers.

The key is placement. Moving titles higher, lower, or to the side of action - as opposed to on top of it, allows it to be noticed by those looking at the POI in reference, read, and then they can move their gaze back to what matters. Putting the titles on top of the action forces viewers to keep reading it. Humans, whether we like it or not, read text - and too much text in a frame is annoying, as we will keep reading it. This may sound weird but think back to times you've been annoyed or have curiously read and re-read text on a screen. The same applies for 360 video.

Further, when labeling characters, don't be afraid to fade away those labels after they are likely to have been read. Just because viewers can look elsewhere doesn't mean they will, and text does have an anchoring effect to viewer gaze. It's a good idea to fade away clutter when it's not needed anymore.

Create Or Don't Create Information Centers

One special note is to be careful when fading text out and in at the same location. Viewers recognize this as a hot-spot where new information will appear after seeing text fade into new text in the same spot. Thus, they will be more reluctant to discover more in your frame. It might make sense to keep your information, like captions, in the same spot for viewers to reference, but be careful not to set the expectation that something new and unknown will appear there. Viewers should realize "this is where captions are" and not "this is where important things I don't know about are". If the wrong expectation is set, viewers will be reluctant to look away from the text for too long, worried they will miss something - and likely to actually miss something filmed.

In all likelihood, we are trying to enable our viewers to feel like they have discovered interesting moments in the film. Too many titles will only prevent these discovery moments from happening.

Text commands a lot of visual weight and attention, and filmmakers must respect the effect that this text has on viewers.

Text Is Attached To The Scene

Let's consider use cases where we want this artifact of 360 video.

Labels are great for place names, locations. Take advantage of this fact. New York Times Daily 360 and Discovery VR Atlas routinely use labels to positive effect.

We can also label anything we do want to be labeled in our scene. Let's say our film is "The Adventures of JoJo", a curious monkey. We could start with a cold open of the monkey stealing something, getting in trouble, and running off - freeze the frame during a particularly hectic frame, and show our title card right over the face of JoJo herself. Now we know what the video is about, we set expectations for what to expect the film will contain, we know who the main character is, and we know a bit about this character's behavior.

To a less extreme example, if we have text that mentions a character (or setting), then it's probably perfectly fine to layer the title above or on top of the titular characters. Depending on the size of the title, we may be able to perhaps put it above, below, or to the side of the character. We can give the viewer a chance to look away from the title and then onto the character when they are done reading. It can function as both title and label.

Stunning 360° Paramotor Flight Above Iguazu Falls w/ Rafael Goberna Perhaps not this literally. I think this overlay was chosen not just to introduce the rider without using an additional shot, but also to hide ugly stitching, perspective distortion, or just distracting low-quality source video of the man's face.

Position In Frame As Metaphor

We can use title's position in a frame to establish dramaturgical angles. The opening title will convey "forward" strongly, and establish our default angles of view. We can introduce evil characters/environments to the viewers right, and their "good" counterparts on the left. They move clockwise/counter, thematically and literally, and clash in the middle. We can take advantage of the angle of the viewers head to help them understand how to interpret a character and understand the story.

When the good secret agent spy infiltrates the enemy compound, and we position them on the right of the frame, it will be uncomfortable, unfamiliar, and tense. Tension is what we would want in that scenario! I'll write more about using the 360 frame in this way, but suffice to say we should pay attention to where, around the frame, we should be placing our titles and labels, and how this placement affects the viewers understanding of the film.

Alright, so now we have an idea of how overlay titles and their inherent contextual link to the content they appear to exist within and around may affect a viewer. Let's explore ways to lessen this perhaps unintended effect.

Methods for Less Diegetic Titles

So how do we conceptually unlink text from the particular area of the frame it appears in? Firstly, this "problem" is never really going to go away without a HUD layer. The text will appear like it is inside of the content. We can take subtle steps to let the viewer not notice this, but as 360 filmmakers, we have to embrace this as one of the quirks of the medium and take advantage where we can.

The important thing is to just be aware of the options available to us.

Don't Overlay The Text On Video

The simplest option is to just put the title over black, or some solid color. However, "Don't put titles over video in 360 reasons" is the type of beginners rule that is begging to be broken and explored creatively.

If this is too boring, we also could, technically, create scenes designed for titles, like how films from the 40's used to paint titles on fancy book covers, for example, then film that. This will allow us to establish mood and atmosphere before the film begins, but may just appear campy or reminiscent.

Elephant Boy (1937) Elephant Boy Title. 1937.

Repeat The Text

An easy method is to simply repeat the text around the frame. The text will be recognizable as a more general title and not linked to any specific element in the frame. Equally spaced text will not imply a visual connection with the overlaid film, as the placement of the text was determined by a simple process of repeating it with an even spacing, and not because of the content underneath it. The viewer can recognize the rules that called for it's

Lastly, this method is already employed and I would argue it is probably recognizable by a variety of viewers.

Rocking Out Backstage With An Opera Star

The New York Times Daily 360 usually repeats such text twice, finding a compromise between findability and annoying/distracting presence. Frame from Rocking Out Backstage With An Opera Star.

Moving Camera

Growing A Virtual Wonder (From Growing A Virtual Wonder by Great Green Wall)

If the camera is panning or moving, the text is not. Thus, an easy way to visually unlink them. We probably want to put our title text "forward" and have the viewer look into the direction of motion, which is less nauseating than other directions. We don't need to move the camera quickly at all, and such panning shots may be very effective for establishing shots. Like the classic rom-com opener of a helicopter shot over a city.

The opening titles to You've Got Mail, uh, technically qualify as text over moving video.

Consistent Plate or Border

What easier way to separate text from the background than to put a plate behind it? It could be a fancy flourish, a simple border, or just a rectangle. All we have to do is encapsulate the text in some visual way, and let the text exist inside of its visual capsule consistently. We are telling the viewer that anything inside of this plate/border is important, but not necessarily related to the content behind the scene.

Guy Turland's 360 VR Lobster Roll | Tastemade HoreuVRes Snapshot of 360 playback. From Guy Turland's 360 VR Lobster Roll | Tastemade Hors d'oeuVRes.

Subtle Movement/Animation

Similar to moving the camera, we can move the text! We can have it quickly pan in, pause, and pan out, crawl across or around the scene, or any other transition effect we can find inside our editing software. (Note: I don't recommend any of these approaches for aesthetic reasons).

What I do recommend is being subtle about the text animation. Perhaps the text fade in, and the underline slides across, maybe it lingers for just a moment after the text fades out. Perhaps the text is just very slowly growing or shrinking, or it's apparent depth in the frame is shifting (vary how large the text is before doing the equirectangular projection conversion for this to work).

DiscoveryVR Atlas: Italy DiscoveryVR Atlas Isn't using animation do disassociate the image from the footage it's under, but it is effective in getting attention and adding visual interest.

Drop Shadow

A drop shadow is a classic method, visually disconnecting the text and video by raising the text above the video content. If applied with video editing software, it makes viewers aware of the (claustrophobic) sphere they are inside of.

One would want to apply the drop shadow before the text is warped to be equirectangular. It's an odd looking effect that, in my opinion, feels trite. I don't really recommend it.

Fade/Blur the video

A halfway compromise between showing the text on a black card and over the video, we allow some information (and audience expectation) about the video that is about to come, while still making the text the focus.

Make the Text Appear Closer

We likely have the ability to make the text appear like it is closer to the frame than most of the video content. We also can change the title so it appears much further away.

See this post for more information on how to do this.


A similar method would be doing this "blur" through content composition. A simple and uninteresting frame, with the text, that changes into a shot with more visual interest. Or the other way around.

Film the inside of a train cabin, with text by the door. Text fades our, door opens, and our character enters.

Film the train in a tunnel with the titles that fade away as the train leaves the tunnel, welcoming them to the new land and start of a new story. Or film a character walking towards the camera, too small to really be seen, and show the text until the character gets close enough to take over visual interest.

We can do this composition shift the other way. Say, as closing credits, with our character walking away into the sunset. After they have walked far enough away that the emotional punch has landed, or just that they are too small to see, we can fade in the ending title without having to worry that our audience thinks the title is a label, thanks not to the visual design of the title card, but context.

A simple and effective method may be just to film the titles in an empty and uninteresting room, then have our characters enter afterward.

Fading Between Titles and Video

Let's consider another aspect of overlay text, and that is where it is in the frame. We are, through text, very actively controlling where the viewer is looking, so long as we don't show the text for too long, and they get bored. How we fade between the title and the content will have different effects for the viewer. The type of fade will help set the expectation for the viewer about where they can look around the scene.

Dip To Black

A dip to black, then back up to the video says "Hey, it's okay to look around the frame". We have separated the video content from the title with a simple black screen, and that's all that's necessary to separate the contextual significance.

Short Crossfade

A short crossfade indicates that the content will be in the general direction the text was in. "Hey, relax, content is over here".

Longer Fade/Text linger

By spending more time juxtaposing the title with the video content, we are making a visual association with the title and this part of the frame. I am keeping on screen here means something. We could mean "this is our main character", "You should look over here", or some other artistic juxtaposition, like an establishing shot that sets the tone of the video, we can focus the viewer on an element in our establishing shot, and that could set our tone.

Essentially we are using some of the artistic offerings that overlay text can provide via juxtaposition for a limited period of time.


The important takeaway is that no design decision should be taken trivially, as everything the viewer experiences has an effect on their, well, experience. It isn't difficult to take a simple title card and execute it well, but it does require thought and consideration. Title text and overlay text should be designed with intent and awareness. Hopefully, this post helped bring to light some of the issues worth considering when designing overlay and title text.