Simple Title Cards
A simple title card is when one uses a regular title in their video editing software. Placing text - unwarped - on an equirectangular projection. Further, a simple title is not overlaid over video.
A simple title:
- Is not "pre-warped"; it is unwarped, on the flat image of an equirectangular projection.
- Is not a video overlay, diegetic, or animated.
This has the advantage of being very easy, with a lot of resources and pre-existing knowledge for creating the titles.
Title cards are incredibly useful as a tool for reset shots - that is, even if all we have is a black screen with a title, we can be reasonably sure that the viewer is looking at the title when we begin our film (once they find it!).
This 'reset shot' is one good reason not to do a title as an overlay. Another reason is that current video players don't contain a "Heads Up Display" layer (like score or ammo in a video game). That is, content is 'glued' to the video, and feels - to some degree - diegetic. In fact, drawing attention to how the text is warped around the sphere makes some video look even worse, as linear warping is more apparent. Further, when doing titles as overlays, one must be very intentional about the content behind it in order to predict/manipulate the viewer's gaze.
I will write about challenge and guidelines for more complicated titles later.
There are 2 major considerations when making a simple title:
- The warping of the text due to the equirectangular projection of the text onto a sphere.
- The user's ability to find the title.
Lets take each one in turn.
First, how text is warped due to equirectangular projection. Let's look at how objects at different places around a frame become warped when viewed on a sphere. We can do this mathematically, but empirically is easier to understand. Before proceeding, it will be useful to understand what field of view is.
Let's look at the following grid, before it's wrapped around a sphere.1
Alright, now let's look at it as a viewer might see it from within a sphere, looking straight ahead.
Alright, not so bad! Looking good! Let's try tilting our head upwards.
Ouch! Okay, so what does this tell us? The less the grid looks like a grid, the more distorted (and unreadable) the image is. This tells us that the closer the text is to the top or bottom of the frame, the more distorted the image will be. It will be entirely unreadable at the poles. Anything at the top or bottom pixel is at the single point, the pole. Anything close to that is, well, very close to the poles, and highly distorted.
If you need help wrapping your head around this projection, it's the same projection that gives us a standard map of the earth: mercator projection. Anything near the poles (which, frankly, sailors didn't really need to worry too much about due to all of this frozen water in the way) is extremely distorted, but these maps more or less it does an okay job.2
The closer content is to the vertical middle of the frame, the less distorted it will be. So, as a Rule Of Thumb, you should place your text within 30 degrees of horizontal, up or down. This is the middle third of the equirectangular frame. Don't go above or below that without reason.
This isn't an arbitrary number. 30 degrees off-center is a 60 degree vertical field of view. This is a reasonable minimum field of view for many platforms, and is what YouTube uses for monoscopic 360 video playback by default.3
Generally, we don't want the user to turn their head in order to read the title of our film. We don't want this because we want to know where around the sphere the viewer will be looking when we start our video. That is, It's good to keep tabs on where the user will be looking.
Much more importantly, it is really annoying to read too-long titles. We read long lines with our eyes moving, not with our head. Reading text while our head is moving is unnatural and difficult.
Therefore, we want our title not to be so wide that it extends past the edges of our frame. With YouTube monoscopic web playback having a 105 degrees horizontal field of view, and many video game engines defaulting to 90, we will use 90 degree field of view. This is 1/4 of the equirectangular frame, as our safe minimum.
That, however, includes potentially going right up to the edges, which never makes for ideal text readability. We want a bit of padding to stay a safe distance from the edges (remember, this may be our viewers entire field of view. When is text ever that big?). Further, text will be more distorted at those points. As the viewer turns their head, any text at the center of their frame will be less distorted; this change in distortion is a rather distracting effect (in my opinion). All of this is to say, we probably want to keep a wide margin (literally!) away from the edges of a 90 degree (25%) section of our frame, for viewing a title without moving our heads.
Thus, because of distortion and the field of view, keep your content less than 1/3 of the vertical height of the frame, as close to the middle as possible, and less than 1/4 of the total width. This is a rule of thumb for maximum title size. When it comes to handling distortion, smaller is better. Take note that the below title images, shown below in their 'flat' form, follow these guidelines. They do appear rather small. Follow the links to the source videos, and see how they look when viewed naturally. It all works out!
Consider the title from this video, it's far larger than I recommend.
And sure enough, here's how it looks in YouTube. Unless you intentionally want the user to have to look around in order to read a title4, this is not ideal.
Can we even find the text?
These guidelines make for pretty small titles. Looking at the unprojected video files, it's rather surprising how little this is. We have to keep our total resolution in mind when producing text! Not only that, but what if the viewer gets lost? Or looks the wrong way?
YouTube, Facebook, and most other platforms/viewers all, by default, begin a 360 video with the viewer centered on the center of the 'flat' unprojected frame: as it appears in our video editors. Thus, that is where we should put our title. This way the title is in front of our viewer when the clip starts.
Further, a viewer is likely able to 'find' this starting position easier than they may find other positions (especially if they are sitting with their feet on the ground. More on this [coming soon]). It's a good place for our ending credits, or any content that isn't visually linked elsewhere/doesn't have a reason to be somewhere else.
If the title does not begin in the center of a frame:
Then we will not see this title in the center of the frame when the video begins:
Note: In the Avicii music video that these frames are from, this is intentional. The text is offset in order to force the user to move their head to read it. Avicii is assuming users are unfamiliar with 360 video content. They are introducing them to the medium of 360 videos, and subtly letting the user know that they will be needing to actually turn their head in order to enjoy the video.5
Sadly, we can't rely on this starting orientation. Because of the nature of head-tracking devices, we are never quite sure about our starting orientation. Hopefully, all is well. However, I may start a video before putting inside of my Google Cardboard, and in the process of bringing that to my face (or lifting it off of, say, a side table and turning it to my head), I may no longer be looking "forward" when I begin watching a video. If we care about knowing the viewer's gaze when a video begins, we will want the viewer to be able to find the text.
What does that mean? Basically, the background plate of the title card should lead the viewer's gaze towards the title. There are a variety of ways to accomplish this. The easiest is probably a simple gradient.
This frame taken from the ending titles of this film by Seeker Daily, which showcases a variety of title and overlay methods.
The gradient doesn't need to be a simple gradient. Vignetted graphics, images, or other such 'center weighted' visual content can work. Take the DiscoveryVR title (which is also animated). It's simple, engaging, effective, and pretty!
If you want to get fancier, it's very easy to add simple shapes, designs, (or even animations, which will be discussed later), or - in this case - an (animated) lens flare, to catch the viewers eye and point them towards the title.
While both the Inner Rift and the DiscoveryVR title's are animated, they would 'work' if they weren't.
Another option is a nifty mirrored text effect. This is a creative way to indicate "Hey, what you want to look at is over there" without being obtrusive or annoying.
It's also reasonable to place the title inside of a 'frame'. Viewers, be it from comic books or merely recognizing traditional video, tend to recognize a frame as something to center their gaze on.
Repeating the title
If it doesn't matter where the viewer is looking when the film beings, then a great approach is to just place the title at multiple points around the viewer. This way, they can't miss it.6 The most common method is to repeat the title 3 times. Why 3? It's sparse enough that the viewer won't need to look at more than one title at a time, but not so spread out that the viewer is able to look at nothing.
It is, of course, possible to repeat a title more than 3 times:
In my opinion, 4 times is too often. There is a very narrow set of angles where only one title is visible at a time, with a 90 degree field of view. Considering most platforms are, or aim to be, wider than that - it may even be impossible to view only one title on the screen. I find it distracting to view 2 half's of a title on the wrong ends of the frame. Unless your title is very small, or if you want the title to appear many times, I don't recommend repeating the title 4 or more times.
These titles are small enough that it works, although 3 would also be acceptable:
On the other side of things, repeating a title 2 times is also an option.
From this music video's ending credits.
This gives a lot of empty, black, space that a view may get lost looking at. I don't recommend repeating the title 2 times without something visual also taking up the space in between the title cards. That is, employing some of the same tricks listed above when having a single title.
For ending title cards, all of the same considerations apply. However, we must keep in mind where the viewer is likely looking at the end of the previous shot, as opposed to merely placing the content in the center, where the viewer is likely to begin a film. More will be written on this blog about methods for predicting gaze, but consider this post on points of interest, or this post which discusses gaze tracking from shot to shot.
On Animated Titles, Overlays,
The Google Jump and the VR Playhouse titles are an excellent examples of an animated titles. These animations catch the viewers eye and draw it to a point. Titles overlaid on top of a moving camera also draw the eye, as viewers tend to look 'forwards' while moving. Methods for overlaid and animated titles will be discussed later.
For now, it is suffice to say that the guidelines regarding title size and placement still apply, regardless of any flourishes.
Stay less than 1/3 of the height and 1/4 of the width of your equirectangular video in the editor; and give your viewer a way to find your title when the video starts.
I also have written about pre-warping title cards to appear undistorted.
These are from a 90 degree horizontal and 60 degree vertical slice of the field. The FOV used depends on the playback engine - some HMD's have wider FOV's than others. 90x60 is a reasonably small FOV, which is another way of saying we are playing it safe, and producing for the smallest FOV that a user is likely to have. ↩
I am by no means a cartographer, so "more or less it does an okay job" is about as strong a statement as any I an stand behind here. ↩
This could be done as a way to impress the viewer: "Wow! Content all around me!" It could also be a way to force the viewer to move their head, as a way to introduce 360 video to them (see following footnote). Perhaps Hello! VR's title is simple enough that they just didn't care, which is fair. I know what this title says even without being able to see all the words. There are exceptions to these guidelines, but only break them with reason. ↩
Thanks to awareness of the medium, and better UI/UX design from platforms like YouTube alerting users to the type of content, such tactics have become unnecessary for creators. I officially don't recommend them, but must admit that this is partly out of hope. If you want to pull a corridor digital and really guide the viewer, it probably won't hurt. ↩
Never say never. A viewer may still be stuck staring at the floor or ceiling. This offset is much less likely to happen between picking up a google cardboard and putting it on ones head, or from a viewers natural head motion. Also, any spherical video/image content tends to have very highly distoryed poles that identifies itself - through graphical ugliness - as "not what you should be looking at", spurring the viewer to look elsewhere. We don't need to worry about it as much. It's fair to assume a viewer is at least looking outwards somewhat level with the horizontal plane. ↩