I felt it prudent to back up and talk about just what we are trying to do when crafting 360 stories. What are these odd films going to look like?
For 360 video, we may not know where the viewer is looking, but we do know when content appears. Filmmakers still have control over the tempo of the story.1 In fact, this absolute frame-by-frame tempo control gives rise to many challenges of a medium which forfeit's gaze control to a viewer - how can you ensure anything happens "correctly?"
What are the options?
One approach is to take control yourself: build interactive systems. Consider Google's Spotlight Stories. Content like this directs where a viewer is looking - forcing or ignoring the gaze by having content appears only there, or waiting until the user is looking in the right direction to trigger the next scene, and so on. This was my initial impulse when first considering this medium. There is probably going to be some great content being made across the whole interactive spectrum between video games to stories, particularly in the VR space. This is an awesome and exciting approach. Yet, it's challenging to implement, unsupported by common platforms like YouTube, and it has ambiguously defined expectations of interaction that may confuse viewers who go too far "off track". It's also out of the scope of my research.
Another approach is to give up control of anything happening "correctly" by creating bulletproof content, content that works more-or-less wherever a reasonable viewer decides to look. Lots of content right now utilizes slow, un-jarring fades, busy scenes with lots to discover, anything important conveyed via audio, and lots of visual atmosphere. This isn't a bad approach, but it's creatively limiting. I'm considering this "look anywhere anytime, explore the frame always" content a genre of 360 films, not a show of its full capabilities.
What should we do?
I believe the best approach is a happy compromise between these two approaches. While working in the current platforms and technology, we lost feedback from the audience. In robotics term, we use an "open loop" system, using educated guesses - and occasional "reset points" - to make a judgment on how to proceed.
Sometimes we will need to control (or "guide") the viewer. Even without interactivity ("closed loop"), there are many visual and auditory "tricks" we can use to put engaging content in front of the viewers gaze, and keep it there.
Some methods for controlling/guiding the viewer have been discussed already when talking about points of interest, and in other posts.
Other times we will need to give up control and let the viewer explore. Just how that exploration happens will need to be considered - are they scanning the horizon? searching for something specific? Confused and wondering what they are even supposed to be looking at in the first place?
It might be a misnomer to say 'give up control', as we are choosing to give them freedom to look anywhere, and we must understand when/why this is happening. We must give up control with intent.
What about a balance that strikes somewhere in the middle? Jessica Brillhart's work is a great example of finding a place on this scale. Guiding the viewer, but never letting them get totally lost - or, if they do - always giving them something to see. Her films never take total control (after the title sequence), but never really give it up either.
If we are to draw a spectrum between controlling the audience's gaze/view, and giving the audience total freedom to search around, I feel that Jessica's videos are far too free. In order to never let the viewer really get lost, each cut is quite long, and the films have a slow tempo (In my opinion, they are kind of visually boring). These films play it safe, so to speak. They stay towards the 'bulletproof' side of things.
If we speed that tempo up or attempt to control the viewers gaze more, we may reach a point where we "lost" the audience. They didn't follow a cut, and are now staring at who-knows-what, confused and looking for the POI. They might not have followed from quick cut to cut to cut. When this happens, it's time to reset.
Resetting the Gaze
"Playing it safe" for an entire film (slow, long takes, fades between cuts, visually saturated shots, etc etc) will be creatively limiting, and will (or already has) give rise to a certain genre of slow voice over radio-story-like film. Let's try to break away from this.
Let's consider a reset point. This is a part of the film where the viewers gaze goes from an unknown point to a known point (well, as well known as we are comfortable with). The intro titles are a perfect example of this. If you only put text on place on the screen, and everything else is black2, then that's where the viewer is looking.
In a 3 minute film, let's say we have a reset shot ever 45 seconds or so. These could be something fancy with lots of movement that direct the eye; a shot where the camera moves3, an empty shot with only one (visually stimulating) POI, or others. If we do this, that gives us a handful of [up to] 45-second "bursts" of controlled filming. We follow our POI from shot to shot, cut on movement, and even allow for a secondary "hero's journey" for the curious. But knowing that if we take a risky cut and lose the audience, they will only be lost for a short period of time, since we will be able to reset them.
Not every shot needs to be the sort of 'reset' or bulletproof shot. We don't have to enter every scene and give the user time to get oriented each time. This makes for very boring and slow films.
We also can, knowing a reset point is coming up, totally concede to the viewer. Give em a full horizon landscape with nothing in it. Or give them a totally saturated and hectic shot with things happening all over the place. Let the audience explore and discover details. We will give up on guessing where they will be looking - but after that, we can implement a reset, recover, and keep our film moving.
So what do 360 films look like?
360 films are going to alternate between giving the viewer full control to directing and force-feeding them information, and everything in between. We need the directed sections to tell concise stories where details happen on scene, and we need undirected sections to allow the viewer to enjoy the medium - to explore and discover things.
The challenge is going to be learning how to transition between these states. Reseting is just one dialectic description. Truthfully, filmmakers will have to be very aware of how well they can guess the viewers gaze at any point, and ensure the content in the story is appropriately matched.
How Do We Guess The Gaze?
There's more to be said about this, of course, but let's consider some cases.
Blocking, for example, can be utilized. Consider a radially symmetrical scene of actors on roller blades circling the camera, holding hands. Such scenes will probably cue a reasonable audience member that it's okay to look around. Further, moving actors around the scene invites the audience to either track someone specific, or view all that pass their view.
Distance equates to size in the frame, and size translates to importance. If everything is the same size, then everything is equally important. With this shot, we can be sure that we have "lost" the gaze.
Now, lets take that ring of people, and unwind and spin them around the camera, like during the "whiplash" part of a roller derby night. Now, all of the actors exist in the same 'angle' from the camera, and each one moved to it from their own position. No matter where the viewer was looking or which actor they followed, they are now directed to just one spot.
Somebody yells "look out" and perhaps the viewer looks ahead to another character, one stumbling - trying to escape the whiplash! (They don't). Whichever way the audience looks - and we aren't sure - the audience will catch the collision between the lone stumbling character and the oncoming whiplash line.
If you have never been to a derby night, then I apologize for how obtuse this example is. Let's consider another.
We could go from undirected - a room with a variety of interesting elements. A chair, a sofa, a whoopee cushion, a TV, etc. - to a reset point, where a character enters through a squeaky door and walks across the room and sits in a chair with the whoopee cushion.
We take advantage of the knowing the viewer probably saw and tracked the actor when they entered and followed them to the seat, perhaps anticipating a fart noise, perhaps surprised by it. The actor reaches under themselves and grabs the whoopee cushion, frowning. If the audience hadn't seen it until then, they see it now. Guessing where the audience is looking, we can cut to a flashback of the actor in elementary school.
The gaze first on some mischievous kids putting a whoopee cushion down in a chair (visually in the same spot as the previous chair). The mischievous actors look up and quickly dash away, hinting at something coming. The viewer searches the frame to find a childhood version of the boy and watches him sit on the whoopee cushion. If the viewer didn't search but stayed gazing at the whoopee cushion, then that's fine! They two points of interest will coalesce in an inevitable hilarity.
Now we see and hear children laughing all around us. We hold on that to let this child's traumatizing embarrassment sink in. We fade to black and fade back to the room. A few moments to recollect ourselves in the space, and then another character enters through squeaky door. The viewer could be looking at either actor or switching back/forth. We hear a greeting (The viewer may not have seen one or perhaps both of the actors yet). We don't know the viewers gaze is until this new actor sits next to the first, and now we have reset ourselves to at least a general direction of 'towards the actors'.
And so on.
The "language" of 360 filmmaking is going to be about how to control the viewers gaze. In other words, how to cue the viewer that it's okay to look around, or let them know that hey - this is important. A lot of this language already exists in an infant state, as viewers have been watching TV and more all their lives. 360 filmmaking is about understanding the gaze of an audience member, and controlling it -or not, the best we can. But whatever we do, we should do it with intent. We shouldn't be afraid of giving up control or keeping it all the time. Great films will use all the tools available to them.
For Transition, Motion Matters
Many 360 films to this day have shied away from movement in all it's forms. Actors rarely move, cameras almost never move, and so on. Movement can quickly confuse or disorient an audience, or even make them sick. Yet movement matters! Motion is one of the strongest tools for directing attention. It provide us with means to move gaze around a room, direct or point an audience towards information, and capture (or let go of) the viewers gaze.
What content/messages will work best when the audience can look anywhere? When should filmmakers direct the viewer's gaze? And how, really, can we reset the scene? The more technical side of this craft really requires more investigations.
Audio is can be used as a safety net, content we know the audience will experience. But, when we can guess where the audience is looking, how can we use audio (say, from off camera) for more creative storytelling purposes? Getting hardware, platforms, and tools to more easily deal with spatial audio is an important next step.
Further inquiries will explore possibilities in a medium where the viewer also has control over the tempo. Comics, basically; sequential image storytelling. ↩
Well, generally one will use animations or other visual effects to draw the eye in, so the viewer isn't accidentally greeted with a blank screen. The Google jump logo is a great example. It's 3D and literally, goes from "all around you" to "just one point" in space. It's a canonical example of a gaze reset. ↩
It's my intuition that viewers will tend to look "forward" when presented with a moving frame. More testing is needed. Consider the titles in this film - there's no way you missed the text. Even if you began in the "wrong" direction, the movement of the camera directed your gaze forwards, to the title text. ↩