“Listen to My Story”: The Problem of Storytelling in Virtual Reality.

Aaron Suduiko

10 years ago

The August 17, 2015 issue of TIME featured a cover story detailing the current state of virtual reality, along with its projected future trajectories. Author Joel Stein throws the term “storytelling” around a decent amount in the article, tracking the current efforts of virtual reality (hereafter ‘VR’) pioneers to develop a methodology for conveying narrative through the medium of VR. Stein’s own prose reflects the seeming contradiction in how VR best ought to go about telling stories: at one moment he observes that, “unlike movies, virtual reality can make you feel dumb or successful by reacting to you”; a moment later, he points out that, despite VR sharing this interactive element with video games, “the storytelling rules of video games don’t work” when it comes to VR on account of danger feeling much more emotionally “real” in VR than in modern video games.[1]

The stakes here are non-trivial in terms of where digital narrative goes next. The title of Stein’s article from which I quote is “Why Virtual Reality is About to Change the World”; last spring at PAX East, a panel of user interface designers from various game development studios expressed the standard thought that user interface development is progressing with the goal of ultimately achieving the “total immersion” of virtual reality.[2] What, then, are we to make of the tension that Stein highlights between VR and video games? Is it simply that the technology is new and we do not yet know how to use it effectively – that, as Gil Baron of Visionary VR told Stein, “[it’s] like you went back in time and gave a caveman a video camera?” Or could the tension perhaps be something deeper – that there is a difference in kind that precludes VR from serving as the “next evolution” of modern video game narrative?

My own view is that, not only there is certainly a tension in VR between being interactive and having different “rules” than video games, but it is also a tension far more fundamental than purported by articles such as Stein’s. Namely, the project of VR seems essentially at odds with our ordinary conception of narrative, whereas video games refine and enhance that same conception. I will offer a defense of this claim by showing that our agency in the actual world, which is what VR aims to emulate, determines a course of events that we experience prior to making any meaning out of that course of events, whereas narrative consists of chains of events composed in order to effect some didactic end – where ‘didactic end’ roughly means ‘a particular meaning or message’. I will then review motivations independent of this for the user interface dynamics of video games being valued on their own terms, rather than being seen as imperfect precursors to “total VR immersion.” Lastly, I will review some of the ways in which I believe VR will be useful, barring the misconception of it serving as an evolution in video game narrative.

I. VR, qua evolved video games, requires full-immersion agency.

I take it to be the case that one of the primary goals of virtual reality – not one that has been realized yet, but which VR developers aim to ultimately realize – is to emulate, within a virtual world, one’s experience in real life of ‘full-immersion agency’. Roughly, this refers simply to the feeling we have of really being able to make choices and affect the real world in which we exist; I will articulate the finer points of the term shortly. First, for the sake of clarity, I should point out that I certainly do not take this to be the only goal of VR: as I will consider in the final part of this paper, there are many other promising ends for which it can be used. Yet it seems that this goal is necessary if we wish to assume the intuitive and popular view that VR is the next evolution of video games. I will not pursue an extended proof here as to why this is the case; suffice it to say that the primary motivation behind this view of things (I take it) is the idea that video games are handicapped by the artificial distance separating player from avatar – the medium would be in some sense more consistent with its goal of dynamically engaging a virtual world if the player were fully subsumed by that virtual world.

‘Full immersion’ in ‘full-immersion agency’ refers to a phenomenology sufficiently similar to that of the player’s real-life phenomenology that the player feels as if she really is an entity within that world. I will not attempt to map necessary and sufficient conditions for this being the case, but we can point to a number of defining characteristics of the concept. For example, the player would have to experience the world from a first-personal perspective, such that it really seems as if they were seeing the world from a perspective internal to that world; this would most likely need to involve all five of the player’s senses. On the other hand, we would presumably want it to not be the case that full-immersion is so immersive that the virtual world is impossible to qualitatively distinguish from reality – such a situation would open the analysis to a whole host of complications that are beyond the scope of the goal currently in consideration.

‘Agency’ in ‘full-immersion agency’ refers to the capacity for the player to act upon the virtual world in such a way as to influence the causal chain of events in the virtual world. This is in many ways an extension of the sort of agency that players have in video games through the proxy of their avatar – the difference being that in this case, there is no proxy. Rather, the player perceives herself as directly being able to exert causal influence on the world. This factor, when conjoined with full immersion, makes the requirements for agency in VR somewhat more robust than the concept of agency in video games: whereas the controls through which a player controls an avatar are indirect and unintuitive (there is no intuitive reason why pressing a button labeled ‘A’ would result in a character jumping, for example), control in VR must be absolutely intuitive: the actions we take in a virtual world, in order to meet the full immersion requirement, must be effected by the same means as those same actions in real life. If we say that all we need to do in real life to jump is to tense our leg muscles, crouch, and propel our legs off the ground, then that mechanism must track with the experience of how the player makes herself jump in a VR world.

A succinct way of capturing the essence of full-immersion agency is to say that virtual reality, fully realized, ought to allow the player to experience her actions upon the virtual world in a way that tracks with her capacity to act upon the real world. If she sees an unlocked virtual door, then she ought to be able to open that door in a way that experientially resembles her opening a door in real life; if she chooses to sit down and do nothing, then any NPCs in the area ought to react to her as analogous people would in a real-life situation of her sitting down and doing nothing; if there is a virtual wall, then she should be able, mutatis mutandis, to at least in principle discover what is on the other side. It is from this requirement of realistic action that problems for VR narrative arise.

II. Total freedom of choice is at odds with didactic chains of events.

A different way to describe the above thesis is that VR, like real life, contains functional representational content. When we perceive objects in real life, there is a tacit assumption that we could, at least in principle, interact with that object: we can approach things, touch things, use them for particular ends, and so forth. Even when considering space, far away from our usual locale of earth, we know it is at least possible for us to be there (say, as an astronaut) and interact in some way with what we find there (I bracket fringe cases here, such as the capacity to interact with dark matter, because nothing crucial in my argument depends on how such cases turn out). Based on our capacity to interact with the various elements of our environment, we are able to bring about a variety of disparate events that are contingent on how we choose to interact with our environment – an ability which allows us in principle to freely choose events to an enormously – indeed, perhaps incalculably – high degree.[3]

The problem with this exceedingly high degree of freedom to, in principle, choose to experience various events, is that it is directly at odds with a traditional goal of narrative: didacticism. When we consider what it means for something to be a ‘narrative’ in the literary sense, a full account usually involves some didactic element; that is to say, we assume that the author of a narrative designed it in such a way as to convey a certain message or elicit a certain response from its appreciator. This didactic element of narrative can be conveyed through a variety of literary elements: word choice, subject matter, and, particularly relevant for the argument at hand, the chain of events constituting the narrative. Which events in the world of a story the author chooses for his particular narrative, the order in which he arranges these events, and so on, are all in part constitutive of the overall meaning of the story. When we think of examples, this is almost trivial: the Odyssey, for instance, would not be the same sort of triumphant revenge story without the chain of events culminating in Odysseus killing his wife’s suitors. A chain of events, in short, is one of the basic building blocks with which an author conveys a narrative’s meaning to the reader.

I have spoken at length in my various analyses of story-based video games about the ways in which they uniquely allow a player to influence chains of events, leading to different narrative outcomes; while this is certainly a crucial feature of video games that goes beyond the fixed chain of events featured in a film or novel, note that even choice-based games are typically fairly linear – that is to say that, while the story may “branch” in different places based on the choices of a player (e.g., Dishonored), this typically just means that a video game features a few possible chains of events based on player choice. A writer can just as easily be didactic with a branching series of events as with a single series of events, providing she knows her craft well – there will certainly be novel considerations such as what didactic content manifests from the interrelations between the various branches of the storyline, but this added nuance does not in any way make the didactic narrative process impossible (again, games such as Dishonored are testament to this). However, games with fairly linear storylines are only one type of game: many others privilege player choice over a traditional storyline, up to the point where some games offer huge, interactive worlds, with numerous choices for the player to make but without any overarching narrative (e.g., a traditional game of Minecraft, in comparison to a story-based mod thereof). At the extreme, these “sandbox games,” in which a player can do virtually anything but has no strict overarching narrative to follow, are an extremely scaled-down example of the problem faced by VR: more choice means more potential chains of events, which makes it more difficult, up to the point of impossibility, to design a didactic narrative.

We can draw a comparison here between the high degree of freedom in VR and the difficulty to model complex physical systems. For example, in her book The Dappled World, philosopher of science Nancy Cartwright discusses a case first made popular by Otto Neurath: the question of where a thousand dollar bill, swept up by the wind, will land. “Mechanics,” Cartwright says, “provides no model for this situation. We have only a partial model, which describes the thousand dollar bill as an unsupported object in the vicinity of the earth, and thereby introduces the force exerted on it due to gravity […] [There is also] in principle (in God’s complicated theory?) a model for mechanics for the action of the wind, albeit probably a very complicated one that we may never succeed in constructing.”[4] Cartwright’s point here is that the number of variables required to accurately construct a mechanical model of a flimsy dollar carried by the wind is so large as to appear virtually incalculable – the complexity of the situation cannot be effectively described by classical mechanics. The point in the case of VR is that a similar breadth of complexity arises if we introduce the number of variable branching events necessary to model a world where a player can act as freely as they act in real life. An author of video may be able to craft a didactic, branching narrative with three or four player-choice-contingent outcomes, but crafting a coherent and didactic set of chains of events for some large number n required by full-immersion agency seems, for the purposes of aesthetic narrative, practicably impossible.

Now, it would be unfair to say that the events of real life cannot in some sense be didactic. People construct narratives out of their real-life experiences all the time; the crucial distinction is that this type of didacticism is only possible after one has already experienced the events in question. Most of us, I take it, do not suppose that all the events we experience in life were pre-designed in order to articulate some particular meaning – rather, we retroactively make meaning out of whatever events we have experienced in life.[5] Such a dynamic as this may well be theoretically possible in VR: imagine something like a complex world with a particular physics, designed to respond to a player in patterned ways. The problem of not being able to design events didactically would remain, yet one could still ascribe meaning to the overall chain of events after the fact – imagine, by way of analogy, something like a tabletop game of Dungeons and Dragons with a very flexible, lenient, versatile dungeon master; or, if you prefer, imagine playing some sandbox game like Minecraft for several hours, and thereafter trying to construct an overall narrative of the events that took place. You might look back on these series of events and formulate some kind of meaning based on them, yet there seems to be no sense in which the series of events were designed for that meaning, prior to you engaging with the game. To reiterate, this is a process and type of engagement fundamentally distinct from the didactically architected narrative we expect from novels, films, and story-based video games.

This is the fundamental friction between VR and traditional narrative that I doubt can be even theoretically surmounted: a realistic degree of agency on the part of the player is directly at odds with a chain(s) of events designed for some didactic ends. Any attempt to use VR to improve upon video games’ model of narrative will have to find some way to solve this problem.

III. Video games can do things that VR cannot.

Beyond the problem of didacticism, another motivation for not conceiving of VR as a step “beyond” video games is that video games, in their current forms, can achieve unique aesthetic effects that do not seem possible in VR. I have examined such effects before, and in this section I will therefore largely recapitulate my earlier work on this topic.

At a panel I attended on user interface/experience (‘UI’/’UX’) design at PAX East last spring, the panelists remarked that “the sign of an effective, sleek UI is that no player actually comments on or notices the UI. When the future of UI was discussed, motions were made toward the promises of virtual reality to eventually develop games in which UI is ultimately seamless.”[6] I questioned whether it ought to be the case that UI/UX categorically aim towards seamless immersion of the player in the world of the game:

“With so many options available [for UI design], it seems naive to claim that the ultimate goal of UI is to be as unnoticeable as possible. In my own work, I have aimed at articulating how the different relationships between player, avatar, and game world can establish unique aesthetic effects (e.g., the embedded narratology of “Assassin’s Creed,” or the player-dependent metaphysics of “Legend of Zelda: Majora’s Mask”); the most immediate facilitator of these interactions, by virtue of being the conduit between player and avatar, is the UI. So I think it follows that UI ought to explore as many permutations of aesthetic principles as possible, rather than mere design permutations, such that we can explore the broadest boundaries of what sort of stories video games as a medium are capable of telling. Perhaps a counterpoint to immersive UI could be intentionally alienating UI that make the player feel like an utter stranger in spite of controlling the avatar within the game; such a model could be the foundation for an aesthetic of estrangement that, by virtue of being interactive, could be much more successful as a video game than as art in another medium.

“What’s more, my intuition is that it’s an artifact of the current state of UI design that we see a conceptual difference between physical space and narrative space in a video game, as the Fagerholt/Lorentzon model [of UI design] suggests. As we develop a more comprehensive theory of video game aesthetics, I think it will become increasingly clear that physical game space and what’s called “narrative” are two different ways of seeing the same aesthetics. Already, the lines between [various types of UI] are blurry at best: we may say that directional markers pointing the player towards a goal are “merely spatial”; yet if we extend the concept of game narrative to include the player as a fundamental, as I have argued that we must, then is this not also a narrative element? And this is the crucial point: for once we accept the player as a part of the game’s narrative, and the totality of the game as its world, then it seems as though all UI, while still aesthetically differentiable, is intrinsically diegetic.”

A few months after I theoretically rejected the dogma that UI ought to trend toward full player immersion, Batman: Arkham Knight was released, providing a vivid example of the precise point I was trying to make. I quote the case at length from my review of the game (this, of course, constitutes a spoiler for those who have yet to play through the game):

“At one point in the game, Alfred tells Batman that Lucius Fox has not been responding to communications for a while. The player can then choose to go to Wayne Tower, where Lucius has been stationed during the events of the game, to check on his status. Batman enters the elevator up to the top of the Tower, where Lucius presumably is, and is seen in the elevator dressed as Bruce Wayne – ostensibly because Lucius’ staff, who does not know Batman’s secret identity, are still in the building, the player directs Wayne into Lucius office, only to find it empty. Searching the office, there is once prompt available to the player: to use the retinal scanner on Lucius’ computer. Wayne sits down in the chair, does this, only to have the computer reject his retinal scan. At this point, Lucius enters the room, approaches the desk, and asks Wayne is anything is wrong and whether there is anything Lucius can do for him. The UI prompt for the player is to press a button to again “Use Retinal Scanner.” However, when the player pressed the button, rather than merely looking into the computer’s scanner again, Wayne grabs Lucius, slams his head against the table, presses his eye up to the scanner, and then begins transferring funds out of Wayne Enterprise’s bank accounts. At this point, the screen is revealed to be security camera footage that the real Batman is watching in the elevator up to Lucius’ office: although the player presumably did not realize it at the time, he was previously playing as Hush, who had surgically engineered his face to look like Bruce Wayne’s in order to break into the Tower.

“The “what-have-I-done” horror of the player upon “using the retinal scanner” is a direct result of UI not being transparent: although the player expects his agency to be extended through the avatar in one way (that is, merely putting one’s eye up to the retinal scanner), his agency ends up effecting something vastly different than what was expected (that is, brutalizing Lucius). This also makes vivid the completeness of Hush’s transfiguration into Wayne: in the game, the source of Batman’s agency is the player, who directs how he ought to act; the player also knows that Batman and Bruce Wayne are identical. Hush was so successful that he tricked the actual source of Batman’s agency into mistaking him for Bruce Wayne, indirectly making Batman responsible for Hush’s attack on Lucius. This makes the standard guilt of Batman for the actions of evildoers grounded in a very strong theoretical way with respect to game mechanics: in this case, Batman’s dual identity, an explicit theme throughout the game, ends up hurting those around him because an enemy is able to convince the player, the agent who most wants and is able to make Batman a hero within the universe of the game, to unwittingly help Hush in his wicked machinations. This grounds the guilt of Batman for the evil that happens in Gotham in a way that only video games could ground it: not only does that evil happen in spite of him, but, in cases like this, it actually comes about because of him.”[7]

The point here is that many of the special aesthetic features of video games come about from the very fact that the player controls an entity in the game’s universe that is not identical to herself – something that cuts against the grain of full immersion. Some video game narratives actually only make sense because the player is able to act upon the game’s universe while remaining separable from it: for example, in the case of Xenoblade Chronicles, I have argued that the only way to make sense of the protagonist (Shulk) overcoming a god (Zanza) that has knowledge of the universe’s total causal structure is to attribute Shulk’s agency to the player, who is not bound within the programmed universe of the game, and can thereby perturb the evolution of its causal chains in ways that the god cannot anticipate:

“Xenoblade does something remarkable on the level of second-order narrative: it shows how video games can be used in aesthetically powerful ways to create a universe with a complete metaphysics, and then perturb those metaphysics with an external agent. A universe of Leibniz’s metaphysics [such as Xenoblade’s] leaves all being subordinate to god [Zanza], which reflects the structure of games as a program, the path of which is determined prior to the player ever finding it; yet the design of the universe as something that can be externally observed allows the player to disturb the universe’s determined structure, and tell a story whose narrative arc is only valid by virtue of the player’s interference. This feature, then, reflects the value of the player acting upon the program of a game to bring its narrative from the realm of possible paths into the reality of a single path from start to finish.”[8]

So we cannot conceive of VR, even when it is refined in the coming years, as an evolution and improvement upon video game narratives. Such a conception could only reasonably rest on the goal of the player being fully immersed in the narrative, because the other special aesthetic ends of video game narrative fall out of the separateness of the player and avatar – something that is lost in the case of VR. And, as I showed in Parts I and II, this goal of full immersion, when combined with player agency, makes our most fundamental notion of narrative implausible in VR. When we look to the future of VR, it cannot be in the form of the “ultimate video game.”

IV. VR could be the next evolution of film.

As I mentioned at the outset of this article, the above considerations are not intended to success that VR is an industry with no future or meaningful place in society – such a position would be misguided and naïve. I have only been concerned with blocking the intuition that VR can advance the storytelling of video games in a way that many people find intuitively plausible. In this last section, I wish to close by pointing to one of the many other areas in which I think that VR holds tremendous promise: further developing the notion of film.

In his August article on the VR industry, Stein nods repeatedly to the apparent potential for VR to allow people to experience events with a greater degree of intimacy than in other media. He describes the work of Xavier Palomer Ripoll, who designs VR simulations that “allow therapists to use immersion therapy with clients who have anxiety disorders, letting them virtually sit on a plane or ride in an elevator, for example.”[9] Jaunt has developed an app that can gives its users “a good sense of what it’s like to be backstage at a Paul McCartney concert.” Felix Lajeunesse and Paul Raphael are “documenting nomadic tribes around the world so you can sit in a Mongolian yurt while a family cooks.” The element of experience that VR has the potential to provide can make people feel as if they are “really in,” say, the events of a movie, or a nomadic tribe’s home.

Such an enhanced degree of intimacy and immersion, without the complications of agency, has tremendous potential. Not only will people be able to experience film-like narratives more vividly, but they will also be able to experience places in a nearer-to-life way that might not otherwise be available to them. Dreamporte, for example, is a non-profit organization that focuses on using VR to bring underprivileged youth educational experiences that would otherwise be inaccessible to them. VR has the potential to hugely decrease the barrier of access to world travel (virtually experience sitting in a café in Paris), to classrooms (sit in a virtual classroom and listen to lectures), and so on. Particularly as the technological quality increases and cost decreases, VR will have an opportunity to very much change the lives of everyday people.

I call this an evolution of film because, as I argued above, extended agency in VR would render narrative virtually impossible. I therefore see film, with a fixed narrative or series of events, as a better model upon which VR can improve. VR can turn the passive experiences we observe in a film into felt experiences with which we can, in some limited capacity, engage; and, as Stein rightly says, this is enough to change much about the state of the world.

Yet with VR’s potential we must also acknowledge its limitations, and that, no matter how much we may wish for it to do, there will be some things it cannot do. Tidus famously opens Final Fantasy X with the injunction, “Listen to my story.” Video games toe a fine line between between the authority of authors and the authority of players; they manage (some more effectively than others) to architect didactic plotlines while also allowing the player to explore and sometimes determine the plot of her own accord. That VR could improve upon this may seem intuitive – but I believe, in the review, that this is one domain best left to video games.

References

Cartwright, Nancy. The Dappled World: A Study of the Boundaries of Science. Cambridge University Press, 1999. Print.

Fagerholt, Erik and Lorentzon, Magnus. Beyond the HUD: User Interfaces for Increased Player Immersion in FPS Games. Chalmers University of Technology, 2009. Web. 15 October 2015.

Monolith Soft. Xenoblade Chronicles. 10 June 2010.

Nintendo. Legend of Zelda: Majora’s Mask. 27 April 2000.

Rocksteady Studios. Batman: Arkham Knight. 23 June 2015.

Square Enix. Final Fantasy X. 19 June 2001.

Stein, Joel. “Why Virtual Reality is About to Change the World.” TIME. 6 August 2015. TIME. Web. 15 October 2015.

Suduiko, Aaron. Various. With a Terrible Fate. Web. 2015.

Ubisoft. Assassin’s Creed. 13 November 2007.

[1] From “Why Virtual Reality is About to Change the World.”

[2] This panel featured Vicki Ebberts [UX, Undead Labs], Alexandria Neonakis [UI/UX Designer, Naughty Dog], and Kate Welch [UI/UX, Freelance]. See “From the Floor of PAX East, Part II: The Aesthetics of User Interfaces.”

[3] What I am claiming here does not depend on a metaphysical claim that we have free will; all that is required for the argument is that we have the experience of having free will, which may just as well end up being epiphenomenal or otherwise superficial with respect to metaphysics.

[4] Cartwright 27, italics mine. Cartwright uses this argument in the context of objecting to those who take a fundamentalist stance towards physical laws; the details of her overall dialectic are less important than the thought experiment itself.

[5] One might disagree with me by taking the stance that, in life, “everything happens for a reason” in the sense that events are in some way pre-designed to serve some purpose. While I would deny such teleology for unrelated reasons, note that such a view actually does not speak against my case: for if one believes that events are pre-ordained for a certain end, then it is very difficult to also commit to any sense of free will in one’s life, even as an epiphenomenon; one is therefore committing to a Weltanschauung that very much resembles a traditional narrative without branching, choice-dependent elements – and there is obviously no problem with designing narratives such as these didactically. This view of real life therefore does nothing to mitigate the difficulty of representing freedom to choose in VR – it merely denies that any such freedom exists in the real world.

[6] This quote and the following come from my March 24, 2015 article, “From the Floor of PAX East, Part II: The Aesthetics of User Interfaces.”

[7] For my full review of Arkham Knight, from which this is excerpted, see “What is it like to be a Batman? Reviewing Arkham Knight.”

[8] Excerpted from a longer analysis of the game, “Finding your Monad: Xenoblade and Leibniz.”

[9] This and subsequent quotes in this paragraph come from “Why Virtual Reality is About to Change the World.”