Video, Computer-Generated Environments and The Future of The Web

Video, Computer-Generated Environments
and the Future of the Internet
By Ian Lamont
(For graduate credit)
HUMA E-105: Survey of Publishing, from Text to Hypertext
Harvard University Extension School
January 16, 2008

Almost since the first stuttering video clips appeared on the World Wide Web,
observers have predicted that video will come to dominate the Internet. Mitchell
Stephens, writing in the mid-1990s, foresaw the rise of sophisticated video production
and narrative techniques derived in part from the “merger” of computers and video.1 He
also believed the Web would play an important role for video, primarily as an on-demand
distribution platform that would allow viewers to be finally freed from television
schedules.2 Another commentator, writing more recently about the future of the Internet,
proclaimed video as “king,” thanks in large part to the popularity of amateur videos and
fan websites, and the rush of advertising dollars to online video content.3 Google,
Microsoft, Apple, Cisco, Verizon, and many other technology companies apparently
agree with these sentiments, spending billions of dollars on fiber-optic networks, massive
data centers, and robust hardware and software platforms to deliver video over the
Internet. While their technologies and business models are often in direct competition,
there seems to be widespread consensus that the Internet will evolve into some sort of
universal cable channel that showcases all kinds of video — from brief amateur video
clips to Hollywood films — to potentially everyone with broadband Internet access,
whenever and nearly wherever they choose. In such an environment, goes the reasoning,
text, audio, still images, and everything else will play secondary roles.
1
Mitchell Stephens, The Rise of the Image, the Fall of the Word (New York:
Oxford University Press, 1998), 164.
2
Stephens, 171.
3
Bambi Francisco, “Net Sense: The Future of the Internet,” MarketWatch.
Available from http://www.marketwatch.com/news/story/net-sense-future-internet-
video/story.aspx?guid=%7B6115530A-15F3-4FDC-B7F6-55FB493D356E%7D.
1
I would like to offer an alternative to this video-centric vision outlined by
Stephens and others. While video is a compelling medium that may one day rival text-
based websites in popularity, it will not dominate the Internet for long. I will argue that
another type of content — one that shares video’s visual appeal, yet currently falls into
the “everything else” category — will eventually overshadow video. That content will
consist of sophisticated computer-generated environments, delivered in a variety of
formats and serving many different types of customer needs, including entertainment,
news, and community. These formats will use advanced computer graphics to deliver
photorealistic, three-dimensional representations of real and imagined spaces to a vast,
online audience, and allow audience members to interact with these environments and
each other in ways that are not possible with video.
Video — which I define as television, film, home movies, and any other moving
images derived from the movements of lit subjects and scenery in front of a camera lens
— was the dominant visual mass medium of the 20th century. It has had a profound
impact on society and world history, as evidenced by the power of moving images to
educate, propagate, agitate, inform and entertain. Stephens called video humankind’s
“third major revolution,” after writing and print.4
Indeed, many of the major events and societal trends of the last century were
shaped by this mass medium. Charlie Chaplin, Al Jolson, and Lillian Gish can be
considered among the first international superstars, beloved by tens of millions across all
social classes and in many countries all over the world, thanks to their leading roles in
Hollywood films in the teens and 20s. Stardom was not unknown before film, but pre-
4
Stephens, 11.
2
mass media musicians, actors, orators and authors were restricted to live performances
and personal appearances, which limited their popularity. Film made it possible for actors
to simultaneously reach millions of people in cities and towns across America, and for
performances to be watched over and over again. The impact on the public was
tremendous.
Politicians similarly expanded their audiences and platforms using the power of
moving pictures. The rise of Adolf Hitler and the Nazi party in Europe in the 1930s was
partially due to the influence of Leni Riefenstahl’s Triumph of the Will and other
propaganda films that promoted core Nazi beliefs while casting Jews and other groups in
harshly negative terms.5 John F. Kennedy’s political rise has been linked to an uneven
televised presidential debate with Richard Nixon in 1960,6 and his death in a Dallas
motorcade — captured on an 8 mm film camera by a bystander named Abraham
Zapruder — sparked a national sense of mourning. Nearly three decades later, another
amateur video showing of a group of police officers beating a black taxi driver named
Rodney King on a Los Angeles street eventually led to several days of deadly urban riots
across the U.S.
Besides changing the course of history, video has come to govern our daily lives,
and serves as an important means of understanding our world. While film was a just a
fringe entertainment in 1900, it became a regular part of public life within a few decades.
5
Elliot Aronson and Anthony Pratkanis, Age of Propaganda: The Everyday Use
and Abuse of Persuasion (New York: Henry Holt, 2001), 323.
6
“1960: Kennedy-Nixon Debates.” Electronic Government Project, Eagleton
Digital Archive of American Politics, Rutgers University. Available from
http://www.eagleton.rutgers.edu/e-gov/e-politicalarchive-JFK-Nixon.htm.
3
By 1921, one source estimated annual U.S. box office receipts totaled $850 million
dollars, and the film industry was supporting hundreds of thousands of jobs.7 Television
also made rapid inroads, expanding from just seven thousand sets nationally by the end of
World War II to ten million receivers in 1950.8 Comedy, dramas, rebroadcast films and
other entertainment formats were not the only popular types of television programming.
Generations of children have been raised on a regular diet of educational programs and
cartoons, and television news became one of the primary sources of news, rivaling the
popularity of newspapers and magazines. As recently as December 2005, a survey of
American consumers found that 59% got news the previous day from local television and
47% from national television, compared to 44% from radio, 38% from a local newspaper,
23% from the Internet, and 12% from a national newspaper.9
Clearly, video continues to have a strong hold over audiences. Its ability to show
events, tell stories, and faithfully reproduce the words and actions of living beings gives it
an advantage over text-based formats such as printed periodicals, books and blogs.
Stephens also noted video’s ability to take viewers “elsewhere,” thanks to the way they
dominate the input to our eyes and ears:
7
“Revolutionary Talking Movies: Widespread Changes That Are Predicted If
New Invention Is a Success — Elimination of Numerous Stars.” The New York Times,
September 10, 1922. Available from http://query.nytimes.com/gst/abstract.html?
res=9F07E5DE1F3AE433A25753C1A96F9C946395D6CF
8
Stephens, 46.
9
John B. Horrigan, “Online News: For many home broadband users, the Internet
is a primary news source.” Pew Internet and American Life Project, March 22, 2006.
4
We misunderstand moving images when we think of them merely
as a form of communication, a type of entertainment, a means of
information or an art form. Perhaps books, newspapers or radio can
squeeze under such headings. Moving images with sound, because they
occupy both of our major senses, cannot. They are more than that. They
are a place we go.10
Stephens outlined a bright future for video in his 1998 book, The Rise of the
Image, the Fall of the Word. According to his thesis, television and film throughout the
20th century was generally unoriginal.11 He said that video needed to be reinvented in a
way that would enhance its strengths and eventually make it the pre-eminent medium for
telling stories and conveying information, even complex information that has
traditionally been the realm of print discourse.12 “... Once we move beyond simply
aiming cameras at stage plays, conversations, or sporting events and perfect original uses
of moving images, video can help us gain new slants on the world, new ways of seeing,”
he said. “It can capture more of the tumult and confusions of contemporary life than tend
to fit in lines of type.”13
The “new video” outlined in Rise of the Image, Fall of the Word incorporated
some of the techniques developed by avant-garde filmmakers and directors working with
music videos and television commercials, as well as new conventions and technologies
envisioned by Stephens. Juxtapositions, fast cutting, densely packed imagery, new
symmetries, an “excess of perspectives,” musical structure, new symbols and forms of
10
Stephens, 124-125.
11
Stephens, 91.
12
Stephens, 179.
13
Stephens, 18.
5
representation, surrealism, and computer graphics would characterize new video.14 The
tastes and preferences of audiences, he continued, would evolve to embrace new video,
while spoken languages and the printed word would “increasingly be a less precise, less
subtle language — one designed for use with images.”15
In his new video paradigm, Stephens described the importance of computing
technologies. Graphics would play central design and production roles. Computer-
generated imagery would be used for transitions, charts, and creative expression that
would allow directors to express their artistic visions and sense of fantasy, while
emphasizing the juxtapositions that he felt were so crucial to new video.16 After
production was completed, computers would serve as a more effective distribution
medium than movie theaters, terrestrial television and cable. The “network computer” —
in the form of larger computers situated in people’s living spaces, or portable wireless
devices, would serve as the primary conduit for anywhere, anytime video:
… As more and more video is produced, with the mass marketing

of digital video editing, and as more and more video is stored in databases
and accessed on Web-like networks, it seems inevitable that the screens of
those network computers will be filled much of the time with moving
images. … A whole range of full-screen video — tracked down in
archives, discovered through hypertext (hypervideo?) forwarded by
friends, crafted by artists, assigned by professors.17
14
Stephens, 182-199.
15
Stephens, 209.
16
Stephens, 196-197.
17
Stephens, 172.
6
Stephens accurately predicted current technologies such as viral video, YouTube,
and the video iPod. The character of video has been slower to evolve in the direction he
predicted, but it may some day come close to matching his vision.
However, Stephens and many of the other boosters who have predicted the
eventual dominance of video on the Web have failed to adequately address the inherent
conflict between the two mediums. In the video world, the director and others behind the
camera lens tell linear stories. The audience watches the screen passively. Stephens
readily accepted this drawback, noting “we have absolutely no influence whatsoever, free
or otherwise, on anything that transpires. Movies and television shows proceed entirely
without us.”18
The Internet, in contrast, is optimized for interactivity. It is a massive, distributed
computer network that was originally envisioned in the late 1960s as a robust
communications and file-transfer tool linking geographically dispersed computers and
networks.19 In the 1970s and 1980s, Internet traffic consisted of data transfers, text
messages, and relatively simple games. The audience was mostly limited to a small
population of computer-savvy users who had a connection through work, school, or one
of the early commercial service providers. By early 1993, there were just three to four
million Internet users worldwide, and only several tens of thousands of network nodes.20
18
Stephens, 126-127.
19
Gina Smith, “Unsung innovators: Robert Kahn, the ‘stepfather’ of the Internet.”
Computerworld, December 3, 2007. Available from http://www.computerworld.com/
action/article.do?command=viewArticleBasic&articleId=9046801.
7
That would shortly change. In 1994, the World Wide Web — a set of protocols
and technologies that sit on top of the Internet — leapt out of research labs and college
campuses, sparking a communications and media revolution. Instead of typing in text
commands to access information or send messages, people were presented with a screen
full of information and options. A Web page might consist of text, photographs, and
music. Almost all pages were linked with at least one other page. Pages could also access
software applications, databases, and other computing resources. Users could navigate
these interconnected pages with a Web browser and a mouse. This greatly simplified
access to the Internet, and opened up Internet-based content to mainstream audiences.
Content quickly moved beyond static pages containing text, links and
photographs. On some sites, pages contained forms that allowed people to input text and
software commands. This enabled developers to create front-end interfaces to back-end
databases and other networked resources, which in turn gave users access to online
discussion forums, search engines, online shopping, and a variety of registration-based
services. In other words, the audience was not just limited to looking at content. They
could react to it, respond to it, alter it, and discuss it or, for that matter, anything else on
their minds.
The impact of the Web on the Internet, mass media, business, and society has
been enormous. As of late 2007, nearly half of all Americans had high-speed Internet
connections at home. Many were using the Internet for social interactions, publishing
blogs or personal web pages, or looking for information in order to make important life
20
Bruce Sterling, “Short History of the Internet.” The Magazine Of Fantasy And
Science Fiction, February 1993. Available from http://w3.aces.uiuc.edu/AIM/scale/
nethistory.html.
8
decisions, such as making a major investment, getting career training, choosing a school,
or helping someone with a health-related decision.21 Among young people, the Internet
has emerged as a central tool for socialization and interaction with friends. According to
survey data released by the Pew Internet and American Life Project, 55% of Americans
aged 12-17 have created a profile on Facebook, MySpace, or another social networking
website. Almost all of the teens that use these sites say they do so to keep in touch with
friends (including those whom they often see in person) and to make plans with them.
The conversations extend to other Web services. Twenty-eight percent of teens say they
blog, and many of them — especially those who already have social networking profiles
— like to leave comments on others’ blogs. Even posting a video or digital photograph
“often starts a virtual conversation” through the commenting features of such services.22
This trend points to the fact that the Web is more than just on-demand distribution
channel for video. The Internet lets audiences and organizations link, categorize,
comment on, rate, map, tag, buy, sell, market, and edit video in ways that very few
people imagined just five years ago.
Moreover, the Internet has eroded the control of the television and film industries
and traditional “gatekeepers” who work for them — writers, reporters, editors, publicists,
publishers, etc. The decline of industry power and control goes beyond the ability of
viewers to visit online forums to criticize a television news network’s supposed political
bias, read film blogs written by unpaid amateur film critics, or apply descriptive
21
John B. Horrigan, “Broadband: What’s All the Fuss About?” Pew Internet &
American Life Project, October 18, 2007.
22
Amanda Lenhart, Mary Madden, Alexandra Rankin Macgill, Aaron Smith,
“Teens and Social Media.” Pew Internet & American Life Project, December 19, 2007.
9
del.icio.us tags to the official websites of Hollywood stars. Members of the public have
been transformed into an army of self-propelled video producers with an international
audience, thanks to easy-to-use software tools, cheap consumer electronics, and the
widespread availability of broadband Internet connections. The video they produce tends
to consist of home movies, simple dramas and humor, and amateur recordings of live
events. However, some of the content is entertaining enough to attract large numbers of
viewers. On a recent evening, the dozen featured videos on the front page of YouTube
had between 135,530 and 2,237,096 views apiece.23 Other content, while amateurish, is
compelling to small numbers of people. An example are the home-made music videos
and live-action plot recreations based on the movie Cars, and readily available on
YouTube. These clips do not meet Hollywood production values, clearly violate
copyright law, and conflict with the marketing and public relations campaigns maintained
by Disney/Pixar, yet are a minor hit with a small group of fans who are starved for Cars-
related video content. The audience for an average amateur clip on YouTube might
consist of just a few dozen people — a manifestation of the so-called “Long Tail” niche
consumption pattern that characterizes Internet content.24 The strength of the Long Tail
becomes apparent when one considers the millions of clips that are available on YouTube
or other video-hosting sites. Most clips have a few dozen or few hundred views, but the
aggregate audience is actually quite large, numbering in the millions or tens of millions
23
135,530 views for “Drunk History vol. 1 - Featuring Michael Cera” and
2,237,096 views for “The Original Human TETRIS Performance by Guillaume
Reymond.” YouTube.com. Data gathered at 10:20 pm on January 7, 2008.
24
Chris Anderson, “Long Tail 101.” The Long Tail: A Public Diary of Themes
Around a Book. Available from http://www.longtail.com/the_long_tail/faq/index.html.
10
of people. Video created by the masses is now competing with the professional video
produced by entertainment industry.
In the news industry, amateur video footage provides a different sort of
competition. Members of the public not only happen to witness news, they often gain
access to people and places that broadcast news professionals cannot or will not see.
They are able to capture vivid, on-scene accounts of major and minor events. The
Zapruder and Rodney King home movies were early examples of this movement. Then,
the devices were relatively expensive and there was no way to distribute the video to a
wide audience, except through traditional media outlets such as television news. Now,
cheap webcams, video cameras and mobile phones with built in cameras make it possible
for practically anyone to record news events. The Internet lets them distribute the footage
to a huge audience, and lets them bypass traditional gatekeepers, their professional
editing requirements, and ethical codes. The footage they shoot is raw and real. It can be
brutally honest and compelling, but also provocative and biased. The December 2004
Indian Ocean tsunami was a watershed moment in this respect. For the first time, global
awareness of a major news event was shaped in large part by footage shot by amateurs
and distributed via the Internet. The footage was disturbing, but captured the scope of the
destruction far more effectively than broadcast news outlets, which had no reporters on
scene when the waves first struck the beaches.
Despite the rise of amateur video and the new modes of distribution and
discussion, the Internet and computer technologies have not been able to change the
fundamental character of video. Whether someone watches video on a television screen,
or plays it on YouTube, video is a linear, passive experience, designed to be watched
11
from beginning to end without alterations or input from the audience. For Web video,
interactivity is limited to tangential content — the text links in the navigation column, the
comment field below the Flash video player, the icon-based ratings systems, and the off-
site commentary on blogs and discussion boards. The video itself has none of these
features. Objects on the video screen are not linked. An audience member cannot easily
reshoot it, to make it more to his or her liking. What the viewer sees depends upon
whatever lit subject or scenery passed in front of the lens, and whatever creative choices
the people controlling the camera and editing the footage decided to apply. This has
always been the fundamental character of video. In this sense, a two-minute clip of an
Independence Day parade on YouTube is not much different than Fred Ott's Sneeze, an
1893 kinetoscope film produced by Thomas Edison showing one of his employees
sneezing.25
The failure of video, or new video, to move beyond a static, linear storytelling
device does not mean Web video is doomed. It has a healthy future, as experimentation
with formats continues and more members of the public learn to use cameras, editing
software, and Internet publishing tools. In addition, video is the best tool to accomplish
certain tasks, or tell certain stories — such as documenting nature, showing news events,
and recording living people. Video will also benefit from sophisticated applications that
use metadata, descriptive pieces of information assigned to individual pieces of content
by humans or software programs. For instance, a video clip stored on my computer may
25
Mary Hanlon, “Movie Audiences, Movie Myth: Early Cinema as Invention,
Entertainment, Instruction.” Early Films, The Silent Western: Early Movie Myths of the
American West. Available from http://xroads.virginia.edu/~hyper/hns/westfilm/
movie.htm
12
have metadata that identifies the make and model of camera used to shoot it, the date it
was created, and the dimension of the frame in pixels. I may further “tag” it with simple
descriptive labels that help me categorize it “home,” “kids,” and “Fido.” If I post it on the
Web, friends and family members may add their own tags: “funny,” “cute,” “Golden
Retriever.” This data may help other people’s Web searches and online activities — for
instance, someone searching for pictures of Golden Retrievers may find mine, and then
republish it on her blog post about cute Golden Retrievers. It is through the power of
metadata and tagging that a YouTube clip of a new electronic gadget can get hundreds of
thousands of views in just a few days.
Two emerging metadata applications that will help audiences more precisely find
and use video content are geotagging and autotagging. Geotags are geospatial data in a
file or software program that identify the location of some object, such as a building or
person. Some cameras with built-in Geographic Positioning System (GPS) devices can
automatically geotag images, which can later be associated with addresses on a map or
searched more effectively (“find all photos taken in the 02166 area code”). Autotagging
is the automatic application of descriptions to a piece of content, without human review.
Penn State researchers Jia Li and James Z. Wang have developed software that can be
trained to automatically recognizes the objects in images, and apply metadata to them.26
Once this technology is applied to moving images, it will be possible to more effectively
organize video content and design online applications that let people view and use video
in a profoundly different manner than we use it now. This goes beyond simply entering a
term in a search engine and finding the most closely matching videos; it will enable video
26
Jia Li and James Z. Wang, Automatic Linguistic Indexing of Pictures - Real
Time. Available from http://www.alipr.com/.
13
to be more precisely integrated into the other software applications that will operate in
our homes, classrooms, and places of work. Imagine generating a personalized report for
a family trip to the zoo that displays recent amateur video of new exhibits and live
camera feeds of the traffic situations along two potential routes. Metadata will make this
possible.
Nevertheless, I believe a family of graphics technologies will eventually
overshadow video and realize the true interactive potential of moving images accessed
via the Internet. The technologies employ three-dimensional computer-generated
environments. These environments are not science fiction, or obscure laboratory
experiments — they are already widely used in certain industries, as well as in the home
and over the Internet. In addition, they rival video for clarity and visual beauty, allow
creative options not possible with video, can be customized according to audience
preferences and situational factors, and can enable social interaction, cooperation, and
competition. In the coming years, new formats and tools will be made available to
audiences and content creators, further accelerating the adoption of computer-generated
environments and ensuring its dominance over other Internet media formats.
What are computer-generated environments? “Virtual reality” is an alternate term
that many people know, but I am reluctant to use it here. It carries with it several
misleading connotations, and does not necessarily include some of the formats that I
believe will play an important role in the future. The concept dates to the 1950s and
1960s, when Ivan Sutherland envisioned computer technologies being used to “render
sensations that would seem real to their recipients.”27 Programmer Jaron Lanier coined
14
the term “virtual reality” in the 1980s, but admits that virtual reality is a “somewhat broad
idea” with no fixed definition.28 Thanks to a wave of media hype and imaginative
Hollywood fictionalizations in the early 1990s, many members of the public associate
virtual reality with special goggles and wired gloves that allow people to manipulate
virtual objects in a digitally rendered, three-dimensional space. However, this excludes
other 3D environments that do not require gloves and headgear. Edward Castronova
additionally noted that “virtual” and “reality” are themselves loaded terms when used to
describe simulated environments that are driving “real” experiences based on artificial
sensory input. In his book, he avoided this semantic minefield, instead using the term
“synthetic worlds” to describe persistent, interactive 3D spaces simultaneously accessed
by large numbers of people.29
I prefer “computer-generated environments.” It is more inclusive than “synthetic
world” or “virtual reality.” It encompasses any graphics technology that displays
computer-generated 3D representations of real, simulated, or imagined environments, and
allows users to control motion, perspective, and elements within these environments. It
also avoids the semantic issues that Castronova described. However, computer-generated
environments do not include certain types of computer-generated imagery (CGI) and 3D
effects, such as static 3D images (e.g., a 3D model of a car engine displayed as a still
27
Edward Castronova, Synthetic Worlds: The Business and Culture of Online
Games (Chicago, The University of Chicago Press: 2005), 287.
28
Janice J. Heiss, “The Future of Virtual Reality: Part Two of a Conversation
with Jaron Lanier.” Articles and Tips, The Sun Developer Network. Available from
http://java.sun.com/features/2003/02/lanier_qa2.html.
29
Castronova, 287-294.
15
image) or linear narratives made with 3D animation, such as Shrek, M&Ms television
advertisements, and some children’s television programs.
There are many examples of computer-generated environments in the workplace.
The U.S. military has been one of the most active users of such tools. Tank crews in the
early 1980s learned how to target their cannons using a simple 3D simulator based on the
popular Battlezone arcade game. Pilots have used flight simulators for decades, and the
Army offers a free, 3D video game called “America’s Army” over the Internet as an
interactive recruiting advertisement and simple training tool.30
In the business world, computer-generated environments are widely used in
architecture and industrial design, as well as in several science-related fields. Since the
1980s, the drafting program AutoCAD has been used to design buildings, vehicle parts,
and other products, with current versions supporting “3D walkthroughs” and various
methods of viewing objects from multiple perspectives.31 General Motors used AutoCAD
and several other construction-oriented 3D modeling applications to build a 2.4-million
square foot plant in Lansing Delta Township, Michigan. The software tools helped GM
complete construction 5% to 8% under budget and 25% ahead of schedule, by letting
architects, builders, and plant managers plan the layout of the facility and all of the
equipment and infrastructure before the foundation was poured.32
30
Harold Kennedy, “Computer Games Liven Up Military Recruiting, Training.”
National Defense, November 2002. Available from http://www.nationaldefensemagazine.
org/issues/2002/Nov/Computer_Games.htm.
31
Shaan Hurley, Unofficial AutoCAD History Pages. Available from
http://myfeedback.autodesk.com/history/area51.htm.
16
Outside of the workplace, video games are the oldest and most popular type of
computer-generated environment used by the public. A significant portion of the
population has grown up with them, and among younger people — the so-called “digital
natives” who have never known life without personal computers, broadband Internet
connections, and 3D games — game play is pervasive. As noted by John Palfrey, not all
young people can be considered digital natives, but people who are “born digital” are
more likely to interact with such technologies as digital natives.33
Battlezone, mentioned earlier, let players control a tank in a simple 3D
environment consisting of green polyhedrons, a distant mountain range, and a never-
ending assault of enemy tanks. Other 3D games from the early 1980s let players wander
through dungeons or castles, killing monsters and gathering treasure. The graphics of
these games, while not sophisticated, introduced millions of people to computer-
generated environments and the concept of doing things — manipulating objects,
completing missions, and sometimes cooperating with others — in simulated, three-
dimensional spaces. In the mid-1990s, players were exposed to more sophisticated
graphics and networked play, either over local-area networks or the Internet. Another
important development during this period was the rise of “modding,” which let players of
3D game titles such as Doom modify characters and missions to suit personal
preferences, or make gameplay more interesting. Game studios or talented
32
Robert Mitchell, “Field Report: GM builds on 3-D model.” Computerworld,
September 11, 2006. Available from http://www.computerworld.com/action/article.do?
command=viewArticleBasic&articleId=112739.
33
John Palfrey, “Born Digital.” John Palfrey from the Berkman Center at the
Harvard Law School, October 28, 2007. Available from http://blogs.law.harvard.edu/
palfrey/2007/10/28/born-digital/.
17
player/programmers developed the modding software, which could be downloaded from
official game websites or fan sites.
Now, an estimated 38% of U.S. adults and 81% of children people ages four to 17
play video games,34 ranging from 3D games based on sports (Madden NFL ’08),
futuristic combat (Halo 3), and even real life (The Sims). While these games are popular
as single-player pastimes or entertainment for small groups of people, an interesting new
game format has begun to attract large numbers of players. Massively multiplayer online
role-playing games (MMORPG) allow thousands of geographically dispersed people to
simultaneously play in a persistent, shared, online world, usually built around a medieval
setting with lots of group campaigns and missions. These environments allow a high
degree of independence, creativity, and customization. In Battlezone, the player was a
standard tank, it was always night, and the starting level and location were always the
same. Now, a World of Warcraft player can choose his or her sex, race, class, continent,
gaming server, default language, guild, and numerous other variables — explained in
great detail in a 208-page guide.35 There are more than nine million active World of
Warcraft subscriptions.36
Second Life, a socially oriented virtual world accessed through the Internet, gives
“residents” even more extensive options to shape their own characters and in-world
34
Alexander Wolfe, “Who's The Child Now, Or Wii (Why) Most Adults Don't
Play Video Games.” Wolfe’s Den, Information Week, December 2, 2007.
35
World of Warcraft Game Manual. Blizzard Entertainment, 2004.
36
“World Of Warcraft Surpasses 9 Million Subscribers Worldwide.” Press
Release. Blizzard Entertainment, July 24, 2007. Available from http://www.blizzard.com/
press/070724.shtml.
18
experiences. Using simple 3D building tools, they can model buildings, clothing,
vehicles, furniture, landscapes, plants, animals, and other objects. If it is nighttime in
their part of Second Life, and they cannot see, they can “force sun” to make it daylight —
even if others around them still see the same nighttime features. They can also customize
the appearance of their “avatars,” or personal 3D characters. Changing one’s face to have
a big nose, red eyes, and a mullet involves clicking through a few menus and adjusting
sliders that control nose size, eye color, and rear hair length. A resident can even change
his or her head to that of a cat, dog, or other animal.37
Both World of Warcraft and Second Life encourage socialization and cooperation
through shared missions or shared interests, whether it is conquering a monster-filled
cave system in World of Warcraft and splitting the treasures within, or building shops and
other virtual facilities for a Brazilian community in Second Life. In most game-oriented
virtual worlds, it is impossible to reach certain areas of the gaming world and achieve
high point levels without cooperating with other players and developing teams that most
effectively draw upon the various skills of different types of players. In Second Life,
ambitious building projects require groups of avatars, and enjoyment is often derived
from interaction with friends and strangers. As with the text-based Internet, interaction in
virtual worlds is not required, but it makes for richer and more rewarding social
experiences. In addition, the mechanics of socialization in these worlds parallel the tools
used in the text-based Internet, such as buddy lists and simple text messages. For
someone who has already been exposed to 3D games, instant messaging, and social
37
These avatars are referred to as “furries.”
19
networking, it is not difficult to make the leap to using an avatar, communicating in
group chats, and joining a guild in an MMORPG.
The 3D graphics for World of Warcraft look cartoonish, and Second Life’s
graphics look even more primitive — avatars move stiffly, textures look blurred, and
walls and other features often do not render at peak times or in locations where lots of
avatars congregate. These issues will gradually disappear as the technical infrastructure
of such services improves, and more advanced 3D hardware and software enters the
marketplace. Moore’s Law, a hypothesis put forth by Intel engineer Gordon Moore in
1965, stipulates that the number of transistors on a chip will double every two years.38 It
was originally envisioned for predicting the increase in the power of computer processing
units (CPU), but can be applied to advances in the abilities of graphics processing units
(GPU) produced by specialized manufacturers such as nVidia. Every few years a new
generation of CPUs and GPUs is released to market, increasing the processing power of
desktop computers, gaming consoles, and portable devices. These advances allow game
designers to strive for the Holy Grail of the gaming industry — achieving advanced 3D
effects that approach photorealism:
The goal for many developers was now to create an experience

identical to reality: rippling waters, flowing hair, shifting wind, dynamic
moving lights, reflections on moving objects, facial lip syncing, varied
character animation and emotions, and real physics and collisions.39
38
Gordon E. Moore, “Cramming More Components Onto Integrated Circuits.”
Electronics, Volume 38, Number 8, April 19, 1965.
39
John Hight, Jeannie Novak, Game Development Essentials: Game Project
Management (Clifton Park, NY: Thomson Delmar Learning, 2008), 17.
20
The drive to photorealism in computer-generated environments potentially
involves sampling real-life objects. This is already done for 3D textures — instead of
painstakingly recreating the rough ochre color of a brick, a designer can take digital
photographs of the six sides of a brick and map them onto a 3D mesh in a software
application. There are also technologies for capturing real-life actions, such as human
movements, and applying them to models in 3D animation or computer-generated
environments. Microsoft is now developing software called Photosynth that pastes
geotagged photographs of a building or object onto a 3D model associated with the same
geospatial coordinates. An application called Fotowoosh turns 2D pictures into simulated
3D images. Such applications open up the possibility of computer-generated
environments or game worlds that mirror real-world places and people.40
Another gaming technology that should be considered in any discussion about the
development of computer-generated environments is the narrow application of artificial
intelligence used to drive the behavior of monsters, enemies, and non-player characters
(NPC) that populate video games. For years, game AI has been based on programmed
logic — e.g., if an avatar in World of Warcraft opens a certain dungeon door, a troll will
launch an attack. In recent years, developers have been experimenting with more
complex game AI that actually “learns” from environmental variables, or is trained by
observing the behavior of human players. Jeff Orkin, a game developer and researcher at
the MIT Media Lab, has developed an online 3D game called The Restaurant Game that
teaches a game AI how to interact with human players, by recording the interactions of
40
Ian Lamont, “Transforming 2D photos into 3D models.” The Digital Media
Machine, Computerworld, April 24, 2007. Available from http://blogs.computerworld.
com/node/5418.
21
thousands of real human volunteers playing the game online. Orkin says that this
technology can potentially be applied to virtual worlds, as a way to make the actions of
NPCs more realistic to human players or residents.41
Besides gaming and virtual worlds, another popular application of computer-
generated environments involves simulations of buildings and representations of real-
world locations. It is now possible for potential homeowners to “tour” a 3D simulation of
a condominium development. Millions of vehicles in the United States have small
computers that capture location data from GPS satellites, and display a live, three-
dimensional representation of their locations and nearby streets. Google Earth, a software
program that uses geospatial information, satellite images, and 3D graphics, lets users
simulate flying over or through cities and geographical features. Google Earth users can
also geotag two-dimensional photographs, and map them on a corresponding Internet-
accessible 3D map. As of mid-2006, an estimated 72 million Americans had taken
“virtual tours” of another location online, with more than five million taking such tours
on a typical day.42
The computer-generated environments described above, and the functionality
available to users within them, are impossible to recreate with standard video
technologies. And why should they? Computer-generated environments and video are
oriented toward different applications. However, this may soon change, as the digital
41
“The Restaurant Game: New forms of Artificial Intelligence for Immersive
Education.” Jeff Orkin, MIT Media Lab. Presented at Immersive Education Day, Harvard
Interactive Media Group/Harvard Graduate School of Education, December 8, 2007.
42
Xingpu Yuan and Mary Madden, “Virtual Space is the Place.” Pew Internet &
American Life Project, November 27, 2006.
22
natives begin to enter maturity, 3D graphics achieve photorealism, and new Internet-
based software tools open up an expanded universe of online experiences that overlap
with those currently provided by video. Audiences and content creators will discover that
computer-generated environments can not only duplicate many types of video
programming, but also can provide customization, interactivity, and even social options
that amplify the ability of moving images to entertain and inform.
In recent years, there have been a number of experiments that indicate the
direction in which computer-generated environments are heading, and how they will
compete with and eventually displace video. Machinima — short for “machine
animation” or “machine cinema” — is one example. It involves the use of 3D animation
tools to make dramas, music videos, and other entertainment-oriented content.
Professional CGI and 3D animation tools have been used by Hollywood studios for
decades, but machinima is largely a grassroots phenomenon that relies upon inexpensive
technologies to create and distribute content. The content creators are individuals or small
teams, the tools are free or cheap game modding engines or games, and the distribution
platform is usually the Internet.
One example is Red vs. Blue. Starting in 2003, and ending 100 episodes later in
2007, a small team of writers and programmers used the Halo game engine to create a
comedy series depicting the hapless antics of two opposing squads of soldiers.43 The
humor was juvenile, the voice actors were amateurs, and the 3D graphics were simple,
but the series became a cult hit on the Internet and was eventually distributed via DVD.
43
“Red vs. Blue: A Machinima Series Based on Halo.” Available from
http://rvb.roosterteeth.com/home.php.
23
Another machinima, The French Democracy, was created in 2005 by Alex Chan, a
French industrial designer who had no previous experience with video production. He
wanted to explain the causes of the urban riots that tore through France that summer, and
he used The Movies — a $70 PC game — to create a drama that described the conditions
and factors that he believed were responsible for the riots. The quality of the animation
was primitive, and Chan had to rely on subtitles and music instead of voice actors for
audio, but the message was powerful. The 13-minute long clip was downloaded by tens
of thousands of viewers44 and generated a great deal of mainstream press reaction.
Machinima has barely made an impact on public awareness, but that will
eventually change as the quality of the graphics in machinima productions approaches
photorealism, high-quality synthetic speech synthesizers are developed, software tools
improve, and amateur writers/content creators become more skilled at scripting and
programming.
Further, while current machinima are like video in that they tell linear narratives,
the descendents of this technology will allow customization and interaction. For instance,
a machinima might let viewers preselect the appearance of the avatar stars, the sounds of
their voices, the location of the dramas, and other plot elements. So, I may opt to watch a
soap opera machinima in the default mode — a standard plot involving a love triangle
between two men and a woman in Los Angeles. However, another viewer may want to
see a love triangle with two women and a man in a small town in the Rockies, change the
name of the lead male character to “Walter,” set the appearance of both of the women to
44
The French Democracy had 31,102 views in QuickTime format, and 14,451
views in Windows Movie format as of January 8, 2007, on the Machinima.com website.
24
blondes, and restrict close-up shots to less than 3% of the total plot length. A third viewer
in Japan may transfer the story to Tokyo, and have all of the characters speaking in
Japanese. Such options will be possible with the more advanced development tools and
user interfaces.
Another possibility for future machinima is to let viewers bring their own avatars
into the story. Most 3D games already have a background story and a plot that players are
supposed to follow. With the exception of some MMORPGs, it is seldom possible to
deviate from the pre-arranged mission, let alone play a part in a love triangle. Flexible
game AIs and plot templates could make interactive machinima a reality. Conceivably,
groups of friends could join each other in a drama, helping to support the plot in some
way — for instance, distracting or disabling a character who seeks to harm the
protagonist. Or, a historical drama or documentary depicting the start of the American
Revolution could also serve as a virtual classroom that lets elementary school explore the
Boston of the 1770s. The application could also let students interact with the period
avatars, whose reactions would be partially driven by advanced game AIs.
In terms of news and documentaries, computer-generated environments cannot
replace compelling video footage of live events, natural phenomena, and recordings of
personal moments. However, computer-generated environments may be used for realistic
simulations when video is not available. For instance, they may let viewers see a two-car
accident from multiple angles — including the points of view from each of the drivers’
seats starting five seconds before the collision — based on the geospatial data gathered
from the police report and other sources. Or, they could let students in an astronomy class
see a simulated asteroid impact on the moon in visible light or infrared light, from 50,
25
100, or 1,000 kilometers away. Author and inventor Ray Kurzweil predicts computer-
generated environments may one day be overlaid upon our real-world views, through
eyewear that displays text, icons and other information corresponding to objects in our
field of view:
… If you look at someone, little pop-ups will appear in your field

of view, reminding you of who that is, giving you information about them,
reminding you that it’s their birthday next Tuesday. If you look at
buildings, it will give you information, it will help you walk around. If it
hears you stumbling over some information that you can’t quite think of, it
will just pop up without you having to ask.
The “augmented reality” described by Kurzweil would reduce our dependence
upon information delivered through computer monitors and small liquid crystal displays
on mobile phones. It would also rely upon geotags, facial recognition software, speech
recognition technologies, and a brain-machine interface that lets people input information
or commands into these systems without speaking or pressing keys.
Computer-generated environments could also replace live human anchors and
newsrooms. Most anchors simply read from scripts that either describe the footage that is
being shown on the screen or introduce segments by reporters in the field. This method of
presenting news is expensive and inflexible. Anchors are expensive. They can only work
at pre-arranged times throughout the day. They can get sick. Some viewers do not like the
appearance of a certain anchor. Avatars and software can remedy these shortcomings, and
allow a viewer to customize the appearance of his or her anchor, the type of news the
anchors narrates, and the time the newscast starts and finishes. Developers at
Northwestern University’s Intelligent Information Laboratory have created a prototype
26
application called News At Seven that features an avatar anchor reading news from eight
different categories — Business, Entertainment, Health, Politics, Science, Tech, U.S., and
World news. News At Seven is delivered over the Web, and is automated. Scripts, still
images, and video are pulled from other online news sources.45 It can be launched at any
time of the day or night. Until October of 2007, News At Seven used avatars from the
Half Life 2 game engine, but the high processing requirements associated with generated
a talking, 3D avatar on demand forced the designers to switch to simple, 2D avatars for
the limited beta launch of the application.46
In the future, similar news applications could allow 3D avatars to be customized
to mimic real news anchors (Walter Cronkite, Katie Couric, Jack Williams), other real
people (someone’s father, a favorite teacher, a politician), characters based on a set of
self-selected attributes, or one’s own avatar. The avatars might be seated in a simulated
newsroom, or could be moved to a computer-generated environment that mirrors the real-
life location where the news that he or she is describing took place. The environment
might be based upon geotags and other metadata that were generated by the original
reports and video footage. The news itself can also be fine-tuned, based on specific
categories, locations, times, and keywords chosen by the viewer. I may choose to have
the first half of my newscast consist of developments relating to the New York Stock
Exchange in the previous 24 hours. For the second half, I may restrict my anchor to
45
Kate Goodloe, “Broadcast News Goes Human-Free.” The Wall Street Journal,
January 6, 2007. Available from http://online.wsj.com/public/article/
SB116803755568668612-7IG7wBl1Wpezld0friGmB0x1ONM_20070113.html.
46
“News at Seven Beta Launch!” News At Seven Blog, October 29, 2007,
Available from http://newsatseven.com/blog/?m=200710.
27
reading reports that mention “China” or “Beijing” in the lede and have accompanying
video footage sourced from any clip taken in Beijing or Shanghai within the past six
hours. Detailed metadata would be crucial to creating such a report.
Approximately 10 years ago, Stephens prophesied a mass media environment that
would be increasingly dominated by video. Noting the failure of CD-ROMs and other
early interactive video technologies such Time Warner’s Qube,47 he foresaw computers
playing largely supportive roles, such as adding graphical flavor and creating distribution
channels for new video. Even in the current Web 2.0 age, characterized by text-based
media such as social networking websites and blogs, many observers still believe video
will eventually triumph, thanks to its solid broadcasting track record, strong advertising
revenue, and the popularity of online video. Other experts acknowledge that computer-
generated environments will be important, but many are unsure what such formats will
look like or how people will use them.48
While predicting the future is difficult, it is possible to identify trends based on
quantitative research and an understanding of recent developments in computer software,
hardware, and networking technologies. I believe many of the predictions outlined above,
far from being the realm of science fiction, provide valid insights into the future of mass
media. Computer-generated environments and other Internet technologies will not only
change the ways in which we interact with each other, they will change the way in which
we see our world.
47
Stephens, 169, 174.
48
Janna Quitney Anderson, Lee Rainie, “The Future of the Internet II.” Pew
Internet & American Life Project, September 24, 2006.
28

Video, Computer-Generated Environments and The Future of The Web

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Video, Computer-Generated Environments and The Future of The Web

Uploaded by

Copyright:

Available Formats

Video, Computer-Generated Environments

and the Future of the Internet

(For graduate credit)

HUMA E-105: Survey of Publishing, from Text to Hypertext

Harvard University Extension School

January 16, 2008

clips to Hollywood films — to potentially everyone with broadband Internet access,

consist of sophisticated computer-generated environments, delivered in a variety of

photorealistic, three-dimensional representations of real and imagined spaces to a vast,

each other in ways that are not possible with video.

“third major revolution,” after writing and print.4

motorcade — captured on an 8 mm film camera by a bystander named Abraham

across the U.S.

popularity of newspapers and magazines. As recently as December 2005, a survey of

23% from the Internet, and 12% from a national newspaper.9

dominate the input to our eyes and ears:

to fit in lines of type.”13

envisioned by Stephens. Juxtapositions, fast cutting, densely packed imagery, new

symmetries, an “excess of perspectives,” musical structure, new symbols and forms of

subtle language — one designed for use with images.”15

In his new video paradigm, Stephens described the importance of computing

production was completed, computers would serve as a more effective distribution

… As more and more video is produced, with the mass marketing

The Internet, in contrast, is optimized for interactivity. It is a massive, distributed

communications and file-transfer tool linking geographically dispersed computers and

campuses, sparking a communications and media revolution. Instead of typing in text

access to the Internet, and opened up Internet-based content to mainstream audiences.

software commands. This enabled developers to create front-end interfaces to back-end

discussion forums, search engines, online shopping, and a variety of registration-based

people imagined just five years ago.

been transformed into an army of self-propelled video producers with an international

produced by entertainment industry.

In the news industry, amateur video footage provides a different sort of

scene when the waves first struck the beaches.

fundamental character of video. Whether someone watches video on a television screen,

or plays it on YouTube, video is a linear, passive experience, designed to be watched

use metadata, descriptive pieces of information assigned to individual pieces of content

thousands of views in just a few days.

is the automatic application of descriptions to a piece of content, without human review.

Nevertheless, I believe a family of graphics technologies will eventually

via the Internet. The technologies employ three-dimensional computer-generated

environments. These environments are not science fiction, or obscure laboratory

audiences and content creators, further accelerating the adoption of computer-generated

What are computer-generated environments? “Virtual reality” is an alternate term

virtual objects in a digitally rendered, three-dimensional space. However, this excludes

“synthetic worlds” to describe persistent, interactive 3D spaces simultaneously accessed

by large numbers of people.29

I prefer “computer-generated environments.” It is more inclusive than “synthetic

world” or “virtual reality.” It encompasses any graphics technology that displays

computer-generated 3D representations of real, simulated, or imagined environments, and

environments do not include certain types of computer-generated imagery (CGI) and 3D

advertisements, and some children’s television programs.

There are many examples of computer-generated environments in the workplace.

interactive recruiting advertisement and simple training tool.30

In the business world, computer-generated environments are widely used in

and several other construction-oriented 3D modeling applications to build a 2.4-million

complete construction 5% to 8% under budget and 25% ahead of schedule, by letting

equipment and infrastructure before the foundation was poured.32

computer-generated environment used by the public. A significant portion of the

more likely to interact with such technologies as digital natives.33

Battlezone, mentioned earlier, let players control a tank in a simple 3D

environment consisting of green polyhedrons, a distant mountain range, and a never-

these games, while not sophisticated, introduced millions of people to computer-