Professional Documents
Culture Documents
The Data-Oriented
Design Process for
Game Development
Jessica D. Bayliss, Rochester Institute of Technology & Unity Technologies
D
ata-oriented design (DOD) grew when game When discussing software processes, it is important
developers needed to use modern hardware to consider that we are likely biased when solving prob-
architectures for performant games, and exist- lems using a computer. Current research suggests that
ing software processes did not meet their needs. we overlook subtractive changes in problem-solving in
The DOD process reduces software to a basic goal of com- comparison with additive changes.1 For example, when
puter architecture: to input, transform, and output data. given a Lego block bridge that has a one-block differ-
To properly explain DOD, we first define DOD and ence between the left and right sides, most people under
compare it to similar processes. A history of how DOD a time constraint will choose to add a block to one side
evolved is introduced, and core concepts in the DOD rather than remove a block from the other side. It makes
process are further discussed through several relevant sense that this bias would also impact how we make
examples. The Unity Technologies Data-Oriented Tech decisions in developing software. “Feature creep” is a
Stack (DOTS) is brought up as a canonical use of DOD, known potential issue, and most software is developed
and the conclusion mentions the use of DOD outside of within time constraints. This bias can lead to bloated
game development as well as its future. and slow software, incompatible with soft real-time sys-
tems, such as games, which are required to consistently
Digital Object Identifier 10.1109/MC.2022.3155108
run at 30, 60, or even 100 (such as for virtual reality
Date of current version: 6 May 2022 applications) frames/s.
This work is licensed under a
C r e a t i v e C o m m o n s A t t r i b u t i o n 4. 0 L i c e n s e . F o r m o r e i n f o r m a t i o n ,
s e e h t t p s : //c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y /4. 0 /d e e d . a s t PUBLISHED BY THE IEEE COMPUTER SOCIET Y M AY 2 0 2 2 31
NEXT-GENERATION GAME TECHNOLOGY
The emphasis on data as a design DOD does not emphasize data con- out and carefully thought through
driver allows DOD to reduce unneces- trol or data flow in a program, only that rather than just “thrown in” as part of
sary complexity and emphasizes that the program be defined in terms of data a “generic” solution. Solving for a prob-
transforming data well means that one input, transformations, and output. The lem that needs a flexible solution is a
must understand characteristics of the core that ties all of the patterns in DOD different concrete problem than solv-
data as well as the whole supply chain together is that programs only input, ing for nonflexible cases.
of development (for example, hardware transform, and output data. All elements In the search to understand data
transformations (especially when they
work poorly), the whole supply chain
for software development, including
both hardware and tools, is considered.
CURRENT RESEARCH SUGGESTS THAT As an example, hardware is consid-
WE OVERLOOK SUBTRACTIVE CHANGES ered primarily because it provides the
IN PROBLEM-SOLVING IN COMPARISON physical means to transform data. A
DOD proponent does not seek to know
WITH ADDITIVE CHANGES.
all hardware specifics but to specifi-
cally understand how the details of the
hardware impact the constraints that
the software needs to meet. The most
and compilers) that implements data and patterns involved with DOD may be common hardware considerations in
transformations. Historically, view- understood through this focus. DOD are cache performance and mul-
ing data as core to the software devel- DOD asks detailed questions about ticore processing, for this reason. Both
opment process is not a new concept. the data and uses answers to design soft- of these hardware elements greatly
For example, data flow programming ware. Examples include asking about impact the performance of the input,
was conceived in the 1960s and concen- transformation, and output of data.
trates on the flow of data through soft- ›› type The core of DOD is not about opti-
ware algorithms, primarily for parallel ›› distribution mization or making fast programs
computation.2,3 The concept of data ›› count through hardware consideration; it is
flow is related to DOD, but, in data flow, ›› storage about organizing programs around a
the emphasis is not truly on knowledge ›› accuracy. deep knowledge of data and its transfor-
of data but on the flow of data from one mation. A slow program can be based
algorithm to another. In modeling, knowledge of the data around modeling data and ignoring the
DOD is an imperative design process can change the way programs are cre- supply chain view that DOD proponents
due to its emphasis on program state ated. As an example, it becomes possi- use. If performance is not important for
changes; however, it is different from ble to consider which program system to the application, then the knowledge of
similar-sounding design processes, create next from how often that system data can still be used to create software.
such as data-driven design, which is is run and how much data it transforms. However, it cannot be understated that
also related to data flow but allows the Within a game context, if a character engineering software well requires
input data to control the state of the spends most of his or her time walking understanding transformations. Hard-
program, sometimes even at a com- around the game world, one could fore- ware knowledge is required primarily
puter architecture level.4 In software see that walking is very important and because models of software and compil-
design, it is commonly used in games to should be a high priority in development. ers are abstractions, and those abstrac-
increase flexibility. As an example, the Deep knowledge of data and the tions are leaky and inconsistent.5
data for a game level may contain infor- problem being solved leads to concrete
mation about special effects and door solutions but does not preclude flexibil- THE HISTORY OF DOD
state changes that are read in and exe- ity in software. Flexibility is part of the The movement toward data orienta-
cuted by a data-driven game program. design process and should be planned tion in game development occurred
M AY 2 0 2 2 33
NEXT-GENERATION GAME TECHNOLOGY
game controller can turn the PS4 on/ commonly creates more complicated breaking rocks, tilling soil, planting
off, but people not using the game con- solutions that need to be fully tested/ crops, and selling those crops to a store
troller may have to look up an image to debugged. Additional levels of com- for money. The full simulation is too
see where the power on/off switch is plexity also equal additional require- complex for presentation, but it is use-
located, as it is hidden under a decora- ments for testing and validation. ful to show how to model software from
tive panel on the front of the PS4. One bit that represents the on/off a data perspective.
Given the complexity of most soft- behavior of the light is necessary to How does one begin to view this in
ware problems that need much more turn on the light. Subtracting all of the a DOD way, and how does using DOD
than a single bit of data, it is import- extra data available yields an interface alter the software development of the
ant not to allow extra complexity into to the light bit that could look some- program? Ignoring the 3D models in
the solution, as that extra complexity thing like Figure 2(a). This particular the program (which each have their
solution is very similar to the solutions own data), one potential set of main
that DOD proponents seek in that it simulation data includes
well represents the data and transfor-
mation of those data. ›› position (x, y)
There are some extra considerations ›› speed
involved with that solution outside of ›› direction
the initial questions asked, though. For ›› target
instance, the wall plate acts as a safety ›› state
measure and covers the wires on the ›› scale.
back side of the light switch so that peo-
ple cannot accidentally touch them. It These are the data necessary to per-
is normal for extra considerations to form the simulation part of the program.
come up in the design and implemen- Rather than considering each “thing” in
(a) (b) tation of a solution, but each consider- the game as its own object with separate
FIGURE 2. A switch (a) that minimizes the ation should be carefully thought about activities, writing out data information
data necessary for turning on a light and before being added to the solution. allows operations across the game to be
(b) with slightly more data that allows a In Figure 2(a), the solution assumes batched where possible and functional-
user to reason about which direction of the user knows that up is the on posi- ity used by multiple sets of data.
the light switch is the on position. tion and down is the off position, as the As an example, everything in the
state change is not labeled. While this game has a position and can be placed
is true in some countries, in others, the at the same time. Only plants and
opposite is true. farmers move. (Plants are carried to
This hinders the usability of the the store by farmers.) This allows for
switch, as the light switch on the left the same movement functionality to
does not contain all necessary data for a be used on farmers and plants. DOD
user to know how to turn it on and off. looks at common data as well as oper-
A final solution to the light problem may ations and allows for the batching of
look something like Figure 2(b), where those data with those operations.
ON is displayed, and the user is given all In viewing the simulation in a DOD
of the information necessary to turn on manner, planning would also look at
the light. which transformations are made the
most often. As an example, farmers in
FIGURE 3. The Autofarmers simulation,
The Autofarmers simulation the simulation spend most of their time
where robotic farmers break rocks, till soil,
problem walking to different places. Hence,
plant procedurally generated crops, and
The Autofarmers simulation is shown a DOD-based solution would seek to
take those crops to the market.
in Figure 3 and consists of farmers understand the exact data necessary
M AY 2 0 2 2 35
NEXT-GENERATION GAME TECHNOLOGY
away from the processor also turns into for the solution), data are in a row-major important as games have evolved. The
extra power usage, as moving data costs format, meaning that elements are laid DOD view of programs as data input,
more in terms of power than processing out next to each other in rows, rather transformation, and output is helpful
those data. than column-major format (for example, when designing for parallel processing.
Structurally, a common pattern for Fortran), where columns are next to each The need for synchronization is a large
organizing data for good cache usage other in memory. Taking this informa- bottleneck to performance for games.
is to employ Structures of Arrays (SoA) tion into account means that, to properly Since race conditions only happen
to organize arrays of homogeneous lay out the image data for transforma- when data can be changed or written
data so that they can easily be read and tion, the two for loops in the piece of code to, knowledge of the read and write
used in programs. A common phrase should be swapped (image height should status for all data and when transfor-
in DOD says that “where there is one, come first, with the inner loop on image mations happen in a program helps to
there are many,” as processing data width) for better cache utilization. avoid race conditions. The SoA format
in a batch can have multiple benefits, Better cache utilization is only can be used to help batch jobs since
including good cache usage. This dif- needed for the solution because the parallelization needs a certain number
fers from some traditional approaches, requirement specifies that the data of elements to process before parallel
where all data based around a single count is enough that cache utilization processing becomes advantageous.
concept are organized in a class based will make a difference in performance.
around that concept, and instances It is possible that DOD can be used to The image transformation problem
of those classes are put into arrays accomplish goals other than perfor- and multiprocessor usage. For the
and methods called on each individ- mance, such as memory or power utili- image transformation problem, using
ual instance. Figure 4 shows how the zation considerations. For this applica- multiple processors likely will not help
organization differs in SoA when com- tion, if the image was a 16 × 16 image, to solve the problem, as it is only a single
pared to Arrays of Structures (AoS). then it would not matter which way the small image that is being transformed,
transformation was done because there and the overhead for setting up parallel
The image transformation problem are only 256 total pixels to transform, processing can be more than the trans-
and cache usage. Organizing data for and they can fit within the cache on formation of a single image. In the case
good cache usage appears to have modern processors. of a problem that required transform-
already been done for the potential solu- ing a set of images, parallel processing
tion in the image problem. However, Multiprocessor usage. Multiproces- could be very useful and save signifi-
in the C# language (the language chosen sor performance has become more cant time in processing.
D
from the experiments done for mea- and commonly uses them as filters
suring data. for jobs. OD concepts have been pre-
sented within the context of
THE UNITY DOTS Job system the game development field.
Unity’s DOTS is a large-scale indus- The job system exists to make data par- DOD does not just exist within the game
try example of DOD and consists of an allelism in C# easier. Jobs accept blit- development field, and there are indica-
experimental set of packages in Unity table (simple data types, such as float tions that it has relevance for other fields
that were introduced after they were and int) structures, transform the data in software development. As an exam-
announced in 2017.4 It is currently the in those structures, and output results. ple, in 2014, CppCon: The C++ Confer-
most publicly available example of The job system contains several differ- ence invited DOD proponent Mike Acton
DOD in an industry product since the ent constructs that range from job-based to give a keynote speech to the larger C++
entity component system (ECS) source parallel for loops that capture outside community,10 and common DOD design
code is viewable online. The core com- variables with a lambda expression to patterns, such as using SoA rather than
ponents of DOTS are structures that have their own execute AoS, can aid any application that pro-
function. cesses a lot of data programmatically.
›› the burst compiler Job inputs can be tagged as read only In terms of philosophy, some of
›› an ECS to allow for better knowledge regard- the concepts of DOD are at odds with
›› a job system ing how the data are being input, trans- object-oriented design (OOD), although
›› testing and debugging formed, and output. Jobs expect to it depends on which definition of OOD
support tools. obtain data laid out in an SoA manner, is used, and it is highly dependent on
and options exist to determine vari- the problem being solved. A discussion
Burst compiler ous worker thread settings. of the many OOD definitions is outside
The burst compiler is an excellent the scope of this article, but, since DOD
example of understanding and using Testing and debugging requires that the data be considered
the whole software development supply support tools first and foremost for software devel-
chain to create better solutions. It is an One important part of viewing data opment, simulated objects represent-
LLVM-based compiler technology that early and often is to have support for ing the problem space are unlikely to
optimizes C# code for Unity’s job sys- adequately accessing the data. While be used. The philosophy of DOD does
tem. It exists to allow for better overall profiler support of the job system in not state that objects representing the
game performance when using DOTS. Unity is not one of the main selling problem space cannot be used if they
features of DOTS, it supports DOD happen to well represent the data for
ECS development efforts deeply. The pro- input, transformations, and output.
Unity’s ECS implementation works filer specifically profiles per frame and DOD promotes solving concrete prob-
closely with the job system. Entities are shows overall job system utilization as lems as opposed to generic ones, which
M AY 2 0 2 2 37
NEXT-GENERATION GAME TECHNOLOGY