Professional Documents
Culture Documents
Hasso Plattner
A Course in
In-Memory Data Management
The Inner Mechanics
of In-Memory Databases
55
56 8 Data Layout In Main Memory
As the name random access memory suggests, the memory can be accessed
randomly and one would expect constant access costs. In order to test this
assumption, we run a simple benchmark accessing a constant number of
addresses with an increasing stride, i.e. distance, between the accessed ad-
dresses.
We implemented this benchmark by iterating through an array chasing
a pointer. The array is filled with structs. Structs are data structures, which
allow to create user-defined aggregate data types that group multiple indi-
vidual variables together. The structs consist of a pointer and an additional
data attribute realizing the padding in memory, resulting in a memory access
with the desired stride when following the pointer chained list.
struct element {
struct element *pointer;
size_t padding[PADDING];
}
In case of a sequential array, the pointer of element i points to element
i + 1 and the pointer of the last element references the first element so that
the loop through all array elements is closed. In case of a random array,
the pointer of each element points to a random element of the array while
ensuring that every element is referenced exactly once. Figure 8.1 outlines
the created sequential and random arrays.
If the assumption holds and random memory access costs are constant,
then the size of the padding in the array and the array layout (sequential or
random) should make no di↵erence when iterating over the array. Figure 8.2
shows the result for iterating through a list with 4,096 elements, while fol-
lowing the pointers inside the elements and increasing the padding between
the elements. As we can clearly see, the access costs are not constant and in-
crease with an increasing stride. We also see multiple points of discontinuity
in the curves, e.g. the access times increase heavily up to a stride of 64 bytes
and continue increasing with a smaller slope.
8.1 Cache E↵ects on Application Performance 57
[Cache Linesize] [Pagesize]
550
500
2.0 2.0
Misses per Element
1.0 1.0
0.5 0.5
0.0 0.0
20 22 24 26 28 210 212 214 216 218 20 22 24 26 28 210 212 214 216 218
Stride in Bytes Stride in Bytes
Seq. L1 Misses Seq. L3 Misses Random, L1 Misses Random, L3 Misses
Seq. L2 Misses Seq. TLB Misses Random, L2 Misses Random, TLB Misses
Fig. 8.3: Cache Misses for Cache Accesses with Increasing Stride
Fig. 8.4: Cycles and Cache Misses for Cache Accesses with Increasing Work-
ing Sets
As mentioned above, there are use cases where a row-based table layout
can be more efficient. Nevertheless, many advantages speak in favor of the
usage of a columnar layout in an enterprise scenario.
Row Data Layout
□ Data is stored tuple-wise
Columnar Data Layout
□ Leverage co-location of attributes□ for a issingle
Data storedtuple
attribute-wise
□ Low cost for60reconstruction, but □higher costsequential
Leverage for sequential
8 Data Layout
scan-speed In Main
in main Memory
memory
scan of a single attribute □ Tuple reconstruction is expensive
7 8
4
REFERENCES 61
8.5 References
[BCR10] T.W. Barr, A.L. Cox, and S. Rixner. Translation Caching: Skip,
Don’t Walk (the Page Table). ACM SIGARCH Computer Architec-
ture News, 38(3):48–59, 2010.
[BT09] V. Babka and P. Tůma. Investigating Cache Parameters of x86
Family Processors. Computer Performance Evaluation and Bench-
marking, pages 77–96, 2009.
[GKP+ 11] M. Grund, J. Krueger, H. Plattner, A. Zeier, S. Madden, and
P. Cudre-Mauroux. HYRISE - A Hybrid Main Memory Storage
Engine. In VLDB, 2011.
[KKG+ 11] Jens Krueger, Changkyu Kim, Martin Grund, Nadathur Satish,
David Schwalb, Jatin Chhugani, Hasso Plattner, Pradeep Dubey,
and Alexander Zeier. Fast Updates on Read-Optimized
Databases Using Multi-Core CPUs. PVLDB, 2011.
[SKP12] David Schwalb, Jens Krueger, and Hasso Plattner. Cache con-
scious column organization in in-memory column stores. Tech-
nical Report 60, Hasso-Plattner-Institute, December 2012.
62 REFERENCES
[SS95] R.H. Saavedra and A.J. Smith. Measuring cache and TLB perfor-
mance and their e↵ect on benchmark runtimes. IEEE Transactions
on Computersn, 1995.