Professional Documents
Culture Documents
Alexander Korotkov
Oriole DB Inc.
2021
1
According to db-engines.com
Alexander Korotkov Solving PostgreSQL wicked problems 4 / 40
PostgreSQL – strong trend2
2
https://db-engines.com/en/ranking_trend/system/PostgreSQL
Alexander Korotkov Solving PostgreSQL wicked problems 5 / 40
PostgreSQL – most loved RDBMS3
3
According to Stackoverflow 2020 survey
Alexander Korotkov Solving PostgreSQL wicked problems 6 / 40
The dark side of PostgreSQL
https://eng.uber.com/postgres-to-mysql-migration/
Alexander Korotkov Solving PostgreSQL wicked problems 8 / 40
Cri cism of PostgreSQL (2/2)
https://medium.com/@rbranson/10-things-i-hate-about-postgresql-20dbab8c2791
Memory Disk
1' 1
2' 3'
2 3
5' 6' 4 5 6 7
Buffer
1 2 3 4 5 6 7
mapping
▶ Each page access requires lookup into buffer mapping data structure.
Memory Disk
1' 1
2' 3'
2 3
5' 6' 4 5 6 7
Buffer
1 2 3 4 5 6 7
mapping
▶ Each page access requires lookup into buffer mapping data structure.
▶ Each B-tree key lookup takes mul ple buffer mapping lookups.
Memory Disk
1' 1
2' 3'
2 3
5' 6' 4 5 6 7
Buffer
1 2 3 4 5 6 7
mapping
▶ Each page access requires lookup into buffer mapping data structure.
▶ Each B-tree key lookup takes mul ple buffer mapping lookups.
▶ Accessing cached data doesn’t scale on modern hardware.
Alexander Korotkov Solving PostgreSQL wicked problems 13 / 40
Solu on: Dual pointers
2 3
5 7
Disk
1 2 3 4 5 6 7
2 3
5 7
Disk
1 2 3 4 5 6 7
2 3
5 7
Disk
1 2 3 4 5 6 7
page
row1 row4
page
row1 row4
page
row1 row4
Index #1 Index #2
Heap
WAL
Index #1 Index #2
Heap
WAL
Index #1 Index #2
Heap
WAL
WAL
▶ Very compact.
WAL
▶ Very compact.
▶ Apply can be parallelized.
WAL
▶ Very compact.
▶ Apply can be parallelized.
▶ Suitable for mul master (row-level conflicts, not block-level).
WAL
▶ Very compact.
▶ Apply can be parallelized.
▶ Suitable for mul master (row-level conflicts, not block-level).
▶ Recovery needs structurally consistent checkpoints.
Alexander Korotkov Solving PostgreSQL wicked problems 18 / 40
Row-level WAL based mul master
Raft replication
2 3
5 7
Disk
1 2 3 4 5 6 7
2 3
5 7*
Disk
1 2 3 4 5 6 7
1*
2 3*
5 7*
Disk
1 2 3 4 5 6 7 7* 3* 1*
1*
2 3*
5 7*
Disk
2 4 5 6 7* 3* 1*
PostgreSQL
4
https://gist.github.com/akorotkov/f5e98ba5805c42ee18bf945b30cc3d67
Alexander Korotkov Solving PostgreSQL wicked problems 27 / 40
OrioleDB benchmark: read-only scalability
Read-only scalability test PostgreSQL vs OrioleDB
1 minute of pgbench script reading 9 random values of 100M
PostgreSQL
OrioleDB
800000
600000
TPS
400000
200000
0
0 50 100 150 200 250
# Clients
400000
300000
TPS
200000
100000
0
0 50 100 150 200 250
# Clients
120000
100000
80000
TPS
60000
40000
20000
pgsql-read-write
orioledb-read-write
0 orioledb-read-write-block-device
0 250 500 750 1000 1250 1500 1750 2000
# Clients
150000
125000
TPS
100000
75000
50000
25000
0
0 250 500 750 1000 1250 1500 1750 2000
# Clients
TPS
400000
300000
200000
100000
0
400 600 800 1000 1200 1400 1600
seconds
CPU usage
100
PostgreSQL
OrioleDB
80
60
Usage, %
40
20
0
400 600 800 1000 1200 1400 1600
seconds
TPS
400000
300000
200000
100000
0
400 600 800 1000 1200 1400 1600
seconds
IO load
35000 PostgreSQL
OrioleDB
30000
25000
20000
IOPS
15000
10000
5000
0
0 250 500 750 1000 1250 1500 1750
seconds
Space used
80 PostgreSQL
70 OrioleDB
60
50
40
GB
30
20
10
0
400 600 800 1000 1200 1400 1600
seconds
OrioleDB: no bloat!
150
125
100
IOPS
75
50
25
0
0 500 1000 1500 2000 2500 3000 3500
seconds
OrioleDB: 9X less read IOPS!
Alexander Korotkov Solving PostgreSQL wicked problems 35 / 40
OrioleDB benchmark: taxi workload (2/3): write
Disk write
PostgreSQL
350 OrioleDB
300
250
200
IOPS
150
100
50
0
0 500 1000 1500 2000 2500 3000 3500
seconds
OrioleDB: 4.5X less write IOPS!
Alexander Korotkov Solving PostgreSQL wicked problems 36 / 40
OrioleDB benchmark: taxi workload (3/3): space
Space used
40 PostgreSQL
OrioleDB
35
30
25
GB
20
15
10
0
0 500 1000 1500 2000 2500 3000 3500
seconds
OrioleDB: 8X less space usage!
Alexander Korotkov Solving PostgreSQL wicked problems 37 / 40
OrioleDB = Solu on of wicked PostgreSQL
problems + extraordinary performance