Order Notation in Practice

Roger Orr
OR/2 Limited
What does complexity measurement mean?
– ACC 2!"# –

What is Order Notation?

$his notation is a %ay o& descri'ing ho% the
number of operations per&ormed 'y an
algorithm (aries 'y the size of the problem
as the si)e increases

*ou+(e pro'a'ly heard o& order notation
'e&ore – i& you ha(e studied computer
science then the next section is li,ely to 'e
re(ision

Why do %e care?

Almost no-one. is actually interested in the
complexity o& an algorithm

What %e normally care a'out is the
performance o& a &unction

$he complexity measure o& an algorithm %ill
a&&ect the per&ormance o& a &unction
implementing it/ 'ut it is 'y no means the
only &actor
0.Present audience possi'ly excepted1

Ways to measure per&ormance

$here are a num'er o& di&&erent %ays to
measure the per&ormance o& a &unction

$ypical measures include2

Wall cloc, time

CP cloc, cycles

3emory use

4/O 0dis,/ net%or,/ etc1

Po%er consumption

Num'er o& 56 'rac,ets used

Complexity measurement

Complexity measurement is 0normally1
used to approximate the num'er o&
operations per&ormed

$his is then used as a proxy &or CP cloc,
cycles

4t ignores +details+ such as memory access
costs that ha(e 'ecome increasingly
important o(er time

4t o&ten is a measure o& one operation

4ntroduction to Order Notation

A classi&ication o& algorithms 'y ho% they
respond to changes in si)e7

ses a 'ig O 0also called Landau+s sym'ol/
a&ter the num'er theoretician 8dmund
Landau %ho in(ented the notation1

We %rite &0x1 9 O0g0x11 to mean
$here exists a constant C and a (alue N

such that :&0x1: 5 C:g0x1: x 6 N ∀

8xample o& Order Notation

4& &0x1 9 2x
2
; <x ; #

$hen &0x1 9 O0x
2
1

4& h0x1 9 x
2
; <#=>?@x ; #=>?@A

$hen h0x1 9 O0x
2
1

Note that/ in these t%o cases/ the (alues o&
C and N are li,ely to 'e di&&erent2

Bor & %e can use 0</ #1
Bor g %e can use 02/ <#=>?A1 or 0#!!!/ @?1

8xample o& Order Notation

Note that & and h are both O0x
2
1 although
they+re di&&erent &unctions7

Bor the purposes o& order classi&ication/ it
doesn+t matter %hat the multiplier C is nor
ho% 'ig the (alue N is7

Note too that &ormally O is a C59D
relationship7 Eo F0x1 9 "> is also O0x
2
1

4& &0x1 9 O0g0x11 and g0x1 9 O0&0x11 then %e
can %rite &0x1 9 θ0g0x11

Eome common orders

Gere a some common orders/ %ith the
slo%er gro%ing &unctions &irst2

O0"1 – constant

O0log0x11 – logarithmic

O0x1 – linear

O0x
2
1 – Huadratic

O0x
n
1 – polynomial

O0e
x
1 - exponential

Order arithmetic

When t%o &unctions are com'ined the order
o& the resulting &unction can 0usually1 'e
in&erred

When adding &unctions/ you simply ta,e the
'iggest order

eg7 O0"1 ; O0n1 9 O0n1

When multiplying &unctions/ you multiply the
orders

eg7 O0n1 . O0n1 9 O0n
2
1

Order arithmetic &or programs

Bor a &unction ma,ing a seHuence o&
&unction calls the order o& the &unction is the
same as the highest order o& the called
&unctions
void f(int n) {
g(n); // O(n.log(n))
h(n); // O(n)
}

4n this example &01 9 O0n7log0n11

Order arithmetic &or programs

Bor a &unction using a loop the order is the
product o& the order o& the loop count and
the loop 'ody
void f(int n) {
int count = g(n); // count is O(log(n))
for (int i = 0; i != count; ++i) {
h(n); // O(n)
}

4n this example too &01 9 O0n7log0n11

Order &or standard algorithms

3any standard algorithms ha(e a %ell-
understood order7 One o& the 'est ,no%n
non-tri(ial examples is pro'a'ly quicksort
%hich Ce(eryone ,no%sD is O0n7log0n117

Order &or standard algorithms

3any standard algorithms ha(e a %ell-
understood order7 One o& the 'est ,no%n
non-tri(ial examples is pro'a'ly quicksort
%hich Ce(eryone ,no%sD is O0n7log0n117

8xcept %hen it isn't/ o& courseI

On average it is O0n7log0n11

$he worst case is O0n
2
1

Also/ this is the computational cost/ not the
memory cost

Order &or standard algorithms

$he C;; standard mandates the complexity
o& many algorithms7

Bor example/ std::sort2
CComplexity2 O0N log0N11 comparisons7D

and std::stablesort2
CComplexity2 4t does at most N log
2
0N1 comparisonsJ i&
enough extra memory is a(aila'le/ it is N log0N17D

and std::list::sort2
CComplexity2 Approximately N log0N1 comparisonsD

Order &or standard operations

$he C;; standard also mandates the
complexity o& many operations7

Bor example/ container::si!e2
CComplexity2 constant7D

and std::list::"ushbac#2
CComplexity2 4nsertion o& a single element into a list ta,es
constant time and exactly one call to a constructor o& $7D

Order &or standard algorithms

7Net lists complexity &or some algorithms7

Bor example/ $ist%&'.(ort2
COn a(erage/ this method is an O0n log n1 operation/ %here
n is CountJ in the %orst case it is an O0n K 21 operation7D

La(a does the same

Bor example/ )rra*s.sort2
C$his implementation is a sta'le/ adapti(e/ iterati(e
mergesort that reHuires &ar &e%er than n lg0n1 comparisons
%hen the input array is partially sorted/ %hile o&&ering the
per&ormance o& a traditional mergesort %hen the input
array is randomly ordered777D

Order &or standard operations

Go%e(er/ neither La(a not 7Net seem to
pro(ide much detail &or the cost o& other
operations %ith containers

$his ma,es it harder to reason a'out the
per&ormance impact o& the choice o&
container and the methods used7

Let+s try some experiments

Eo that+s the theoryJ %hat happens %hen %e
try some o& these out in an actual program
on real hard%are?

*33M 0di&&erent cloc, speeds/ amount
o& memory/ speed o& memory access
and cache si)es1

strlen01

Ehould 'e simple enough2 O0n1 %here n is
the num'er o& 'ytes in the string7
int strlen(char +s) /+ source: ,-. +/
{
int n;
for(n = 0; +s != /00/; s++)
{
n++;
}
return n;
}

Anyone loo,ed inside strlen recently?

strlen01 – more than you %anted to ,no%
strlen:
1ov ra23rc2 ; ra2 4' string
neg rc2
test ra235 ; test if string is aligned on 67 bits
8e 1ainloo"
2chg a23a2
str1isaligned:
1ov dl3b*te "tr 9ra2: ; read ; b*te
inc ra2
test dl3dl
8e b*te5
test al35
8ne str1isaligned ; loo" until aligned
1ainloo":
1ov r<35=>=>=>=>=>=>=>>h
1ov r;;3<;0;0;0;0;0;0;00h
1ov rd23?@ord "tr 9ra2: ; read < b*tes
1ov rA3r<
add ra23<
add rA3rd2
not rd2
2or rd23rA
and rd23r;;
8e 1ainloo"
1ov rd23?@ord "tr 9ra24<: ; found !ero b*te in the loo"
test dl3dl
8e b*te0 ; is it b*te 0B
test dh3dh
8e b*te; ; is it b*te ;B
shr rd23;0h
...
b*te;:
lea ra239rc2+ra245:
ret
b*te0:
lea ra239rc2+ra24<:
ret

strlen01

Nai(ely %e compare time &or2
ti1er.start();
strlen(data;);
ti1er.sto"();

$he call appears to ta,e no time at all 7777

Notcha2 strlen01 use can 'e optimised a%ay
i& the return (alue is not used7

It's important to check you're measuring
what you think you're measuring!

strlen01

Eet up a couple o& strings2
char const data;9: = C;C;
char const dataD9: = C;DE7F...65<A0...C;

Compare time &or v; = strlen(data;) against
vD = strlen(dataD)

Notcha2 strlen01 o& a constant string can 'e
e(aluated at compile time2 O0"1

It's important to check you're measuring
what you think you're measuring!

strlen01 - O0n1

Linear and consistent

strlen01 - O0dear1

Oiscontinuous 0and no longer as consistent1

strlen01 - )oom in


strlen01 - small n

$his machine has >#P L" ; ="2P L2 cache per core

strlen01

O0n1 to a (ery good approximation &or n
'et%een cache si)e and a(aila'le memory

Emall discontinuity around cache si)e

O0n1 %hen s%apping/ 'ut the &actor +C+ is
much 'igger 02=! – <!! times 'igger here1

string22&ind01

Let+s s%ap o(er &rom using strlen01 to using
string22&ind0+Q!+1

8xactly the same sort o& operation 'ut %ith
a (ery slightly more generic algorithm

We expect this %ill 'eha(e Fust li,e strlen01

string22&ind01

Eorting

Let+s start %ith a 0deterministic1 bogo sort
te1"late %t*"ena1e &'
void bogosort(& begin3 & end)
{
do
{
std::ne2t"er1utation(begin3 end);
} @hile (!std::issorted(begin3 end));
}

NEBW

O0n R nI1 comparisons

Eorting

$imings
"!/!!! items – "7"<ms
2!/!!! items – 27<2ms
<!/!!! items - <7==ms

#!/!!! items - #7?2ms

O0n1 – 'ut S ho%?

4 cheated and set the initial state care&ully

Te (ery care&ul a'out 'est and %orst casesI

Eorting

$imings 0randomised collection1

4 got 'ored a&ter 1 items

4t loo,s li,e %e hit a +%all+ at "</"#

Eorting

$imings 0randomised collection1

Eame graph a&ter ! items

Note2 the +%all+ e&&ect depends on scale

Eorting

std22sort

the 'est ,no%n in C;;

Hsort

the eHui(alent &or C

'u''leUsort

easy to explain and demonstrate

sta'leUsort

retain order o& eHui(alent items

partialUsort

sort +m+ items &rom +n+

Eorting

4 must mention AlgoRythmics – illustrating
sort algorithms %ith Gungarian &ol, dance

https2//%%%7youtu'e7com/%atch?(9y%WTy>L=g)@

Gelps to gi(e some idea o& ho% the
algorithm works

Also sho%s the importance o& the multiplier
" in the &ormula

Eorting

C4+d li,e to go 'ac, in time and ,ill the in(entor o&
'u''lesortD - Andrei Alexandrescu

Eorting

Nranted

Eorting

std22sort is &aster than Hsort

don+t tell the C programmers

*ou do pay 0a little1 &or sta'ility

partialUsort is a Cdar, horseD - do you really need
the full set sorted?

$hat %as %ith randomised input

A lot o& real data is not randomly sorted

Eorting

'u''leUsort+s re(enge

List or (ector?

$he complexity o& std22sort is the same as
std22list22sort – so %hat+s the di&&erence?

3ust copy the %hole o'Fect in a (ector

Can Fust s%ap the pointers in a list

List or (ector?


List or (ector?

Eo at this data si)e list is o(er t%ice as slo%
as (ector to sort 'ut uses Fust o(er hal& as
many comparisons

Perhaps measure sort complexity in other
terms than Fust the num'er o& comparisons

Go%e(er note that the items sorted in this
example are Huite small 0%raps an int1

List or (ector?

$he per&ormance %ill depend on the si)e o&
the o'Fect 'eing copied

With a 'igger o'Fect &ootprint

Eame num'er o& comparisons

Eame num'er o& pointer s%aps 0list1

3ore 'ytes copied 0(ector1

Repeat the test %ith a 'igger data structure
0%e %on+t display the V o& comparisons1

List or (ector?

List or (ector?

List or (ector?

$his is %hat %e expect2 the per&ormance
depends (ery hea(ily on the si)e o& the
o'Fect 'eing copied
Eo/ in this test on this hard%are/ the 'rea,-
e(en point comes at some%here around
"!! 'ytes &or the o'Fect &ootprint

$his is 'igger than 4 %as expecting

Bor comparison here is the e&&ect on sorting
the list %hen %e change the o'Fect &ootprint

List or (ector?

List or (ector?

$his is less expected2 it is a'out 2 – < times
slo%er to sort a list o& "P' o'Fects than a list
o& int o'Fects7

$he only di&&erence is the memory access
pattern2 o'Fects are &urther apart and so
cache use is less e&&icient7

Tut once you+re &urther apart than a cache
line 0>#'ytes1 %hy does more si)e still ma,e
a di&&erence?

Tac, to 'asics

Allocate a range o& memory and access it
seHuentially %ith +n+ steps o& si)e +m+7

$here is an o(erall trend/ o& sorts/ %ith
some anomalies

$he speci&ics %ill (ary depending on the
hard%are you+re running on and %ill depend
on 'oth the size and associativity o& the
(arious caches

Tac, to 'asics

Tac, to 'asics

Tac, to 'asics

While the speci&ics (ary/ the principle o&
locality is important

4& it is multiplicative %ith the algorithmic
complexity it can change the complexity
measure o& the o(erall &unction

Cost o& inserting

Euppose %e need to insert data into a
collection and the per&ormance is an
issue

What might 'e the e&&ect o& using2

std22list

std22(ector

std22deHue

std22set

std22multiset

Cost o& inserting

std22list Cconstant time insert and erase
operations any%here %ithin the seHuenceD

std22(ector Clinear in distance to end o& (ectorD

std22deHue Clinear in distance to nearer endD

std22set W std22multiset ClogarithmicD

We also need the time to &ind the insert point

Cost o& inserting

Randomly inserting "!/!!! items2

std2list X>!!ms

(ery slo% – cost o& finding the insertion
point in the list

std22(ector X<?ms

3uch &aster than list e(en though %e+re
copying each time %e insert

std22deHue X<"!ms

Eurprisingly poor – spilling 'et%een 'uc,ets

std22set X27>ms our %innerI

Cost o& inserting

3ay 'e %orth using a helper collection i& the
target collection is costly to create

se std22set as the helper and construct
std2list on completion X#ms

se a std22map o& iterators into the list so
list 'uilt in right order X#7@ms

$he helper collection %ill increase the
o(erall memory use o& the program

Cost o& sorted inserting

4nserting "!/!!! sorted items2

std2list X!7@@ms

Bast insertion 0at known insert point1

std22(ector X!7@=ms 0end1 / >!ms 0start1

#uch &aster %hen appending

std22deHue X<ms

Roughly eHual cost at either endJ a 'it
slo%er than a (ector

std22set X2ms 0'et%een (ector and deHue1

Cost o& inserting

What a'out order notation e&&ects?

4& %e use "!x as many items2

std2list X>!!s 0"!!!x1

std22(ector X<7?s 0"!!x1

std22deHue X<<s 0"!!x1

std22set X>>ms 0<<x1

$he find cost &or list d%ar&s the insert cost/
%hich is o&ten a hidden complexity

Cost o& inserting

Can %e 'eat std22set ?

$ry naY(e std22unordered$set01 - (ery
slightly slo%er at "!P 0X27@ms (s
X27>ms1 'ut 'etter at "!!P 0X#>ms (s
X>>ms1

Go%e(er/ in this particular case %e ha(e
additional ,no%ledge a'out our (alue set
and so can use a trivial hash &unction

No% std22unorderedUset01 ta,es X27<ms
0"!P1 and X<@ms 0"!!P1

Conclusion

$he algorithm %e choose is o'(iously
important &or the o(erall per&ormance o&
the operation 0measured as elapsed time1

As data si)es increase %e e(entually hit the
limits o& the machineJ the 'est algorithms
are those that in(ol(e least s%apping

Bor smaller data si)es the characteristics o&
the cache %ill ha(e some e&&ect on the
per&ormance

Conclusion

While complexity measure is a good tool %e
must 'ear in mind2

What are N 0the rele(ant si)e1 and C 0the
multiplier1?

Ga(e %e identi&ied the &unction %ith the
dominant complexity?

Can %e re-de&ine the pro'lem to reduce the
cost?

3a,ing it &aster

We+(e seen a &e% examples already o&
ma,ing things &aster7

Compile-time e(aluation o& strlen01 turns
O0n1 into O0"1

Can you pre-process 0or cache1 ,ey
(alues?

E%apping setup cost or memory use
&or runtime cost

3a,ing it &aster

Oon+t calculate %hat you don+t need

We sa% that/ i& you only need the top +n+/
partialUsort is typically much &aster than a
&ull sort

4& you ,no% something a'out the
characteristics o& the data then a more
speci&ic algorithm might per&orm 'etter

strlen01 (s &ind01

Eorting nearly sorted data

+$ri(ial+ hash &unction

3a,ing it &aster

Pic, the 'est algorithm to %or, %ith
memory hard%are

Pre&er seHuential access to memory

Emaller is 'etter

Eplitting compute-intensi(e data items
&rom the rest can help – at a slight
cost in the complexity o& the program
logic and in memory use

Eome other re&erences

Ecott 3eyers at ACC CCP cachesD2
http2//%%%7aristeia7com/$al,Notes/ACC2!""UCPCaches7pd&

lrich Orepper CWhat 8(ery Programmer Ehould
Pno% A'out 3emoryD2
http2//people7redhat7com/drepper/cpumemory7pd&

Ger' Eutter+s experiments %ith containers2
http2//%%%7got%7ca/got%/!=#7htm

and loo,ing at memory use2
http2//%%%7got%7ca/pu'lications/mill"#7htm

TFarne Etroustrup+s (ector (s list test2
http2//'ulldo)er!!7com/2!"2/!2/!A/(ectors-and-lists/ 0esp slides #<-#?1

Taptiste Wicht+s list (s (ector 'enchmar,s2
http2//%%%7'aptiste-%icht7com/2!"2/"2/cpp-'enchmar,-(ector-list-deHue/