Professional Documents
Culture Documents
Benchmarking
Tools(LMBench)
Agenda -
● LMBench introduction
● bw_mem
● bw_pipe
● lat_mem_read
● lat_ctx
● lat_syscall
● stream
What is LMBench ?
● The first three options are similar for most of the benchmarks.
○ [-W <warmup>]
○ [N <repetitions>]
○ fcp
○ bzero
○ bcopy
● MEMORY UTILIZATION
○ Can move up to six times the requested memory per process.
○ Two processes - sender and receiver.
○ One read from the source and a write to the destination.
○ Write usually results in a cache line read and then a write back of the cache line at some later
point.
lat_mem_rd
● Memory read latency benchmark.
● Usage -
● This means that the switch includes the time for the cache misses on larger processes.
● Output format specifies the size and non-context switching overhead of the test.
● The overhead and the context switch times are in micro second units.
lat_syscall
● Times simple entry into the operating system
● Usage -
● After the first three options it takes the one of the options from
null,read,write,stat,fstat and open to time the operation.
● It takes last option as the path to file.
lat_syscall (contd…)
● Explaining the diff options -
○ null - measures how long it takes to do getppid().
○ read - measures how long it takes to read one byte from /dev/zero.
○ write - times how long it takes to write one byte to /dev/null.
○ stat - measures how long it takes to stat() a file whose inode is already cached.
○ fstat - measures how long it takes to fstat() an open file whose inode is already cached.
○ open - measures how long it takes to open() and then close() a file.
● Output format is -
stream
● Synthetic benchmark
● It measures the performance of benchmark repeatedly and reports the median result.
● benchmp creates parallel sub-processes which run benchmark in parallel.
● This allows lmbench to measure the system’s ability to scale as the number of client processes
increases.
Explaining arguments of benchmp
● Each sub-process executes initialize before starting the benchmarking cycle with iterations set to 0.
● It will call initialize , benchmark , and cleanup with iterations set to the number of iterations in the
timing loop several times in order to collect repetitions results.
● The calls to benchmark are surrounded by start and stop call to time the amount of time it takes to do
the benchmarked operation iterations times.
● After all the benchmark results have been collected, cleanup is called with iterations set to 0 to cleanup
any resources which may have been allocated by initialize or benchmark.
● cookie is a void pointer to a hunk of memory that can be used to store any parameters or state that is
needed by the benchmark.