You are on page 1of 22

CPU Schedulers Compared

By Graysky 15-Oct-2011 15 Oct 2011 revision 3 http://repo-ck.com/bench/benchmark.pdf

Abstract and Introduction


We all know that Con Kolivas Brain Fuck Scheduler (BFS) was designed to provide superior desktop interactivity and responsiveness to machines running it.1 However, it was not implicitly designed to provide superior performance. The purpose of this experiment is to evaluate the Completely Fair Scheduler (CFS) in the vanilla Linux kernel and the BFS in the corresponding kernel patched with the ck1 patchset on different machines to see if differences exist and, to what degree they scale using performance based metrics even though these end-points were never within the scope of primary d i goals of h h h d i i hi h f i design l f the BFS. Kernels packages used: Linux-3.0.6-2 (official Arch package) Linux-ck-3.0.6-2 (unofficial ck-generic package from http://repo-ck.com)2 Package uses bfs v0.406 which is contained in the 3.0.0-ck1 patchset Phoronix also did some benchmarking using non-latency based endpoints about which Con subsequently blogged.3
(1) http://ck.kolivas.org/patches/bfs/sched-BFS.txt (2) Additional information available at: https://wiki.archlinux.org/index.php/linux-ck (3) http://ck-hack.blogspot.com/2011/08/phoronix-revisits-bfs.html

Benchmark Details
The collective benchmark is composed of two tasks: 1. Compilation benchmark using gcc to make bzImage for a preconfigured linux v3.0.6 kernel. 2. 2 x264 video encoding benchmark using HandBrakeCLI to encode a 2 min long, 720p clip (no audio). Each machine running the benchmark: R Runs A h Li Arch Linux x86_644 86 64 Runs a minimal set of daemons including: syslog-ng, network, and netfs. Runs a BASH script that generates the two aforementioned benchmark datasets called from /etc/rc.local (see appendix for code). The aforementioned script: Runs the two individual benchmarks ten (10) times totally to get a decent number observations n mber of obser ations for a statistical comparison In all cases the first r n comparison. cases, run is omitted leaving an n=9 for each CPU/benchmark.

(4) Except for the Athlon XP machine which runs Arch i686 due to this CPU lacking 64-bit extensions.

Seven CPUs/Machines Compared


CPUs/Machines: 1. AMD Athlon XP 3200+ (single core) 2. 2 Intel E5200 (dual core) 3. Intel Atom 330 (hyperthreaded dualcore) 4. Intel Atom 330 (hyperthreaded dualcore w/ HT disabled) 5. Intel X3360 (quad core) 6. Dual Intel E5620 (2x hyperhreaded quadcore CPUs on a single board w/ HT disabled) 7. Dual Intel E5620 (2x hyperhreaded quadcore CPUs on a single board) g )
= physical core = hyperthreaded core yp = disabled hyperthreaded core

=1 =2 =4 =2 =4 =8

= 16

The Data

Make Benchmark Distribution Per CPU


Each panel plots compile time (seconds) vs. kernel (either ck-generic or generic) Panel subtitles describe the CPU; the # in parentheses is the # of threads used in the make step unless otherwise noted (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Lower is better!

make -j2 was used

make -j1 was used

Make Benchmark Distribution Per CPU


Each panel plots compile time (seconds) vs. kernel (either ck-generic or generic) Panel subtitles describe the CPU; the # in parentheses is the # of threads used in the make step (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Lower is better! Hyper threading

Hyper

threading

x264 Video Benchmark Distribution Per CPU


Each panel plots encoded frames per second vs. kernel (either ck-generic or generic) Each panel is a different PC and the number is parenthesis is the number of cores used by HandBrake-CLI HandBrake CLI (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Higher is better!

x264 Video Benchmark Distribution Per CPU


Each panel plots encoded frames per second vs. kernel (either ck-generic or generic) Each panel is a different PC and the number is parenthesis is the number of cores used by HandBrake-CLI HandBrake CLI (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Higher is better! Hyper threading

Hyper

threading

Statistical Relevance of Results

How to Read a Box Plot

Box plots are graphical tools to visualize key statistical measures, such as median, median mean and quartiles quartiles. The individual box plot is a visual aid to examining key statistical properties of a variable. The diagram below shows how the shape of a box plot encodes these properties. The range of the vertical scale is from the minimum to the maximum value of the selected column, or, to the highest or lowest of the displayed reference points.

Text and images taken from: http://web-player-server.liveintent.iponweb.net/SpotfireWeb/Help/dxpwebclient/box_what_is_a_box_plot.htm

How to Read a Box Plot


The drawing of comparison circles is a way to display whether the mean values for various categories (boxes in the box plot) are significantly different from each other or not. The circles are drawn with their centers at the mean value for the box to which it corresponds. If the circles for different groups do not overlap, the means of the two groups are generally significantly different. If the circles have a large overlap, the means are not significantly different. If the circles for different groups do not overlap, the means of the two groups are generally significantly different. If the g p p, g p g y g y circles have a large overlap, the means are not significantly different.

In the example above, the sum of sales is shown for a number of different fruits and vegetables. The box for Pears has been marked, which is also indicated in the corresponding comparison circle. The marked comparison circle is shown with a darker border d transparent fill Th square i th relation i di t under th b d k b d and a t t fill. The in the l ti indicator d the boxes i di t th marked b and th indicates the k d box d the lines of the relation indicator go to any boxes that are not significantly different from the marked one.
Text and images taken from: http://web-player-server.liveintent.iponweb.net/SpotfireWeb/Help/dxpwebclient/box_what_is_a_box_plot.htm

Statistical Significance of Benchmarks


Each box plot displays the distribution of the nine metrics vs. kernel (either ck-generic or generic) Panel subtitles describe the CPU; the # in parentheses is the # of threads used by the benchmark (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Lower is better!

Higher is better!

Statistical Significance of Benchmarks


Each box plot displays the distribution of the nine metrics vs. kernel (either ck-generic or generic) Panel subtitles describe the CPU; the # in parentheses is the # of threads used by the benchmark (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Lower is better!

Higher is better!

Statistical Significance of Benchmarks


Each box plot displays the distribution of the nine metrics vs. kernel (either ck-generic or generic) Panel subtitles describe the CPU; the # in parentheses is the # of threads used by the benchmark (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Lower is better!

Higher is better!

Statistical Significance of Benchmarks


Each box plot displays the distribution of the nine metrics vs. kernel (either ck-generic or generic) Panel subtitles describe the CPU; the # in parentheses is the # of threads used by the benchmark (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Lower is better!

Higher is better!

Statistical Significance of Benchmarks


Each box plot displays the distribution of the nine metrics vs. kernel (either ck-generic or generic) Panel subtitles describe the CPU; the # in parentheses is the # of threads used by the benchmark (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Lower is better!

Higher is better!

Statistical Significance of Benchmarks


Each box plot displays the distribution of the nine metrics vs. kernel (either ck-generic or generic) Panel subtitles describe the CPU; the # in parentheses is the # of threads used by the benchmark (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Lower is better!

Higher is better!

Statistical Significance of Benchmarks


Each box plot displays the distribution of the nine metrics vs. kernel (either ck-generic or generic) Panel subtitles describe the CPU; the # in parentheses is the # of threads used by the benchmark (2) means two physical cores (2+2) means two physical cores and two hyper-threaded cores
Lower is better!

Higher is better!

Appendix

Other Software Used


gcc: gcc-4.6.1-4 (official Arch package). handbrake: handbrake-cli-svn-4283 (package from AUR htt // h db k li 4283 ( k f http://aur.archlinux.org/packages.php?ID=29320). hli / k h ?ID 29320) x264-20111001-1 (official Arch package).

Contact the Author


Contact graysky with questions, corrections, or rants: graysky AT archlinux DOT us

Benchmark BASH Script


#!/bin/bash test_path="/media/data/bench p / / / ramdisk="/tmp/bench x264_file="2m-720p.mpg here="$test_path/linux-3.0 limit="10" MAKEFLAGS="4 # # # # # # p path containing test clip and source code g p where to do the test - select a ramdisk to min hdd usage x264 clip location of kernel source number of times to run number of make flags to use for make (x264 is automatic)

[[ -z $(which bc) ]] && echo "Install bc to allow calculations" && exit [[ -z $(which HandBrakeCLI) ]] && echo "Install HandBrakeCLI to allow calculations" && exit echo -n "Name of this kernel: " read -e NAME e if [ -z "$NAME" ]; then NAME=$(dmesg | grep BFS) fi [[ [[ [[ [[ [[ ! ! ! ! ! -d -f -f -d -f $ramdisk ]] && mkdir $ramdisk $test_path/make.txt ]] && echo "hostname,makeflags,kernel,time(sec),run #,run date" > $test_path/make.txt $test_path/xdata.txt ]] && echo "hostname,threads,kernel,time(sec),fps,run #,run date" > $test_path/xdata.txt $ramdisk/linux-ck/src/linux-3.0 ]] && cp -r $here $ramdisk $ramdisk/$x264_file ]] && cp $test_path/$x264_file $ramdisk

kernel() { k l() cd "$ramdisk/linux-3.0" x=0 while [ "$x" -lt "$limit" ]; do x=$(( $x + 1 )) make -j$NAMEFLAGS clean RUNDATE=$(date "+%F %T") start=$(date +%s.%N) make -j$MAKEFLAGS bzImage end=$(date +%s.%N) diff=$(echo "scale=6; $end - $start" | bc) echo "$HOSTNAME,$MAKEFLAGS,$NAME,$diff,$x,$RUNDATE" >> $test_path/make.txt done } video() { cd $ramdisk work="--input $x264_file" extras="--verbose --no-dvdnav --audio none --crop 0:0:0:0" preset="--preset=Normal" x=0 while [ "$x" -lt "$limit" ]; do x=$(( $x + 1 )) RUNDATE=$(date "+%F %T") start=$(date +%s.%N) HandBrakeCLI $work --output test-$x.mp4 ${extras} ${preset} 2>&1 | tee "test-$x".log end=$(date +%s.%N) diff=$(echo "scale=6; $end - $start" | bc) fps=$(grep "job is" "test-$x".log | sed -n 's/.*is //p' |sed 's/ fps//') CPUS=$(grep CPU "test-$x".log | sed 's/ CPU.*$//') echo "$HOSTNAME $CPUS $NAME $diff $fps $x $RUNDATE" >> "$test path"/xdata txt $HOSTNAME,$CPUS,$NAME,$diff,$fps,$x,$RUNDATE $test_path /xdata.txt rm -f test*.{mp4,log} done } kernel video