You are on page 1of 9

I Am, Therefore I Think

"Congress shall make no law abridging the freedom of speech, or of the press"

About

The Full X11 Palette

The X11 Palette

February 2, 2011 / gus3

Finding the Fastest Filesystem, 2011 Edition

SEARCH

Type and press enter

CONTACT

Introduction
In my previous report about journaling filesystem benchmarking using dbench, I observed that a
properly-tuned system using XFS, with the deadline I/O scheduler, beat both Linuxs ext3 and IBMs
JFS. A lot has changed in the three years since I posted that report, so its time to do a new round of
tests. Many bug fixes, improved kernel lock management, and two new filesystem (btrfs and ext4)
bring some new configurations to test.
Once again, Ill provide raw numbers, but the emphasis of this report lies in the relative performance of
the filesystems under various loads and configurations. To this end, I have normalized the charted data,
and eliminated the raw numbers on the Y-axes. Those who wish to run similar tests on their own
systems can download a tarball containing the testing scripts; Ill provide the link to the tarball at the
end of this report.

System configuration

Send me a private message

CATEGORY CLOUD

Computers
Celebrities

Humor Idiot

Media Spin

Miscellaneous Personal

Philosophy Politics
Professional Victim Watch Science
and Technology Stupid, Vicious
Adults The Anglo-venezuelan

The test system is my desktop at home, an AMD Athlon 64 X2 dual-core 4800+ with 4 gigs of RAM
and two SATA interfaces. The drives use separate IRQs, with the journal on sda using IRQ 20, and the
primary filesystem on sdb using IRQ 21. The kernel under test is Linux 2.6.38-rc2 which now has builtin IRQ balancing between CPUs/cores. The installed distribution is Slackware64-current. During the
tests, the system was in runlevel 1 (single-user), and I didnt touch the keyboard anytime except during
un-measured warm-ups.

Connection Uncategorized

The motherboard chipset supposedly supports Native Command Queuing, but the Linux kernel disables
it due to hardware bugs. Even with this limitation, hdparm -Tt reports about 950M/s for cached reads
on both drives, and 65M/s buffered disk read for sda, 76M/s for sdb. That raw throughput serves my
usual desktop purposes well.

A Substitute Remote Control

Filesystem options

Just One More

I made a big improvement over hand-written notes, by formalizing and scripting the filesystem
initialization and mounting options. I also broadened the list of tested filesystems, adding ext2, ext4,
ReiserFS, and btrfs. All filesystems were mounted with at least noatime,nodiratime in the mount
options; this is becoming standard practice for many Unix and Linux sites, where system
administrators question the value of writing a new access time whenever a file is merely read.

Back and Forth

A quick perusal of Documentation/filesystems/ in the kernel source tree, turned up a treasure trove of
mount options, even for the experimental btrfs. One unsafe option I added where possible, was to
disable write barriers. Buffered writes can be the bane of journal integrity, so write barriers attempt to
force the drive to write to the permanent storage sooner rather than later, at the cost of limiting the I/O
elevators benefits. I opted for bandwidth in my short tests, for btrfs and ext4.

Linux Out, FreeBSD In

btrfs

Michael J. Totten

This filesystem format isnt yet finalized, so it is completely unsuitable for storage of critical data. Still,
it has been getting a lot of press coverage and online comment, with a big boost from Ted Tso, who
called it the future of Linux filesystems. Strictly speaking, btrfs isnt a filesystem with a journal. Its a
log-structured filesystem, in which the journal is the filesystem. Btrfs supports RAID striping of data,
metadata, or both, so I opted to enable RAID1 to distribute the I/O load:

Photios

mkfs.btrfs -d raid0 -m raid0 ${LOGDEV} ${PRIMARY}

Website

Matters
RECENT POSTS

Still Kicking
Finding the Fastest Filesystem,
2011 Edition

Forth, Come Forth!


My 2011 New Years Wish
The Image and the Original

Expectations and Results

BLOGROLL

EASTERN ORTHODOXY

Archdiocese of the Eastern USA

(EDIT: I previously used RAID1, mirroring, instead of RAID0 striping. I have adjusted the results
below.)

Greek Orthodox Archdiocese of America

The btrfs mount options added nobarrier,space_cache for performance.

Phronema

Orthodox Church in America

converted by Web2PDFConvert.com

ext2
I added ext2, to provide a reference point based on highly stable code. It provided one of the early
surprises in the tests.

FREE/LIBRE SOFTWARE

Apache web server

mke2fs ${PRIMARY}

Enlightenment desktop
The default features enabled in /etc/mke2fs.conf were
sparse_super,filetype,resize_inode,dir_index,ext_attr, with no mount options beyond
noatime,nodiratime.

GNOME desktop
Mozilla browsers
Nothing Microsoft!

ext3
mke2fs -O journal_dev ${LOGDEV}
mke2fs -J device=${LOGDEV} ${PRIMARY}

The only addition to the base ext2 features is the journal. The mount options added for this test were
data=writeback,nobh,commit=30.

GNU/LINUX

General Linux information


LXer

ext4
The other new Linux filesystem is ext4, which adds several new features over ext2/3. The most notable
feature replaces block maps with extents, which require less on-disk space for tracking the same
amount of file data. The ext4 journal also has stronger integrity checking than ext3 uses. (Another
feature, not used in this test, is the ability to omit the journal from an ext4 filesystem. Combined with
the efficiency of extents, this makes ext4 a strong candidate for flash storage, using fewer writes for
the same amount of file data.)
mke2fs -O journal_dev ${LOGDEV}
mke2fs -E lazy_itable_init=0 -O extents -J device=${LOGDEV} ${PRIMARY}

The Slackware Linux Project

HUMOR

Day By Day Cartoon


User Friendly
xkcd

The features from /etc/mke2fs.conf were


has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize, but the uninit_bg feature was
overridden by specifying -E lazy_itable_init=0 to mke2fs. This reduces extra background work during
the dbench run.

Drudge Report

JFS

Instapundit

Just as was the case three years ago, JFS still has no mkfs or mount options useful for testing
throughput. WYSIAYG (What You See Is ALL You Get).

Pajamas Media

NEWS

mkfs.jfs -q -j ${LOGDEV} ${PRIMARY}

ReiserFS
I caught a lot of guff three years ago, for omitting ReiserFS from my testing. This time around, I decided
that, if btrfs is good enough to test, even though its still in beta, then I should be fair to the ReiserFS
community and include it as well. Specifying -f twice skips the request for confirmation, useful for
scripting.

TECH NEWS

Groklaw
OSNews
Slashdot

mkreiserfs -f -f -j ${LOGDEV} ${PRIMARY}

Unfortunately, there is no file explaining ReiserFS options in Documentation/filesystems/, and the best
advice in mount(8) uses weasel-words: This [option] may provide performance improvements in some
situations. Without an explanation of what situations would benefit from the various options, I saw no
point in testing them. Hence, the only non-default option in my ReiserFS testing is the external journal.

ARCHIVES

April 2011
March 2011
February 2011

XFS

January 2011

This was the hands-down winner in my previous testing. Designed with multi-threading and aggressive
memory management, XFS can sustain heavy workloads of many different operations. It has many
tunable options for both mkfs.xfs and mount, so the scripted options are the most complicated:

December 2010
November 2010
October 2010

mkfs.xfs -f -l logdev=${LOGDEV},size=256m,lazy-count=1 \
-d agcount=16 ${PRIMARY}

One shortcoming of XFS is its lack of a pointer to an external journal device. As far as I can tell, it is
the only journaled filesystem on Linux to have only a flag specifying whether the journal is internal or
external. If the journal is external, then the mount command must include a valid logdev= option, or
the mount will fail.
I also expanded the mounted journal buffers, with logbufs=8,logbsize=262144. On my computer,
memory management is faster than disk I/O.

September 2010
August 2010
July 2010
June 2010
May 2010
March 2010
February 2010

Testing the elevators

January 2010
converted by Web2PDFConvert.com

January 2010
The original testing was intended to show the effects of disk I/O elevators and CPU speed on the
various filesystems, using medium and heavy I/O load conditions. Since I ran the original tests, the
anticipatory I/O elevator has been dropped from the Linux kernel, leaving only noop, deadline, and
cfq. This round of testing still shows significant differences between them.

December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008

With a 5-thread dbench load, I was surprised to see that ext2 was the consistent winner. Its lack of a
journal makes for less overall disk I/O per operation, at the cost of a longer time to check the
filesystem after an improper shutdown. XFS came in a close second, at roughly 97% the performance
of ext2.
The rest of the filesystems arent nearly as competitive. Even with their best elevators, JFS, ReiserFS,
and btrfs have less than half the performance of ext2 or XFS.

August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007

When the load increases to 20 threads, XFS is once again the clear winner. Ext4 benefits in the overall
ranking, thanks to extent-based allocation management, a trait it shares with XFS. Ext2 falls to third
place, probably due to the increased burden of managing block-based allocations. Ext3 again comes in
fourth, with block-based allocations and added journal I/O. The clear loser is once again JFS, coming in
at only 40% under heavy load. (More on this later.)

March 2007
February 2007
January 2007
December 2006
November 2006

converted by Web2PDFConvert.com

November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
Normalizing the throughput by a filesystems best elevator, shows which filesystems benefit from which
elevators. Oddly, under a 5-process load, the only filesystem to benefit from cfq is JFS on a fast CPU.
As seen above, that isnt enough to make it a strong contender against XFS or any of the native Linux
ext{2,3,4} filesystems.

July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
February 2004
January 2004
December 2003
November 2003

Here is where the game has changed. The cfq elevator clashed badly with XFS three years ago; it is
now mostly on par with deadline and noop. The XFS developers have put a lot of work into cleaning
up the internals, improving the integration of XFS with the Linux frameworks for VFS and disk I/O. They
still have work to do, as explained in Documentation/filesystems/xfs-delayed-logging-design.txt.
At its best, ReiserFS had only about 1/3 the throughput as the best filesystem, in any tested
configuration. Some mount options could probably improve the throughput, but without clear guidance, I
wasnt going to test every combination to find the best.

October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003

Bandwidth saturation
I decided to run a separate series of tests, to see what process loads would saturate the various
filesystems, and how they would scale after passing those saturation points. Using their best elevators,
I tested the throughput of each filesystem under loads from 1 to 10 processes.

March 2003
February 2003
January 2003
December 2002

converted by Web2PDFConvert.com

December 2002
November 2002
October 2002
August 2002
July 2002
June 2002
May 2002
April 2002

The two worst performers were JFS and ext2. JFS peaked at 3 processes, then dropped off badly,
ending up at 33% of its best performance at 10 processes. Ext2 didnt suffer as badly, peaking at 5
processes, then falling only to 75% of its peak. Ext3, ext4, XFS, and ReiserFS didnt suffer
significantly under saturated load, staying mostly within a horizontal trend.
If I had to make a guess why JFS scales so poorly, I can only suppose that, following IBMs
philosophy, its better to be correct than to be fast.

A special surprise
Btrfs was something of a mystery, hitting a performance valley at 4 processes, then climbing steadily
upward nearly to the end of the test. Given that its raw number under 20 processes was better than its
raw number under 10 processes, I decided to extend its test all the way to 50 processes, hoping to find
its saturation point.

Btrfs managed to scale somewhat smoothly, all the way from 4 to 30 processes. Beyond that, its
performance began to exhibit some noise, while still keeping an upward trend. This is a very impressive
development for a dual-core system. (For the math geeks, the trend line from 4 to 50 is f(x)=0.34x0.29,
with coefficient of determination R2=0.99.)

Conclusion
The Linux filesystem landscape has changed a lot in the past three years, with two new contenders
and lots of clean-up in the code. The conclusions of three years ago may not hold today.
Similarly, whats true on my system, may not be the case on yours. If you wish to examine your own
systems behavior, this tarball (CAPTCHA and 30-second wait required) contains the scripts I used for
this article, as well as two PDFs with the raw and normalized results of my own testing.
Share this:

Email

Print

StumbleUpon
A

converted by Web2PDFConvert.com

Filed under Uncategorized

Just One More

Still Kicking

Like Be the first to like this post.

leave a comment

Basico / Feb 3 2011 5:20 am


You dont give a final recommendation. Which is the best or faster system for the average
user?
reply

gus3 / Feb 3 2011 6:53 am


For the average user, the best one is the one that came with the system
and gets the job done.
For someone who needs continuous, steady performance, it depends on
the hardware configuration. Thats why I provide the testing scripts. The
best filesystem might be JFS or btrfs, somewhere else.
reply

Greg Zeng / Feb 8 2011 1:33 am


average user as you mention in the article is an
enterprise system admin. In common usage, average
users are notebook & netbook owners, often with the
latest SATA system (just released) which sys admins do
not yet use its too fast & has compatibility problems.
This explains why you do not consider FAT (12, 16, 32),
OS X, nor any of the many versions of NTFS.
You also ignore the fact that national, educational & nongovernment agencies do not use Linux as file systems.
The federal Australian government has just reinforced
(forced) the use of Microsoft software onto this nation ;
no open software, etc allowed, except after tedious,
expensive legal procedures are attempted by any
government agency.
Retired (medical) IT Consultant, Australian Capital

converted by Web2PDFConvert.com

Territory
reply

gus3 / Feb 8 2011 1:50 pm

I dont mention average user in the article. However, a


slightly above-average user can use the scripts I
provide, to test another systems behavior, especially
w.r.t. disk I/O elevator. That one tunable is not nearly as
difficult to adjust as a backup/re-format/restore, and it
can have a significant impact on disk performance. XFS
used to suffer under the CFQ elevator; switching to
deadline was like getting a new laptop as one user
put it. (My previous article discusses this.)
My understanding of Macintosh OS X is that it uses
UFS. This wouldnt be impossible to test under Linux,
but it isnt really used as a primary filesystem in the
Linux world. NTFS isnt well-supported at all; the NTFS
write support in the Linux kernel config comes with a
stronger-than-usual we hope this works right, but tough
cookies if it messes up your data warning.
reply

FUlano / Feb 3 2011 10:04 am


dbench is not exactly a great filesystem benchmark (any serious fs comparation should
use several benchmarks)
reply

gus3 / Feb 3 2011 1:37 pm


It wont be difficult to substitute a different benchmark into the testing
script. I use dbench for two reasons:
(1) dbench shows clear differences between filesystems, more than any
others I tried. Writing large files on my system will saturate at around 70
M/s, for any filesystem under test.
(2) dbench reports data volume per fixed time, rather than reporting time
for fixed data volume. Other benchmarks perform a fixed number of
operations, and report the time required; I prefer dbenchs opposite
approach.
reply

mike / Feb 4 2011 12:08 pm


it would have been fun to include Fat32 and NTFS, maybe HFS+ or whatever OSX uses
now too.
just to see how they level up. :)
reply

Greg Zeng / Feb 8 2011 1:23 am

converted by Web2PDFConvert.com

Check Wikipedia; its changed a lot since I instigated updates. There are
many versions of NTFS: NTFS-3G (Ms-Win & Mac), & M-$ NTFS (several
versions). With both FAT32 & all versions of NTFS, The published opinions
are that perhaps a cluster size of 4mb is faster.
Most PC users have notebooks & laptops. But his article only is about
system administrators, on enterprise servers. The author wrongly labelled
the posting.
reply

Rafael / Feb 4 2011 12:26 pm


Thanks for sharing these tests. I will try it here, on my cenarios/hardware.
Rafael from Suporte Linux Team
reply

John / Feb 4 2011 2:55 pm


You state, Btrfs supports RAID striping of data, metadata, or both, so I opted to enable
RAID1 to distribute the I/O load: but isnt RAID-1 mirroring? RAID-0 is striping AFAIK.
Id guess that this could have a significant negative effect on your Btrfs performance
results?
reply

gus3 / Feb 4 2011 3:14 pm


Oh dear, you are exactly right. Ill point out my blunder, and re-run the
btrfs tests tonight.
reply

John / Feb 4 2011 4:04 pm


To be fair though, I think all the other filesystems can be striped (raid-0) by using mdraid.
Perhaps there is a valid difference if Btrfs does this natively?
reply

gus3 / Feb 4 2011 4:31 pm


Yes, btrfs does volume management natively, when initializing the
filesystem, or later with btrfs device add/remove.
reply

Jack Ripoff / Feb 7 2011 9:13 am


Id like too see HammerFS in the comparison as well, too bad its not been ported to Linux
yet.
reply

Trackbacks

converted by Web2PDFConvert.com

1.
2.
3.
4.
5.
6.
7.
8.

LXer: Finding the Fastest Filesystem, 2011 Edition - oBlurb


Tweets that mention Finding the Fastest Filesystem, 2011 Edition I Am, Therefore I Think -- Topsy.com
Sistemas de ficheros: cul es ms rpido? | MuyLinux
Links 3/2/2011: PCLinuxOS Magazine Issue for February 2011, CentOS 6 Interview | Techrights
Linux News Finding the Fastest Filesystem, 2011 Edition
Recomendaciones de la semana | Pillateunlinux
Anonymous
filesystem Linux ()
Leave a Reply

Your email address will not be published. Required fields are marked
Name

Email

Website

Comment

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym
title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q
cite=""> <strike> <strong>

Post Comment
Notify me of follow-up comments via email.
Subscribe to this site by email

Theme: Paperpunch by The Theme Foundry. Blog at WordPress.com.

converted by Web2PDFConvert.com