Linux Software RAID5 Performance

Gianluca Cecchi tekka99 at
Tue Apr 2 13:29:55 PST 2002

Which option did you use for ext3 journal mechanism? It makes difference
when using "writeback" vs the default "ordered" (see below part of the
"Changes" file for ext3)
Which I/O benchmark did you use?
Gianluca Cecchi

New mount options:

    "mount -o journal=update"
        Mounts a filesystem with a Version 1 journal, upgrading the
        journal dynamically to Version 2.

    "mount -o data=journal"
        Journals all data and metadata, so data is written twice. This
        is the mode which all prior versions of ext3 used.

    "mount -o data=ordered"
        Only journals metadata changes, but data updates are flushed to
        disk before any transactions commit. Data writes are not atomic
        but this mode still guarantees that after a crash, files will
        never contain stale data blocks from old files.

    "mount -o data=writeback"
        Only journals metadata changes, and data updates are entirely
        left to the normal "sync" process. After a crash, files will
        may contain stale data blocks from old files: this mode is
        exactly equivalent to running ext2 with a very fast fsck on reboot.

Ordered and Writeback data modes require a Version 2 journal: if you do
not update the journal format then only the Journaled data will be

The default data mode is Journaled for a V1 journal, and Ordered for V2.

----- Original Message -----
From: "Michael Prinkey" <mikeprinkey at>
To: <beowulf at>
Sent: Sunday, March 31, 2002 9:33 PM
Subject: Linux Software RAID5 Performance

> Some time ago, a thread discussed the relative performance and stability
> merits of different RAID solutions.  At that time, I gave some results for
> 640-GB arrays that I had build using EIDE drives and Software RAID5.  I
> recently constructed and installed a 1.0-TB array and had some performance
> numbers to share for it as well.  They are interesting for two reasons:
> First, the filesystem in use is ext3, rather than ext2.  Second, the read
> performance is significantly better (almost 2x) than that of the 640-GB
> units.
> The system uses 11 120-GB Maxtor 5400-RPM drives, two Promise Ultra66
> controllers, a P4 1.6-GHz CPU, an Intel 850 motherboard, and 512 MB ECC
> RDRAM.  Drives are configured in RAID5 (9 data, 1 parity, 1 hot spare).
> Four drives are on each Promise controller.  Three are on the on-board
> controller (UDMA100).  A small boot drive is also on the on-board
> controller.  I had intended to use Ultra100 TX2 controllers, but the
> EIDE driver updates with TX2 support are not making it into the latest
> kernels (I'm using 2.4.18), so I opted for the older, slower controllers
> rather than patching.  So, I am both cautious and lazy.  8)
> Again, performance (see below) is remarkably good, especially considering
> all of the strikes against this configuration:  EIDE instead of SCSI,
> instead of 100/133, 5400-RPM instead of 7200-RPM, and master/slave drives
> each port instead of a single drive per port.  With some hdparm tuning (-c
> -u 1), the read performance went from 83 MB/sec to 93 MB/sec.  Write
> performance remained essentially unchanged by tuning at 26 MB/sec.  For
> comparison, the 640-GB arrays gave read performance of about 56 MB/sec,
> write performance of 28.5 MB/sec.
> Had I more time, I would have tested ext2 vs ext3 to ascertain how much
> change effected performance.  Likewise, I was considering the use of a
> array as the ext3 journal device to perhaps improve write performance.
> thoughts?
> Regards,
> Mike Prinkey
> Aeolus Research, Inc.
> ----------------------
> [root at tera /root]# df; mount; cat /proc/mdstat; cat bonnie10.log
> Filesystem           1k-blocks      Used Available Use% Mounted on
> /dev/hda6             38764268   2601128  34193976   8% /
> /dev/hda1               101089      4965     90905   6% /boot
> /dev/md0             1063591944  58195936 1005396008   6% /raid
> raid640:/raid/home   630296592 284066148 346230444  46% /mnt/tmp
> /dev/hda6 on / type ext2 (rw)
> none on /proc type proc (rw)
> /dev/hda1 on /boot type ext2 (rw)
> none on /dev/pts type devpts (rw,gid=5,mode=620)
> /dev/md0 on /raid type ext3 (rw)
> automount(pid580) on /misc type autofs
> (rw,fd=5,pgrp=580,minproto=2,maxproto=3)
> raid640:/raid/home on /mnt/tmp type nfs (rw,addr=
> Personalities : [raid5]
> read_ahead 1024 sectors
> md0 : active raid5 hdl1[10] hdk1[9] hdj1[8] hdi1[7] hdh1[6] hdg1[5]
> hde1[3] hdd1[2] hdc1[1] hdb1[0]
>       1080546624 blocks level 5, 32k chunk, algorithm 2 [10/10]
> unused devices: <none>
> Bonnie 1.2: File '/raid/Bonnie.1027', size: 1048576000, volumes: 10
> Writing with putc()...         done:  14810 kB/s  88.9 %CPU
> Rewriting...                   done:  22288 kB/s  13.4 %CPU
> Writing intelligently...       done:  26438 kB/s  21.7 %CPU
> Reading with getc()...         done:  17112 kB/s  97.9 %CPU
> Reading intelligently...       done:  93332 kB/s  32.2 %CPU
> Seek numbers calculated on first volume only
> Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
>               ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd
> Seek-
>               -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k
> (03)-
> Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU
> %CPU
> raid05 10*1000 14810 88.9 26438 21.7 22288 13.4 17112 97.9 93332 32.2
>   2.1
> _________________________________________________________________
> Get your FREE download of MSN Explorer at
> _______________________________________________
> Beowulf mailing list, Beowulf at
> To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list