[Beowulf] since we are talking about file systems ...

C. Ahmet MERCAN ahmet.mercan at gmail.com
Tue Jan 17 11:05:34 PST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Joe Landman wrote:
> I created a simple perl code to create lots of small files in a
> pre-existing directory named "dir" below the current directory.  This
> code runs like this
> 
>     dualcore:/local/files # ./files.pl 50000
>     Creating N=50000 files
>     Creating N files took 20 wallclock secs ( 0.57 usr + 13.99 sys =
> 14.56 CPU) seconds
> 
> then looking at the files
> 
>     dualcore:/local/files # time ls dir | wc -l
>     50002
>     
>     real    0m0.131s
>     user    0m0.094s
>     sys     0m0.040s
> 
> also doesn't take much time.  Then again, this may be due to caching,
> the md0 raid device the filesystem is on, or any number of other things.
> 
> What's interesting about this is the amount of wasted space more than
> anything.
> 
> Each file is on the order of 21 bytes or less.  50000 of them should be
> about 1 MB.  Right?
> 
> No.

No, because an one-byte-size file wastes a block of harddisk. The block
size * 50000 is correct amount of the wasted space.

> 
>     dualcore:/local/files/dir # ls -alF f10011.dat
>     -rw-r--r--  1 root root 21 Jan 17 13:10 f10011.dat
>     dualcore:/local/files/dir # du -h .
>     198M    .
> 
> ext3 isn't any better, giving about 197M.
> 
> Reiser theoretically takes care of stuff like this, though it has enough
> other issues that we won't use it (again).
> 
> Note:  for laughs, I ran the same code (one character modification to
> work under windows) under windows using the late model ActiveState port
> of Perl.  Running on an NTFS file system, fast local disk, 1 GB ram,
> windows XP latest patch updates.  Interesting results.
> 
>     C:\test>perl files.pl 50000
>     Creating N=50000 files
>     Creating N files took 187 wallclock secs ( 6.28 usr + 15.84
> sys         = 22.13 CPU) seconds
> 
> Yes, we had a virus scanner running (who doesn't) under windows.  Ok,
> turn off the virus scanner (McAfee).  This is a *dangerous* way to run
> windows as all of us know.
> 
>     C:\test>perl files.pl 50000
>     Creating N=50000 files
>     Creating N files took 27 wallclock secs ( 2.98 usr +  7.95 sys =    
> 10.94 CPU) seconds
> 
> Ok, thats better, but still not where we need.  More importantly, we had
> to turn off the only protection we have against viri and malware on this
> platform in order to achieve these results.  You can be fast or you can
> be safe on this platform.  You get to pick exactly one of these two
> options.
> 
> Ok, now try CIFS (running on a *fast* SAMBA server).  Had to add a "\\"
> to the sprintf to replace the "\" that worked in windows, which replaced
> the "/" which worked in linux.
> 
> Ummm... going on 10 minutes now, and it still hasn't returned.  Looks
> like it is creating 50-80 files per second.  Running a quick ls in that
> directory from the file server itself shows about 36k files out of 50k.
> 
> So I wanted to see if this was a SAMBA server problem.  Ran
> 
>     time smbclient -U landman //crunch-r/big
> 
> from another linux machine.  Logged in.  cd'ed to dir.  typed ls.
> exited.  Even with the interaction time in there (password entry, etc),
> this took *only* 10 seconds wall clock.  Doesn't sound like the SAMBA
> server is the issue.  The machine (PC with windows XP) was not swapping,
> not running anything else, virus checker is off...
> 
>     J:\>perl c:\test\files.pl 50000
>     Creating N=50000 files
>     Creating N files took 2297 wallclock secs ( 5.05 usr + 24.41 sys =
> 29.45 CPU) seconds
> 
> The files.pl code is on my download page
> http://downloads.scalableinformatics.com
>     
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (MingW32)

iD8DBQFDzT/+ipV9/tAZ2IMRAv8/AKCvPhp48CoWbWN1/eCvfLh6JoZa5ACfe5kw
bC5VEChTVSer2uzBcAdHtco=
=ayzd
-----END PGP SIGNATURE-----



More information about the Beowulf mailing list