[Beowulf] since we are talking about file systems ...
Robert G. Brown
rgb at phy.duke.edu
Sun Jan 22 10:23:32 PST 2006
On Sun, 22 Jan 2006, PS wrote:
> Indexing is the key; observe how Google accesses millions of files in split
> seconds; this could easily be achieved in a PC file system.
I think that you mean the right thing, but you're saying it in a very
confusing way.
1) Google doesn't access millions of files in a split second, it AFAIK
accesses relatively few files that are hashes (on its "index server")
that lead to URLs in a split second WITHOUT actually traversing millions
of alternatives (as you say, indexing is the key:-). File access
latency on a physical disk makes the former all but impossible without
highly specialized kernel hacks/hooks, ramdisks, caches, disk arrays,
and so on. Even bandwidth would be a limitation if one assumes block
I/O with a minimum block size of 4K -- 4K x 1M -> 4 Gigabytes/second
(note BYTES, not bits) exceeds the bandwidth of pretty much any physical
medium except maybe memory.
2) It cannot "easily" be achieved in a PC file system, if by that you
mean building an actual filesystem (at the kernel level) that supports
this sort of access. There is a lot more to a scalable, robust,
journalizeable filesystem than directory lookup capabilities. A lot of
Google's speed comes from being able to use substantial parallelism on a
distributed server environment with lots of data replication and
redundancy, a thing that is impossible for a PC filesystem with a number
of latency and bandwidth bottlenecks at different points in the dataflow
pathways towards what is typically a single physical disk on a single
e.g. PCI-whatever channel.
I think that what you mean (correctly) is that this is something that
"most" user/programmers would be better off trying to do in userspace on
top of any general purpose, known reliable/robust/efficient PC
filesystem, using hashes customized to the application. When I first
read your reply, though, I read it very differently as saying that it
would be easy to build a linux filesystem that actually permits millions
of files per second to be accessed and that this is what Google does
operationally.
rgb
>
>
> Paul
>
>
> Joe Landman wrote:
>
>> Methinks I lost lots of folks with my points ...
>>
>> Major thesis is that on well designed hardware/software/filesystems, 50000
>> files is not a problem for accesses (though from a management point of view
>> it is a nightmare). For poorly designed/implemented file systems it is a
>> nightmare.
>>
>> Way back when in the glory days of SGI, I seem to remember xfs being tested
>> with millions of files per directory (though don't hold me to that old
>> memory). Call this hearsay at this moment.
>>
>> A well designed and implemented file system shouldn't bog you down as you
>> scale out in size, even if you shouldn't. Its sort of like your car. If
>> you go beyond 70 MPH somewhere in the US that supports such speeds, your
>> transmission shouldn't just drop out because you hit 71 MPH.
>>
>> Graceful degradation is a good thing.
>>
>> Joe
>>
>>
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
More information about the Beowulf
mailing list