[Beowulf] Computation on the head node
Joe Landman
landman at scalableinformatics.com
Sun May 18 11:12:31 PDT 2008
Perry E. Metzger wrote:
> Joe Landman <landman at scalableinformatics.com> writes:
>> [...]
>> We approach it from a different view. Start out with a very fast RAID
>> and attach good networking to it. Here is a RAID6 across 12 disks.
>> [...]
>
> I have to emphasize, yet again, that the correct approach is not to
> assume that you need a fast file system (or that you can make do with
> a slow one) until you've actually benchmarked the application that
> your cluster is intended to serve. One can have no idea without
Perry, you are "preaching to the choir" in general. These are points we
have made for quite some time on this group and elsewhere. The only
app that matters is your app. Not spec-*, not NAS parallel B/C/... , ...
This said, we (collectively) also have some experience dealing with I/O
issues on clusters. This is, as it turns out, a hard problem. It is a
problem exasperated by fast networks that can sink and source large data
sets. Moreover, the problem is non-trivial in a number of cases, as you
scale up, as you discover new and exciting points of serialization that
have not been exposed before.
This is not just true of IO but also of CPUs, of memory systems, etc. 8
processor cores sharing a single memory system can have an impact on
codes. Specific algorithms map well into these systems, some do not.
> checking. A very fast RAID array may be in order -- or it may be
> completely unnecessary. One can't know without understanding one's
> application intimately, and that requires testing.
Of course. But there are quite a few people/groups on this list with
decades of HPC experience that might have an inkling if a USB or similar
connected drive "is a good idea" for an app, even prior to running it.
Benchmarking is important, but it is important that the benchmark
represent real runs. Experience can provide a rough guide in the case
of no benchmark data availability. With clusters, you run into the very
real problem of IO resource contention, quite quickly. Putting lower
end IO devices in there rarely makes sense. Sure, you can benchmark it,
and you should if possible. But it is also not a bad idea to listen to
people whom have been working on this stuff for a while, they might have
a clue about these things.
> Picking hardware without knowing the app in detail is like deciding
> that the right vehicle for you is a formula 1 race car before
> discovering your application is hauling 50 passengers across town
> through heavy traffic, or picking a bus before discovering your
> application is breaking a land speed record.
... and we see this happen all the time. Not picking particular models
as much as picking particular brands, or systems ill-suited for the
tasks at hand, or ill-configured for the tasks, ...
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf
mailing list