[Beowulf] Purdue Supercomputer

Mark Hahn hahn at mcmaster.ca
Sat May 3 23:41:08 PDT 2008

> Does anyone know what is the detailed plan for building that thing with 200
> people in just 1 day?

I'm guessing it's mainly just the monkey work.  I've heard that dell always
delivers each server in a separate box, so the most annoying part of 
building a dell cluster is unboxing, racking and managing the detritus.

812 servers, 200 people is only 4 servers/day, which seems quite generous.

> I am very curious to understand what things can be done in parallel, what
> things are serialized from the point of view of installation, testing and
> evaluation/assesment.

from the article, it sounds like the 200 people will mainly be unboxing,
perhaps applying the rail kit, transporting to the machineroom.  it would 
make more sense to just have them rack directly, with one other person 
stationed at the back of each rack handling cabling.  I'm guessing that 
the cluster uses a leaf-switch-in-rack design, and that the rack arrangement
and cabling is done ahead/separately.  I can't imagine why, as soon as the 
server is in the rack and cabled, it couldn't pxe boot a test config.
if you fill the rack in some well-defined order, you can easily enough keep
track of physical-network node mappings.

> Even monitoring the progress,identifying critical tasks, balancing the
> workforce, having several B,C plans in case plan A fails.

you make it sound hard and uncertain.  it's not.  doing it in one day with
200 people is basically just a stunt...

> What is a day in here a business day or 24hours?

if everyone knew what they were doing, I can't imagine why it would take more
than a few hours to build, even counting the elevator ride.  (presumably ~50
per elevator trip.  but more importantly, the elevator partitions the
workforce.)  if the event also includes setting up the racks, cabling the 
interconnect, and infrastructure servers, etc, it would be more impressive.

More information about the Beowulf mailing list