[Beowulf] Announcing nettee 0.1.4

Felix Rauch Valenti felix.rauch.valenti at gmail.com
Tue May 3 21:26:08 PDT 2005


On 5/4/05, David Mathog <mathog at mendel.bio.caltech.edu> wrote:
> > Can you elaborate on the features that you have added to Dolly?
> 
> Hmm, well, I sent some changes for dolly to Felix and then
> the most recent version of that was mutated into nettee.

Since I'm now working in a slightly different area, I don't have much
time anymore to maintain Dolly. I appreciate David's efforts to keep
the idea of Dolly alive, especially since Dolly's pioneering efforts
are credited ;-)

> Both nettee and dolly put a really heavy load on the switch
> and are of course completely the wrong solution for a shared
> network (if anybody still has those).

Dolly and nettee are actually good tests to find out whether your
switch can deliver the performance it's supposed to deliver. We once
got a free upgrade to a much more expensive switch after we proved to
the manufacturer that the performance listed in the data sheet was
much exaggerated.

> I've only got 20 nodes.  I have no idea how this daisychain works
> when the chain is 100 or more nodes long.

We regularly used Dolly on a 128-node cluster at ETH Zurich. The main
problem was to find all the faulty nodes before cloning, but the
throughput was the same with 16 nodes and with 128 nodes (once we got
the decent switch). The performance scaled perfectly (at least for
large files, where the few seconds startup-overhead (for the ssh
commands to start Dolly on all nodes) didn't matter).

- Felix




More information about the Beowulf mailing list