diskless clients? beowulf-newbie seeks advice

Fri Jun 22 16:01:40 PDT 2001

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

On Friday 22 June 2001 18:56, Brian LaMere wrote:
> couple of follow-up questions....
>
> ********************
> I personally try to advocate diskless clients whenever I get a chance.
> There are several reasons for this:
> 1.	Administration is much easier if you don't have local data or OS or
> anything that won't be fixed by a simple power cycle.  If a problem
> persists through that, you know it's hardware (I guess it could be BIOS
> setup, but I'd consider that hardware).

I agree. OTOH the cluster setup is more complicated IMHO

> ********************
> this is easy enough to automate though...a single shell script that rcp's
> the changed files to all the nodes (for configuration files).  Then I was
> thinking of nfs exporting any applications that are needed...simply install
> them into /usr2, and share out /usr2...wouldn't that handle that?
>
> ********************
> 2.	You do save money on hard disks, but have to spend a little more on
> NICs.  What this does mean though, is that you can put a more advanced
> storage system on the server (or network attached storage).  Putting
> SCSI Raid or such on the server will increase performance and / or
> reliability (I think Gig E is faster than most IDE drives and I know
> Myrinet is, so you should improve total system performance).
> ********************
>

I partially agree. Your thoughs on memory management?. Swap over Network 
would decrease greatly the performance, so in my opinion you are limited to 
non-memory-consuming computations

> well, we do use 15k rpm scsi drives, but yeah....ya want to do as few reads
> as possible even then.  Just seems with a potential of there being 100 of
> these nodes, I very well may have to boot one every few days...and that
> would be lots of traffic.  At that point, is it advisable to have a worldly
> node set aside for doing the boot-up?  Hell, I could have everything
> bootstrap off their 100bt ports, then actually work off a gig-ethernet
> port...hmmm...
>
> **************************
> 3.	In theory, you shouldn't be writing to the local disks in a program
> anyway.  This will slow your computations waaaay down.  Same thing for
> swapping to disk.
> 4.	You only have one copy of data, so not only do you save in total
> storage costs (assuming cost/byte is fixed), but you also only have one
> copy of the data so you don't have to worry about inconsistencies
> between nodes (there are other ways to get the hard drives consistent,
> but they are a bit of a pain).  Also, as a sys-admin it is nice to only
> have one place to look if there are problems.
> *************************
> the data would obviously be shared out, yeah.  With a possibility of a
> multi-terabyte database that the cluster is querying in a year, there's no
> way I'd put a terabyte on each node..hehe.  I mean, IBM can say all they
> want about low-end drives being 400gigs in a year or so, but...  point
> being, whether they are diskless or not they'll be pulling in all their
> data locally.  Its just the OS that would be local, and perhaps any
> applications I'd want on each node.
>
> **************************
> 5.	You can call it intuition if you like (although it's based on
> facts),
> but I really think this is the way clusters are going.  This is the way
> that big systems like the Cray T3E work.  It's a lot simpler for
> programmers (It took me a while to explain how to write to local disks
> vs. server disks).  It's also just a lot more elegant, which may sound
> like a cop out, but most good solutions are elegant.

IMHO Cray's are another beasts. Each micro has access to the main memory via 
a high-speed bus. Thats not my situation (100Mbps) and probably not the 
situation of most beowulfs

> *************************
> Elegancy is certainly not a cop-out in my book.  The more elegant something
> is, the better it works...this is almost always true.
>
I agree. Having diskless computing nodes focus them as purely computational 
nodes, not just some boxen tied to a network

> **********************************
> Having said all that, it is important to note that diskless nodes are
> not for everyone.  In fact our cluster is not diskless, and we aren't
> looking at getting diskless nodes any time soon (give me a few years).
> Right now it doesn't meet our needs since we need the local disk and we
> have our disked cluster working fine.
>
> Hope this helps a little.
> Jared
> *************************************

Just my $0.02

Cheers

Pedro

>
> Any tips are helpful.  I'm just sittin here trying to decide which would be
> better for -our- particular application.  When is it better for there to be
> disked-clients?  Is diskless pretty much something I should obviously do
> considering the fact that the cluster will be quering a huge database
> anyway?
>
> Brian LaMere
> Diversa
>
> Brian LaMere wrote:
> > why does every guide around talk about diskless clients?  I mean...disks
>
> are
>
> > stinkin cheap nowadays...
> >
> > I have ~$150,000 to make a test cluster (with WAY more if the test
> > cluster shows worth) but the boss-man wants to go with nodes which aren't
> > exactly "commodity" in my book.  dual p3-1000 with 1.25Gb ram, 15krpm
> > 18Gb drives. The things cost $8k+ each...tried to explain that 148 $1k
> > machines would
>
> way
>
> > out perform 16 $8k machines, but...oh well.  These boxes take up 1u,
> > which seems to be their main selling point (HP's lp1000r).  Fortunately,
> > these boxes are down to $6.5k now in cost (dropped a bit since we bought
> > them a couple months back), but still...
> >
> > on to my point.  Getting PVM to see everyone as one happy little family
>
> was
>
> > easy enough.  Got the network guys to isolate the little guys, so that
>
> only
>
> > the worldly node could see them, since I wasn't happy with opening up
> > everything and simply putting a little all:all in hosts.deny, and having
> > that be all the security I had.  But every guide that I've found has been
> > all about diskless nodes for a beowulf.  And this isn't really a beowulf
> > with just pvm (and soon lam-mpi and mpich), right?  I personally thought
> > that the network nfs/tftp traffic would be horrible if they were all
> > diskless clients...
> >
> > so the real question:  I can put gig-e cards in the boxes instead of hard
> > drives...right now they just have 2 100bt enet connections.  I'm only
>
> using
>
> > one of the enet ports at the moment, too.  Would I be better with no
>
> disks,
>
> > and gig-e instead?  Some of the concerns I have here: though we're only
> > starting with a hundred gigs or such of data, we'll be at multi-terabyte
> > within a year.  To be throwing around data that large, while nfs'ing the
>
> OS
>
> > filesystems (on the clients) just seems like a lot for the boxes to do.
>
> Am
>
> > I looking at it wrong?  Also, for cost reasons we may be doing our data
> > storage on something as tacky as network attached storage; we were
> > looking at some NetApp boxes, but went with some EMC boxes instead.  Note
> > I'm not talking about a symmetrix box or something (I already have one of
> > those housing my oracle data), but instead a EMC product called an
> > "ip4700."
>
> Not
>
> > all that impressed with it.
> >
> > Just a little genetics research firm, needing some serious horsepower to
> > start running big hammer and blast jobs.  The data we have now is just
> > the bare minimum we need to get by, but if we had things like a working
>
> beowulf
>
> > the scientists upstairs would start making, since they'd be able to use
>
> it,
>
> > much more data.  They hired me on as the unix guy here knowing I don't
>
> know
>
> > squat about beowulfs, but that I really want to learn :)  Got "how to
>
> build
>
> > a beowulf" <grin> and I've read the manuals for pvm, mpich, lam-mpi, etc,
> > and several other beowulf how-to guides.  All are about diskless.  Is
> > diskless better?  Is it just better because its cheaper?  Are there other
> > reasons its better?  Would having gig-ethernet in the boxes instead of
>
> hard
>
> > drives be far better performance-wise?
> >
> > Brian LaMere
> > Diversa
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
>
> http://www.beowulf.org/mailman/listinfo/beowulf

- -- 
- -----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

mQGiBDqcGZsRBADFIahNPLk8suMlS39m8RqatLgX4dO7PU2F5p1oVvkyB7PaLQCv
FREWwfrjGpxAjRnxyZ4TdaFi1oCP495t5R2CdjPZu0EfjsEqosdLXkjDsKl2n4Wo
Afb6BaHMJS5PADEI0QfpZOkB8OruAZja/oGmn5rThyjgCxWHUuK1ArmeGwCg7+9a
owg9wP1RohePHJSDB9d2HYMD/i7z1X4ev+K90LumgJwSWlScJ7MEip5rw4wqGOkK
lF/C2nTYsoX5CVEn/pu7hROL/BWIYtBgkNDaEjsVsyb+4KjQXcZUW5l3ADipWYx2
r9s4sFfeZ9nfhDcG0aNYRcCNkYSZ/WxUkXS8UjVEAEhkFu1BA+6UZmeq3pKtJZTR
+HqKA/9zRmgTon36zt2qe9eiR6DyY0EpGEI0iY+KYX6GC/wxizeHBw0FW1eOEoxF
GjtxdBv/U9vi7Vgav6aY+pr4la5q6jVabe03Y8yGDFeL8jM+lqww1rzpABiGrF+W
qge65zCUjL3jJE5+5yi+KcRyllb1OA7uXQTtsRw+TGq9Dvaaz7QwUGVkcm8gRGlh
eiBKaW1lbmV6IChCLk8uRi5ILikgPHBkaWF6ODhAdGVycmEuZXM+iFYEExECABYF
AjqcGZsECwoEAwMVAwIDFgIBAheAAAoJEJ7ud33hGMZRj20An2Ce4S/vBTuZDxnL
WFBrJRnc3UdaAKDnIPNRbz7r4dh9AuBcpbCE1pQ/SLkBDQQ6nBmqEAQAr7O07Dws
5zAbQvm1hwGthXKCHtIIuWCPdX/XkNG6ZxV/cXgs4LI4oAg3GhttD2JIEk2SoVXE
FOf/wIddIDz70/9mIZavMvpR31LxBFSJk0Up3caOvThM90wMttRi7tg7cf04rrMM
Phy8T5bOIW/q5SMwZffbJXD7bA0/jDLdQ6MAAwYD/1emSwNTzOOmMCZadoEBpKIE
HA35P2/m/SsCI+pQ/OKXKPvvrQKTQqRCcDa5aq31oSiT9M5WQ96BlIGKHRPWGpvm
0822V7M9RF2mYZPIfgKfTSvZpYHzjz+RM7PvBBiBc9l95vy70Sh7SywIF86H80Ag
D0dUIDtGlrSANhXjx4EJiEYEGBECAAYFAjqcGaoACgkQnu53feEYxlHdVACgjVhU
Y8CKf6MYZgQOR9eIDNvTX0AAn3dwbW1HLxEF5OQKJIsngl0BUlYK
=d4S3
- -----END PGP PUBLIC KEY BLOCK-----
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7M85Unu53feEYxlERApXBAJ928Fa0axtRjA6qq1aOH2FCqyqs5wCcCUKU
trEC8zxe8TqY5qnqWQzoJ3I=
=Fqj6
-----END PGP SIGNATURE-----