[Beowulf] best archetecture / tradeoffs
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comSat Aug 27 16:22:08 PDT 2005
- Previous message: [Beowulf] best archetecture / tradeoffs
- Next message: [Beowulf] best archetecture / tradeoffs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mark Hahn wrote: > swap across the network is asking for trouble. Yes it is. Especially if you are swapping 4k pages :( > you should evaluate whether > you actually need swap at all. I advocate having a disk in nodes to handle > swap, actually, even though I'd rather *boot* as if diskless. Swap is not something you should normally touch during a run, unless your runs have grown larger than ram. More in a second. > > >>>the nfs drive or is it just back into memory? What is the best ( fastest >>>) way to handle swap on diskless nodes that might sometimes be >>>processing jobs using more than the physical RAM? > > > you need to seriously rethink such jobs, since actually *using* swap is > pretty much a non-fatal error condition these days. I disagree with this classification (specifically the label you applied). Using swap means IMO that you need to buy more ram for your machines. There is no excuse to skimp on ram, as it is generally inexpensive (up to a point, just try to buy reasonably priced 4GB sticks of DDR single/dual ranked memory). You could argue memory leak, but every so often I have a customer call me up to tell me how slow a machine got when they overcommitted memory as they ran a huge job (10x their old jobs). At 100 MB/s bandwidth versus 3000 MB/s bandwidth, and a latency that is 4 orders of magnitude higher, swap is definitely not the place to go if you can avoid it. But turning it off completely could create some other rather exciting problems. > > >>conditions. Networked remote disk even more so, if you manage to work >>this out. > > > actually, swap over the network *could* make excellent sense, since, > for instance, gigabit transfers a page about 200x faster than a disk > can seek. (I'm assuming that the "swap server" has a lot of ram ;) The disk seek time is on the order of 8 ms while the bandwidth is on the order of 60+ MB/s per disk, while the gigabit has a "seek time" about the same (if you are swapping to a local or remote file system or disk, you still need to pay the seek time unless you are running in asynchronous mode), and the bandwidth is on the order of 20-90 MB/s. Plus you get to pay some additional bonus latencies. Add to this that it is very easy to tweak local swap across 2 disks to get > 100 MB/s swap transfers at the same latency as a single. I usually classify this in the "local disk is almost always fastest" rule (which some folks disagree with, but never indicate data to the contrary). The take home messages are a) avoid swap if possible b) and if you cannot swap at the fastest possible speed (e.g. locally). Now if we could get us some nice 4MB size pages .... >>>Also, is it really true you need a separate copy of the root nfs drive >>>for every node? I don't see why this is. I have it working with just one > > > certainly not! in fact, it's basically stupid to do that. my diskless > clusters do not have any per-node shares, though doing so would simplify > certain things (/var mainly). You might want to clarify this a bit, because this is an important point. That is, for the N machines you install, some directories are going to be identical across similar ABI machines (/bin, /sbin, /lib, /usr/, ...) while there may be minimal variations in others (/etc, /var, ...). [...] >>system just wrote. So rolling your own single-exported-root cluster can >>work, or can appear to work, or can work for a while and then >>spectacularly fail, depending on just what you run on the nodes and how >>they are configured. > > > sorry Robert, but this is FUD. a cluster of diskless nodes each mounting > a single shared root filesystem (readonly) is really quite nice, robust, etc. http://onesis.org >>There are, however, ways around most of the problems, and there are at >>this point "canned" diskless cluster installs out there where you just >>install a few packages, run some utilities, and poof it builds you a >>chroot vnfs that exports to a cluster while managing identity issues for > > > canned is great if it does exactly what you want and you don't care to > know what it's doing. but the existence of canned systems does NOT mean > that it's hard! The vast majority of the canned systems adhere to particular philosophical tennets. Some insist upon a RedHat-like OS, so anything not supported by this is simply unsupported (such as SATA, Firewire, XFS, .... (long list of good technology) ). Some insist upon other things which range between neat ideas to highly questionable ones. Some require significant kernel/glibc changes which render them slightly incompatible at the binary level. From the commercial view of this, most end users just want a simple to maintain machine (they view a cluster as a single machine for the most part) that runs, with no surprises, and just works. I am not aware of any of the canned systems that do this while also meeting the critera that they require in terms of flexibility of distribution choice (some people have distribution constraints based upon their purchased software support requirements), breadth of hardware support, support for a wide array of infrastructure elements... -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615
- Previous message: [Beowulf] best archetecture / tradeoffs
- Next message: [Beowulf] best archetecture / tradeoffs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
