[Beowulf] Xgrid and Mosix (fwd from john@rudd.cc)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Douglas Eadline, Cluster World Magazine deadline at clusterworld.comSat Jan 1 07:40:44 PST 2005
- Next message: [Beowulf] grid
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
These two pages are useful when considering Mosix What does not migrate: http://howto.x-tend.be/openMosixWiki/index.php/don't what does migrate: http://howto.x-tend.be/openMosixWiki/index.php/work%20smoothly ClusterWorld just ran a feature on OpenMosix http://www.clusterworld.com/issues/dec-04-preview.shtml Doug On Thu, 30 Dec 2004, Eugen Leitl wrote: > ----- Forwarded message from John Rudd <john at rudd.cc> ----- > > From: John Rudd <john at rudd.cc> > Date: Wed, 29 Dec 2004 19:37:02 -0800 > To: xgrid-users at lists.apple.com > Subject: Xgrid and Mosix > X-Mailer: Apple Mail (2.619) > > > I see in the archives that someone asked about OpenMosix back in > September ( > http://lists.apple.com/archives/xgrid-users/2004/Sep/msg00023.html ), > but I didn't see any responses. So I thought I'd ask too, but with a > little more detail. > > The thing that I find interesting about the Mosix style distributed > computing environment is that applications do NOT need to be re-written > around them. Mosix abstracts the distributed computing cluster away > from the program and developer in the same way that threads abstract > multi-processing away from the program and developer. Under Mosix, any > program, without having to be written around any special library, > without having to be relinked or recompiled, can be moved off to > another processing node if there are nodes that are significantly less > busy than yours. And, AFAIK, any multi-threaded application can make > use of multiple nodes (with threads being spawned on any host that is > less loaded than the current node). Imagine taking a completely > mundane but multi-threaded application (I'll assume Photoshop is > multi-threaded and use that as an example). Suddenly, without having > to get Adobe to support Xgrid, you can use Xgrid to speed up your > Photoshop rendering. > > It seems to me that a similar set of features could be added to Xgrid. > The threading and processing spawning code within the kernel could be > extended by Xgrid to check for lightly loaded Agents, and move the new > process or thread to that Agent. Only the IO routines would need to > exist on the Client (and even then, maybe not: if every node has > similar filesystem image, then only the UI (for user bound > applications) or primary network interface code (for network > daemons/servers) needs to run on the original Client system). From > what I recall, the mach microkernel already makes some infrastructure > for this type of thing available, it just needs to be utilized, and > done deep enough in the kernel that an application doesn't need to know > about it. > > > Though, that does bring up one consideration: I have a friend who did a > lot of distributed computing work when he was working for Flying > Crocodile (a web hosting company that specialized in porn sites, where > his distributed computing code had to support multiple-millions of hits > per second). His experience there gave him a concern about Mosix style > distributed computing. One of the advantages of something like Beowulf > is that the coder often needs to control what things need to be kept > low latency (must use threads for SMP on the local processor) and what > things can have high latency (can use parallel code on the network), > and the programming interface type of distributed computing gives them > that flexibility. > > The idea that I suggested was something like nice/renice in unix, where > you could specify certain parallelism parameters to a process before > you run it, or after it is already running. For example, instead of > "process priority", you might specify a sort of "process affinity" or > "thread affinity". For process affinity, a low number (which means > high affinity, just like priority and nice numbers) means "when this > process creates a child, it must be kept close to the same CPU as the > one that spawned it". Thread affinity would be the same, but for > threads. A default of zero means "everything must run locally". A > high number means "I can tolerate more latency" (so, "latency > tolerance" would be the opposite of "affinity"). (it occurs to me > after I wrote all of this that it might be easier for the end user to > think in terms of "latency tolerance" instead of "process affinity", > high latency = high number, instead of the opportunity for confusion > that affinity has since the numbers go in the opposite direction ... I > hope all of that made sense) > > A process with a low process affinity (high number) and a high thread > affinity (low number) means that it can spawn new > tasks/processes/applications anywhere in the network, but any threads > for it (or its sub-processes) must exist on the same node as its main > thread. Or, if you want all of the applications to be running on your > workstation/Client, but run their threads all over the network, then > you set a high process affinity (low number), and a low thread affinity > (high number). > > I would have the xgrid command line tool have such a facility (I don't > know if it does already or not, I haven't really done much with xgrid) > similar to both the "nice" and "renice" commands. I would also add a > preference pane that allows the user to set a default process affinity, > a default thread affinity, and a list of applications and default > affinities for each of those applications (so that they can be > exceptions to the default, without the user having to set it via > command line every time). Last, I would add a tool, possibly attached > to the Xgrid tachometer, which would allow me to adjust an affinity > after a program was running. > > The only thing up in the air is the ability to move a running thread > from one node to another while it's running (well, during a context > switch, really). I know a friend of mine at Ga Tech was doing PhD > research on that (portable threads) 10ish years ago, but I don't know > if it got anywhere. But, that would allow someone to lower the number > of an application's affinity while it's running, thus recalling the > threads or processes from a remote Agent to the local Client (the > scenario being I have a laptop that is an Xgrid Client, and I start > running applications that spread out across the network ... then I get > up to leave, so I lower the affinity numbers of everything so that the > tasks and threads come back to my laptop, running slower now that they > have fewer nodes to run upon, but still running (or sleeping, as the > case might be)). > > > So ... all of that leads up to: does anyone know if Xgrid is working on > this type of Application-Transparent Distributed Computing that Mosix, > OpenMosix, and I think OpenSSI have? I think it would be a natural > extension to Xgrid: Apple is trying to make this as "it just works" as > possible, so it seems that it should not only be easy for the sysadmin > to set up the distributed computing cluster, but easy/transparent for > the developer, too (in the same way that threads made Multi-Processing > easier and more abstract for the developer, this type of distributed > computing makes threads not just a multi-processing model, but a > distributed computing model). Ultimately, it even makes distributed > computing easy for the user: they don't need to learn how to re-code a > program (or coerce a vendor into making a distributed version of their > application), any multi-threaded application will use multiple nodes, > and even single-threaded non-distributed applications can be run on > remote nodes. That seems like a powerful "it just works" capability to > me. > > (the main drawback of Mosix, OpenMosix, and OpenSSI from my perspective > is that they're Linux only, specifically developed for the Linux kernel > ... but I'd really love to see something like them available for Mac OS > X) > > Thoughts? > > _______________________________________________ > Do not post admin requests to the list. They will be ignored. > Xgrid-users mailing list (Xgrid-users at lists.apple.com) > Help/Unsubscribe/Update your Subscription: > http://lists.apple.com/mailman/options/xgrid-users/eugen%40leitl.org > > This email sent to eugen at leitl.org > > ----- End forwarded message ----- > -- ---------------------------------------------------------------- Editor-in-chief ClusterWorld Magazine Desk: 610.865.6061 Fax: 610.865.6618 www.clusterworld.com
- Next message: [Beowulf] grid
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
