[Beowulf] Joe Blaylock's notes on running a MacOS cluster, Nov. 2007

Tue Nov 20 14:35:23 PST 2007

It's nice to see some Apple centric notes here. Excellent writeup.

I've been involved in building and supporting Apple clusters running  
Sun Grid Engine for years now, dating back to the time when the first  
Xserves were released and we got some notoriety for using Apple 2nd  
generation iPods as bootable firewire drives for auto-imaging our  
cluster nodes.

Our apple cluster experience is more focused on batch-style compute  
farming rather than "true HPC" but I can toss some comments into the  
mix here - I've commented down below on some of the points that were  
raised in the write-up.

It's sad to hear from people attending SuperComputing that Apple did  
not have a booth. The consensus reported back to me was that "Apple  
has nothing to show in the HPC space ..." and that sort of goes along  
with what we've been seeing with Apple turning back from enterprise  
sales and focusing almost exclusively on the consumer market. Sad to  
hear -- we've been waiting on refreshed Xserves for way too long now.  
At this point I'd guess that the product may be dead or discontinued.

On Nov 20, 2007, at 2:19 PM, Kragen Javier Sitaker wrote:

> This is not strictly about Beowulfs, but it is probably of interest to
> their users.
>
> My friend Joe's team from Indiana University just fielded a MacOS
> cluster for the Supercomputing '07 Cluster Challenge.  His experiences
> weren't that great; I encouraged him to jot something quick down so  
> that
> other people could benefit from his hard-won lessons.
>
> There's more information about the challenge at
> http://sc07.supercomp.org/?pg=clusterchlng.html&pp=conference.html.
>
> ----- Forwarded message from Kragen Javier Sitaker  
> <kragen at pobox.com> -----
>
> From: Kragen Javier Sitaker <kragen at pobox.com>
> To: kragen-fw
> Subject: Joe Blaylock's notes on running a MacOS cluster, Nov. 2007
>
>  Disordered thoughts on using MacOS X for HPC.
>
> By Joe Blaylock, 2007-11.
>
>
>    Recollections:
>
>    * we were the first people to ever try that particular combination:
>      Tiger on Xeons with Intel's ICC 10 compiler suite and MKL linear
>      algebra libraries. Blazing new territory is never easy.
>    * We didn't use XGrid or Apple's cluster management stuff, only
>      Server Admin and ARD.
>    * Pov-Ray was easy; OpenMPI was easy; using Myrinet over 10Gig
>      Ethernet was easy
>    * GAMESS was more challenging, but we got it working somewhat. We
>      still don't know how to run jobs of type ccsd(t), which require
>      System V shared memory.
>    * We never got POP to work.
>    * Apparently, ICC 10 has some bugs. There were several times when  
> we
>      were trying to use it to build, IIRC, GAMESS or POP, and it would
>      give illegal instruction errors during compile. Or it would build
>      a binary that we would run, and then it would do something
>      horrible (like hang the machine (probably a bug interaction
>      between icc and MacOSX).
>    * OpenDirectory doesn't seem ready for prime time. It's pretty easy
>      to set up, but it's unreliable and mysterious. In MacOS X, there
>      seems to be a fundamental disconnect between things in the CLI
>      world and things in the GUI world. Setting something up in one
>      place won't necessarily be reflected in the other place. I'm sure
>      that this is all trivial, if you're a serious Darwin user. But
>      none of us were. So for example, you set up your NFS exports in
>      the Server Admin tool, rather than by editing /etc/exports. The
>      Admin tool won't put anything into /etc/exports. So if you're on
>      the command line, how do you check what you're exporting? With  
> the
>      complexity of LDAP, this becomes a real problem. You set up
>      accounts on your head node, and say to export that information.
>      But perhaps you create an account, but can't log into it on a
>      node. If you're ssh'd in from the outside, where do you check to
>      see (from the command-line) what the authentication system is
>      doing? Our local Mac guru couldn't tell us. And then you'd create
>      another account, and the first one would start working again.  
> WTF?
>    * This may be the most frustrating thing about working with OS X
>      Server. The CLI is the redheaded stepchild, and lots of HPC is
>      mucking about on the command-line. You can use VNC to connect to
>      ARD (but only if a user is logged in on the desktop and running
>      ARD!), but it's slow, and only provides desktop control, not
>      cluster management. ARD can then be run on the desktop, to  
> provide
>      desktop control of the nodes in the cluster, and some cluster
>      management: run unix command everywhere, shut nodes down, etc.
>      There were a handful of tasks which seemed important, but which I
>      couldn't figure out how to do on the command-line at all. The  
> most
>      heinous of these is adding and removing users to/from LDAP.

Prior to Apple releasing the Xserve, there were many things that  
required a GUI to accomplish. Right around the time they released the  
Xserves though the OS improved to the point where you could just about  
everything via the command line and/or a serial console.

ServerAdmin even has a CLI variant that can do everything that the  
GUI. I think 'serveradmin' along with 'networksetup' were the two main  
CLI tools we used over and over again.

OpenDirectory is more of a pain - there is a CLI tool but you also end  
up working with standard openldap commands and binaries.

The best reference we found (in addition to lots of blood, sweat and  
tears) was a PDF that Apple publishes called "Mac OS X Server Command  
Line Administration" which can be found here:  http://www.apple.com/server/pdfs/Command_Line.pdf

All things considered we've found that working on headless Apple boxes  
using a serial console is certainly possible, it's a bit slower than  
Linux due to the need to look up cryptic CLI commands in the PDF but  
it works for about 99% of the things we ever really needed to do. I  
think one GUI only thing that bit us once was the fact that during  
initial OS install if you want to software RAID your disks you can  
only do this via a GUI prompt.

>
>    * Most of the time, I found it more convenient to use a 'for' loop
>      that would ssh to nodes to run some command for me.

We use passwordless SSH and 'dsh' or 'pdsh' utilities on just about  
every Apple system we work on. There are many times when the "do  
something on N nodes" automated process is necessary. heh

>
>    * MacOS X lacks a way to do cpu frequency scaling. This killed us  
> in
>      the competition. We couldn't scale cores to save on our power
>      budget, we could only leave them idle.
>    * Being a Linux dude, I found having to have license keys for my
>      operating systems, and (separately) my administration and
>      management tools, to be odious in the extreme. Having to
>      separately license ICC and IFORT and MKL just added frustration
>      and annoyance.

This is not really a "secret" but Apple does not publish it easily and  
it took some internal contacts to discover ...

Apple is capable of producing what is called a "Watermarked Serial  
Number" that will license all servers in a cluster or a subnet. You  
still have to enter a serial number in the OS but having the single  
Watermark serial number allows you to use the same values which makes  
it scriptable or something you can bake into a netboot'ed image.

Watermarked serial numbers can not be requested by mere mortals. You  
have to request this from your sales rep and apparently there is some  
internal system that the sales rep can use to generate the watermarked  
serial number. This is the first thing we tell our Apple using friends  
and colleagues as it is a significantly nice thing to have.

>
>    * We didn't make detailed performance comparisons between stuff
>      built with the intel suite and things built with, e.g., the GNU
>      suite and GotoBLAS. We were too busy just trying to get  
> everything
>      to work. I'm sure that Intel produces better code under normal
>      circumstances, but we had lots of cases where version 10 couldn't
>      even produce viable binaries. So, make of that what you will.
>
>
>    What I would recommend (if you were going to use MacOS X):
>
>    * Learn Darwin, in detail. Figure out the CLI way to do everything,
>      and do it. In fact, forget Mac OS X; just use Darwin. Learn the
>      system's error codes, figure out how to manipulate fat binaries
>      (and how to strip them to make skinny ones), be able to  
> manipulate
>      users, debug the executing binaries, etc. Consider looking into
>      the Apple disk imaging widget so you can boot the nodes diskless.
>
>
>    What I would do differently (whether I stick with MacOS X or not):
>
>    * diskless clients
>    * Flash drive for head node
>    * no GPUs
>    * Get Serial Console set up and available, even if you don't use it
>      routinely
>    * CPU Frequency Scaling!!
>    * many more, smaller cores. we had 36 at 3GHz. this was crazy. We
>      were way power hungry.
>    * Go to Intel 45nm dies.
>
>

Regards,
Chris