[Beowulf] Beowulf SysAdmin Job Description

Gerry Creager gerry.creager at tamu.edu
Fri May 1 04:54:04 PDT 2009

Prentice Bisbal wrote:
> Steve Herborn wrote:
>> Right now "I" think a lot of what you say is from the perspective of "while
>> true at the time".  
>> As HPCC goes mainstream and all the bugs, wrinkles, and bumps in the road
>> are straightened away we'll begin to see the Scientists who developed (and
>> were forced to maintain) them move on to more esoteric conundrums that face
>> them and their research.  I have no idea what the next great break-through
>> in computing will be, put I highly doubt from a processing perspective it
>> will be the IT guys who've been forced into standardization, availability
>> and cost reduction.
>> Eventually even HPCC will get homogenized and integrated into an
>> organizations overall greater IT structure as these systems are used more &
>> more for business purposes.  At that time a plain-old Unix SysAdmin just
>> might do, be they self-taught or degreed.
> Definitely possible.

Yeah.  Shortly after Windows becomes the clustering solution of choice 
for HPC.  Or Linux wins the desktop.

Sorry for the sarcasm, but I don't think HPC administration is as 
similar to mainstream IT as it appears.  That doesn't mean the suits 
won't try to integrate it into their structure, but they'll either 
determine that a plain ol' administrator hasn't learned about 
mpich[1||2], openmpi, mvapich, fortran of any flavor, myrinet, 
infiniband, gluster, lustre, etc.

The prototypical Unix IT administrator is not likely to be able to 
wander into someone's molecular dynamics code today, and another's 
weather code tomorrow, and help find why they're both crashing.

There's definitely an apprenticeship required to become an HPC admin and 
support person.

>> People end up working with computers as a primary job for any number of
>> reasons, but I do not consider that to be necessarily working in IT.
>> The one thing that does seem to be apparent is not an awful lot of people
>> have HPCC SysAdmin as a job description.  I suspect most have some other
>> type of job description and ended up doing HPCC work out of necessity to
>> fulfill their primary job role.
>> Or, on the other hand I'm completely out to lunch.  :)
> No, not out to lunch. Those last two paragraphs are pretty much what I
> said. A lot of scientists start out doing research, and end up cluster
> admins, or it at least becomes a big part of the job.

I refer to that as "sacrificing a graduate student to the cluster 
ghods".  We take someone who was promising enough to get into the 
program on the merit of their capabilities, and then, based on (usually) 
a heretofore unappreciated ability to log in, get a terminal prompt, and 
execute 'ls', they spend the next 5-10 years learning the care and 
feeding of their research group's cluster.  They're eventually awarded a 
terminal degree, and often have earned it, but they've worked harder 
than their fellow grad students because to accomplish their research AND 
manage the cluster... AND support their fellows' and boss' needs when 
things broke in code.  Or, the several unfortunate ones I've seen, where 
they do become a skilled administrator and HPC user, but never really 
learn their science, and are paroled with the degree anyway at some 
point (usually 9 years here: a student starts losing courses at 10 years 
in a program and is usually removed from it at that point thanks to our 
State Legislature).

> Just recently there were a lot of questions asked (or was that on the
> SGE mailing list?) by a grad student who ended up responsible for the
> departments cluster.

Here.  SGE.  Rocks.  Lots of places where this comes up.  I get calls at 
least weekly on our campus from some grad student who needs help.  I've 
rescued departmental clusters where no one research group was in charge 
(departmental resource) and the overworked IT admin for the department 
was at his wits' end.  So far, I've not encountered a departmental 
resource cluster administered by a shanghai'd grad student, but that 
could be because our University tends to foist that off on a post-doc or 
the Windows support guy.

Gerry Creager -- gerry.creager at tamu.edu
Texas Mesonet -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843

More information about the Beowulf mailing list