[Beowulf] [OT] HPC and University IT - forum/mailing list?

Tue Aug 15 17:50:16 PDT 2006

Mike Davis wrote:
> I'm not 100% sure about that Mark. I care about big-A administration. I
> care about showing departments what resources are actually available. I
> care about what is the most efficient use of limited University
> resources. When I meet with researchers they often say that they had no
> idea that there were 500+ processors dedicated to research here.

One of the toughest parts of the job is marketing what you have to your
intended audience.  Lots of them won't or don't know much about what you
have to offer, and all of them will need it couched in terms that they
are comfortable with.

Then there is the big "P".  Politics are often difficult to wade
through, lots of interesting things crop up when you try to embed what
is effectively a service organization within a research institution.
Specifically the fight^H^H^H^H^Hrequests for budget.  Service
organizations are fundamentally expenses and cost centers not "profit"
centers or "revenue" centers.  NSF/NIH/... don't like funding things
that won't produce a body of work that the program managers can point
to.  This is the universities job.

The university doesn't want to spend much money on things that won't
directly generate grant/contract income.  Space is another problem and
the F&M considerations for a large install can be ... considerable.
ORNL and others build new buildings to house their new acquisitions, and
have new taps to huge power sources.  Not everyone can do this.

> I know that other people have the same issues. Another is the funding
> model issue. Which is best overhead, direct, or central budget? Or how
> about knowing what resources we each provide our users. Does a given
> organization focus on hardware support, software support or both?

Unfortunately one needs to support a wide range of users.  Some of our
customers are scary smart, teach us a thing or three.  Others need more
... interactive ... support from us.  Same is true with universities.
Some researchers you can leave alone and they will be great.  Others,
will need lots of support.  Support costs money.  Where to pay for that
from though is the question.

In the "old" days, Cray used to supply warm breathing bodies as part of
the price of the machine.  I haven't seen too much interest in this in
recent times, though it is a good model of support.  It is just not
inexpensive.  Good support never is.

In order to get more support for little a, and I would argue for
little/big S (end user support/machine support), you can't just be a
cost center.  You need to be a profit center.  This forces a particular
range of business models upon you (chargeback and others) but with some
creative thinking, you might be able to turn this into a real revenue
generator for the university (not big revenue, but self sustaining).
Let me know if you want to talk about this offline.  It is more of a
business model than anything else.

> Those are some of the Big-A issues. Here is one that is both Big-A and
> small-A.
> 
> Running one of the new Sun x4100's with both dualcore processors at 100%
> uses <270 watts (as determined by kill-a-watt. That is Big-A because it
> means that we can be more efficient in our use of AC and power. It is
> small-a for the same reasons. For example spinning up a v20 uses 250
> watts for both processors at full power. I can't discuss some of my
> application specific performance due to license constraints, but I can
> say that I like the 4100 in general for Computational Physics and
> Chemistry.

At the end of the day it is always a cost-benefit analysis.  If you have
infinite budget with no constraints, you can always optimize for your
most favored issue, performance, uptime, network bandwidth, latency, ...

I try to look at these as a minmax problem.  Minimize the maximum pain.
 If you are 10% off in speed, but 2/3 the price, are you ahead of the
game?  If you are 50% lower in power (4 x single core versus 2 x dual
core), and 30% lower in per node cost, but some codes are slower, is it
worth it?

This is where the politics and big-A stuff come into play, and the
minmax algorithm.  You want to minimize the maximum pain your users will
 suffer, constrained by costs of course.  Which means taking into
account all the power, cooling, space, general environment, ....

> 
> Another that is both is what submission systems we are using and Why?

Most users don't care.  I have seen universities opt for a $/node
license for one scheduler when another one able to do the same job was
available gratis.  I can't say I understand all the decisions.  Then
again I don't have all the information, and the big-A team needs to make
business decisions, with technology, price, and other factors as their
inputs, it may have made perfect sense in a larger picture.

While some technological decisions are clear cut, business decisions are
often less so, and worse for pure technologists, may often take other
considerations into account which would lead to decisions opposed to the
best technological choices.

> Same questions, that affect both administration and Administration.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615