[Beowulf] SC13 wrapup, please post your own

Peter St. John peter.st.john at gmail.com
Tue Nov 26 06:48:42 PST 2013

The *annum mirabilis* "exceeds expectations", that's beautiful :-)

On Tue, Nov 26, 2013 at 9:15 AM, Lux, Jim (337C)
<james.p.lux at jpl.nasa.gov>wrote:

> On 11/25/13 4:11 PM, "Adam DeConinck" <ajdecon at ajdecon.org> wrote:
> >Hash: SHA512
> >
> >> 4. I went to a BoF on ROI on HPC investment. All the presentations in
> >> the BoF frustrated me. Not because they were poorly done, but because
> >> they tried to measure the value of a cluster by number of papers
> >> published that used that HPC resource. I think that's a crappy, crappy
> >> metric, but haven't been able to come up with a better one myself yet.
> >>I
> >> was very vocal with my comments and criticisms of the presentations, so
> >> if any of the presenters are reading this now, I apologize for
> >> hi-jacking your BoF. Getting good ROI on a cluster is close to my
> >>heart,
> >> but is also difficult to quantify and measure. I hope I can be part of
> >> the discussion next year.
> >>
> >
> >Do you have any thoughts you can share on what alternative metrics
> >might look like, even if you can't think of one that's clearly better?
> >
> >I have no horse in this race as I've been doing industry HPC for the
> >past few years, but I'm curious what good metrics for ROI on an academic
> >or lab cluster might be. Total number of papers? Number of
> >citations after an N-year time window? [shrug]
> >
> >ROI measurement can sometimes be difficult even in an industrial or
> >commercial setting, especially if the HPC resource is used for R&D or
> >"engineering support" as opposed to something that feeds directly into
> >the product.
> >
> >Cheers,
> >Adam
> Definitely a challenge.
> Maybe we have webcams that look at all the users and we calculate
> percentage of time smiling while interacting with the cluster?
> ROI for "technology development" is a tough thing to calculate.  ROI, by
> it's nature is a "money returned for money spent", and the return is
> somewhat intangible.
> All of these metrics require having a baseline so you can do a
> before/after comparison.  And realistically, there needs to be a fairly
> long averaging time on the metric.  Here's the annual paper output of a
> noted physicist.
> 1901  1
> 1902  2
> 1903  1
> 1904  1
> 1905 25
> 1906  6
> 1907  8
> 1908  4
> 1909  5
> 1910  6
> 1911  8
> How would you evaluate the ROI of feeding him?  Started kind of slow, had
> a really good year, and likely received a "exceeds expectations" annual
> review. But that set a new bar, and now his supervisor is going to be
> hammering him.. Dude, your output is slacking off, I think we need to put
> you on a performance improvement plan, and this year, you're going to be
> "does not meet" in your review.
> The other problem is that paper publishing (and the schedule thereof) is
> influenced by things other than availability of computational resources,
> so you need a very large sample so those influences average out.  For
> instance, lack of funds or permission to travel to a conference, combined
> with the recent fad of "you must present in person" will have an effect.
> The sequester and/or furlough will almost certainly manifest itself in any
> sort of time series counting publications.
> The other thing is that there is a long gestation period for some work.
> You might not have something "publishable", especially with the bias
> against publishing null or negative results. That doesn't mean that the
> HPC work wasn't useful, if it found a bunch of "ways not to go".
> There might also be a availability of workforce to grind out the papers
> issue.  At least at JPL, relatively few people work on a single job or
> task.  A more typical scenario is having 2 or 3 projects you work on
> simultaneously, along with half a dozen things you support. There is a
> tendency to spend one's time on the latest thing to go wrong, and in a "do
> more with less" environment, there's not a lot of down time in which to
> catch up.
> In an environment where short term results are more important (or, at
> least have more "gain" in the control loop) it's tough to push "getting
> published" higher up the priority list, since the personal ill effects of
> not publishing may be years down the road, compared to immediate ill
> effects of "the project will be cancelled if we don't make the deliverable
> this month".
> At JPL, it is easy to tell in which organizations, the "papers published"
> metric is important in annual ranking and review: they're the ones with
> lots of papers.  That's not to say that other organizations don't do lots
> of publishable work, but if your annual review depends on something
> *other* than the metric, you're not going to spend your time doing it.
> Export controls also rear their ugly head.  A lot of interesting problems
> that can be attacked by HPC are the practical ones. But in a number of
> industries, once you move beyond pure research and theory (TRL 3), you get
> into an area where it is either competition sensitive or export
> controlled.    It is true that you could write a paper that is suitably
> expurgated and sanitized, but you still have to go through the export
> control/public release review process. And that's time consuming too.
> The whole competition sensitive/proprietary rights/ export controls aspect
> might be why some of you have commented on the gulf between what's
> presented on the show floor and what's presented in the talks.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20131126/29110dc8/attachment.html>

More information about the Beowulf mailing list