[Beowulf] Re: energy costs and poor grad students

Wed Jul 2 06:44:20 PDT 2008

Hi Mark

Mark Kosmowski wrote:
> I'm in the US.  I'm almost, but not quite ready for production runs -
> still learning the software / computational theory.  I'm the first
> person in the research group (physical chemistry) to try to learn
> plane wave methods of solid state calculation as opposed to isolated
> atom-centered approximations and periodic atom centered calculations.

Heh... my research group in grad school went through that transition in 
the mid 90s.  Went from an LCAO-type simulation to CP like methods.  We 
needed a t3e to run those (then).

Love to compare notes and see which code you are using someday. 
On-list/off-list is fine.

> It is turning out that the package I have spent the most time learning
> is perhaps not the best one for what we are doing.  For a variety of
> reasons, many of which more off-topic than tac nukes and energy
> efficient washing machines ;) , I'm doing my studies part-time while
> working full-time in industry.

More power to ya!  I did mine that way too ... the writing was the 
hardest part.  Just don't lose focus, or stop believing you can do it. 
When the light starts getting visible at the end of the process, it is 
quite satisfying.

I have other words to describe this, but they require a beer lever to 
get them out of me ...

> I think I have come to a compromise that can keep me in business.
> Until I have a better understanding of the software and am ready for
> production runs, I'll stick to a small system that can be run on one
> node and leave the other two powered down.  I've also applied for an
> adjunt instructor position at a local college for some extra cash and
> good experience.  When I'm ready for production runs I can either just
> bite the bullet and pay the electricity bill or seek computer time
> elsewhere.

Give us a shout when you want to try the time on a shared resource. 
Some folks here may be able to make good suggestions.  RGB is a physics 
guy at Duke, doing lots of simulations, and might know of resources. 
Others here might as well.

Joe

> 
> Thanks for the encouragement,
> 
> Mark E. Kosmowski
> 
> On 7/1/08, ariel sabiguero yawelak <asabigue at fing.edu.uy> wrote:
>> Well Mark, don't give up!
>> I am not sure which one is your application domain, but if you require 24x7
>> computation, then you should not be hosting that at home.
>> On the other hand, if you are not doing real computation and you just have a
>> testbed at home, maybe for debugging your parallel applications or something
>> similar, you might be interested in a virtualized solution. Several years
>> ago, I used to "debug" some neural networks at home, but training sessions
>> (up to two weeks of training) happened at the university.
>> I would suggest to do something like that.
>> You can always scale-down your problem in several phases and save the
>> complete data-set / problem for THE RUN.
>>
>> You are not being a heretic there, but suffering energy costs ;-)
>> In more places that you may believe, useful computing nodes are being
>> replaced just because of energy costs. Even in some application domains you
>> can even loose computational power if you move from 4 nodes into a single
>> quad-core (i.e. memory bandwidth problems). I know it is very nice to be
>> able to do everything at home.. but maybe before dropping your studies or
>> working overtime to pay the electricity bill, you might want to reconsider
>> the fact of collapsing your phisical deploy into a single virtualized
>> cluster. (or just dispatch several threads/processes in a single system).
>> If you collapse into a single system you have only 1 mainboard, one HDD, one
>> power source, one processor (physically speaking), .... and you can achieve
>> almost the performance of 4 systems in one, consuming the power of.... well
>> maybe even less than a single one. I don't want to go into discussions about
>> performance gain/loose due to the variation of the hardware architecture.
>> Invest some bucks (if you haven't done that yet) in a good power source.
>> Efficiency of OEM unbranded power sources is realy pathetic. may be 45-50%
>> efficiency, while a good power source might be 75-80% efficient. Use the
>> energy for computing, not for heating your house.
>> What I mean is that you could consider just collapsing a complete "small"
>> cluster into single system. If your application is CPU-bound and not I/O
>> bound, VMware Server could be an option, as it is free software
>> (unfortunately not open, even tough some patches can be done on the
>> drivers). I think it is not possible to publish benchmarking data about
>> VMware, but I can tell you that in long timescales, the performance you get
>> in the host OS is similar than the one of the guest OS. There are a lot of
>> problems related to jitter, from crazy clocks to delays, but if your
>> application is not sensitive to that, then you are Ok.
>> Maybe this is not a solution, but you can provide more information regarding
>> your problem before quitting...
>>
>> my 2 cents....
>>
>> ariel
>>
>> Mark Kosmowski escribió:
>>
>>> At some point there a cost-benefit analysis needs to be performed.  If
>>> my cluster at peak usage only uses 4 Gb RAM per CPU (I live in
>>> single-core land still and do not yet differentiate between CPU and
>>> core) and my nodes all have 16 Gb per CPU then I am wasting RAM
>>> resources and would be better off buying new machines and physically
>>> transferring the RAM to and from them or running more jobs each
>>> distributed across fewer CPUs.  Or saving on my electricity bill and
>>> powering down some nodes.
>>>
>>> As heretical as this last sounds, I'm tempted to throw in the towel on
>>> my PhD studies because I can no longer afford the power to run my
>>> three node cluster at home.  Energy costs may end up being the straw
>>> that breaks this camel's back.
>>>
>>> Mark E. Kosmowski
>>>
>>>
>>>
>>>> From: "Jon Aquilina" <eagles051387 at gmail.com>
>>>>
>>>>
>>>
>>>
>>>> not sure if this applies to all kinds of senarios that clusters are used
>> in
>>>> but isnt the more ram you have the better?
>>>>
>>>> On 6/30/08, Vincent Diepeveen <diep at xs4all.nl> wrote:
>>>>
>>>>
>>>>> Toon,
>>>>>
>>>>> Can you drop a line on how important RAM is for weather forecasting in
>>>>> latest type of calculations you're performing?
>>>>>
>>>>> Thanks,
>>>>> Vincent
>>>>>
>>>>>
>>>>> On Jun 30, 2008, at 8:20 PM, Toon Moene wrote:
>>>>>
>>>>> Jim Lux wrote:
>>>>>
>>>>>
>>>>>> Yep.  And for good reason.  Even a big DoD job is still tiny in
>> Nvidia's
>>>>>>
>>>>>>> scale of operations. We face this all the time with NASA work.
>>>>>>>  Semiconductor manufacturers have no real reason to produce
>> special purpose
>>>>>>> or customized versions of their products for space use, because
>> they can
>>>>>>> sell all they can make to the consumer market. More than once,
>> I've had a
>>>>>>> phone call along the lines of this:
>>>>>>> "Jim: I'm interested in your new ABC321 part."
>>>>>>> "Rep: Great. I'll just send the NDA over and we can talk about
>> it."
>>>>>>> "Jim: Great, you have my email and my fax # is..."
>>>>>>> "Rep: By the way, what sort of volume are you going to be using?"
>>>>>>> "Jim: Oh, 10-12.."
>>>>>>> "Rep: thousand per week, excellent..."
>>>>>>> "Jim: No, a dozen pieces, total, lifetime buy, or at best maybe
>> every
>>>>>>> year."
>>>>>>> "Rep: Oh...<dial tone>"
>>>>>>> {Well, to be fair, it's not that bad, they don't hang up on you..
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Since about a year, it's been clear to me that weather forecasting
>> (i.e.,
>>>>>> running a more or less sophisticated atmospheric model to provide
>> weather
>>>>>> predictions) is going to be "mainstream" in the sense that every
>> business
>>>>>> that needs such forecasts for its operations can simply run them
>> in-house.
>>>>>> Case in point:  I bought a $1100 HP box (the obvious target group
>> being
>>>>>> teenage downloaders) which performs the HIRLAM limited area model
>> *on the
>>>>>> grid that we used until October 2006* in December last year.
>>>>>>
>>>>>> It's about twice as slow as our then-operational 50-CPU Sun Fire
>> 15K.
>>>>>> I wonder what effect this will have on CPU developments ...
>>>>>>
>>>>>> --
>>>>>> Toon Moene - e-mail: toon at moene.indiv.nluug.nl - phone: +31 346
>> 214290
>>>>>> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
>>>>>> At home: http://moene.indiv.nluug.nl/~toon/
>>>>>> Progress of GNU Fortran:
>> http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Beowulf mailing list, Beowulf at beowulf.org
>>>>> To change your subscription (digest mode or unsubscribe) visit
>>>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>>>
>>>>>
>>>>>
>>>> --
>>>> Jonathan Aquilina
>>>>
>>>>
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>>
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615