Gaussian (was: SOFTWARE PRICING FOR CLUSTERS)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduThu Jul 25 16:02:35 PDT 2002
- Previous message: Gaussian (was: SOFTWARE PRICING FOR CLUSTERS)
- Next message: Anyone using a 3COM 3C966B-T Successfully?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, 25 Jul 2002, Herbert Fruchtl wrote: > Dean Johnson wrote: > > I await the open source version of Gaussian. If it'll only cost like > > $100k to develop... ;-) > > It's called NWChem. And there are others that provide a subset of the > functionality (Gamess-US, for example). Gaussian itself is fairly > cheap for academics, even with source code. > > You are right, of course, that the man-years invested into such > programs would add up to tens of millions of dollars rather than 100k. Although my old daddy the economist taught me at a very early age the difference between opportunity cost labor and real expenses. He set me on his knee and said "Son, when I pay other folks to paint our house, that's a real expense. But when I make you paint our house for the cost of the food I'm paying you anyway, that's opportunity cost labor. Now stop your bitchin' and get out there and paint!" NWChem, PVM, MPI, Unix in general and Gnu/Linux in particular contain BILLIONS of dollars worth of labor, but most of it is opportunity cost labor (paid for by e.g. grant agencies who were funding the research that relied on the tools anyway, for example, or written by graduate students paid to do annoying tasks like actually implement a lovely theory in code). This is in contrast to most commercial code, which is almost always paid for with real money. The interesting question is where you draw the line. If I write a scientific application in three months of labor, using the GSL, GCC, sundry open source toolkits from e.g. netlib, running on a linux box (derived from Unix, which once upon a time cost some real money to develop), while being "paid" by the physics department here to teach and being partly subsidized by a grant to do research derived from the application, how much does it cost? Fair answers could range from "nothing" (opportunity cost) to more money than was spent putting man on the moon. As was very wisely observed by Gerry, the answer ultimately comes down to who's paying for it and good old fashioned cost-benefit analysis. If you write a proposal that involves doing quantum chemstry research, you have to make something of a bet. You fund salaries, you fund hardware, you fund various kinds of support for the computations. You generally have a pretty accurate idea of how much the granting agency is likely to fund as an upper bound for your particular scientific project. In some cases it will come down to things like: Hmmm, use a commercial package of one sort or another with terrible cost scaling across the cluster (e.g. doubling the cost per node including the software) and live with half the nodes (doubling the time to complete the project, publish, become famous, get tenure) OR half the graduate students (doubling the amount of work for the remaining hands to do it, which can ALSO double the time in addition to requiring a lot of YOUR time instead of THEIR time) or get the maximum number of nodes and students one can afford and invest some of the students' OC time in developing the application from open source tools the best that they can? This is by no means an imaginary scenario; it is played out every time a group builds a beowulf at all. The group is free to install a Windows cluster of various flavors, or a Solaris cluster, or an Irix cluster. In all of these cases, the cost per flop will double to triple BEFORE getting around to choosing an application layer, but get -- maybe -- some "benefits" in terms of reduced personal time investment from the shrink wrap and commercial support process. Or they can use open source Linux or perhaps FreeBSD, and live with investing more OC time to make it all work (well, not so much any more , but five or six years ago the bulk of the list traffic was devoted to DIRECTING the OC time that was very definitely required:-) Many of us are (still) here on this list because of the choice we made then and continue to make now. For us and our funding scenarios, we can get money for hardware leveraged in part by our clearly demonstrated competence in getting it all working efficiently on our problems WITHOUT another $100K worth of software investment by the granting agency. Or we have a fixed budget and the ivory tower ideal of subsidized OC time -- we teach, have a small grant to pay for hardware, and have the time to develop the code ourselves but not the money to pay others for the code, or at least not very much. The strength of the open source "movement" is founded upon two things: opportunity cost labor and cost scaling. Linux development does cost some real money; even if 80% of the labor is OC, there is some that isn't. And as noted, there are a LOT of FTE hours in Gnu/Linux and the associated toolset! I personally view it as one of the greatest accomplishments of mankind, quite literally dwarfing the pyramids and the development of nuclear bombs in terms of sheer intellectual investment. I wasn't kidding about it comparing to the moon program -- in terms of sustained investment in time, computer systems from hardware through firmware and up to software are arguably mankind's greatest and most time-intensive achievement -- more time invested, and "expensive" time at that, than nearly anything else we've touched. But look at the benefits and how they scale! I could NEVER write linux from scratch. Neither could Linus Torvalds. A cast of thousands, nay, tens of thousands, has written Linux -- maybe hundreds of thousands if you consider the energy that went in to multics, unix, and all the other early OS's and software suites that eventually found its firmly open source realization in the Gnu/Linux/FreeBSD code base. Still, at this point that energy runs millions of computers and enables them to run billions to trillions of chores for hundreds of millions of people. We run it (and other OS software) because it is so CHEAP by the time scaling is taken into account. So what if one once had to sometimes contribute a bit of extra time to management. At this point, a computer store I know of runs a linux server to install the preconfigured windows systems they sell because it a) works and b) scales so well! At this point, NOTHING can touch linux for ease of installation and management -- the only factor that limits the number of systems our linux sysadmins can handle is the rate of hardware failure! It takes less time for us to get a bare-nekked box, stick it on the network, and kickstart it into our standard desktop configuration than it would to install a pre-installed (but not configured) WinXX system! Is it worth it (in the long run) to contribute bug fixes, new GPL tools, and all the rest of the OC time hard core linux users generally invest? Absolutely. If I contribute the code I write (that I need to write anyway) and somebody else contributes THEIR code, we both get to use each other's code. Pretty soon that builds up to a LOT of code, mostly developed with OC labor, cleaned up and packaged with a bit of real money, and managed with a mix of real money and OC labor. That's why I said -- tough sell for this particular list. We KNOW about the cost scaling of clusters and open source software because that is the foundation of what we do. Sure, we might buy software if we must, but most of us will knee-jerk reject $50K packages out of hand even if it buys a university wide site license unless its CBA is OVERWHELMINGLY favorable and NO open source tool can even begin to do the task. That is from experience, of course, each of us with our own. I'd regale the list with stories of the evolution of Duke's mail delivery system (which began with large bodies on campus using such gems as cc-mail, to give away the punchline) to an open source, open standard basis, but it would be boringly familiar to most of us. In the short run, a horrendously expensive institutional site (or cluster) license for a proprietary solution may LOOK scalable and cheap, but historically, in the long run open source open standard solutions end up being MUCH more scalable and MUCH cheaper, even allowing for the OC user/admin participation in fixing its occasional problems. With open source software, at least one CAN try to fix a problem instead of suffer through very expensive downtime... rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: Gaussian (was: SOFTWARE PRICING FOR CLUSTERS)
- Next message: Anyone using a 3COM 3C966B-T Successfully?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
