[Beowulf] First cluster in 20 years - questions about today

Jonathan Douglas Engwall engwalljonathanthereal at gmail.com
Mon Feb 3 16:14:00 PST 2020

At least get a 4U, 4 CPU rack machine and don't even think of any GPU older than 2016. Better yet, look toward a DELL T440, they sell off when the warranty starts to go, practically new, at prices that will shock you; if you have paid for a laptop or phone recently.
Industrial cast-offs are the way to go. You can try whatever you like that way, because you are running solid equipment.
You can't protect yourself when a developer dumps support for this or that when everything you use is older, edge-case, or your own handiwork, professional it may be.

Jonathan Engwall

On February 1, 2020, at 9:21 PM, Mark Kosmowski <mark.kosmowski at gmail.com> wrote:

I've been out of computation for about 20 years since my master degree.  I'm getting into the game again as a private individual.  When I was active Opteron was just launched - I was an early adopter of amd64 because I needed the RAM (maybe more accurately I needed to thoroughly thrash my swap drives).  I never needed any cluster management software with my 3 node, dual socket, single core little baby Beowulf.  (My planned domain is computational chemistry and I'm hoping to get to a point where I can do ab initio catalyst surface reaction modeling of small molecules (not biomolecules).)

I'm planning to add a few nodes and it will end up being fairly heterogenous.  My initial plan is to add two or three multi-socket, multi-core nodes as well as a 48 port gigabit switch.  How should I assess whether to have one big heterogenous cluster vs. two smaller quasi-homogenous clusters?

Will it be worthwhile to learn a cluster management software?  If so, suggestions?

Should I consider Solaris or illumos?  I do plan on using ZFS, especially for the data node, but I want as much redundancy as I can get, since I'm going to be using used hardware.  Will the fancy Solaris cluster tools be useful?

Also, once I get running, while I'm getting current with theory and software may I inquire here about taking on a small, low priority academic project to make sure the cluster side is working good?

Thank you all for still being here!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20200203/b8fc511b/attachment.html>

More information about the Beowulf mailing list