[Beowulf] 512 nodes Myrinet cluster Challanges
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at physics.mcmaster.caFri Apr 28 05:04:53 PDT 2006
- Previous message: [Beowulf] 512 nodes Myrinet cluster Challanges
- Next message: [Beowulf] 512 nodes Myrinet cluster Challanges
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> Does any one know what types of problems/challanges for big clusters? cooling, power, managability, reliability, delivering IO, space. > we are considering having a 512 node cluster that will be using > Myrinet as its main interconnect, and would like to do our homework how confident are you at addressing especially the physical issues above? cooling and power happen to be prominent in my awareness right now because of a 768-node cluster I'm working on. but even ~200 node clusters need to have some careful thought applied to managability (cleaining up dead jobs, making sure the scheduler doesn't let jobs hang around consuming myrinet ports, for instance.) reliability is a fairly cut and dried issue, IMO - either you make the right hardware decisions at purchase, or not. > The cluster is meant to run an inhouse fluid simulation application > that is I/O intensve, and requires large memory models. what parallel-cluster filesystem are you planning to run? how many fileservers? (or is the IO intensivity handlable using per-node disks?)
- Previous message: [Beowulf] 512 nodes Myrinet cluster Challanges
- Next message: [Beowulf] 512 nodes Myrinet cluster Challanges
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
