(Stress-)Testing of nodes in a beowulf cluster
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Steven Timm timm at fnal.govMon Mar 12 06:05:40 PST 2001
- Previous message: (Stress-)Testing of nodes in a beowulf cluster
- Next message: (Stress-)Testing of nodes in a beowulf cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
At Fermilab in our PC Farms our cluster is not a true Beowulf, but we do an extensive stress test of 30 days. Our test consists of continuously running seti at home for 30 days on both cpu's, then every hour on the hour using "bonnie" to write a 1 GB test file to each disk and "nettest" to simultaneously push 400 MB over the net. "Streams" could be added to this as well. In addition, there are starting to be utilities out there that can read the event logs in the BIOS, which track if you have any memory faults or any power supply faults. In our experience, power supplies are the most likely thing to go bad in the first 30 days, and sometimes you get a bad batch of memory too. The stress test above makes the machine draw almost the highest current it will draw, and if the power supply is going to die, it will do so quickly. ------------------------------------------------------------------ Steven C. Timm (630) 840-8525 timm at fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division/Operating Systems Support Scientific Computing Support Group--Computing Farms Operations On Mon, 12 Mar 2001 Kian_Chang_Low at vdgc.com.sg wrote: > Hi, > > I have been playing with beowulf cluster for quite a while and have put > together a small cluster as a test to show that it can be done. > > Now I am faced with a question about the reliability of the nodes (slave > or/and master). Is there any tests (or stress-tests) that we can run to > check the reliability of the following, > 1) CPU > 2) memory > 3) network interface card > 4) disk > 5) motherboard > 6) any other?! > > I heard of using memtest to test the memory. But what about tests for the > other components? > > I thought it will be great if there is a suite of tests that the node has > to undergo before being added to the cluster. Rather than trying to > determine the cause of failure after putting the cluster together, we at > least know that a node is downright faulty from the beginning. > > Thanks, > Kian Chang. > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf >
- Previous message: (Stress-)Testing of nodes in a beowulf cluster
- Next message: (Stress-)Testing of nodes in a beowulf cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
