Cluster and RAID 5 Array bottleneck.( I believe)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Leonardo Magallon leo.magallon at grantgeo.comThu Mar 15 09:07:43 PST 2001
- Previous message: Mysterious kernel hangs
- Next message: Cluster and RAID 5 Array bottleneck.( I believe)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all, We finally finished upgrading our beowulf from 48 to 108 processors and also added a 523GB RAID-5 system to provide a mounting point for all of our "drones". We went with standard metal shelves that cost about $40 installed. Our setup has one machine with the attached RAID Array to it via a 39160 Adaptec Card ( 160Mb/s transfer rate) at which we launch jobs. We export /home and /array ( the disk array mount point) from this computer to all the other machines. They then use /home to execute the app and /array to read and write over nfs to the array. This computer with the array attached to it talks over a syskonnect gig-e card going directly to a port on a switch which then interconnects to others. The "drones" are connected via Intel Ether Express cards running Fast Ethernet to the switches. Our problem is that apparently this setup is not performing well and we seem to have a bottleneck either at the Array or at the network level. In regards to the network level I have changed the numbers nfs uses to pass blocks of info in this way: echo 262144 > /proc/sys/net/core/rmem_default echo 262144 > /proc/sys/net/core/rmem_max /etc/rc.d/init.d/nfs restart echo 65536 > /proc/sys/net/core/rmem_default echo 65536 > /proc/sys/net/core/rmem_max Our mounts are set to use 8192 as read and write block size also. When we start our job here, the switch passes no more than 31mb/s at any moment. A colleague of mine is saying that the problem is at the network level and I am thinking that it is at the Array level because the lights on the array just keep steadily on and the switch is not even at 25% utilization and attaching a console to the array is mainly for setting up drives and not for monitoring. My colleague also copied 175Megabytes over nfs from one computer to another and the transfers took close to 45 seconds. Any comments or suggestions welcomed, Leo.
- Previous message: Mysterious kernel hangs
- Next message: Cluster and RAID 5 Array bottleneck.( I believe)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
