Hi HPC experts,<br><br> I seek your advise/suggestion to resolve a storage(NAS) server' repeated hanging problem.<br><br> We've a 23 nodes Rocks-5.1 HPC cluster. The Sun storage of capacity 12 TB is connected to a management server Sun Fire X4150 installed with RHEL 5.3 and this server is connected to a Gigabit switch which provides cluster private network. The home directories on the cluster are NFS mounted from storage partitions across all nodes including the master.<br>
<br> This server gets hanged repeatedly. As an initial troubleshooting we installed Ganglia, to check network utilization. But its normal. We're not getting how to troubleshoot it and resolve the problem. Can anybode help us resolve this issue?<br>
<br>Thanks,<br>Sangamesh<br>