Hello all,<br><br> Thanks for your suggestions. <br> But we lost the access to the cluster because of the delay.<br><br> But I got useful information to debug next time.<br><br>Thanks,<br>Sangamesh<br><div class="gmail_quote">
On Thu, Jan 14, 2010 at 10:38 AM, Skylar Thompson <span dir="ltr"><<a href="mailto:skylar@cs.earlham.edu">skylar@cs.earlham.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div class="h5">Sangamesh B wrote:<br>
> Hi HPC experts,<br>
><br>
> I seek your advise/suggestion to resolve a storage(NAS) server'<br>
> repeated hanging problem.<br>
><br>
> We've a 23 nodes Rocks-5.1 HPC cluster. The Sun storage of<br>
> capacity 12 TB is connected to a management server Sun Fire X4150<br>
> installed with RHEL 5.3 and this server is connected to a Gigabit<br>
> switch which provides cluster private network. The home directories on<br>
> the cluster are NFS mounted from storage partitions across all nodes<br>
> including the master.<br>
><br>
> This server gets hanged repeatedly. As an initial troubleshooting<br>
> we installed Ganglia, to check network utilization. But its normal.<br>
> We're not getting how to troubleshoot it and resolve the problem. Can<br>
> anybode help us resolve this issue?<br>
</div></div>Is there anything amiss according to the service processor?<br>
<font color="#888888"><br>
--<br>
-- Skylar Thompson (<a href="mailto:skylar@cs.earlham.edu">skylar@cs.earlham.edu</a>)<br>
-- <a href="http://www.cs.earlham.edu/%7Eskylar/" target="_blank">http://www.cs.earlham.edu/~skylar/</a><br>
<br>
<br>
</font></blockquote></div><br>