[Beowulf] TCP connect error: ECONNREFUSED.
John Hearns
hearnsj at googlemail.com
Tue Mar 31 00:11:52 PDT 2009
2009/3/30 Jörg Saßmannshausen <jorg.sassmannshausen at strath.ac.uk>:
> Dear all,
>
> I am having this rather anoying problem with the parallel execution of one
> of the programs (GAMESS US version) on our cluster. The error message is:
>
Guys and girls,
I am not putting Jorg in the spotlight here, I hope he understands this.
As a general point - I have seen on several mailing lists folks asking
for specific help with a system. That is of course what community
mailing lists are for, and we all can learn from the answers.
However, please, please contact the company who installed the cluster.
They will be happy to provide support for it. I can state
categorically that when working for two leading cluster companies we
always went above and beyond what was strictly required for support,
and would delve into issues like this at a very low level. That is why
you buy a prebuilt and tested cluster with support rather than a pile
of cardboard boxes.
Cluster vendors also keep records of the configuration of systems - so
if you are landed with an 'orphaned' system, or one you are not
familiar with again just phone the vendor.
As a personal point, there is nothing worse than being told three
years down the line that a certain system never worked properly if
that comes out of the blue and the end users never reported it or
asked for help.
While I'm on a rant, the staff working for cluster vendors are very
knowlegeable - there is a definite revolving door between academia,
industry and HPC integrators. You might find the guy who visits you to
debug some TCP/IP problem today is sitting beside you next week.
More information about the Beowulf
mailing list