p4_error: net_recv read: probable EOF on socket: 1

Dr. David F. Robinson drobinson at aletheon.com
Sun Jan 27 14:42:44 PST 2002


I am receiving the following errors while running my mpi enabled code.  

 

p4_error: net_recv read:  probable EOF on socket: 1

 

This error occurs after running the code for several hours using all
processors in my cluster.  I have seen several postings similar to this
on the web, however, I have not seen any posted solutions.  My
configuration is as follows:

 

Mpich_1.2.1 compiled w/ Portland compilers

Scyld 27cz-8 (Red Hat Linux 6.2)

Linux 2.2.19

 

I have tried to update my eepro100 drivers by downloading and compiling
the netdrivers.tgz file from the Scyld ftp site.  They compiled and
installed fine using 'make' and 'make install', however, the driver on
the slave nodes has not been updated.  When I reboot the master node and
do a dmesg, the latest driver is being implemented on the master.  The
slave nodes are still booting with the old driver.  How do I get the
boot image for the slaves to use the updated modules?  Are my problems
caused by the old eepro100 drivers?

 

Any help is greatly appreciated.

 

Thanks,

 

David

 

 

  --------------------------------------------

David F. Robinson, Ph.D.

Aletheon Technologies

224 Rolling Hills Rd.; Suite 9A

Mooresville, NC 28117

(704) 799-6944 Ext. 11

(704) 799-7974 Fax

www.aletheon.com

drobinson at aletheon.com

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20020127/0383a55e/attachment.html>


More information about the Beowulf mailing list