[Beowulf] mpich2 complain about nodes that i dont use

Mark Hahn hahn at physics.mcmaster.ca
Fri Sep 30 18:47:46 PDT 2005

> I am using mpich2 on linux cluster, I kept having errors like the following
> rank 14 in job 2  cn128_57798   caused collective abort of all ranks
>   exit status of rank 14: killed by signal 9

signal 9 is sigkill (not segv or abrt, etc), and I'd be a bit surprised
if this happened other than by someone killing the process.

