PBS Scheduler
Ivan Oleynik
oleynik at chuma.cas.usf.edu
Fri Sep 27 04:27:30 PDT 2002
Hi,
I have a problem with PBS scheduler: every time when I run IO intensive
series of jobs it goes down. As a result, the whole pbs queue with other
jobs become suspended.
I could not see any useful info in sched_logs and server_logs files except
of noninformative messages:
0001;PBS_Server;Svr;PBS_Server;Connection refused (111) in contact_sched,Could not contact Scheduler
For this particular test I run a bunch of mpich jobs requesting just 1
processor per job, and the number of the submitted jobs was 6 times the
number of available nodes. Each job does intensive IO via NFS running over
Myrinet (writing files ~ 300 Mb each).
When I run jobs with less intensive IO everything seems to be all right.
I would appreciate very much if someone could give a hint what could be a
reason of this strange behaviour.
Thanks,
Ivan Oleynik
------------------------------------------------------------------------
Ivan I. Oleynik E-mail : oleynik at chuma.cas.usf.edu
Department of Physics
University of South Florida
4202 East Fowler Avenue Tel : (813) 974-8186
Tampa, Florida 33620-5700 Fax : (813) 974-5813
------------------------------------------------------------------------
More information about the Beowulf
mailing list