PBS Scheduler

Ivan Oleynik oleynik at chuma.cas.usf.edu
Fri Sep 27 04:27:30 PDT 2002


Hi,

I have a problem with PBS scheduler: every time when I run IO intensive
series of jobs it goes down. As a result, the whole pbs queue with other
jobs become suspended.

I could not see any useful info in sched_logs and server_logs files except
of noninformative messages:

0001;PBS_Server;Svr;PBS_Server;Connection refused (111) in contact_sched,Could not contact Scheduler

For this particular test I run a bunch of mpich jobs requesting just 1
processor per job, and the number of the submitted jobs was 6 times the
number of available nodes. Each job does intensive IO via NFS running over
Myrinet (writing files ~ 300 Mb each).

When I run jobs with less intensive IO everything seems to be all right.

I would appreciate very much if someone could give a hint what could be a
reason of this strange behaviour.

Thanks,

Ivan Oleynik

------------------------------------------------------------------------
Ivan I. Oleynik                       E-mail : oleynik at chuma.cas.usf.edu
Department of Physics
University of South Florida
4202 East Fowler Avenue                  Tel : (813) 974-8186
Tampa, Florida 33620-5700                Fax : (813) 974-5813
------------------------------------------------------------------------





More information about the Beowulf mailing list