Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

PBS Scheduler

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Ivan Oleynik oleynik at chuma.cas.usf.edu
Fri Sep 27 04:27:30 PDT 2002


Hi,

I have a problem with PBS scheduler: every time when I run IO intensive
series of jobs it goes down. As a result, the whole pbs queue with other
jobs become suspended.

I could not see any useful info in sched_logs and server_logs files except
of noninformative messages:

0001;PBS_Server;Svr;PBS_Server;Connection refused (111) in contact_sched,Could not contact Scheduler

For this particular test I run a bunch of mpich jobs requesting just 1
processor per job, and the number of the submitted jobs was 6 times the
number of available nodes. Each job does intensive IO via NFS running over
Myrinet (writing files ~ 300 Mb each).

When I run jobs with less intensive IO everything seems to be all right.

I would appreciate very much if someone could give a hint what could be a
reason of this strange behaviour.

Thanks,

Ivan Oleynik

------------------------------------------------------------------------
Ivan I. Oleynik                       E-mail : oleynik at chuma.cas.usf.edu
Department of Physics
University of South Florida
4202 East Fowler Avenue                  Tel : (813) 974-8186
Tampa, Florida 33620-5700                Fax : (813) 974-5813
------------------------------------------------------------------------





More information about the Beowulf mailing list