[Beowulf] picking out a job scheduler
Nathan Moore
ntmoore at gmail.com
Tue Jan 2 20:55:10 PST 2007
Torque was really easy to install, but it seems like my /etc/hosts
file must be screwed up, as I can't get the cluster nodes to
respond. Specifically, within a cluster of 3 machines, each having
an /etc/hosts file of:
127.0.0.1 localhost.localdomain localhost
199.17.152.17 runner
199.17.152.135 muscovey
199.17.152.13 pekin
(( other workstations follow ))
Now, when I have the pbs_server running on runner, and the pbs_mom
daemons running on muscovey, pekin, and runner, I et the following
status message,
[root at runner torque-2.1.6]# pbsnodes -a
pekin
state = down
np = 1
ntype = cluster
muscovey
state = down
np = 1
ntype = cluster
runner
state = down
np = 1
ntype = cluster
I realize this is a pretty low-level question, but what the heck is
wrong with my /etc/hosts file?
regards,
NT
ps, the trouble shooting message given by torque is,
[root at runner torque-2.1.6]# momctl -d 3
Host: runner/runner Version: 2.1.6
WARNING: server not specified (set $pbsserver)
PID: 30531
HomeDirectory: /var/spool/torque/mom_priv
MOM active: 2518 seconds
Server Update Interval: 45 seconds
LOGLEVEL: 0 (use SIGUSR1/SIGUSR2 to adjust)
Communication Model: RPP
TCP Timeout: 20 seconds
NOTE: no prolog configured
Alarm Time: 0 of 10 seconds
Trusted Client List: 199.17.152.17,127.0.0.1
Configured to use /usr/bin/scp -rpB
NOTE: no local jobs detected
diagnostics complete
- - - - - - - - - - - - - - - - - - - - - - -
Nathan Moore
Physics
Winona State University
nmoore at winona.edu
AIM:nmoorewsu
- - - - - - - - - - - - - - - - - - - - - - -
On Jan 2, 2007, at 7:23 PM, Chris Samuel wrote:
On Wednesday 03 January 2007 08:06, Chris Dagdigian wrote:
> Both should be fine although if you are considering *PBS you should
> look at both Torque (a fork of OpenPBS I think)
That's correct, it (and ANU-PBS, another fork) seem to be the defacto
queuing
systems in the state and national HPC centers down here.
Torque is just *so* much better than OpenPBS used to be (not that it was
particularly hard).
cheers,
Chris
--
Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://
www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list