[Beowulf] TORQUE issues
Lance S. Jacobsen
lance at gohypersonic.com
Sat Apr 12 19:52:19 PDT 2008
Hi,
I recently put together a small cluster of Xeons using CentOS 5.1
x86_64. This cluster is my first real big experience with Linux and
administration. It took some learning and such to install NIS, NFS,
etc., but now the machines seem to be working well, and so I am working
on the next step: installing a que scheduler. I decided on TORQUE 2.3.0
since its free and I don't know any better. I have installed this and am
having trouble getting it to detect my nodes.
I think the problem is that I named them starting with numbers in my
/etc/hosts file: 1of12 , 2of12, ... 12of12. Instead of something like
node01, node02, ...
After the installation, TORQUE did not create a file called 'nodes'
which it told me that I needed, and so after searching the web I found
the command to create it:
# qmgr -c "create node 2of12"
When I do this it gives me the following reply:
qmgr: syntax error - checklist failed
create node 2of12
/\
If I do this naming my node with a letter in front (n2of12) then it
seems to work and generate the nodes file.
Now if I then go and do the "pbsnodes -a" command it tells me:
n2of12
state = down
np =1
ntype = cluster
seems fine... should be down since there is no n2of12 in my hosts file.
Now if I then go and rename the node in the node file back to 2of12 and
type the following to kill and restart the server:
# qterm
# pbs_server
I get the following reply:
PBS_Server: pbsd_init(setup_nodes), token "2of12" doesn't start with
alpha on line 1.
PBS_Server: PBS_Server, pbsd_init failed
Now I am reluctant to go and change all of my node names (IP aliases)
since everything else about my cluster is finally working well and so I
have been trying to find out why pbsd_init will not accept host names
that start with numbers. Also, I would hate to go and change this if it
is not the problem.
Does anyone know if I might be able to edit the setup files associated
with pbsd_init to get this to work (or any other ways to do this)?
Thanks,
Lance
--
Lance S. Jacobsen, Ph.D.
President
GoHypersonic Incorporated
714 E. Monument Ave., Suite 201
Dayton, OH 45402-1382
Tel: 937-531-6678
Fax: 937-531-6679
More information about the Beowulf
mailing list