[Beowulf] TORQUE issues

Lance S. Jacobsen lance at gohypersonic.com
Sat Apr 12 19:52:19 PDT 2008


Hi,

I recently put together a small cluster of Xeons using CentOS 5.1 
x86_64.  This cluster is my first real big experience with Linux and 
administration. It took some learning and such to install NIS, NFS, 
etc., but now the machines seem to be working well, and so I am working 
on the next step: installing a que scheduler. I decided on TORQUE 2.3.0 
since its free and I don't know any better. I have installed this and am 
having trouble getting it to detect my nodes.

I think the problem is that I named them starting with numbers in my 
/etc/hosts file: 1of12 , 2of12, ... 12of12. Instead of something like 
node01, node02, ...

After the installation, TORQUE did not create a file called 'nodes' 
which it told me that I needed, and so after searching the web I found 
the command to create it:

# qmgr -c "create node 2of12"

When I do this it gives me the following reply:

qmgr: syntax error - checklist failed
create node 2of12
                   /\

If I do this naming my node with a letter in front (n2of12) then it 
seems to work and generate the nodes file.

Now if I then go and do the "pbsnodes -a" command it tells me:

n2of12

state = down
np =1
ntype = cluster

seems fine... should be down since there is no n2of12 in my hosts file.

Now if I then go and rename the node in the node file back to 2of12 and 
type the following to kill and restart the server:

# qterm
# pbs_server

I get the following reply:

PBS_Server: pbsd_init(setup_nodes), token "2of12" doesn't start with 
alpha on line 1.

PBS_Server: PBS_Server, pbsd_init failed

Now I am reluctant to go and change all of my node names (IP aliases) 
since everything else about my cluster is finally working well and so I 
have been trying to find out why pbsd_init will not accept host names 
that start with numbers. Also, I would hate to go and change this if it 
is not the problem.

Does anyone know if I might be able to edit the setup files associated 
with pbsd_init to get this to work (or any other ways to do this)?

Thanks,

Lance

-- 
Lance S. Jacobsen, Ph.D.
President
GoHypersonic Incorporated
714 E. Monument Ave., Suite 201
Dayton, OH 45402-1382
Tel: 937-531-6678
Fax: 937-531-6679



More information about the Beowulf mailing list