Maui scheduler error
Gabriel J. Weinstock
gabriel.weinstock at dnamerican.com
Tue Apr 2 13:00:03 PST 2002
Hi,
We are having trouble getting the Maui scheduler to work. We have no
problem starting the server/scheduler and drone programs. (For testing, we
are not starting the drone on every node in the cluster; is this a problem?)
The set up is 'mauictl start' on the head node, followed by 'nodectl start'
on 2 compute nodes. 'showq' works correctly. The log files show all three
nodes processing correctly; right up until a user submits a job, at which
point the server node spits out the following message to its log file and
exits:
- log file -
4/02 15:21:25 (Sched.java:299) iteration 36
04/02 15:21:25 (Wiki.java:392) Wiki loop event
04/02 15:21:25 (BackfillMod.java:147) backfill scheduling
04/02 15:21:25 (ReservationsMod.java:105) handling reservations
04/02 15:21:25 (JobChecker.java:220) checkpointing...
04/02 15:21:25 (Sched.java:311) scheduling interval took 0.016 seconds
04/02 15:21:29 (BasicWorker.java:430) mauisubmit
04/02 15:21:29 (MauiSubmit.java:96) mauisubmit
04/02 15:21:29 (MauiSubmit.java:128) LRM cmdfile
04/02 15:21:29 (CMD.java:280) Removing envvar HOSTNAME
04/02 15:21:29 (CMD.java:280) Removing envvar MACHTYPE
04/02 15:21:29 (CMD.java:280) Removing envvar HOSTTYPE
04/02 15:21:29 (CMD.java:280) Removing envvar OSTYPE
04/02 15:21:29 (CMD.java:280) Removing envvar _
04/02 15:21:29 (MauiMySQL.java:268) Changing romeda's job account to
no-account
04/02 15:21:29 (MauiSubmit.java:199) checking job on RM=Node
04/02 15:21:29 (BasicPolicy.java:111) pre debiting bank for 7200 slotsecs for
job=romeda:1017778889:0
04/02 15:21:29 (MauiXMLHandlerImpl.java:284) FATAL:
org.xml.sax.SAXParseException: Illegal XML character: �.
04/02 15:21:29 (BasicWorker.java:244) Ignoring SAX freak-out: Illegal XML
character: �.
04/02 15:21:30 (Sched.java:326)
----------------------------------------------------
04/02 15:21:30 (Sched.java:299) iteration 37
04/02 15:21:30 (Wiki.java:392) Wiki loop event
04/02 15:21:30 (BackfillMod.java:147) backfill scheduling
04/02 15:21:30 (BackfillMod.java:164) contemplating job romeda:1017778889:0
04/02 15:21:30 (Sched.java:330) java.lang.ArrayIndexOutOfBoundsException
java.lang.ArrayIndexOutOfBoundsException
at
unm.maui.rm.SimpleMatcher.getNodeAvailSlotIDs(SimpleMatcher.java:563)
at unm.maui.rm.SimpleMatcher.getNodesSlots(SimpleMatcher.java:377)
at unm.maui.rm.SimpleMatcher.getNodesSlots(SimpleMatcher.java:256)
at unm.maui.rm.SimpleMatcher.findNodesSlots(SimpleMatcher.java:79)
at unm.maui.sched.BackfillMod.makeReservation(BackfillMod.java:240)
at unm.maui.sched.BackfillMod.event(BackfillMod.java:169)
at unm.maui.sched.Sched.fireLoop(Sched.java:922)
at unm.maui.sched.Sched.run(Sched.java:306)
at java.lang.Thread.run(Thread.java:484)
04/02 15:21:30 (Sched.java:347) checkpointing scheduler.
04/02 15:21:30 (Wiki.java:385) shutting down RM=Node
04/02 15:21:30 (Sched.java:359) scheduler finished
- end -
If I try to restart the server daemon after this crash, it immediately exits
again with the message in iteration 37 (ArrayIndexOutOfBoundsException.) The
only way to restart the daemon is to create the mySQL database again (wiping
whatever was in it.) Here is my .cmd file, which I run with 'mauisubmit
maui_job.cmd':
- maui_job.cmd -
IWD == "/tmp"
WCLimit == 3600
Account == "WWGD190053X"
Tasks == 2
Nodes == 2
TaskPerNode == 1
Arch == x86
OS == Linux
JobType == "mpi.ch_gm"
Exec == "/export/mauisched-1.2/bin/runmpi_gm"
Args == "/export/home/romeda/cpi"
Output == "/tmp/$(MAUI_JOB_USER)2x3gm$(MAUI_JOB_ID).out"
Error == "/tmp/$(MAUI_JOB_USER)2x3gm$(MAUI_JOB_ID).err"
Log == "/tmp/$(MAUI_JOB_USER)2x3gm$(MAUI_JOB_ID).log"
Input == "/dev/null"
- end -
Is the XML error related to the out of bounds array exception? We compiled
with the Sun jdk 1.3.1-02 and JavaCC 2.1. There is no information about this
error on the web. Any help would be greatly appreciated.
Thanks,
Gabe
More information about the Beowulf
mailing list