[Beowulf]
William Burke
wburke999 at msn.com
Sat Mar 26 18:31:22 PST 2005
Reuti,
>> I'd suggest to move over to the SGE users list at:
>> http://gridengine.sunsource.net/servlets/ProjectMailingListList
I have but I do not see my name yet? How long is the verification process?
>> Although there is a special Myrinet directory, you can also try to use
>> the files in the mpi directory instead.
The mpi directory's mpich.template doesn't use mpirun.ch_gm so how does it
know what version of mpirun to use? If I use the mpi what changes do I have
to make?
>> Can you please give more details of your queue and PE setup (qconf -sq/sp
>> output
SEE BELOW
>> Do you have an admin account for SGE? I'd prefer not to do anything in
>> SGE as root.
Yes, its grid... SEE BELOW
>> Not really an issue: you have to make a small change to the
>> mpirun.ch_gm.pl to make all jobs staying in the same process group to get
>> them correctly killed in case of a jobb abort:
I have to double check that in:
http://gridengine.sunsource.net/howto/mpich-integration.html
Here is the new problem I have this situation in the PE:
My jobs won't run when I run my script it goes into pending mode for about
10 sec (status qw), SGE submits to N number of hosts (status t), jobs hangs
in a status t mode, then quickly exit. When I investigated both the
Jobscript_name.{pe|po}JobID output it states that SGE can't make links in
the
/WEMS/grid/tmp/549.1.Production.q/ directory.
It looks like the startmpi.sh script links files in $TMPDIR and from my
understanding the value of $TMPDIR
is derived from the tmpdir parameter in the queue's configuration. I have
designated this attribute as
'/WEMS/grid/tmp/' but according to the errorlog qsub_wrf.sh.pe549 it is
'/WEMS/grid/tmp/549.1.Production.q/'
Possibly the source of the problem is here, so what created the
'549.1.Production.q' addendum?
I then checked the permission of /WEMS/grid/tmp
[wems at wems grid]$ ls -ltr /WEMS/grid | grep tmp
drwxrwxrwx 2 root root 4096 Mar 26 17:34 tmp
As a sanity check within the startmpi.sh I echo out the ls -ltr of $TMPDIR:
drwxr-xr-x 2 65534 65534 4096 Mar 26 2005 549.1.Production.q
as expected there is no UID/GID that is 65534 in my /etc/passwd. Furthermore
there are only write permission
for UID/GID 65534 so if it(N1GE) is the only one writing and reading this
directory what else could be
preventing the writing into that directory? I thought maybe there was a lock
file in /WEMS/grid/tmp so I checked..
[wems at wems tmp]$ ls -al /WEMS/grid/tmp
total 8
drwxrwxrwx 2 root root 4096 Mar 26 17:34 .
drwxr-xr-x 22 grid grid 4096 Mar 26 04:20 ..
No Avail, so I am out of solutions. Is this a known issue when using
myrinet, mpich, tight integration or am I overlook something?? I am using
the sge_mpirun script instead of mpirun script. Have you seen any problem
like this before?
I also suspect that the editor may be reading the PE mpich configuration
file's argument start_proc_args incorrectly since the editor wraps the
string of argument around to the next line according to the
/WEMS/wems/data/WRF/wni001a/log/050811200.wrf.pbs file. In CHECK 5 it says
"The mpirun command "\" does not exist"
SEE CHECK 5, CHECK 8, then CHECK 7 BELOW.
Oh yeah, this may be a silly question but where does SGE get $pe_hostfile,
$TMPDIR from and what is the process of how it acquires these variables? I
would like some clarification.
Thanks,
William
Things that I checked
CHECK 0.5
[root at wems wrfprd]# cat qsub_wrf.sh
#!/bin/sh
#$ -S /bin/ksh
#$ -pe mpich 32
#$ -l h_rt=10800
#$ -q Production.q
#
#. /WEMS/wems/external/WRF/wrfsi/etc/setup-mpi.sh
cd /WEMS/wems/data/WRF/wni001a/wrfprd
echo 'This is the job ID '$JOB_ID >
/WEMS/wems/data/WRF/wni001a/log/050811200.wrf.pbs
echo 'This is the pe_hostfile '$PE_HOSTFILE >>
/WEMS/wems/data/WRF/wni001a/log/050811200.wrf.ps
echo 'This is the tmpdir '$TMPDIR >>
/WEMS/wems/data/WRF/wni001a/log/050811200.wrf.ps
/WEMS/grid/mpi/myrinet/sge_mpirun
/WEMS/wems/external/WRF/wrfsi/../run/wrf.exe >>
/WEMS/wems/data/WRF/wni001a/log/050811200.wrf.pbs 2>&1
CHECK 1
[wems at wems wems]$ qsub -pe mpich 32 -P test -q Production.q
/WEMS/wems/data/WRF/wni001a/wrfprd/qsub_wrf.sh
CHECK 2
[wems at wems grid]$ cat qsub_wrf.sh.pe549
ln: creating symbolic link `/WEMS/grid/tmp/549.1.Production.q/mpirun.sge' to
`/WEMS/pkgs/mpich-gm-1.2.6.14a/bin/mpirun.ch_gm': Permission denied
/WEMS/grid/mpi/myrinet/startmpi.sh[142]: cannot create
/WEMS/grid/tmp/549.1.Production.q/machines: Permission denied
cat: /WEMS/grid/tmp/549.1.Production.q/machines: No such file or directory
ln: creating symbolic link `/WEMS/grid/tmp/549.1.Production.q/rsh' to
`/WEMS/grid/mpi/rsh': Permission denied
CHECK 3
[wems at wems grid]$ cat qsub_wrf.sh.po549
-catch_rsh /WEMS/grid/wems-hosts2
/WEMS/pkgs/mpich-gm-1.2.6.14a/bin/mpirun.ch_gm
this is the value of mpirun /WEMS/pkgs/mpich-gm-1.2.6.14a/bin/mpirun.ch_gm
I am doing a ls -ltr on $TMPDIR
total 4
drwxr-xr-x 2 65534 65534 4096 Mar 26 2005 549.1.Production.q
Machine file is /WEMS/grid/tmp/549.1.Production.q/machines
CHECK 4
[wems at wems grid]$ cat Queue-config
qname Production.q
hostlist @Parallel
seq_no 0
load_thresholds np_load_avg=1.75
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors 2
qtype BATCH
ckpt_list NONE
pe_list mpich
rerun FALSE
slots 2
tmpdir /WEMS/grid/tmp
shell /bin/ksh
prolog NONE
epilog NONE
shell_start_mode posix_compliant
starter_method NONE
suspend_method NONE
resume_method NONE
terminate_method NONE
notify 00:00:60
owner_list NONE
user_lists Test_A
xuser_lists NONE
subordinate_list NONE
complex_values NONE
projects test
xprojects NONE
calendar NONE
initial_state default
s_rt INFINITY
h_rt INFINITY
s_cpu INFINITY
h_cpu INFINITY
s_fsize INFINITY
h_fsize INFINITY
s_data INFINITY
h_data INFINITY
s_stack INFINITY
h_stack INFINITY
s_core INFINITY
h_core INFINITY
s_rss INFINITY
h_rss INFINITY
s_vmem INFINITY
h_vmem INFINITY
CHECK 5
[wems at wems grid]$ cat mpich-PE-config
pe_name mpich
slots 78
user_lists Test_A
xuser_lists NONE
start_proc_args /WEMS/grid/mpi/myrinet/startmpi.sh -catch_rsh \
/WEMS/grid/wems-hosts2 \
/WEMS/pkgs/mpich-gm-1.2.6.14a/bin/mpirun.ch_gm
stop_proc_args /WEMS/grid/mpi/myrinet/stopmpi.sh
allocation_rule $fill_up
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
CHECK 6
[wems at wems wems]# cat /WEMS/wems/data/WRF/wni001a/log/050811200.wrf.ps
This is the pe_hostfile
/WEMS/grid/default/spool/wems18/active_jobs/388.1/pe_hostfile
This is the tmpdir /WEMS/grid/tmp/388.1.Production.q
This is the pe_hostfile
/WEMS/grid/default/spool/wems07/active_jobs/389.1/pe_hostfile
This is the tmpdir /WEMS/grid/tmp//389.1.Production.q
This is the pe_hostfile
/WEMS/grid/default/spool/wems24/active_jobs/390.1/pe_hostfile
This is the tmpdir /WEMS/grid/tmp/398.1.Production.q
This is the pe_hostfile
/WEMS/grid/default/spool/wems22/active_jobs/549.1/pe_hostfile
This is the tmpdir /WEMS/grid/tmp/549.1.Production.q
This is the pe_hostfile
This is the tmpdir
CHECK 7
[wems at wems wems]$ cat /WEMS/wems/data/WRF/wni001a/log/050811200.wrf.pbs
This is the job ID 549
The mpirun command "\" does not exist
There must be a problem with the mpich parallel environment
CHECK 8
[root at wems wrfprd]# cat qsub_wrf.sh
#!/bin/sh
#$ -S /bin/ksh
#$ -pe mpich 32
#$ -l h_rt=10800
#$ -q Production.q
#
#. /WEMS/wems/external/WRF/wrfsi/etc/setup-mpi.sh
cd /WEMS/wems/data/WRF/wni001a/wrfprd
echo 'This is the job ID '$JOB_ID >
/WEMS/wems/data/WRF/wni001a/log/050811200.wrf.pbs
echo 'This is the pe_hostfile '$PE_HOSTFILE >>
/WEMS/wems/data/WRF/wni001a/log/050811200.wrf.ps
echo 'This is the tmpdir '$TMPDIR >>
/WEMS/wems/data/WRF/wni001a/log/050811200.wrf.ps
/WEMS/grid/mpi/myrinet/sge_mpirun
/WEMS/wems/external/WRF/wrfsi/../run/wrf.exe >>
/WEMS/wems/data/WRF/wni001a/log/050811200.wrf.pbs 2>&1
exit
-----Original Message-----
From: Reuti [mailto:reuti at staff.uni-marburg.de]
Sent: Wednesday, March 23, 2005 6:26 PM
To: William Burke
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf]
Hi,
I'd suggest to move over to the SGE users list at:
http://gridengine.sunsource.net/servlets/ProjectMailingListList
But anyway, let's sort the things out:
Quoting William Burke <wburke999 at msn.com>:
> I can't get PE to work on a 50 node class II Beowulf. It has a front-end
> Sunfire v40 (qmaster host) and 49 Sunfire v20s (execution hosts) running
> Linux configured to communicate data over Myrinet using MPICH-GM version
> 1.26.14a.
Although there is a special Myrinet directory, you can also try to use the
files in the mpi directory instead.
> These are the requirements of the N1GE environment to handle:
>
> 1. Serial type jobs for pre-processing the data - average runtime 15
> minutes.
> 2. Output is pipelined into parallel processing jobs - range of runtime
> 1- 6 hours.
> 3. Concurrently running is post-processing serial jobs.
>
> I have setup a Parallel Environment called mpich-gm and a straight-forward
> FIFO scheduling schema for testing. When I submit parallel jobs they hang
> in
> limbo in a 'qw' state pending submission. I am not sure why the scheduler
> does not see jobs that I submit.
>
>
>
> I used the myrinet mpich template located $SGE_ROOT/< sge_cell
> >/mpi/myrinet
> directory to configure the pe (parallel environment) plus I copied the
> sge_mpirun script to the $SGE_ROOT/< sge_cell >/bin directory. I
> configured
> a Production.q queue that runs only parallel jobs. As a last sanity check
I
> ran a trace on the scheduler, submitted a simple parallel job, and this is
> the results that I got from the logs:
Can you please give more details of your queue and PE setup (qconf -sq/sp
output).
> JOB RUN Window
>
> [wems at wems examples]$ qsub -now y -pe mpich-gm 1-4 -b y hello++
>
> Your job 277 ("hello++") has been submitted.
>
> Waiting for immediate job to be scheduled.
>
>
>
> Your qsub request could not be scheduled, try again later.
>
> [wems at wems examples]$ qsub -pe mpich-gm 1-4 -b y hello++
>
> Your job 278 ("hello++") has been submitted.
>
> [wems at wems examples]$ qsub -pe mpich-gm 1-4 -b y hello++
>
> Your job 279 ("hello++") has been submitted.
You can't start a parallel job this way, as there is no mpirun used. When
you
used your mentioned script, you get the same behavior (and there you used
mpirun -np $NSLOTS ...)?
> This is the 2nd window SCHEDULER LOG
>
> [root at wems bin]# qconf -tsm
>
> [root at wems bin]# qconf -tsm
>
> [root at wems bin]# cat /WEMS/grid/default/common/schedd_runlog
>
> Wed Mar 23 06:08:55 2005|-------------START-SCHEDULER-RUN-------------
>
> Wed Mar 23 06:08:55 2005|queue instance "all.q at wems10.grid.wni.com"
dropped
> because it is temporarily not available
>
> Wed Mar 23 06:08:55 2005|queue instance "Production.q at wems10.grid.wni.com"
> dropped because it is temporarily not available
>
> Wed Mar 23 06:08:55 2005|queues dropped because they are temporarily not
> available: all.q at wems10.grid.wni.com Production.q at wems10.grid.wni.com
>
> Wed Mar 23 06:08:55 2005|no pending jobs to perform scheduling on
>
> Wed Mar 23 06:08:55 2005|--------------STOP-SCHEDULER-RUN-------------
>
> Wed Mar 23 06:11:37 2005|-------------START-SCHEDULER-RUN-------------
>
> Wed Mar 23 06:11:37 2005|queue instance "all.q at wems10.grid.wni.com"
dropped
> because it is temporarily not available
>
> Wed Mar 23 06:11:37 2005|queue instance "Production.q at wems10.grid.wni.com"
> dropped because it is temporarily not available
>
> Wed Mar 23 06:11:37 2005|queues dropped because they are temporarily not
> available: all.q at wems10.grid.wni.com Production.q at wems10.grid.wni.com
>
> Wed Mar 23 06:11:37 2005|no pending jobs to perform scheduling on
>
> Wed Mar 23 06:11:37 2005|--------------STOP-SCHEDULER-RUN-------------
>
> [root at wems bin]# qstat
>
> job-ID prior name user state submit/start at queue
> slots ja-task-ID
>
>
----------------------------------------------------------------------------
> -------------------------------------
>
> 279 0.55500 hello++ wems qw 03/23/2005 06:11:43
> 1
>
> [root at wems bin]#
Do you have an admin account for SGE? I'd prefer not to do anything in SGE
as
root.
> BTW that node wems10.grid.wni.com has connectivity issues and I have not
> removed it from the cluster queue.
>
>
>
> What causes this type of problem in N1GE to return "no pending jobs to
> perform scheduling on" in the schedd_runlog even though there are
available
> slots ready to take jobs?
>
> I had no problem submitting serial jobs, only the parallel jobs resulted
as
> such. Are there N1GE - Myrinet issue that I am not aware of? FYI the same
> binary (hello++) runs with no problems from the command line.
If you just start hello++, it will not run in parallel I think.
Not really an issue: you have to make a small change to the mpirun.ch_gm.pl
to
make all jobs staying in the same process group to get them correctly killed
in
case of a jobb abort:
http://gridengine.sunsource.net/howto/mpich-integration.html
> Since I generally run scripts from qsub instead of binaries I created a
> script to run the mpich executable but that yield the same result.
>
>
>
> I have an additional question regarding setting a queue.conf parameter
> called "subordinate_list". How is it read from the result of qconf -mq
> <queue_name>?
>
> Example
>
> i.e., subordinate_list low_pri.q=5,small.q.
The queue "low_pri.q" will be suspended, when 5 or more slots of
"<queue_name>"
are filled. The "small.q" will be suspened, if all slots of "<queue_name>"
are
filled.
Cheers - Reuti
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050326/c2adc1f8/attachment.html>
More information about the Beowulf
mailing list