[Beowulf] Re: after update sgeexecd not starting correctly on reboot
Reuti
reuti at staff.uni-marburg.de
Wed Nov 26 04:15:38 PST 2008
Hi David,
Am 26.11.2008 um 01:08 schrieb David Mathog:
>> I think maybe the NFS mounting is different, so that the remote_fs
>> prerequisite isn't really satisfied, even though the associated
>> script
>> has run. The sgeexecd script does include a test:
>>
>> while [ ! -d "$SGE_ROOT" -a $count -le 120 ]; do
>> count=`expr $count + 1`
>> sleep 1
>> done
>
> This seems to have been it. Changing "$SGE_ROOT" to "$SGE_ROOT/bin"
> let SGE came up ok in a couple of consecutive reboots. Not definitive
> proof that was the issue, but at least it seems like progress.
> Apparently it was getting to this part of the SGE init script before
> $SGE_ROOT was actually mounted, the -d test always passed, NFS
> mounted or not, and of course the SGE start up failed since none of
> that
> code from the remote system was reachable. Just for kicks I added an
> echo line within the loop, so that if it sticks there it will show
> up on the console.
may I beg you to enter an issue at http://gridengine.sunsource.net/
of this?
-- Reuti
More information about the Beowulf
mailing list