[Beowulf] kdump / kexec to optain crash dumps from randomly crashing nodes.
Rahul Nabar
rpnabar at gmail.com
Thu Oct 9 12:19:20 PDT 2008
Hi Paolo,
The funny thing is that the console remains blank. We have all these
systems connected to a KVM and the kvm shows the system as actually
disconnected post the crash.
That is what makes it so hard to debug. No screen output at all.
-Rahul
On Thu, Oct 9, 2008 at 2:07 PM, Paolo Supino <paolo.supino at gmail.com> wrote:
> Hi Rahul
>
> Did you try to redirect console to a serial port? If a system crashes
> and all console messages (including kernel) will be sent to the serial
> console that will keep displaying the messages it received until the
> system is power cycled ...
>
>
>
>
>
> --
> ttyl
> Paolo
>
>
>
> Rahul Nabar wrote:
>> On my Centos system I installed kexec/kdump to investigate the cause of
>> some random system-crashes by getting access to a crash-dump. I installed
>> the rpm for kexec and then made the change to grub.conf that reserves the
>> additional memory for the new kernel.
>>
>> Also configured kdump.conf. I start the kexec service.and then I tried to
>> simulate a crash by echo c to sysrq-trigger.
>>
>> The system does crash and then after a while reboots itself. But I see no
>> vmcore when it coms back up. /var/crash is empty. This is when I tried to
>> write to local drive.
>>
>> I also tried a nfs write but then still no success.
>>
>> Any idea what could be missing in my steps? Or any other debug
>> suggestions? Any other kdump users on Beowulf?
>>
>
>
More information about the Beowulf
mailing list