[Beowulf] kdump / kexec to optain crash dumps from randomly crashing nodes.
paolo.supino at gmail.com
Thu Oct 9 12:07:40 PDT 2008
Did you try to redirect console to a serial port? If a system crashes
and all console messages (including kernel) will be sent to the serial
console that will keep displaying the messages it received until the
system is power cycled ...
Rahul Nabar wrote:
> On my Centos system I installed kexec/kdump to investigate the cause of
> some random system-crashes by getting access to a crash-dump. I installed
> the rpm for kexec and then made the change to grub.conf that reserves the
> additional memory for the new kernel.
> Also configured kdump.conf. I start the kexec service.and then I tried to
> simulate a crash by echo c to sysrq-trigger.
> The system does crash and then after a while reboots itself. But I see no
> vmcore when it coms back up. /var/crash is empty. This is when I tried to
> write to local drive.
> I also tried a nfs write but then still no success.
> Any idea what could be missing in my steps? Or any other debug
> suggestions? Any other kdump users on Beowulf?
More information about the Beowulf