[Beowulf] kdump / kexec to optain crash dumps from randomly crashing nodes.
rpnabar at gmail.com
Thu Oct 9 12:19:20 PDT 2008
The funny thing is that the console remains blank. We have all these
systems connected to a KVM and the kvm shows the system as actually
disconnected post the crash.
That is what makes it so hard to debug. No screen output at all.
On Thu, Oct 9, 2008 at 2:07 PM, Paolo Supino <paolo.supino at gmail.com> wrote:
> Hi Rahul
> Did you try to redirect console to a serial port? If a system crashes
> and all console messages (including kernel) will be sent to the serial
> console that will keep displaying the messages it received until the
> system is power cycled ...
> Rahul Nabar wrote:
>> On my Centos system I installed kexec/kdump to investigate the cause of
>> some random system-crashes by getting access to a crash-dump. I installed
>> the rpm for kexec and then made the change to grub.conf that reserves the
>> additional memory for the new kernel.
>> Also configured kdump.conf. I start the kexec service.and then I tried to
>> simulate a crash by echo c to sysrq-trigger.
>> The system does crash and then after a while reboots itself. But I see no
>> vmcore when it coms back up. /var/crash is empty. This is when I tried to
>> write to local drive.
>> I also tried a nfs write but then still no success.
>> Any idea what could be missing in my steps? Or any other debug
>> suggestions? Any other kdump users on Beowulf?
More information about the Beowulf