[Beowulf] kdump / kexec to optain crash dumps from randomly crashing nodes.
rpnabar at gmail.com
Thu Oct 9 11:57:39 PDT 2008
On my Centos system I installed kexec/kdump to investigate the cause of
some random system-crashes by getting access to a crash-dump. I installed
the rpm for kexec and then made the change to grub.conf that reserves the
additional memory for the new kernel.
Also configured kdump.conf. I start the kexec service.and then I tried to
simulate a crash by echo c to sysrq-trigger.
The system does crash and then after a while reboots itself. But I see no
vmcore when it coms back up. /var/crash is empty. This is when I tried to
write to local drive.
I also tried a nfs write but then still no success.
Any idea what could be missing in my steps? Or any other debug
suggestions? Any other kdump users on Beowulf?
More information about the Beowulf