kernel oopses
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert Latham robl at mcs.anl.govTue Jan 29 07:39:33 PST 2002
- Previous message: kernel oopses
- Next message: kernel oopses
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Jan 21, 2002 at 06:13:44PM -0800, Martin Siegert wrote: > This is somewhat off topic - sorry for that. it's a great topic for clusters. in an ideal world, the kernel never oopses, but when you have N kernels and possibly dodgy hardware, it happens. i get frustrated with this list because topics like Martin's get ignored, while topics like cooling with LN2, game console clusters and anything athlon get multi-day discussions. [snip problem report ] > The first thing I would like to do is to log the oops message. Right now > it goes to the console only - it does not appear in the log files > although syslog sends everything of severity *.info to /var/log/messages. i guess you've read Documentation/oops-tracing.txt , but if not, it's a good start. depending on where the panic happens, the part of the kernel that would normally write that oops out to disk doesn't run. So you've got a few options: . typing off the screen: sucks. a lot. and is highly error prone. and the kernel console blanking mechanism might kick in ( and since the kernel has paniced, it won't listed for input signals and unblank itself ) but if you've got no other option... ( one time a guy took a picture of the oops with a digital camera and sent that to me. that was fun. I don't have any character regognition software, but if someone knows of a linux OCR tool that won't mind a screenful of hex, i'd like to hear about it ) . serial console: not bad. if it's just one machine, you can pass parameters to your kernel and capture all kernel messages over the serial port. Documentation/serial-console.txt has all the info you need. . netconsole: http://people.redhat.com/mingo/netconsole-patches/ like a serial console, but using your network device instead of a serial device. It's a kernel patch and a convienece script for the sender and a userspace tool for the reciever to display the messages. Patching a kernel and setting up yet another tool might be a bit much, but man is it cool to see it work :> . patch your kernel to support "dump log to swapfile" or "dump log to disk". I haven't set something like this up, but always meant to try it out... Basically the name of the game is to get that oops into a form you can feed to ksymoops, then hope the backtrace it prints out gives you a clue. ( like "oh, the last thing it called was do_scsi_service... maybe i have a dogdy scisi controller ). Anybody else know of good ways ( even funny bad ways might be entertaining) to capture an oops? ==rob -- Rob Latham A215 0178 EA2D B059 8CDF B29D F333 664A 4280 315B
- Previous message: kernel oopses
- Next message: kernel oopses
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
