[Beowulf] Re: Monitoring crashing machines

Lawrence Stewart larry.stewart at sicortex.com
Tue Sep 9 18:48:46 PDT 2008


On Sep 9, 2008, at 7:41 PM, Robert G. Brown wrote:

> On Tue, 9 Sep 2008, David Mathog wrote:
>
>> word.  In the old days some of those crash events spewed garbage to  
>> the
>> printer, and that resulted in a ream of nonsense on the floor, and  
>> more
>> often than not, the paper mashed into an accordian behind a pinfeed  
>> jam.
>
> Nobody said it was EASY back then, right?  Even when a system DIDN'T
> crash, it dump reams of fanfold into the takeup box, most of it never
> examined by human mind. ;-)


A non HPC story... from someone who used to work at the Stanford IT  
shop way back when.

He was a systems analyst or programmer  working on upgrading
various department JCL decks and batch jobs for some systems  
conversion, maybe
new DASD or something.  While testing a job for one department, the  
report
seemed to come out correctly, but it was immediately followed by a  
five inch
thick abend dump.  Evidently, the space allocated on the old disk was  
longer
than the file data, but shorter than the program was expecting.  It  
would process
the report, and then run off the end of the file and crash.  The  
analyst converted
the file for the new disk, set the length correctly, and went on to  
the next job.

A month or two later, the department calls in to inquire "Where's the  
numbers
report?"  After some confusion back and forth, it seems that the  
department
had been dutifully filing the abend dumps in a row of file cabinets,  
and wanted
to know why they had gone missing after the upgrade...

-Larry

PS I never did work with old style big iron myself.  I probably would  
have gotten
fired for leaving my coffee cup on top of one of the printers when it  
opened
for more paper.

PPS When I got started, we had printer that the "0" was worn out.  I  
had to
patch the device driver to substitute capital "O".




More information about the Beowulf mailing list