[Beowulf] Re: Monitoring crashing machines
Lawrence Stewart
larry.stewart at sicortex.com
Tue Sep 9 18:48:46 PDT 2008
On Sep 9, 2008, at 7:41 PM, Robert G. Brown wrote:
> On Tue, 9 Sep 2008, David Mathog wrote:
>
>> word. In the old days some of those crash events spewed garbage to
>> the
>> printer, and that resulted in a ream of nonsense on the floor, and
>> more
>> often than not, the paper mashed into an accordian behind a pinfeed
>> jam.
>
> Nobody said it was EASY back then, right? Even when a system DIDN'T
> crash, it dump reams of fanfold into the takeup box, most of it never
> examined by human mind. ;-)
A non HPC story... from someone who used to work at the Stanford IT
shop way back when.
He was a systems analyst or programmer working on upgrading
various department JCL decks and batch jobs for some systems
conversion, maybe
new DASD or something. While testing a job for one department, the
report
seemed to come out correctly, but it was immediately followed by a
five inch
thick abend dump. Evidently, the space allocated on the old disk was
longer
than the file data, but shorter than the program was expecting. It
would process
the report, and then run off the end of the file and crash. The
analyst converted
the file for the new disk, set the length correctly, and went on to
the next job.
A month or two later, the department calls in to inquire "Where's the
numbers
report?" After some confusion back and forth, it seems that the
department
had been dutifully filing the abend dumps in a row of file cabinets,
and wanted
to know why they had gone missing after the upgrade...
-Larry
PS I never did work with old style big iron myself. I probably would
have gotten
fired for leaving my coffee cup on top of one of the printers when it
opened
for more paper.
PPS When I got started, we had printer that the "0" was worn out. I
had to
patch the device driver to substitute capital "O".
More information about the Beowulf
mailing list