[Beowulf] lustre 'lctl dl' weirdness?

Michael Di Domenico mdidomenico4 at gmail.com
Thu Aug 27 13:27:02 PDT 2009

I posted this to the lustre-discuss mailling list, but i have not
heard anything all day, just wonder if anyone here might have an

We had a problem in the datacenter this morning where a bunch of
servers went down hard, this included my lustre filesystem and just
about every other machine in the building

When i try to bring the MDS/MGS back online, it does mount, but an
'lctl dl' shows that everything is there and UP, but i know its not,
because i have not mounted the OSS's.

Is this some junk left over?  Can/Should it be cleared out?

If i go through and mount all the OSS's they mount and i can mount the
filesystem on the client, but no ls or df of the mountpoint works

I've tried various methods of recovery that i know of, but i can't
seem to get the MDS/MGS to come up in what appears to be a clean state

is there some magic command or file that needs to be deleted to abort
everything and restart?


