[Beowulf] Lustre failover

Wed Sep 10 09:09:32 PDT 2008

On Wednesday 10 September 2008, Mark Hahn wrote:
> >> active/active seems strange to me - it implies that the bottleneck
> >> is the OSS (OST server), rather than the disk itself.  and a/a means
> >> each OSS has to do more locking for the shared disk, which would seem
> >> to make the problem worse...
> >
> > No, you can do active/active with several systems
> >
> >      Raid1
> >     /     \
> > OSS1        OSS2
> >    \      /
> >     Raid2
> >
> >
> > (Raid1 and Raid2 are hardware raid systems).
> >
> > Now OSS1 will primarily serve Raid1 and OSS2 will primarily serve Raid2.
> > So
>
> yes, I know - that's how HP SFS is set up.  the OP was talking
> active-active, though, meaning that IO at any instant can go to either OSS
> and still make it onto a particular raid.  otherwise it's active/passive,
> what SFS does.

I have a real hard time understanding how lustre could manage an active/active 
OST. This based on the fact that an OST is essentially a ldiskfs(ext4) 
filesystem on a device and this setup does not work in a situation where more 
than one entity modifies the data.

I think that what the lustre manual is refering to is a setup with two OSTs on 
a pair of servers. In this config one server would be active for one OST and 
passive for the other (and vice versa).

/Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20080910/f29b4b06/attachment.sig>