[Beowulf] Question on hgh performance, low cost Fileserver

Paulo Afonso Lopes pal at di.fct.unl.pt
Mon Nov 14 02:30:29 PST 2005


GFS and GPFS are SAN-based. I do not have any experience with Lustre, but
it seems (at least in a supported - by the vendor - configuration) to be
based on a "back-end" SAN.

What you have to deal, using currently available solutions, is with this
kind of decisions:
- Do you rate availability/fault tolerance as important?
  If you do (why else would you say PVFS is not for home dirs?) you must
use a disk array based solution, either with an FC-SAN or NAS or iSCSI.

Then, you must choose your "file system" (not for the NAS option). You'll
have to decide if:
- You need "POSIX locking": if you do, you can't use PVFS
- You will want to support applications that both require high I/O
bandwidth and heavy file sharing (RW the same file): if you do, you must
exclude GFS, use GPFS in "data shipping mode" and modify your applications

(Note: you can have a resilient PVFS configuration if you use a SAN with
disk arrays instead of "internal" disks, and add some HA software - of
course, you can "transfer" disks manually, via command scripts, if you do
not want to use HA software)

You also need to put the "high cost" of a SAN into context: if you want to
move data at high speeds in a COTS (Gigabit Eth) LAN, you will consume all
the available CPU (e.g. around 40% of a 2.6GHz Xeon to reach around 80MB/s
sustained in one node). If you go for "fancy" interconnects (Infiniband,
Myrinet,...) you are in the same "cost territory" as FC/SANs

By NOT using "asymetrical" file systems (such as PVFS) and using "cluster
file systems" such as GFS or GPFS you may (depending on your requirements)
dispense with I/O nodes (client nodes on a SAN can directly access data)
alltogether...

I have never been involved in a large configuration like the one you're
planning to build, but I honestly think that you should go for a "mix" of
HA filesystem (e.g., GFS) for homes, etc. (mostly unshared file access)
and PVFS for the directories where files for HPC applications do live. I
don't think there is a single, currently available file system, that can
do both things well.

HTH

paulo


> We are looking into designing a low cost, high performance storage system.
> Requirements as below:
>
> - Starts at 3TB, should scale up by adding more servers to say 10-12TB
> - Use commodity technologies (x86_64, IB, GE, Linux), preferably all OSS
> components
> - Provide high I/O which scales with addition of storage nodes.
> - To be used for hosting user home dirs so reliability is important
> - The HPC cluster starts with 6 AMD64 nodes and is expected to scale to
> 1000+nodes in a year.
> - Preferably without FC/SAN
>
> We do have experience with IBM GPFS, PVFS (1,2), NetApps, PolyServe but
> not with GFS and LUSTRE.
>
> PVFS is not reliable enough for home dirs (OK for scratch), GPFS cannot
> do RAID5 like striping across nodes, needs SAN for RAID1 like
mirroring
> (cost $$$) , polyserve is too expensive (per CPU pricing)
>
> Is GFS or Lustre suitable for the above needs? Any other commercial
> slution?
>
--
Paulo Afonso Lopes                        | Tel: +351- 21 294 8536
Departamento de Informática               | 294 8300 ext.10763
Faculdade de Ciências e Tecnologia        | Fax: +351- 21 294 8541
Universidade Nova de Lisboa               | e-mail: pal at di.fct.unl.pt
2829-516 Caparica, PORTUGAL




More information about the Beowulf mailing list