[Beowulf] High Performance for Large Database
Laurence Liew
laurence at scalablesystems.com
Tue Nov 16 18:26:37 PST 2004
Hi.
When GFS was commercial and only available from Sistina - to put a GFS +
SAN solution together for a customer for a Beowulf customer made the
storage portion of the cluster around 40% of the overal costs of a 16
node beowulf cluster.
That was when we investigated GFS GNBD - but the performance leaves much
to be desired....
Anyway - today with Lustre and PVFS(2?) - I think they are much more
suitable filesystems for HPC type workloads.
BTW: Over in Singapore and countries around here - a 64 nodes cluster is
considered large... so a SAN solution *IS* a significant costs for most
academic customers.
Cheers!
laurence
Craig Tierney wrote:
> On Tue, 2004-11-16 at 02:01, Laurence Liew wrote:
>
>>Hi,
>>
>> From what I understand it has to do with "locking" on the SAN devices
>>by the GFS drivers.
>>
>>Yes.. you are right.. most implementations will have separate IO and
>>compute nodes... in fact that is the recommended way.... what I had
>>meant in my earlier statement was that "I prefer data to be distributed
>>amongst nodes - IO nodes - rather than have them centralised in a single
>>SAN backend.
>
>
> Do you have an issue with a single storage unit, or actually using
> a SAN? You could connect, dare I say "cluster", smaller FC based
> storage units together. You will get much better price/performance
> that going with larger storage units. This solution would work
> for shared filesystems like GFS, CXFS, or StorNext. You could
> connect the same units directly to IO nodes for distributed filesystems
> like Lustre, PVFS1/2, or Ibrix.
>
> Craig
>
>
>
>>GFS + NFS is painful and slow as you have experienced it... hopefully
>>RHEL V4 will bring about better performance and new features to address
>>HPC by GFS (unlikely but just hoping).
>>
>>Laurence
>>
>>Craig Tierney wrote:
>>
>>>On Mon, 2004-11-15 at 06:26, Laurence Liew wrote:
>>>
>>>
>>>>Hi
>>>>
>>>>The current version of GFS have a 64 node limit.. something to do with
>>>>maximum number of connections thru a SAN switch.
>>>
>>>
>>>I would suspect the problem is that GFS doesn't scale past
>>>64 nodes. There is no inherent limitation in Linux on the
>>>size of a SAN (well, if there is, it is much larger than 64 nodes).
>>>Other shared filesystems, like StorNEXT and CXFS, are limited
>>>to 128 nodes due to scalability reasons.
>>>
>>>
>>>
>>>>I believe the limit could be removed in RHEL v4.
>>>>
>>>>BTW, GFS was built for enterprise and not specifically for HPC... the
>>>>use of SAN (all nodes need to be connected to a single SAN storage)..
>>>>may be a bottleneck...
>>>>
>>>>I would still prefer the model of PVFS1/2 and Lustre where the data is
>>>>distributed amongst the compute nodes
>>>
>>>
>>>You can do this, but does anyone do it? I suspect that most
>>>implementations are setup where the servers are not on the compute
>>>nodes. This provides for more consistent performance across
>>>the cluster. Also, are you going to install redundant storage in
>>>all of your compute nodes so that you can build a FS across the
>>>compute nodes? Unless the FS is for scratch only, I don't want
>>>to have to explain to the users why the system keeps losing their data.
>>>Even if you use some raid1 or raid5 ATA controllers in a
>>>few storage servers, you are going to be able to build a fast and
>>>fault-tolerant system that just using disk in the compute nodes.
>>>
>>>
>>>
>>>>I suspect GFS could prove useful however for enterprise clusters say 32
>>>>- 128 nodes where the number of IO nodes (GFS nodes with exported NFS)
>>>>can be small (less than 8 nodes)... it could work well
>>>
>>>
>>>I had some experience with a NFS exported GFS system about 12
>>>months ago and it wasn't very pleasant. I could feel the latency
>>>in the meta-data operations when accessing the front ends of the
>>>cluster interactively. It didn't surprise me because other experience
>>>I have had with shared file-systems have been similar.
>>>
>>>Craig
>>>
>>>
>>>
>>>>Cheers!
>>>>Laurence
>>>>
>>>>Chris Samuel wrote:
>>>>
>>>>
>>>>>On Wed, 10 Nov 2004 12:08 pm, Laurence Liew wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>You may wish to try GFS (open sourced by Red Hat after buying
>>>>>>Sistina)... it may give better performance.
>>>>>
>>>>>
>>>>>Anyone here using the GPL'd version of GFS on large clusters ?
>>>>>
>>>>>Be really interested to hear how folks find that..
>>>>>
>>>>>
>>>>>
>>>>>------------------------------------------------------------------------
>>>>>
>>>>>_______________________________________________
>>>>>Beowulf mailing list, Beowulf at beowulf.org
>>>>>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>>
>>>
>>______________________________________________________________________
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>
--
Laurence Liew, CTO Email: laurence at scalablesystems.com
Scalable Systems Pte Ltd Web : http://www.scalablesystems.com
(Reg. No: 200310328D)
7 Bedok South Road Tel : 65 6827 3953
Singapore 469272 Fax : 65 6827 3922
More information about the Beowulf
mailing list