[Beowulf] Doing i/o at a small cluster
Vincent Diepeveen
diep at xs4all.nl
Sat Aug 18 07:55:12 PDT 2012
On Aug 18, 2012, at 1:04 PM, Andrew Holway wrote:
> 2012/8/17 Vincent Diepeveen <diep at xs4all.nl>:
>> The homepage looks very commercial and they have a free trial on it.
>> You refer to the free trial?
>
> http://nexentastor.org/ - Sorry wrong link. Its a commercially backed
> open source project.
>
>> Means buy raid controller. That's extra cost. That depends upon what
>> it costs.
>
> You just need to attach the disks to some SATA port. ZFS does all the
> raid stuff internally in software.
>
>> But it does mean that every node and every diskread and write you do,
>> that they all hammer at the same time at that single basket.
>
> ZFS seems to handle this kind of stuff elegantly. As with hardware
> raid each disk can be accessed individually. NFSoRDMA would ensure
> speedy access times.
>
>>
>> I don't see how you can get out of 1 cheap box good performance like
>> that.
>
> Try it and see. It would certainly be much less of a headache than
> some kind of distributed filesystem which, in my opinion is complete
> overkill for a 4 node cluster. All of the admins that I know that look
> after these systems have the haunted look of a village elder that must
> choose which of the villages daughters must be fed to the lustre
> monster every 6 months.
>
> Dont forget to put in as much memory as you can afford and ideally an
> SSD For read cache (assuming that you access the same blocks over and
> over in some fashion)
I designed something myself in datastructure that's close to ZFS
according to someone at the time working for Sun in Bangalore;
this was before ZFS was popular, or even introduced (am not sure - it
was 2001-2002 or so),
but am no aware how the filesystem has been expanded since then to
satisfy professional needs :)
Note i wasn't aware it works sincethen in Linux as opensource. Does it?
My thing is streaming a dataset of around a 1.3TB over and over again
and each time something in
the dataset gets modified. So the output is a bitstream that you
store and this bitstream is, all cores together
storing 1.3TB or so.
Note if i write TB it's terabyte. All those raidcards write Gb =
gigaBIT.
1.3TB of SSD a node would speed it up considerable, but that's too
expensive.
I do agree about maintenance, but my cluster ain't larger than 8
nodes here and i do want that
performance of 0.5TB/s a node, so in case of 8 nodes it should be 4GB/
s agreggated bandwidth to the i/o
and not the say nearly 800MB/s that most raidcards, that are cheap on
ebay, seem to deliver.
So some sort of distributed file system seems the best option, and a
lot cheaper and a lot faster than a dedicated fileserver
that will not be able to keep up.
More information about the Beowulf
mailing list