[Beowulf] dedupe filesystem

Joe Landman landman at scalableinformatics.com
Fri Jun 5 10:43:27 PDT 2009

John Hearns wrote:

>> In 2009, twenty years later, I think he might have a different take on this.
>>  I put all my bits onto floppys when I left there, and moved the important
>> ones to spinning rust.  I can still read the floppies.  I doubt he can still
>> read the tapes.
> This is I think referred to as 'digital archaeology'

:)  (visions of scientists holding up some rust as they dig through a 
plastic tape, and saying "look, I found a bit!!!")

>> The point is that tape folks talk about longevity.  But this makes a number
>> of important assumptions about the media, the drives, and availability of
>> replacement drives, which, as my advisor in graduate school discovered after
>> her drive died, are not necessarily correct or accurate.
> There. You have the concept - now, to add value to my SATA eating
> expanding storage array, you need to engineer it
> so your company can come along and bolt onto it the next type of
> storage - cakes of Blu-ray disks, multi packs of thumb drives, or
> whatever. The smart storage array will already be migrating your data
> before you even know it is out of date.
> The hard part comes in disguising the bills to the Chief Finance Officer.

Actually, what you described is *exactly* cloud storage.  And the CFO 
would love (generally) to pay for it.  Add whatever capacity you need, 
and pay for it ... only when you need it.  Lowers the cost per TB or per 
GB ... however you want to view it.  Your cost to run 1TB includes 
power, cooling, space, etc.  Your cost to increment this costs whatever 
quantum of storage you currently pay in whatever size you pay for it. 
What if, rather than in large "kerchunk" amounts (with gleeful sales 
critters rubbing hands together), it was in effectively whatever size 
amount you needed?

Without turning this into a commercial, we are working with a few folks 
in this regime.  Anyone interested in this stuff, bug me offline.

Do remember, TANSTAAFL though ... you have to pay the storage loaded 
cost, the bandwidth costs and latency to get data.

I have a feeling, if the governments really invest in infrastructure 
that this might be much less of an issue going forward ...

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

More information about the Beowulf mailing list