[Beowulf] Slide on big data

Douglas Eadline deadline at eadline.org
Wed Feb 19 06:14:29 PST 2014


Right now Big Data is more like other fuzzy marketing
words i.e. Cloud, Grid, etc.

And, Big is a relative term. There are several
aspects of "Big Data" that I have noticed:
(Often summarized by the three V's of Big Data
Volume, Variety and Velocity)

- often an un-structured collection of data not
  easily processed using relational tools
- often unused organization data collected over time
  or a recent upswing in transaction-type data from the web
  or other sources, sensors, simulations, experiemnets
- strains existing infrastructure (which varies by organization)
- rapid analysis (real-time ) of data is desired
- Big in size, but not that big i.e.

Two analytics clusters at Yahoo and Microsoft,
median job input sizes are under 14GB and 90%
of jobs on a Facebook cluster have input sizes
under 100 GB. (“Nobody ever got fired for using
Hadoop on a cluster,” HotCDP2012)

"Big Data's sweet spot starts at 110GB and the discovery
that the most common amount of data the average company has under
management is between 10 to 30TB"
(http://www.sisense.com/blog/bruno/2013/01/13/big-data-surprises)


And, yes I have been swimming in the "Big Data" pool recently.


--
Doug

> Pardon me, what exactly IS Big Data :)
>
>
> On Tue, Feb 18, 2014 at 3:25 PM, Prentice Bisbal <
> prentice.bisbal at rutgers.edu> wrote:
>
>> So I stumbled upon this on reddit yesterday. It would be funny if it
>> wasn't so true:
>>
>> http://i.imgur.com/n4BvQMi.jpg
>>
>> For those of you who don't want to click the link, the slide says the
>> following:
>>
>> Big Data: What is it?
>>
>> Big Data is like teenage sex:
>> - Everyone talks about it
>> - Nobody really knows how to do it
>> - Everyone thinks everyone else is doing it
>> - So, everyone claims they are doing it
>>
>> --
>> Prentice
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
>
>
> --
> Audis,
> 1416J
>
> --
> Mailscanner: Clean
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>


--
Doug

-- 
Mailscanner: Clean




More information about the Beowulf mailing list