[Beowulf] Slide on big data
Douglas Eadline
deadline at eadline.org
Wed Feb 19 12:19:35 PST 2014
>
> On 02/19/2014 09:14 AM, Douglas Eadline wrote:
>> Right now Big Data is more like other fuzzy marketing
>> words i.e. Cloud, Grid, etc.
>>
>> And, Big is a relative term. There are several
>> aspects of "Big Data" that I have noticed:
>> (Often summarized by the three V's of Big Data
>> Volume, Variety and Velocity)
>>
>> - often an un-structured collection of data not
>> easily processed using relational tools
>> - often unused organization data collected over time
>> or a recent upswing in transaction-type data from the web
>> or other sources, sensors, simulations, experiemnets
>> - strains existing infrastructure (which varies by organization)
>> - rapid analysis (real-time ) of data is desired
>> - Big in size, but not that big i.e.
>>
>> Two analytics clusters at Yahoo and Microsoft,
>> median job input sizes are under 14GB and 90%
>> of jobs on a Facebook cluster have input sizes
>> under 100 GB. (Nobody ever got fired for using
>> Hadoop on a cluster, HotCDP2012)
>>
>> "Big Data's sweet spot starts at 110GB and the discovery
>> that the most common amount of data the average company has under
>> management is between 10 to 30TB"
>> (http://www.sisense.com/blog/bruno/2013/01/13/big-data-surprises)
>>
>>
>> And, yes I have been swimming in the "Big Data" pool recently.
>
> Have you been drinking Kool-Aid made with water from that pool? ;)
Had a few cups while working on the Hadoop YARN book.
I still prefer HPC beer though.
--
Doug
>>
>>
>> --
>> Doug
>>
>>> Pardon me, what exactly IS Big Data :)
>>>
>>>
>>> On Tue, Feb 18, 2014 at 3:25 PM, Prentice Bisbal <
>>> prentice.bisbal at rutgers.edu> wrote:
>>>
>>>> So I stumbled upon this on reddit yesterday. It would be funny if it
>>>> wasn't so true:
>>>>
>>>> http://i.imgur.com/n4BvQMi.jpg
>>>>
>>>> For those of you who don't want to click the link, the slide says the
>>>> following:
>>>>
>>>> Big Data: What is it?
>>>>
>>>> Big Data is like teenage sex:
>>>> - Everyone talks about it
>>>> - Nobody really knows how to do it
>>>> - Everyone thinks everyone else is doing it
>>>> - So, everyone claims they are doing it
>>>>
>>>> --
>>>> Prentice
>>>>
>>>> _______________________________________________
>>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>>> Computing
>>>> To change your subscription (digest mode or unsubscribe) visit
>>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>>
>>>
>>>
>>> --
>>> Audis,
>>> 1416J
>>>
>>> --
>>> Mailscanner: Clean
>>>
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>> Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>
>> --
>> Doug
>>
>
>
> --
> Mailscanner: Clean
>
--
Doug
--
Mailscanner: Clean
More information about the Beowulf
mailing list