deadline at eadline.org
Sat Feb 7 10:38:36 PST 2015
> Hello Jonathan.
> Here it is a good document to get you thinking.
> Although Doug said "Oh, and Hadoop clusters are not going to supplant your
I should have continued, ... and there will be overlap.
> I believe that there is an ongoing effort to converge Cloud computing (eg.
> Hadoop) and HPC.
> The key things are exposed in the link I provided.
> To me the convergence is summarized in:
> -strong scalability.
> -reliability/fault tolerance.
> -programming productivity.
> -standarized/cheap infrastructure.
> ------ Original Message ------
> Received: 09:20 AM PST, 02/07/2015
> From: Jonathan Aquilina <jaquilina at eagleeyet.net>
> To: Douglas Eadline <deadline at eadline.org>Cc: Beowulf
> <beowulf at beowulf.org>
> Subject: Re: [Beowulf] hadoop
>> Hey Douglas,
>> Thanks for the information, what has me curious is if it can be used for
>> example in applications which dont involve large amounts of data.
>> It would be great if you or anyone has any resources like ebooks are
>> useful websites to read up on it would be great if you could send them
>> reason being where I am working we deal with lots of live telemetry in
>> terms of positioning etc. and since we are going to be moving our system
>> away from windows to open source technologies such as angular.js for the
>> web site of our platform as well as mongodb and nodejs, we will be
>> implementing hadoop from amazon to take advantage of Amazon's elastic
>> map reduce.
>> Jonathan Aquilina
>> Founder Eagle Eye T
>> On 2015-02-07 17:33, Douglas Eadline wrote:
>> > Jonathan
>> > I understand your confusion. Hadoop and Big Data have reached
>> > overused but not well understood status years ago.
>> > First, Hadoop started out at a MapReduce engine. This all
>> > changed with Hadoop V2 and YARN (Yet Another Resource Negotiator)
>> > Hadoop V2 can be considered a platform on which applications that need
>> > parallel access to large amounts of unstructured data (i.e. raw data
>> > in a traditional database. It can also used with its own database
>> > which is based on Google Big Table.
>> > The idea is this, a "Hadoop" cluster has a large amount of storage
>> > using HDFS (or possibly another parallel filesystem) This is often
>> > to as the "Data Lake." Raw data is dumped in the lake. There is no
>> > ETL (Extract Transform and Load) step. Various Hadoop YARN frameworks
>> > this data. YARN provides a very dynamic resource allocation model and
>> > ability to provide data locality to your application (i.e. the
>> > MapReduce idea was "move the computation to the data")
>> > Thus in a Hadoop V2 cluster you can have MapReduce applications (which
>> > support many of the the popular apps like Pig and Hive) It also
>> > Spark, Storm, Giraph and even MPI (not the most efficient but it
>> > There are many other applications being ported to YARN.
>> > Second, Big Data is usually defined by Volume, Velocity, and Variety.
>> > The definition seems to be what ever a vendor wants it to be, however.
>> > It reminds me of products that suddenly became "grid ready" in years
>> > Again such designations mean as much as "now works with binary data"
>> > Finally, if you are interested in Hadoop YARN you can check out the
>> > "Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with
>> > Apache Hadoop 2" (I helped write it). There also many online
>> > The first chapter of the book has the history of Hadoop as written by
>> > one of the developers. It is quite interested to read and helps dispel
>> > many of the Hadoop myths. You can read this chapter for free here:
> That is enough Hadoop for Saturday morning. Oh, and Hadoop clusters
>> > are not going to supplant your HPC cluster.
>> > --
>> > Doug
>> >> Can someone explain to me what exactly the purpose of hadoop is and
> we mean when we say big data? Is this for data storage and retrieval?
> crunching? -- Regards, Jonathan Aquilina Founder Eagle Eye T --
> Clean _______________________________________________ Beowulf mailing
> Beowulf at beowulf.org sponsored by Penguin Computing To change your
> (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf 
>> > --
>> > Doug
>>  http://www.beowulf.org/mailman/listinfo/beowulf
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
> Mailscanner: Clean
More information about the Beowulf