Analysing Data with Hadoop :
While the MapReduce programming model is at the heart of Hadoop, it is low-level and as such becomes an unproductive way for developers to write complex analysis jobs.
To increase developer productivity, several higher-level languages and APIs have been created that abstract away the low-level details of the MapReduce programming model.
There are several choices available for writing data analysis jobs.
The Hive and Pig projects are popular choices that provide SQL-like and procedural data flow-like languages, respectively.
HBase is also a popular way to store and analyze data in HDFS. It is a column-oriented database, and unlike MapReduce, provides random read and write access to data with low latency.
MapReduce jobs can read and write data in HBase’s table format, but data processing is often done via HBase’s own client API.
0 Comments