Tuesday, April 12, 2022

Home Big Data Big Data Unit-2 Anatomy of a Map-Reduce Job Run

Anatomy of a Map-Reduce Job Run

Anatomy of a Map-Reduce Job Run :

Hadoop Framework comprises of two main components :

A typical Hadoop MapReduce job is divided into a set of Map and Reduce tasks that execute on a Hadoop cluster.

The execution flow occurs as follows:

Input data is split into small subsets of data.
Map tasks work on these data splits.
The intermediate input data from Map tasks are then submitted to Reduce task after an intermediate process called ‘shuffle’.
The Reduce task(s) works on this intermediate data to generate the result of a MapReduce Job.

Random Posts