“The computer was born to solve problems that did not exist before.”

Random Posts

Tuesday, April 12, 2022

Anatomy of a Map-Reduce Job Run

Anatomy of a Map-Reduce Job Run :

Hadoop Framework comprises of two main components :

  • Hadoop Distributed File System (HDFS) for Data Storage
  • MapReduce for Data Processing.

A typical Hadoop MapReduce job is divided into a set of Map and Reduce tasks that execute on a Hadoop cluster. 

The execution flow occurs as follows:

  • Input data is split into small subsets of data.
  • Map tasks work on these data splits.
  • The intermediate input data from Map tasks are then submitted to Reduce task after an intermediate process called ‘shuffle’.
  • The Reduce task(s) works on this intermediate data to generate the result of a MapReduce Job.

No comments:

Post a Comment

Post Top Ad

Your Ad Spot

Pages

SoraTemplates

Best Free and Premium Blogger Templates Provider.

Buy This Template