Header Ads Widget

HDFS Interfaces

HDFS Interfaces :

Features of HDFS interfaces are :

  1. Create new file
  2. Upload files/folder
  3. Set Permission
  4. Copy
  5. Move
  6. Rename
  7. Delete
  8. Drag and Drop
  9. HDFS File viewer

Data Flow :

  • MapReduce is used to compute a huge amount of data. 
  • To handle the upcoming data in a parallel and distributed form, the data has to flow from various phases : 
  • Input Reader : 
  • The input reader reads the upcoming data and splits it into the data blocks of the appropriate size (64 MB to 128 MB). 
  • Once input reads the data, it generates the corresponding key-value pairs.
  • The input files reside in HDFS.
  • Map Function : 
  • The map function process the upcoming key-value pairs and generated the corresponding output key-value pairs. 
  • The mapped input and output types may be different from each other.
  • Partition Function :
  • The partition function assigns the output of each Map function to the appropriate reducer. 
  • The available key and value provide this function. 
  • It returns the index of reducers.
  • Shuffling and Sorting :
  • The data are shuffled between nodes so that it moves out from the map and get ready to process for reduce function. 
  • The sorting operation is performed on input data for Reduce function.
  • Reduce Function :
  • The Reduce function is assigned to each unique key. 
  • These keys are already arranged in sorted order. 
  • The values associated with the keys can iterate the Reduce and generates the corresponding output.
  • Output Writer :
  • Once the data flow from all the above phases, the Output writer executes. 
  • The role of the Output writer is to write the Reduce output to the stable storage.

Post a Comment

0 Comments