Header Ads Widget

Unit Tests With MR Unit

Unit Tests With MR Unit  :

  • Hadoop MapReduce jobs have a unique code architecture that follows a specific template with specific constructs. 
  • This architecture raises interesting issues when doing test-driven development (TDD) and writing unit tests.   
  • With MRUnit, you can craft test input, push it through your mapper and/or reducer, and verify its output all in a JUnit test.
  • As do other JUnit tests, this allows you to debug your code using the JUnit test as a driver. 
  •  A map/reduce pair can be tested using MRUnit’s MapReduceDriver. , a combiner can be tested using MapReduceDriver as well.
  • A PipelineMapReduceDriver allows you to test a workflow of map/reduce jobs. 
  • Currently, partitioners do not have a test driver under MRUnit.  
  • MRUnit allows you to do TDD(Test Driven Development) and write lightweight unit tests which accommodate Hadoop’s specific architecture and constructs.

Example: We’re processing road surface data used to create maps.  The input contains both linear surfaces and intersections. The mapper takes a collection of these mixed surfaces as input, discards anything that isn’t a linear road surface, i.e., intersections, and then processes each road surface and writes it out to HDFS.   We can keep count and eventually print out how many non-road surfaces are input. For debugging purposes, we can additionally print out how many road surfaces were processed.

 

Post a Comment

0 Comments