Unit Tests With MR Unit :
- Hadoop MapReduce jobs have a unique code architecture that follows a specific template with specific constructs.
- This architecture raises interesting issues when doing test-driven development (TDD) and writing unit tests.
- With MRUnit, you can craft test input, push it through your mapper and/or reducer, and verify its output all in a JUnit test.
- As do other JUnit tests, this allows you to debug your code using the JUnit test as a driver.
- A map/reduce pair can be tested using MRUnit’s MapReduceDriver. , a combiner can be tested using MapReduceDriver as well.
- A PipelineMapReduceDriver allows you to test a workflow of map/reduce jobs.
- Currently, partitioners do not have a test driver under MRUnit.
- MRUnit allows you to do TDD(Test Driven Development) and write lightweight unit tests which accommodate Hadoop’s specific architecture and constructs.
Example: We’re processing road surface data used to create maps. The input contains both linear surfaces and intersections. The mapper takes a collection of these mixed surfaces as input, discards anything that isn’t a linear road surface, i.e., intersections, and then processes each road surface and writes it out to HDFS. We can keep count and eventually print out how many non-road surfaces are input. For debugging purposes, we can additionally print out how many road surfaces were processed.
0 Comments