Hadoop Recipe – Using Custom Java Counters

Starting the Hadoop Recipe series, in which I shall pick up a topic and provide sample code around it. Each shall be small and concise, would provide ready to use hints on topics covered.

The post covers usage of Custom Counters in Java in Hadoop world. Counters are very helpful in MapReduce world. We can use them as a way of watching the progress or as a way of indirect debugging or validation as well. For example, you may want to have specific counter to know how many types of specific record were processed from the complete data, like how many request had a specific keyword in the apache access log. There can be many other similar Use Cases.

Hadoop provides some inbuilt counters that are always there like number of Map input records, number of bytes processed etc.

Lets see how we can use a custom Counter in Java code

Define the counter

public static enum COUNTERS {
    ERROR_COUNT,
    MISSING_FIELDS_RECORD_COUNT
}

Counter definition is very simple, we define an enum and all the Counters that we want to use.

Using the Counters

 @Override
 public void map(LongWritable key, Text value, Context context) 
                 throws IOException, InterruptedException {
     // mapper code here
     
     // if error condition, increment the error counter
     if(error) {
         context.getCounter(COUNTERS.ERROR_COUNT).increment(1);
     }
     
     // if missing records conditions
     if(missingRecords) {
         context.getCounter(COUNTERS.MISSING_FIELDS_RECORD_COUNT).increment(1);
     }
}  

Usage of Counters is again simple. For a given condition you can increment it. In the example, we are incrementing it by one, but you can increment it by higher values as well.

Viewing the Counter values

You can view the counter values in JobTracker UI or programatically as well. You can print all the Counters or a specific Counter as well. Following example shows how to print value of a specific Counter


// Code in the Job Driver Class
Counter errorCounter = job.getCounters().findCounter(COUNTERS.ERROR_COUNT);
System.out.println("Error Counter = "+errorCounter.getValue());

One thought on “Hadoop Recipe – Using Custom Java Counters

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.