Finding the input File for Hadoop Map task
I had a lot of pure Map only jobs, whose main function was to clean the incoming log stream and emit a refined output log with consistent fields. Due to code bugs or variation in input, a lot of time Map jobs would get killed or not produce the desired outcomes. In the quest of [...]
