Compiling Hadoop codes

After wandering for around a month and a half and pulling my hair off my head, it feels good when the work starts heading in a definite direction. Understanding how hadoop works and then start coding for it are miles apart.

One of the problems that I faced in coding was I couldn’t compile any of my hadoop codes that I wrote, not even the one that were given in the books. The error that came up looked something like-

xyz.java:5: package org.apache.hadoop.fs does not exist

import org.apache.hadoop.fs.Path;

^

xyz.java:6: package org.apache.hadoop.io does not exist

import org.apache.hadoop.io.*;

^

and so on..

The basic problem is the classpath. We need to set the classpath to compile our codes because hadoop library files are yet to be integrated so that they can be referred during compilation. This can be done by-

$ javac -classpath hadoop-common-0.21.0.jar <filenam.java>

you can add -verbose option to the command-line so that you can actually see what’s going on during the compilation.

Though I did this on Linux, but it doesn’t really matter on the OS. The same syntax applies even to Windows.

With this you are done with compilation of your hadoop code. Jar your files and then execute them.

Advertisements