Apache Oozie is a workflow scheduling system used to manage Hadoop jobs. In this post we shall look at a quick setup.
Pre-Conditions
JDK 1.7+
Building Oozie
Download Oozie source distribution.
Extract the distribution.
There are different options to customize based on Hadoop versions being used, but we would build a simple uber build
Execute the following command where the distribution was extracted
1 |
bash-3.2$ bin/mkdistro.sh -DskipTests -Puber |
Once the build is done, the Oozie distribution can be found at
1 2 3 4 5 6 |
bash-3.2$ cd distro/target/ bash-3.2$ ls antrun maven-shared-archive-resources test-classes archive-tmp oozie-4.2.0-distro tomcat classes oozie-4.2.0-distro.tar.gz maven-archiver oozie-distro-4.2.0.jar |
Once done, copy the oozie-4.2.0-distro.tar.gz to a desired location and extract is. This becomes our Oozie installation home.
Setting up extjs
Download extjs2.2 to enable web console. You can find the same at following links
http://dev.sencha.com/deploy/ext-2.2.zip
http://archive.cloudera.com/gplextras/misc/ext-2.2.zip
Create a directory libext in Oozie installation home and copy extjs2.2.zip in the folder
Starting Oozie
From the Oozie installation home, execute the following
1 |
bash-3.2$ bin/oozied.sh start |
This shall start the Oozie server in an embedded Tomcat. You can access Oozie console at
You should see the following screen.