How to compile Hadoop 2.7.1 on Raspberry Pi B+

Ultimo aggiornamento: 28-01-2016

A day I friend of mine give me an old Raspberry B+ and I found on internet this tutorial by The University of Glasgow’s Raspberry Pi Project Blog . I tried to compile Hadoop on it, in the way that native libraries are loaded (you can check this with hadoop checknative -a)

But why are native libraries (libhadoop, zlib, snappy, lz4, bzip2 and openssl) so important? I get the answer from the Hadoop page:

“Hadoop has native implementations of certain components for performance reasons and for non-availability of Java implementations.”

Raspberry B+ is not so fast and use more possible native code is better! (Raspberry Zero was also released: it is 40% faster than B+ but it has not ethernet support ).

So I started to compile it on hardware. I started from this tutorial by Headamage for Raspi2 and after some days of experiments, I have successfully compiled Hadoop!
First: I used an external USB disk also because compile projects of this dimension can destroy your SD card. You need at least 1.5GB of free memory on an ext2 partition (or some other file system that support symbolic links) only to compile Hadoop without protobuf: so I suggest you 3GB of free memory on disk for everythings.

The Compilation Process

First thing you have to do is to run your distro in console mode and see if you have more at least 280/300MB of free RAM.

If you have enough free RAM, you have to follow this tutorial until 3th step: in the tutorial are present all packages you need to install and how to install protobuf.
You don’t forget to download and unpack hadoop-2.7.1 source on the external disk!

Hadoop compilation steps

You have to set JAVA_HOME environment variable to don’t have error while compiling: it is used in Makefiles. So type: export JAVA_HOME=/path/to/jdk/
We used Oracle JDK 8u65 (JDK!) for Linux ARM v6/v7 Hard Float ABI and not the OpenJDK, because Oracle has better performance than OpenJDK.
You have to set Java heap space:
export MAVEN_OPTS=”-Xmx256m”
I tried different heap size and 256MB is the minimum!
Now you can execute this Maven command:
mvn package -Pdist,native -DskipTests=true -Dtar -Dmaven.javadoc.skip=true
With this, other than skips tests execution (you can also use -Dmaven.test.skip=true to skip tests generetion) it also skips javadoc creation.

Finally, you have the compiled code in a tar.gz file, after 2/3 hours of compilation, into “/hadoop-src/hadoop-dist/target/” directory: here you can download my result.

!There’s another way to compile the libraries, always from Glasgow: see the 5th step 😛

Installation

You can download the last version of your distro and follow the tutorial by The University of Glasgow’s Raspberry Pi Project to install your compiled Hadoop.
I tried my compiled code on the last RASPBIAN JESSIE LITE (released at 2015-11-21), and I see that I only need to install libssl-dev to support OpenSSL, while zlib is already installed and lz4 is built-in Hadoop.

Have fun!