Sunday, August 30, 2015

Installing Mahout on Spark 1.4.1

Installing Mahout and Spark

In this blog I will describe the step to install Mahout with Apache Spark 1.4.1 (latest version). Also list out the possible Error and remedies.

Installing Mahout & Spark on your local machine

1) Download Apache Spark 1.4.1 and unpack the archive file

2) Change to the directory where you unpacked Spark and type sbt/sbt assembly to build it

3) Make sure right version of maven (3.3) installed in your system. If not install mvn before build Mahout

4) Create a directory for Mahout somewhere on your machine, change to there and checkout the master branch of Apache Mahout from GitHub git clone https://github.com/apache/mahout mahout

5) Change to the mahout directory and build mahout using mvn -DskipTests clean install


Starting Mahout's Spark shell

1) Goto the directory where you unpacked Spark and type sbin/start-all.sh to locally start Spark

2) Open a browser, point it to http://localhost:8080/ to check whether Spark successfully started. Copy the url of the spark master at the top of the page (it starts with spark://)

3) Define the following environment variables:

export MAHOUT_HOME=[directory into which you checked out Mahout]

export SPARK_HOME=[directory where you unpacked Spark]

export MASTER=[url of the Spark master]

4) Finally, change to the directory where you unpacked Mahout and type bin/mahout spark-shell, you should see the shell starting and get the prompt mahout>



In next blog will discuss the possibility of Error while installing Mahout with solution.

Next : Resolved issues - Installing Mahout 0.11.0 with Saprk 1.4.1

No comments: