Santosh Srinivas

on 07 Nov 2016, tagged onApache Spark, Analytics, Data Minin

I've finally got to a long pending to-do-item to play with Apache Spark.

The following installation steps worked for me on Ubuntu 16.04.

The below options worked for me:

Download Apache Spark

  • Unzip and move Spark
cd ~/Downloads/
tar xzvf spark-2.0.1-bin-hadoop2.7.tgz
mv spark-2.0.1-bin-hadoop2.7/ spark
sudo mv spark/ /usr/lib/
  • Install SBT

As mentioned at sbt - Download

echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823
sudo apt-get update
sudo apt-get install sbt
  • Make sure Java is installed

If not, install java

sudo apt-add-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
  • Configure Spark
cd /usr/lib/spark/conf/
cp spark-env.sh.template spark-env.sh
vi spark-env.sh

Add the following lines

JAVA_HOME=/usr/lib/jvm/java-8-oracle
SPARK_WORKER_MEMORY=4g
  • Configure IPv6

Basically, disable IPv6 using sudo vi /etc/sysctl.conf and add below lines

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
  • Configure .bashrc

I modified .bashrc in Sublime Text using subl ~/.bashrc and added the following lines

export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export SBT_HOME=/usr/share/sbt-launcher-packaging/bin/sbt-launch.jar
export SPARK_HOME=/usr/lib/spark
export PATH=$PATH:$JAVA_HOME/bin
export PATH=$PATH:$SBT_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin
  • Configure fish (Optional - But I love the fish shell)

Modify config.fish using subl ~/.config/fish/config.fish and add the following lines

#Credit: http://fishshell.com/docs/current/tutorial.html#tut_startup

set -x PATH $PATH /usr/lib/spark
set -x PATH $PATH /usr/lib/spark/bin
set -x PATH $PATH /usr/lib/spark/sbin
  • Test Spark (Should work both in fish and bash)

Run pyspark (this is available in /usr/lib/spark/bin/) and test out.

For example ....

>>> a = 5
>>> b = 3
>>> a+b
8
>>> print(“Welcome to Spark”)
Welcome to Spark
## type Ctrl-d to exit

Try also, the built in run-example using run-example org.apache.spark.examples.SparkPi

That's it! You are ready to rock on using Apache Spark!

Next, I plan to checkout analysis using R as mentioned inhttp://www.milanor.net/blog/wp-content/uploads/2016/11/interactiveDataAnalysiswithSparkR_v5.pdf

Installing Apache Spark on Ubuntu 16.04的更多相关文章

  1. Install and Configure Apache Kafka on Ubuntu 16.04

    https://devops.profitbricks.com/tutorials/install-and-configure-apache-kafka-on-ubuntu-1604-1/ by hi ...

  2. Install LAMP Stack On Ubuntu 16.04

    原文:http://www.unixmen.com/how-to-install-lamp-stack-on-ubuntu-16-04/ LAMP is a combination of operat ...

  3. Ubuntu 16.04 LAMP server tutorial with Apache 2.4, PHP 7 and MariaDB (instead of MySQL)

    https://www.howtoforge.com/tutorial/install-apache-with-php-and-mysql-on-ubuntu-16-04-lamp/ This tut ...

  4. digitalocean --- How To Install Apache Tomcat 8 on Ubuntu 16.04

    https://www.digitalocean.com/community/tutorials/how-to-install-apache-tomcat-8-on-ubuntu-16-04 Intr ...

  5. 安装Hadoop及Spark(Ubuntu 16.04)

    安装Hadoop及Spark(Ubuntu 16.04) 安装JDK 下载jdk(以jdk-8u91-linux-x64.tar.gz为例) 新建文件夹 sudo mkdir /usr/lib/jvm ...

  6. 解决Ubuntu 16.04 上Android Studio2.3上面运行APP时提示DELETE_FAILED_INTERNAL_ERROR Error while Installing APKs的问题

    本人工作环境:Ubuntu 16.04 LTS + Android Studio 2.3 AVD启动之后,运行APP,报错提示: DELETE_FAILED_INTERNAL_ERROR Error ...

  7. Installing Moses on Ubuntu 16.04

    Installing Moses on Ubuntu 16.04 The process of installation To install requirements sudo apt-get in ...

  8. Installing Hyperledger Fabric v1.1 on Ubuntu 16.04 — Part I

    There is an entire library of Blockchain APIs which you can select according to the needs that suffi ...

  9. 如何在Ubuntu 16.04上安装Apache Web服务器

    转载自:https://www.howtoing.com/how-to-install-the-apache-web-server-on-ubuntu-16-04 介绍 Apache HTTP服务器是 ...

随机推荐

  1. 【BZOJ】2395: [Balkan 2011]Timeismoney

    题解 最小乘积生成树! 我们把,x的总和和y的总和作为x坐标和y左边,画在坐标系上 我们选择两个初始点,一个是最靠近y轴的A,也就是x总和最小,一个是最靠近x轴的B,也就是y总和最小 连接两条直线,在 ...

  2. Data时间格式化

    //时间戳转时间 function timeStamp2String(time) { var datetime = new Date(); datetime.setTime(time); var ye ...

  3. iOS 9应用开发教程之编辑界面与编写代码

    iOS 9应用开发教程之编辑界面与编写代码 编辑界面 在1.2.2小节中提到过编辑界面(Interface builder),编辑界面是用来设计用户界面的,单击打开Main.storyboard文件就 ...

  4. 【Memory】chrome调试面板

    本篇文章以chrome版本67.0.3396.99为例,介绍如何使用Chrome和DevTools查找影响页面性能的内存问题,包括内存泄漏.内存膨胀和频繁的垃圾回收. 一.参考链接 https://d ...

  5. Opencv学习笔记5:Opencv处理彩虹图、铜色图、灰度反转图

    一.概述: 人类能够观察到的光的波长范围是有限的,并且人类视觉有一个特点,只能分辨出二十几种灰度,也就是说即使采集到的灰度图像分辨率超级高,有上百个灰度级,但是很遗憾,人们只能看出二十几个,也就是说信 ...

  6. LOJ P3959 宝藏 状压dp noip

    https://www.luogu.org/problemnew/show/P3959 考场上我怎么想不出来这么写的,状压白学了. 直接按层次存因为如果某个点在前面存过了则肯定结果更优所以不用在意各点 ...

  7. 【贪心】Google Code Jam Round 1A 2018 Waffle Choppers

    题意:给你一个矩阵,有些点是黑的,让你横切h刀,纵切v刀,问你是否能让切出的所有子矩阵的黑点数量相等. 设黑点总数为sum,sum必须能整除(h+1),进而sum/(h+1)必须能整除(v+1). 先 ...

  8. bzoj 3996 最小割

    公式推出来后想了半天没思路,居然A是01矩阵..... 如果一个问题是求最值,并那么尝试先将所有可能收益加起来,然后矛盾部分能否用最小割表达(本题有两个矛盾,第一个是选还是不选,第二个是i,j有一个不 ...

  9. Alpha冲刺(1/10)——追光的人

    1.队友信息 队员学号 队员博客 221600219 小墨 https://www.cnblogs.com/hengyumo/ 221600240 真·大能猫 https://www.cnblogs. ...

  10. php 可以动态的new一个变量类名

    <?PHPheader("content-type:text/html; charset=utf-8");//echo ucfirst('a b'); class Stude ...