It is very easy to install a Spark cluster (Standalone mode). In my example, I used three machines.

All machines run a OS of ubuntu 12.04 32bit. One machine is named "master", the other two are

named "node01" and "node02" respectively. The name of a machine can be set in:  /etc/hostname.

Further more, every nodes (machines) should the same user name.

1. On every node: Install Java and set Java environment in ~/.bashrc as:

  #set java environment

  export JAVA_HOME=/usr/local/jdk1.7.0_67

  export JRE_HOME=$JAVA_HOME/jre

  export PATH=$JAVA_HOME/bin:$PATH

  export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

Note that in my example, I used Java jdk1.7.0_67 and put it under /usr/local.

2. On every node: Install Scala and set corresponding environment variables in ~/.bashrc as:

export SCALA_HOME=/usr/local/scala-2.10.4

export PATH=$SCALA_HOME/bin:$PATH

Note that in my example, I used Scala scala-2.10.4 and put it under /usr/local.

3. On every node: Install Spark.

Download any version of Spark from http://spark.apache.org/downloads.html , in my example, I

chose spark-1.1.0-bin-hadoop2.4.tgz and extract it to /usr/local.

    Set in ~/.bashrc:

export SPARK_HOME=/usr/local/spark-1.1.0-bin-hadoop2.4

4. Set up ssh such that every two nodes in the cluster can ssh each other without password. This step

is also needed when you set up a hadoop cluster, there are abundant tutorials on the Internet, so

the details is omitted here.

5. On every node:

  $ sudo vim /etc/hosts

and set the IP address of the nodes in the network. For example, I set the hosts file on every node to:

  127.0.0.1        localhost

  223.3.86.xxx  master

  223.3.81.xxx  node01

  223.3.70.xxx  node02

6. On master node: Enter the root folder of Spark, and edit con/slaves. In my example:

  $ cd /usr/local/spark-1.1.0-bin-hadoop2.4

  $ sudo vim conf/slaves

Edit slaves file to:

  master

  node01

  node02

7. On master node: Enter the root folder of Spark and start spark cluster.

  $ cd /usr/local/spark-1.1.0-bin-hadoop2.4

  $ sbin/start-all.sh

8. Open http://master:8080/ using your web browser to monitoring the cluster.

9. Run Spark examples:

Locally:

$ MASTER=local[4] $SPARK_HOME/bin/run-example SparkLR

On cluster:

$ MASTER=spark://master:7077 $SPARK_HOME/bin/run-example SparkLR

For any questions, feel free to contact me.  Email: wuzimian2006@163.com  QQ: 726590906

随机推荐

  1. UDP异步通信

    先看效果图 Server: using System; using System.Collections.Generic; using System.Text; using System.Net; u ...

  2. PLSQL_性能优化系列17_Oracle Merge Into和Update更新效率

    2015-05-21 Created By BaoXinjian 一.摘要 以前只考虑 merge into 只是在特定场合下方便才使用的,今天才发现,merge into 竟然会比 update 在 ...

  3. AppCan认为,移动APP开发不是技术活

    很多粉丝反应,AppCan的文章太专业了,技术大大们毫不费劲,小白看的晕乎乎. 时代变了,5年前,AppCan的受众只有开发者.现在,政府高管.集团董事长.非技术类管理者.中小企业主.各行各业的管理者 ...

  4. Ajax发送FormData对象封装的表单数据

    前端页面: <!doctype html> <html lang="en"> <head> <meta charset="UTF ...

  5. Excel加载期间出现问题 解决方案

         今天在处理Excle表格的时候出现了如图所示的问题,资料比较重要,需要进行恢复:       出现问题的原因就是在制作的时候,产生了某些临时的htm文件,但是只保留了excel,将那些临时文 ...

  6. Shell Script(1)----variable compare

    PS:在学习python的时间里,抽空复习自己学习的Linux下的shell脚本知识点 1.数据类型 学习一门语言,比较关心其数据的表示方式,以及数据的类型,这里首先看一个bash shell的脚本 ...

  7. (简单) POJ 3984 迷宫问题,BFS。

    Description 定义一个二维数组: int maze[5][5] = { 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, ...

  8. Javascript Sting(字符串)对象

    一, 如何计算字符串的长度? 可以通过.length属性来计算 <script type="text/javascript"> var txt="Hello ...

  9. react使用proxy代理配置

    proxy,默认为NULL,类型为URL,一个为了发送http请求的代理 在package.json文件中使用proxy配置可以解决跨域问题 使用注意事项: create-react-app脚手架低于 ...

  10. GLSL传递数组

    static const char *microshaderFragSource = { "varying vec4 color;\n" "uniform bool te ...