Spark Standalone Mode Configuration

For currently popular distributed framework Spark, here it shows the intro and steps to configure the spark standalone mode on several machines.

It is easy to configure it from stratch. The following instruction I note down is based on the spark-2.0.2-bin-hadoop2.7 as example on the linux debian machines for scala programming.

Assume you have two machines with IP: 192.168.0.51 and 192.168.0.52

1. Preinstall java, scala, sbt

check: https://www.scala-lang.org/download/install.html

http://www.scala-sbt.org/0.13/docs/Installing-sbt-on-Linux.html

2. Download prebuilt spark version with hadoop. or you can compile on your own

the link can be referenced: https://spark.apache.org/downloads.html

3. Unzip the file and create the link for easy visit later

e.g. execute: ln -s /usr/local/spark-2.0.2-bin-hadoop2.7 /usr/local/spark

4. Configure the spark environments:

(1) configure slaves file: /usr/local/spark-2.0.2-bin-hadoop2.7/conf/slaves

# A Spark Worker will be started on each of the machines listed below.

192.168.0.51

192.168.0.52

(2) configure spar_env.sh. e.g.

#spark-env.sh

export SCALA_HOME=/usr/local/scala

export JAVA_HOME=/home/local/jdk

#export SPARK_LOCAL_IP=localhost

export SPARK_EXECUTOR_MEMORY=6g

export SPARK_EXECUTOR_CORES=6

export SPARK_MASTER_IP=192.168.0.51

export SPARK_MASTER_PORT=8070

export SPARK_MASTER_WEBUI_PORT=8080

#export SPARK_WORKER_INSTANCES=1

export SPARK_WORKER_PORT=8092

#export SPARK_WORKER_MEMORY=4g

#export SPARK_WORKER_CORES=4

5. Set up passwordless ssh access key

(1) Generate ssh key without password

$ ssh-keygen -t rsa -P ""

(2) Copy id_rsa.pub to authorized-keys

$  cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

(3) Start ssh localhost if you want to work in only one localhost machine for spark standalone

$ ssh localhost

6. Start spark

$SPARK_HOME/sbin/start-all.sh

execute jps to check worker and master have been up

7. Write and run your application

execute: sbt package

execute: $SPARK_HOME/bin/spark-submit \

　　　　--class "main.scala.MainAppTest" \

　　　　--master local[4] \

　　　　　xxxxxxxx.jar

Spark Standalone Mode Configuration的更多相关文章

（二）win7下用Intelij IDEA 远程调试spark standalone 集群
关于这个spark的环境搭建了好久,踩了一堆坑,今天环境: WIN7笔记本 spark 集群(4个虚拟机搭建的) Intelij IDEA15 scala-2.10.4 java-1.7.0 版本 ...
【原】Spark Standalone模式
Spark Standalone模式安装Spark Standalone集群手动启动集群集群创建脚本提交应用到集群创建Spark应用资源调度及分配监控与日志与Hadoop共存配置网络 ...
Spark standalone模式的安装（spark-1.6.1-bin-hadoop2.6.tgz）（master、slave1和slave2）
前期博客 Spark运行模式概述 Spark standalone简介与运行wordcount(master.slave1和slave2) 开篇要明白 (1)spark-env.sh 是环境变量配 ...
spark standalone ha spark submit
when you build a spark standalone ha cluster, when you submit your app, you should send it to the l ...
Spark standalone HA
配置Spark standalone HA 主机:node1,node2,node3 master: node1,node2 slave:node2,node3 修改配置文件: node1,node3 ...
spark standalone zookeeper HA部署方式
虽然spark master挂掉的几率很低,不过还是被我遇到了一次.以前在spark standalone的文章中也介绍过standalone的ha,现在详细说下部署流程,其实也比较简单. 一.机器 ...
Windows下IntelliJ IDEA中运行Spark Standalone
ZHUAN http://www.cnblogs.com/one--way/archive/2016/08/29/5818989.html http://www.cnblogs.com/one--wa ...
Spark standalone安装（最小化集群部署）
Spark standalone安装-最小化集群部署(Spark官方建议使用Standalone模式) 集群规划: 主机 IP ...
Spark Standalone模式应用程序开发
作者:过往记忆 | 新浪微博:左手牵右手TEL | 能够转载, 但必须以超链接形式标明文章原始出处和作者信息及版权声明博客地址:http://www.iteblog.com/文章标题:<Spar ...

随机推荐

强制解包看 Swift 的设计
不知道大家有没有发现,在一个 Objective-C 和 Swift 混编的 App 中,当把一个 OC 中的参数转到 Swift 时,Swift 会自动把这个变量进行强制解包.举个例子,我在 OC ...
在服务器上用Fiddler抓取HTTPS流量
转自:http://yoursunny.com/t/2011/FiddlerHTTPS/在服务器上用Fiddler抓取HTTPS流量阳光男孩发表于2011-03-19 开发互联网应用的过程中,常常 ...
【Netty】Netty之ByteBuf
一.前言前面已经学习了Netty中传输部分,现在接着学习Netty中的ByteBuf. 二.ByteBuf 2.1 ByteBuf API 在网络上传输的数据形式为Byte,Java NIO提供了B ...
HBuilder 安装使用教程
前段时间朋友让我帮忙打包一个 IPA 文件(使用 HTML5 开发的 Web 应用),了解到 HBuilder 这款 H5 开发神器.之前一直使用 WebStorm 开发 H5,闲来无事也学习下 HB ...
python学习之爬虫(一) ——————爬取网易云歌词
接触python也有一段时间了,一提到python,可能大部分pythoner都会想到爬虫,没错,今天我们的话题就是爬虫!作为一个小学生,关于爬虫其实本人也只是略懂,怀着"Done is b ...
Windows、Office系列产品精华部分集锦
提示有了这个帖子麻麻再也不用担心我因为四处找Microsoft家的软件和系统而四处劳累所烦恼了! 首先,你们最爱的老XP同志,XP同志虽然退休了,但是依然坚持在岗位上,向他致敬!! Windows ...
解密Lazy<T>
1.Lazy<T>的使用无意间看到一段代码,在创建对象的时候使用了Lazy,顾名思义Lazy肯定是延迟加载,那么它具体是如何创建对象,什么时候创建对象了? 先看这段示列代码: publi ...
sql备份文件兼容性问题
第一步: 右键需要备份的数据库,选择"属性"
Core ML 机器学习
在WWDC 2017开发者大会上,苹果宣布了一系列新的面向开发者的机器学习 API,包括面部识别的视觉 API.自然语言处理 API,这些 API 集成了苹果所谓的 Core ML 框架.Core M ...
利用document的readyState去实现页面加载中的效果
打开新的网页时,为了增强友好性体验,告知用户网页正在加载数据需要呈现一个"页面加载中"之类的提示,只需要利用document就可以实现. 实现示例代码如下: <style&g ...

Spark Standalone Mode Configuration

Spark Standalone Mode Configuration的更多相关文章

随机推荐

热门专题