Scala进阶之路-Spark独立模式(Standalone)集群部署

                                       作者:尹正杰

版权声明:原创作品,谢绝转载!否则将追究法律责任。

  我们知道Hadoop解决了大数据的存储和计算,存储使用HDFS分布式文件系统存储,而计算采用MapReduce框架进行计算,当你在学习MapReduce的操作时,尤其是Hive的时候(因为Hive底层其实仍然调用的MapReduce)是不是觉得MapReduce运行的特别慢?因此目前很多人都转型学习Spark,今天我们就一起学习部署Spark集群吧。

一.准备环境

  如果你的服务器还么没有部署Hadoop集群,可以参考我之前写的关于部署Hadoop高可用的笔记:https://www.cnblogs.com/yinzhengjie/p/9154265.html

1>.启动HDFS分布式文件系统

[yinzhengjie@s101 download]$ more `which xzk.sh`
#!/bin/bash
#@author :yinzhengjie
#blog:http://www.cnblogs.com/yinzhengjie
#EMAIL:y1053419035@qq.com #判断用户是否传参
if [ $# -ne ];then
echo "无效参数,用法为: $0 {start|stop|restart|status}"
exit
fi #获取用户输入的命令
cmd=$ #定义函数功能
function zookeeperManger(){
case $cmd in
start)
echo "启动服务"
remoteExecution start
;;
stop)
echo "停止服务"
remoteExecution stop
;;
restart)
echo "重启服务"
remoteExecution restart
;;
status)
echo "查看状态"
remoteExecution status
;;
*)
echo "无效参数,用法为: $0 {start|stop|restart|status}"
;;
esac
} #定义执行的命令
function remoteExecution(){
for (( i= ; i<= ; i++ )) ; do
tput setaf
echo ========== s$i zkServer.sh $ ================
tput setaf
ssh s$i "source /etc/profile ; zkServer.sh $1"
done
} #调用函数
zookeeperManger
[yinzhengjie@s101 download]$

[yinzhengjie@s101 download]$ more `which xzk.sh`

[yinzhengjie@s101 download]$ more `which xcall.sh`
#!/bin/bash
#@author :yinzhengjie
#blog:http://www.cnblogs.com/yinzhengjie
#EMAIL:y1053419035@qq.com #判断用户是否传参
if [ $# -lt ];then
echo "请输入参数"
exit
fi #获取用户输入的命令
cmd=$@ for (( i=;i<=;i++ ))
do
#使终端变绿色
tput setaf
echo ============= s$i $cmd ============
#使终端变回原来的颜色,即白灰色
tput setaf
#远程执行命令
ssh s$i $cmd
#判断命令是否执行成功
if [ $? == ];then
echo "命令执行成功"
fi
done
[yinzhengjie@s101 download]$

[yinzhengjie@s101 download]$ more `which xcall.sh`

[yinzhengjie@s101 download]$ more `which xrsync.sh`
#!/bin/bash
#@author :yinzhengjie
#blog:http://www.cnblogs.com/yinzhengjie
#EMAIL:y1053419035@qq.com #判断用户是否传参
if [ $# -lt ];then
echo "请输入参数";
exit
fi #获取文件路径
file=$@ #获取子路径
filename=`basename $file` #获取父路径
dirpath=`dirname $file` #获取完整路径
cd $dirpath
fullpath=`pwd -P` #同步文件到DataNode
for (( i=;i<=;i++ ))
do
#使终端变绿色
tput setaf
echo =========== s$i %file ===========
#使终端变回原来的颜色,即白灰色
tput setaf
#远程执行命令
rsync -lr $filename `whoami`@s$i:$fullpath
#判断命令是否执行成功
if [ $? == ];then
echo "命令执行成功"
fi
done
[yinzhengjie@s101 download]$

[yinzhengjie@s101 download]$ more `which xrsync.sh`

[yinzhengjie@s101 download]$ xzk.sh start
启动服务
========== s102 zkServer.sh start ================
ZooKeeper JMX enabled by default
Using config: /soft/zk/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
========== s103 zkServer.sh start ================
ZooKeeper JMX enabled by default
Using config: /soft/zk/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
========== s104 zkServer.sh start ================
ZooKeeper JMX enabled by default
Using config: /soft/zk/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[yinzhengjie@s101 download]$
[yinzhengjie@s101 download]$ xcall.sh jps
============= s101 jps ============
Jps
命令执行成功
============= s102 jps ============
Jps
QuorumPeerMain
命令执行成功
============= s103 jps ============
QuorumPeerMain
Jps
命令执行成功
============= s104 jps ============
Jps
QuorumPeerMain
命令执行成功
============= s105 jps ============
Jps
命令执行成功
[yinzhengjie@s101 download]$

[yinzhengjie@s101 download]$ xzk.sh start

[yinzhengjie@s101 download]$ start-dfs.sh
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/hadoop-2.7./share/hadoop/common/lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.-bin/lib/log4j-slf4j-impl-2.4..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Starting namenodes on [s101 s105]
s101: starting namenode, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-namenode-s101.out
s105: starting namenode, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-namenode-s105.out
s103: starting datanode, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-datanode-s103.out
s104: starting datanode, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-datanode-s104.out
s102: starting datanode, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-datanode-s102.out
Starting journal nodes [s102 s103 s104]
s104: starting journalnode, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-journalnode-s104.out
s102: starting journalnode, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-journalnode-s102.out
s103: starting journalnode, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-journalnode-s103.out
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/hadoop-2.7./share/hadoop/common/lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.-bin/lib/log4j-slf4j-impl-2.4..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Starting ZK Failover Controllers on NN hosts [s101 s105]
s101: starting zkfc, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-zkfc-s101.out
s105: starting zkfc, logging to /soft/hadoop-2.7./logs/hadoop-yinzhengjie-zkfc-s105.out
[yinzhengjie@s101 download]$
[yinzhengjie@s101 download]$ xcall.sh jps
============= s101 jps ============
Jps
NameNode
DFSZKFailoverController
命令执行成功
============= s102 jps ============
QuorumPeerMain
JournalNode
Jps
DataNode
命令执行成功
============= s103 jps ============
JournalNode
QuorumPeerMain
Jps
DataNode
命令执行成功
============= s104 jps ============
QuorumPeerMain
Jps
DataNode
JournalNode
命令执行成功
============= s105 jps ============
DFSZKFailoverController
NameNode
Jps
命令执行成功
[yinzhengjie@s101 download]$

[yinzhengjie@s101 download]$ start-dfs.sh

2>.上传测试数据到HDFS集群中

[yinzhengjie@s101 download]$ cat temp
++023450FM-+000599999V0202701N015919999999N0000001N9-+99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0202901N008219999999N0000001N9++99999102001ADDGF104991999999999999999999
++023450FM-+000599999V0209991C000019999999N0000001N9-+99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0201801N008219999999N0000001N9-+99999101831ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9-+99999101761ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9++99999101751ADDGF108991999999999999999999
++023450FM-+000599999V0202001N009819999999N0000001N9-+99999101701ADDGF106991999999999999999999
++023450FM-+000599999V0202701N015919999999N0000001N9-+99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0202901N008219999999N0000001N9-+99999102001ADDGF104991999999999999999999
++023450FM-+000599999V0209991C000019999999N0000001N9++99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0201801N008219999999N0000001N9-+99999101831ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9++99999101761ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9++99999101751ADDGF108991999999999999999999
++023450FM-+000599999V0202001N009819999999N0000001N9++99999101701ADDGF1069919999999999999999990029029070999991901010106004++023450FM-+000599999V0202701N015919999999N0000001N9-+99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0202901N008219999999N0000001N9-+99999102001ADDGF104991999999999999999999
++023450FM-+000599999V0209991C000019999999N0000001N9++99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0201801N008219999999N0000001N9++99999101831ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9++99999101761ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9-+99999101751ADDGF108991999999999999999999
++023450FM-+000599999V0202001N009819999999N0000001N9++99999101701ADDGF1069919999999999999999990029029070999991901010106004++023450FM-+000599999V0202701N015919999999N0000001N9-+99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0202701N015919999999N0000001N9-+99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0202901N008219999999N0000001N9++99999102001ADDGF104991999999999999999999
++023450FM-+000599999V0209991C000019999999N0000001N9-+99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0201801N008219999999N0000001N9++99999101831ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9++99999101761ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9++99999101751ADDGF108991999999999999999999
++023450FM-+000599999V0202001N009819999999N0000001N9-+99999101701ADDGF106991999999999999999999
++023450FM-+000599999V0202701N015919999999N0000001N9++99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0202901N008219999999N0000001N9-+99999102001ADDGF104991999999999999999999
++023450FM-+000599999V0209991C000019999999N0000001N9++99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0201801N008219999999N0000001N9++99999101831ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9-+99999101761ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9++99999101751ADDGF108991999999999999999999
++023450FM-+000599999V0202001N009819999999N0000001N9++99999101701ADDGF1069919999999999999999990029029070999991901010106004++023450FM-+000599999V0202701N015919999999N0000001N9-+99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0202901N008219999999N0000001N9++99999102001ADDGF104991999999999999999999
++023450FM-+000599999V0209991C000019999999N0000001N9-+99999102001ADDGF108991999999999999999999
++023450FM-+000599999V0201801N008219999999N0000001N9++99999101831ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9-+99999101761ADDGF108991999999999999999999
++023450FM-+000599999V0201801N009819999999N0000001N9++99999101751ADDGF108991999999999999999999
++023450FM-+000599999V0202001N009819999999N0000001N9++99999101701ADDGF1069919999999999999999990029029070999991901010106004++023450FM-+000599999V0202701N015919999999N0000001N9-+99999102001ADDGF108991999999999999999999
[yinzhengjie@s101 download]$

[yinzhengjie@s101 download]$ cat temp

[yinzhengjie@s101 download]$ hdfs dfs -mkdir -p  /home/yinzhengjie/data
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/hadoop-2.7./share/hadoop/common/lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.-bin/lib/log4j-slf4j-impl-2.4..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[yinzhengjie@s101 download]$
[yinzhengjie@s101 download]$ hdfs dfs -ls -R /
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/hadoop-2.7./share/hadoop/common/lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.-bin/lib/log4j-slf4j-impl-2.4..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
drwxr-xr-x - yinzhengjie supergroup -- : /home
drwxr-xr-x - yinzhengjie supergroup -- : /home/yinzhengjie
drwxr-xr-x - yinzhengjie supergroup -- : /home/yinzhengjie/data
[yinzhengjie@s101 download]$

[yinzhengjie@s101 download]$ hdfs dfs -mkdir -p /home/yinzhengjie/data

[yinzhengjie@s101 download]$ hdfs dfs -put temp /home/yinzhengjie/data
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/hadoop-2.7./share/hadoop/common/lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.-bin/lib/log4j-slf4j-impl-2.4..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[yinzhengjie@s101 download]$
[yinzhengjie@s101 download]$
[yinzhengjie@s101 download]$ hdfs dfs -ls -R /
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/hadoop-2.7./share/hadoop/common/lib/slf4j-log4j12-1.7..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.-bin/lib/log4j-slf4j-impl-2.4..jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
drwxr-xr-x - yinzhengjie supergroup -- : /home
drwxr-xr-x - yinzhengjie supergroup -- : /home/yinzhengjie
drwxr-xr-x - yinzhengjie supergroup -- : /home/yinzhengjie/data
-rw-r--r-- yinzhengjie supergroup -- : /home/yinzhengjie/data/temp
[yinzhengjie@s101 download]$

[yinzhengjie@s101 download]$ hdfs dfs -put temp /home/yinzhengjie/data

 

二.部署Spark集群

 1>.创建slaves文件

[yinzhengjie@s101 ~]$ cp /soft/spark/conf/slaves.template /soft/spark/conf/slaves
[yinzhengjie@s101 ~]$ more /soft/spark/conf/slaves | grep -v ^# | grep -v ^$
s102
s103
s104
[yinzhengjie@s101 ~]$

2>.修改spark-env.sh配置文件

[yinzhengjie@s101 ~]$ cp /soft/spark/conf/spark-env.sh.template /soft/spark/conf/spark-env.sh
[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ echo export JAVA_HOME=/soft/jdk >> /soft/spark/conf/spark-env.sh
[yinzhengjie@s101 ~]$ echo SPARK_MASTER_HOST=s101 >> /soft/spark/conf/spark-env.sh
[yinzhengjie@s101 ~]$ echo SPARK_MASTER_PORT= >> /soft/spark/conf/spark-env.sh
[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ grep -v ^# /soft/spark/conf/spark-env.sh | grep -v ^$
export JAVA_HOME=/soft/jdk
SPARK_MASTER_HOST=s101
SPARK_MASTER_PORT=
[yinzhengjie@s101 ~]$

3>.将s101机器上的spark环境进行分发

[yinzhengjie@s101 ~]$ xrsync.sh /soft/spark
spark/ spark-2.1.-bin-hadoop2./
[yinzhengjie@s101 ~]$ xrsync.sh /soft/spark
spark/ spark-2.1.-bin-hadoop2./
[yinzhengjie@s101 ~]$ xrsync.sh /soft/spark/
=========== s102 %file ===========
命令执行成功
=========== s103 %file ===========
命令执行成功
=========== s104 %file ===========
命令执行成功
=========== s105 %file ===========
命令执行成功
[yinzhengjie@s101 ~]$ xrsync.sh /soft/spark-2.1.-bin-hadoop2./
=========== s102 %file ===========
命令执行成功
=========== s103 %file ===========
命令执行成功
=========== s104 %file ===========
命令执行成功
=========== s105 %file ===========
命令执行成功
[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ su root
Password:
[root@s101 yinzhengjie]#
[root@s101 yinzhengjie]# xrsync.sh /etc/profile
=========== s102 %file ===========
命令执行成功
=========== s103 %file ===========
命令执行成功
=========== s104 %file ===========
命令执行成功
=========== s105 %file ===========
命令执行成功
[root@s101 yinzhengjie]#
[root@s101 yinzhengjie]# exit
exit
[yinzhengjie@s101 ~]$

4>.在所有的spark节点的conf/目录创建core-site.xml和hdfs-site.xml软连接文件

[yinzhengjie@s101 ~]$ xcall.sh "ln -s /soft/hadoop/etc/hadoop/core-site.xml /soft/spark/conf/core-site.xml"
============= s101 ln -s /soft/hadoop/etc/hadoop/core-site.xml /soft/spark/conf/core-site.xml ============
命令执行成功
============= s102 ln -s /soft/hadoop/etc/hadoop/core-site.xml /soft/spark/conf/core-site.xml ============
命令执行成功
============= s103 ln -s /soft/hadoop/etc/hadoop/core-site.xml /soft/spark/conf/core-site.xml ============
命令执行成功
============= s104 ln -s /soft/hadoop/etc/hadoop/core-site.xml /soft/spark/conf/core-site.xml ============
命令执行成功
============= s105 ln -s /soft/hadoop/etc/hadoop/core-site.xml /soft/spark/conf/core-site.xml ============
命令执行成功
[yinzhengjie@s101 ~]$ xcall.sh "ln -s /soft/hadoop/etc/hadoop/hdfs-site.xml /soft/spark/conf/hdfs-site.xml"
============= s101 ln -s /soft/hadoop/etc/hadoop/hdfs-site.xml /soft/spark/conf/hdfs-site.xml ============
命令执行成功
============= s102 ln -s /soft/hadoop/etc/hadoop/hdfs-site.xml /soft/spark/conf/hdfs-site.xml ============
命令执行成功
============= s103 ln -s /soft/hadoop/etc/hadoop/hdfs-site.xml /soft/spark/conf/hdfs-site.xml ============
命令执行成功
============= s104 ln -s /soft/hadoop/etc/hadoop/hdfs-site.xml /soft/spark/conf/hdfs-site.xml ============
命令执行成功
============= s105 ln -s /soft/hadoop/etc/hadoop/hdfs-site.xml /soft/spark/conf/hdfs-site.xml ============
命令执行成功
[yinzhengjie@s101 ~]$

5>.启动Spark集群

[yinzhengjie@s101 ~]$ /soft/spark/sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /soft/spark/logs/spark-yinzhengjie-org.apache.spark.deploy.master.Master--s101.out
s102: starting org.apache.spark.deploy.worker.Worker, logging to /soft/spark/logs/spark-yinzhengjie-org.apache.spark.deploy.worker.Worker--s102.out
s104: starting org.apache.spark.deploy.worker.Worker, logging to /soft/spark/logs/spark-yinzhengjie-org.apache.spark.deploy.worker.Worker--s104.out
s103: starting org.apache.spark.deploy.worker.Worker, logging to /soft/spark/logs/spark-yinzhengjie-org.apache.spark.deploy.worker.Worker--s103.out
[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ xcall.sh jps
============= s101 jps ============
NameNode
DFSZKFailoverController
Master
Jps
命令执行成功
============= s102 jps ============
DataNode
QuorumPeerMain
Worker
JournalNode
Jps
命令执行成功
============= s103 jps ============
Worker
Jps
JournalNode
QuorumPeerMain
DataNode
命令执行成功
============= s104 jps ============
Worker
Jps
JournalNode
DataNode
QuorumPeerMain
命令执行成功
============= s105 jps ============
DFSZKFailoverController
Jps
NameNode
命令执行成功
[yinzhengjie@s101 ~]$

6>.启动spark-shell连接到spark集群

[yinzhengjie@s101 ~]$ spark-shell --master spark://s101:7077
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
// :: WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
// :: WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/soft/spark-2.1.0-bin-hadoop2.7/jars/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/soft/spark/jars/datanucleus-api-jdo-3.2.6.jar."
// :: WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/soft/spark/jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/soft/spark-2.1.0-bin-hadoop2.7/jars/datanucleus-core-3.2.10.jar."
// :: WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/soft/spark-2.1.0-bin-hadoop2.7/jars/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/soft/spark/jars/datanucleus-rdbms-3.2.9.jar."
// :: ERROR ObjectStore: Version information found in metastore differs 2.1. from expected schema version 1.2.. Schema verififcation is disabled hive.metastore.schema.verification so setting version.
// :: WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
// :: WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Spark context Web UI available at http://172.30.100.101:4040
Spark context available as 'sc' (master = spark://s101:7077, app id = app-20180727051910-0000).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.
/_/ Using Scala version 2.11. (Java HotSpot(TM) -Bit Server VM, Java 1.8.0_131)
Type in expressions to have them evaluated.
Type :help for more information. scala>

[yinzhengjie@s101 ~]$ spark-shell --master spark://s101:7077  (默认的端口是7077,这个端口可以修改哟!)

7>.查看WebUI界面

 8>.编写程序在Spark集群上实现WordCount

val rdd1 =  sc.parallelize(Array[String]("hello world1" , "hello world2" , "hello world3"))
rdd1.flatMap(_.split(" ")).map((_,)).reduceByKey(_+_).collect.sortBy(t=> -t._2 )

三.在Spark集群中执行代码

1>.Create JAR from Moudles

2>.选择需要打包的项目

3>.删除第三方类库

4>.点击Build Artifacts....

5>.选择build

6>.查看编译后的文件

7>.将编译后的文件上传到服务器上

8>.

Scala进阶之路-Spark独立模式(Standalone)集群部署的更多相关文章

  1. Scala进阶之路-Spark本地模式搭建

    Scala进阶之路-Spark本地模式搭建 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.Spark简介 1>.Spark的产生背景 传统式的Hadoop缺点主要有以下两 ...

  2. Scala进阶之路-Spark底层通信小案例

    Scala进阶之路-Spark底层通信小案例 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.Spark Master和worker通信过程简介 1>.Worker会向ma ...

  3. spark standalone集群部署 实践记录

    本文记录了一次搭建spark-standalone模式集群的过程,我准备了3个虚拟机服务器,三个centos系统的虚拟机. 环境准备: -每台上安装java1.8 -以及scala2.11.x (x代 ...

  4. Standalone 集群部署

    Spark中调度其实是分为两个层级的,即集群层级的资源分配和任务调度,以及任务层级的任务管理.其中集群层级调度是可配置的,Spark目前提供了Local,Standalone,YARN,Mesos.任 ...

  5. linux平台使用spark-submit以cluster模式提交spark应用到standalone集群

    shell脚本如下 sparkHome=/home/spark/spark-2.2.0-bin-hadoop2.7 $sparkHome/bin/spark-submit \ --class stre ...

  6. Flink 集群搭建,Standalone,集群部署,HA高可用部署

    基础环境 准备3台虚拟机 配置无密码登录 配置方法:https://ipooli.com/2020/04/linux_host/ 并且做好主机映射. 下载Flink https://www.apach ...

  7. 在spark中启动standalone集群模式cluster问题

    spark-submit --master spark://master:7077 --deploy-mode cluster --driver-cores 2 --driver-memory 100 ...

  8. windows平台使用spark-submit以client方式提交spark应用到standalone集群

    1.spark应用打包,我喜欢打带依赖的,这样省事. 2.使用spark-submit.bat 提交应用,代码如下: for /f "tokens=1,2 delims==" %% ...

  9. Spark概述及集群部署

    Spark概述 什么是Spark (官网:http://spark.apache.org) Spark是一种快速.通用.可扩展的大数据分析引擎,2009年诞生于加州大学伯克利分校AMPLab,2010 ...

随机推荐

  1. Opentsdb 启动显示配置文件不存在

    今天 重新启动opentsdb  出现本地配置文件不存在   这不知道  我查了一下官网 了解到 You can use the --config command line argument to s ...

  2. <<梦断代码>>阅读笔记一

    没有想象中的枯燥,甚至有些有趣.这就是我对<梦断代码>这一本书的第一印象.而且,作为一本面向程序员的书籍,作者很有意义地从第0章开始,那我也从第0章开始说.这第一次读书笔记是针对0~2 章 ...

  3. <<浪潮之巅>>阅读笔记一

    第一次的阅读就想读这本书的,却因为很多愿意一直拖到现在,因为听说这本书在李开复先生 的微博上有推荐,更是增加了我的阅读兴趣.可能是因为在网上找的电子版的原因,通篇阅读的速度很快,但是没有纸质数那种细嚼 ...

  4. spring播放器详细设计说明书(一)

    1 引言 1.1编写目的 编写目的是详细说明SPRING音乐播放器的设计使用,预期读者对象为在个人电脑上需要使用简单音乐播放器的用户.1.2项目背景  说明: a.待开发软件系统的名称为SPRING音 ...

  5. 【软件工程】5.8 黑盒&白盒测试

    代码链接:http://www.cnblogs.com/bobbywei/p/4469145.html#3174062 搭档博客:http://www.cnblogs.com/Roc201306114 ...

  6. python中的hasattr()、getattr()、setattr()

    hasattr()的用法和理解--hasattr(obj, target) 判断对象obj中是否含有,目标target属性,然后返回布尔值,如果有返回True,没有返回False. >>& ...

  7. ASP.NET MVC4学习笔记

    一.MVC简介

  8. ESXi 更新补丁 暂时未测试 等有时间尝试一下.

    下载地址: https://my.vmware.com/group/vmware/patch 使用操作图: 选择相应的zip包下载即可 更新方式: 命令方式升级ESXi主机补丁包 1.进入VMware ...

  9. 使用TensorFlow实现分类

    这一节使用TF搭建一个简单的神经网络用于分类任务,首先把需要的包引入,另外为了防止在多次运行中一些图中的tensor在内存中影响实验,采取重置操作: import tensorflow as tf i ...

  10. python异常提示表

    Python常见的异常提示及含义对照表如下: 异常名称 描述 BaseException 所有异常的基类 SystemExit 解释器请求退出 KeyboardInterrupt 用户中断执行(通常是 ...