大数据-spark HA集群搭建

一、安装scala

我们安装的是scala-2.11.8 5台机器全部安装

下载需要的安装包，放到特定的目录下/opt/workspace/并进行解压

1、解压缩

[root@master1 ~]# cd /opt/workspace

[root@master1 workspace]#tar -zxvf scala-2.11..tar.gz

2、配置环境变量 /etc/profile文件中添加spark配置

[root@master1 ~]# vi /etc/profile

# Scala Config

export SCALA_HOME=/opt/software/scala-2.11.8

export PATH=$SCALA_HOME/bin:$PATH

[root@master1 ~]# source /etc/profile

3、启动scala

[root@master1 workspace]# vim /etc/profile
[root@master1 workspace]# scala -version
-bash: /opt/workspace/scala-2.11.8/bin/scala: 权限不够
[root@master1 workspace]# chmod +x /opt/workspace/scala-2.11.8/bin/scala
[root@master1 workspace]# scala -version
Scala code runner version 2.11.8 -- Copyright 2002-2016, LAMP/EPFL
[root@master1 workspace]# scala
Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181).
Type in expressions for evaluation. Or try :help.

scala>

二、安装spark

1、下载spark对应版本

因为后期需要安装Hive，并且会运行hive on spark模式，为避免jar冲突，我们去掉了spark中的hive部分。

我们应用的是spark-2.3.0-bin-hadoop2-without-hive.tgz 自己编译的版本

可参考https://blog.csdn.net/sinat_25943197/article/details/81906060进行编译

2、文件解压

[root@master1 workspace]# tar -zxvf spark-2.3.0-bin-hadoop2-without-hive.tgz

3、配置文件 spark-env.sh slaves、/etc/profile

/etc/profile文件中添加

# Spark Config

export SPARK_HOME=/opt/workspace/spark-2.3.-bin-hadoop2-without-hive
export PATH=.:${JAVA_HOME}/bin:${SCALA_HOME}/bin:${MAVEN_HOME}/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:${SPARK_HOME}/bin:$SQOOP_HOME/bin:${ZK_HOME}/bin:$PATH

source /etc/profile

spark-env.sh.template重新命名为spark-env.sh文件、配置如下：

#    http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

#

# This file is sourced when running various Spark programs.

# Options read when launching programs locally with

# ./bin/run-example or ./bin/spark-submit

# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files

# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node

# - SPARK_PUBLIC_DNS, to set the public dns name of the driver program

# Options read by executors and drivers running inside the cluster

# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node

# - SPARK_PUBLIC_DNS, to set the public DNS name of the driver program

# - SPARK_LOCAL_DIRS, storage directories to use on this node for shuffle and RDD data

# - MESOS_NATIVE_JAVA_LIBRARY, to point to your libmesos.so if you use Mesos

# Options read in YARN client/cluster mode

# - SPARK_CONF_DIR, Alternate conf dir. (Default: ${SPARK_HOME}/conf)

# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files

# - YARN_CONF_DIR, to point Spark towards YARN configuration files when you use YARN

# - SPARK_EXECUTOR_CORES, Number of cores for the executors (Default: ).

# - SPARK_EXECUTOR_MEMORY, Memory per Executor (e.g. 1000M, 2G) (Default: 1G)

# - SPARK_DRIVER_MEMORY, Memory for Driver (e.g. 1000M, 2G) (Default: 1G)

#export SPARK_MASTER_IP=master1

export SPARK_SSH_OPTS="-p 61333"

export SPARK_MASTER_PORT=

export SPARK_WORKER_INSTANCES=

export SCALA_HOME=/opt/workspace/scala-2.11.

export JAVA_HOME=/opt/workspace/jdk1.

export HADOOP_HOME=/opt/workspace/hadoop-2.9.

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export SPARK_HOME=/opt/workspace/spark-2.3.-bin-hadoop2-without-hive

export SPARK_CONF_DIR=$SPARK_HOME/conf

export SPARK_EXECUTOR_MEMORY=5120M

export SPARK_DIST_CLASSPATH=$(/opt/workspace/hadoop-2.9./bin/hadoop classpath)

export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=master1:2181,master2:2181,slave1:2181,slave2:2181,slave3:2181 -Dspark.deploy.zookeeper.dir=/spark"

# Options for the daemons used in the standalone deploy mode

# - SPARK_MASTER_HOST, to bind the master to a different IP address or hostname

# - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports for the master

# - SPARK_MASTER_OPTS, to set config properties only for the master (e.g. "-Dx=y")

# - SPARK_WORKER_CORES, to set the number of cores to use on this machine

# - SPARK_WORKER_MEMORY, to set how much total memory workers have to give executors (e.g. 1000m, 2g)

# - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports for the worker

# - SPARK_WORKER_DIR, to set the working directory of worker processes

# - SPARK_WORKER_OPTS, to set config properties only for the worker (e.g. "-Dx=y")

# - SPARK_DAEMON_MEMORY, to allocate to the master, worker and history server themselves (default: 1g).

# - SPARK_HISTORY_OPTS, to set config properties only for the history server (e.g. "-Dx=y")

# - SPARK_SHUFFLE_OPTS, to set config properties only for the external shuffle service (e.g. "-Dx=y")

# - SPARK_DAEMON_JAVA_OPTS, to set config properties for all daemons (e.g. "-Dx=y")

# - SPARK_DAEMON_CLASSPATH, to set the classpath for all daemons

# - SPARK_PUBLIC_DNS, to set the public dns name of the master or workers

# Generic options for the daemons used in the standalone deploy mode

# - SPARK_CONF_DIR      Alternate conf dir. (Default: ${SPARK_HOME}/conf)

# - SPARK_LOG_DIR       Where log files are stored.  (Default: ${SPARK_HOME}/logs)

# - SPARK_PID_DIR       Where the pid file is stored. (Default: /tmp)

# - SPARK_IDENT_STRING  A string representing this instance of spark. (Default: $USER)

# - SPARK_NICENESS      The scheduling priority for daemons. (Default: )

# - SPARK_NO_DAEMONIZE  Run the proposed command in the foreground. It will not output a PID file.

# Options for native BLAS, like Intel MKL, OpenBLAS, and so on.

# You might get better performance to enable these options if using native BLAS (see SPARK-).

# - MKL_NUM_THREADS=        Disable multi-threading of Intel MKL

# - OPENBLAS_NUM_THREADS=   Disable multi-threading of OpenBLAS

slaves.template文件重新命名为slaves、配置如下：

# Licensed to the Apache Software Foundation (ASF) under one or more

# contributor license agreements.  See the NOTICE file distributed with

# this work for additional information regarding copyright ownership.

# The ASF licenses this file to You under the Apache License, Version 2.0

# (the "License"); you may not use this file except in compliance with

# the License.  You may obtain a copy of the License at

#

#    http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

#

# A Spark Worker will be started on each of the machines listed below.

slave1

slave2

slave3

4、启动spark

[root@master1 workspace]# ./spark-2.3.0-bin-hadoop2-without-hive/sbin/start-all.sh

报错：默认是22端口，进行ssh端口修改

解决：在spark-env.sh中增加端口

export SPARK_SSH_OPTS="-p 61333"

重新启动spark

启动成功

5、手动启动备用master

[root@master2 workspace]# ./spark-2.3.0-bin-hadoop2-without-hive/sbin/start-master.sh

参考：https://blog.csdn.net/sinat_25943197/article/details/81906060

大数据-spark HA集群搭建的更多相关文章

大数据-HBase HA集群搭建
1.下载对应版本的Hbase,在我们搭建的集群环境中选用的是hbase-1.4.6 将下载完成的hbase压缩包放到对应的目录下,此处我们的目录为/opt/workspace/ 2.对已经有的压缩包进 ...
大数据-hadoop HA集群搭建
一.安装hadoop.HA及配置journalnode 实现namenode HA 实现resourcemanager HA namenode节点之间通过journalnode同步元数据首先下载需要 ...
大数据学习——HADOOP集群搭建
4.1 HADOOP集群搭建 4.1.1集群简介 HADOOP集群具体来说包含两个集群:HDFS集群和YARN集群,两者逻辑上分离,但物理上常在一起 HDFS集群: 负责海量数据的存储,集群中的角色主 ...
大数据中Hadoop集群搭建与配置
前提环境是之前搭建的4台Linux虚拟机,详情参见 Linux集群搭建该环境对应4台服务器,192.168.1.60.61.62.63,其中60为主机,其余为从机软件版本选择: Java:JDK1 ...
大数据中HBase集群搭建与配置
hbase是分布式列式存储数据库,前提条件是需要搭建hadoop集群,需要Zookeeper集群提供znode锁机制,hadoop集群已经搭建,参考 Hadoop集群搭建 ,该文主要介绍Zookeep ...
大数据平台Hadoop集群搭建
一.概念 Hadoop是由java语言编写的,在分布式服务器集群上存储海量数据并运行分布式分析应用的开源框架,其核心部件是HDFS与MapReduce.HDFS是一个分布式文件系统,类似mogilef ...
大数据学习——Storm集群搭建
安装storm之前要安装zookeeper 一.安装storm步骤 1.下载安装包 2.解压安装包 .tar.gz storm 3.修改配置文件 mv /root/apps/storm/conf/st ...
大数据中Linux集群搭建与配置
因测试需要,一共安装4台linux系统,在windows上用vm搭建. 对应4个IP为192.168.1.60.61.62.63,这里记录其中一台的搭建过程,其余的可以直接复制虚拟机,并修改相关配置即 ...
大数据学习——hadoop集群搭建2.X
1.准备Linux环境 1.0先将虚拟机的网络模式选为NAT 1.1修改主机名 vi /etc/sysconfig/network NETWORKING=yes HOSTNAME=itcast ### ...

随机推荐

Java 8特性
1. Java8的新特性 1.1. Lambda表达式和函数式接口最简单的Lambda表达式可以用逗号分隔的参数列表.->符号和功能语句块来表示.示例如下: Arrays.asList( &q ...
lintcode-单例
单例是最为最常见的设计模式之一.对于任何时刻,如果某个类只存在且最多存在一个具体的实例,那么我们称这种设计模式为单例.例如,对于 class Mouse (不是动物的mouse哦),我们应将其设计为 ...
ant+proguard签名打包 .jar
ant+proguard签名打包 .jar 摘自: https://blog.csdn.net/a_ycmbc/article/details/53432812 2016年12月02日 14:52:3 ...
MVC各个层的作用
(1)控制器的作用是调用模型,并调用视图,将模型产生的数据传递给视图.并让相关视图去显示.(2)模型的作用是获取数据并处理数据.(3)视图的作用是将取得的数据进行组织.美化等,并最终向用户终端输出.
es-多文档简单查询（_mget）
1.多文档查询 (1)url:POST http://localhost:9200/_mget?pretty/ 参数: { "docs": [{ "_index" ...
Caused by: org.hibernate.HibernateException: Unable to build the default ValidatorFactory
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'testAction': ...
CodeForces 288B Polo the Penguin and Houses （暴力或都快速幂）
题意:给定 n 和k,n 表示有n个房子,然后每个有一个编号,一只鹅要从一个房间中开始走,下一站就是房间的编号,现在要你求出有多少种方法编号并满足下面的要求: 1.如果从1-k房间开始走,一定能直到 ...
CodeForces 474C Captain Marmot (数学，旋转，暴力)
题意:给定 4n * 2 个坐标,分成 n组,让你判断,点绕点的最少次数使得四个点是一个正方形的顶点. 析:那么就一个一个的判断,n 很小,不会超时,四个点分别从不转然后转一次,转两次...转四次,就 ...
delphi实现截全屏功能
procedure TForm1.Button10Click(Sender: TObject);var bmp: TBitmap; can: TCanvas; dc: HDC; Image1: TIm ...
mybatis 单表的增删改查
添加数据返回id mapper.xml mapper -> insert -> selectKey mybatis 内置别名

大数据-spark HA集群搭建

大数据-spark HA集群搭建的更多相关文章

随机推荐

热门专题