There are a lot of Hadoop related projects which are open sourced and widely used by many componies. This article will go through the installations of them.

Install JDK

Install Hadoop

Install Hbase

Install Hive

Install Spark

Install Impala

Install Sqoop

Install Alluxio

Install JDK

Step 1: download package from offical site, and choose appropriate version.

Step 2: unzip the package and copy to destination folder

tar zxf jdk-8u111-linux-x64.tar.gz

cp -R jdk1.8.0_111/* /usr/share

Step 3: setting PATH and JAVA_HOME

vi ~/.bashrc

export JAVA_HOME=/usr/share/jdk1.8.0_111
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

source ~/.bashrc

Step 4: reboot server to make the changes take effect

Step 5: check java version

java -version

javac -version

Install Hadoop

Follow below steps to install Hadoop in standalone mode.

Step 1: download package from apache site

Step 2: unzip the package and copy to destination folder

tar zxf hadoop-2.7.3.tar.gz

cp -R hadoop-2.7.3/* /usr/share/hadoop

Step 3: create 'hadoop' fiolder under 'home'

mkdir /home/hadoop

Step 4: set PATH and HADOOP_HOME

vi ~/.bashrc

export HADOOP_HOME=/usr/share/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_INSTALL=$HADOOP_HOME

source ~/.bashrc

Step 5: check hadoop version

hadoop version

Step 6: config hadoop hdfs, core site, yarn and map-reduce

cd $HADOOP_HOME/etc/hadoop

vi hadoop-env.sh

export JAVA_HOME=/usr/share/jdk1..0_111

vi core-site.xml

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>

vi hdfs-site.xml

<property>
<name>dfs.replication</name>
<value></value>
</property> <property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopinfra/hdfs/namenode</value>
</property> <property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopinfra/hdfs/datanode</value>
</property>

vi yarn-site.xml

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

cp mapred-site.xml.template mapred-site.xml

vi mapred-site.xml

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

Step 7: initialize hadoop namenode

hdfs namenode -format

Step 8: start hadoop

start-dfs.sh

start-yarn.sh

Step 9: check hadoop site to see if it works

http://localhost:50070/

http://localhost:8088/

Install HBase

Follow below steps to install HBase in standalone mode.

Step 1: check if Hadoop installed

hadoop version

Step 2: download version 1.2.4 of hbase from apache site

Step3: unzip package and copy to destination folder

tar zxf hbase-1.2.4-bin.tar.gz

cp -R hbase-1.2.4-bin/* /usr/share/base

Step 4: configure hbase env

cd /usr/shar/hbase/conf

vi hbase-env.sh

export JAVA_HOME=/usr/share/jdk1..0_111

Step 5: modify hbase-site.xml

vi hbase-site.xml

<configuration>
//Here you have to set the path where you want HBase to store its files.
<property>
<name>hbase.rootdir</name>
<value>file:/home/hadoop/HBase/HFiles</value>
</property> //Here you have to set the path where you want HBase to store its built in zookeeper files.
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper</value>
</property>
</configuration>

Step 6: start hbase and check hbase directory in hdfs

cd /usr/share/hbase/bin

start-hbase.sh

hadoop fs -ls /hbase

Step 7: check hbase via web interface

http://localhost:60010

Install Hive

Step 1: download version 1.2.1 of hive from apache site

Step 2: unzip the package and copy to destination folder

tar zxf apache-hive-1.2.1-bin.tar.gz

cp -R apache-hive-1.2.1-bin/* /usr/share/hive

Step 3: set HIVE_HOME

vi ~/.bashrc

export HIVE_HOME=/usr/share/hive
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:/usr/share/hadoop/lib/*:.
export CLASSPATH=$CLASSPATH:/usr/share/hive/lib/*:.

source ~/.bashrc

Step 4: configure env for hive

cd $HIVE_HOME/conf

cp hive-env.sh.template hive-env.sh

export HADOOP_HOME-/usr/share/hadoop

Step 5: download version 10.12.1.1 of Apache Derby from apache site

Step 6: unzip derby package and copy to destination folder

tar zxf db-derby-10.12.1.1-bin.tar.gz

cp -R db-derby-10.12.1.1-bin/* /usr/share/derby

Step 7: setup DERBY_HOME

vi ~/.bashrc

export DERBY_HOME=/usr/local/derby
export PATH=$PATH:$DERBY_HOME/bin
export CLASSPATH=$CLASSPATH:$DERBY_HOME/lib/derby.jar:$DERBY_HOME/lib/derbytools.jar

source ~/.bashrc

Step 8: create a directory to store metastore

mkdir $DERBY_HOME/data

Step 9: configure metasore of hive

cd $HIVE_HOME/conf

cp hive-default.xml.template hive-site.xml

<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true </value>
<description>JDBC connect string for a JDBC metastore </description>
</property>

Step 10: create a file named jpox.properties and add the following content into it

touch jpox.properties

vi jpox.properties

javax.jdo.PersistenceManagerFactoryClass =

org.jpox.PersistenceManagerFactoryImpl
org.jpox.autoCreateSchema = false
org.jpox.validateTables = false
org.jpox.validateColumns = false
org.jpox.validateConstraints = false
org.jpox.storeManagerType = rdbms
org.jpox.autoCreateSchema = true
org.jpox.autoStartMechanismMode = checked
org.jpox.transactionIsolation = read_committed
javax.jdo.option.DetachAllOnCommit = true
javax.jdo.option.NontransactionalRead = true
javax.jdo.option.ConnectionDriverName = org.apache.derby.jdbc.ClientDriver
javax.jdo.option.ConnectionURL = jdbc:derby://hadoop1:1527/metastore_db;create = true
javax.jdo.option.ConnectionUserName = APP
javax.jdo.option.ConnectionPassword = mine

Step 11: enter into hive shell and execute command 'show tables'

cd $HIVE_HOME/bin

hive

hive> show tables;

Install Spark

Step 1: download version 2.12.0 of scala from scala site

Step 2: unzip the package and copy to destination folder

tar zxf scala-2.12.0.tgz

cp -R scala-2.12.0/* /usr/share/scala

Step 3: set PATH for scala

vi ~/.bashrc

export PATH=$PATH:/usr/share/scala/bin

source ~/.bashrc

Step 4: check scala version

scala -version

Step 5: download version 2.0.2 of spark from apache site

Step 6: unzip the package and copy to destination folder

tar zxf spark-2.0.2-bin-hadoop2.7.tgz

copy spark-2.0.2-bin-hadoop2.7/* /usr/share/spark

Step 7: setup PATH

vi ~/.bashrc

export PATH=$PATH:/usr/share/spark/bin

source ~/.bashrc

Step 8: enter into spark-shell to see if spark is installed successfully

spark-shell

Install Impala

Step 1: download version 2.7.0 of impala from impala site

Step 2: unzip the package and copy to destination folder

tar zxf apache-impala-incubating-2.7.0.tar.gz

cp -R apache-impala-incubating-2.7.0/* /usr/share/impala

Step 3: set PATH and IMPALA_HOME

vi ~/.bashrc

export IMPALA_HOME=/usr/share/impala
export PATH=$PATH:/usr/share/impala

source ~/.bashrc

Step 4: to be continued...

Install Sqoop

Preconditions: should have Hadoop (HDFS and Map-Reduce) installed

Step 1: download version 1.4.6 of sqoop from apache site

Step 2: unzip the package and copy to destination folder

tar zxf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz

cp -R sqoop-1.4.6.bin__hadoop-2.0.4-alpha/* /usr/share/sqoop

Step 3: set SQOOP_HOME and PATH

vi ~/.bashrc

export SQOOP_HOME=/usr/lib/sqoop
export PATH=$PATH:$SQOOP_HOME/bin

source ~/.bashrc

Step 4: configure sqoop

cd $SQOOP_HOME/conf

mv sqoop-env-template.sh sqoop-env.sh

vi sqoop-env.sh

export HADOOP_COMMON_HOME=/usr/share/hadoop
export HADOOP_MAPRED_HOME=/usr/share/hadoop

Step 5: download version 5.1.40 of mysql-connector-java from site

Step 6: unzip the package and move related jar file into destination folder

$ tar -zxf mysql-connector-java-5.1.40.tar.gz
# cd mysql-connector-java-5.1.40
# mv mysql-connector-java-5.1.40-bin.jar /usr/lib/sqoop/lib

Step 7: verify if sqoop is installed successfully

cd $SQOOP_HOME/bin

sqoop-version

Install Alluxio

Step 1: download version 1.3.0 of alluxio from site

Step 2: unzip the package and move it to destination folder

tar zxf alluxio-1.3.0-hadoop2.7-bin.tar.gz

cp -R alluxio-1.3.0-hadoop2.7-bin/* /usr/share/alluxio

Step 3: create alluxio-env

cd /usr/share/alluxio

bin/alluxio bootstrapConf localhost local

vi conf/alluxio-env.sh

export ALLUXIO_UNDERFS_ADDRESS=/tmp

Step 4: format alluxio file system and start alluxio

cd /usr/share/alluxio

bin/alluxio format

bin/alluxio-start.sh local

Step 5: verify if alluxio is running by visiting http://localhost:19999

Step 6: run predefined tests

cd /usr/share/alluxio

bin/alluxio runTests

Hadoop Ecosytem的更多相关文章

  1. Hadoop 中利用 mapreduce 读写 mysql 数据

    Hadoop 中利用 mapreduce 读写 mysql 数据   有时候我们在项目中会遇到输入结果集很大,但是输出结果很小,比如一些 pv.uv 数据,然后为了实时查询的需求,或者一些 OLAP ...

  2. 初识Hadoop、Hive

    2016.10.13 20:28 很久没有写随笔了,自打小宝出生后就没有写过新的文章.数次来到博客园,想开始新的学习历程,总是被各种琐事中断.一方面确实是最近的项目工作比较忙,各个集群频繁地上线加多版 ...

  3. hadoop 2.7.3本地环境运行官方wordcount-基于HDFS

    接上篇<hadoop 2.7.3本地环境运行官方wordcount>.继续在本地模式下测试,本次使用hdfs. 2 本地模式使用fs计数wodcount 上面是直接使用的是linux的文件 ...

  4. hadoop 2.7.3本地环境运行官方wordcount

    hadoop 2.7.3本地环境运行官方wordcount 基本环境: 系统:win7 虚机环境:virtualBox 虚机:centos 7 hadoop版本:2.7.3 本次先以独立模式(本地模式 ...

  5. 【Big Data】HADOOP集群的配置(一)

    Hadoop集群的配置(一) 摘要: hadoop集群配置系列文档,是笔者在实验室真机环境实验后整理而得.以便随后工作所需,做以知识整理,另则与博客园朋友分享实验成果,因为笔者在学习初期,也遇到不少问 ...

  6. Hadoop学习之旅二:HDFS

    本文基于Hadoop1.X 概述 分布式文件系统主要用来解决如下几个问题: 读写大文件 加速运算 对于某些体积巨大的文件,比如其大小超过了计算机文件系统所能存放的最大限制或者是其大小甚至超过了计算机整 ...

  7. 程序员必须要知道的Hadoop的一些事实

    程序员必须要知道的Hadoop的一些事实.现如今,Apache Hadoop已经无人不知无人不晓.当年雅虎搜索工程师Doug Cutting开发出这个用以创建分布式计算机环境的开源软...... 1: ...

  8. Hadoop 2.x 生态系统及技术架构图

    一.负责收集数据的工具:Sqoop(关系型数据导入Hadoop)Flume(日志数据导入Hadoop,支持数据源广泛)Kafka(支持数据源有限,但吞吐大) 二.负责存储数据的工具:HBaseMong ...

  9. Hadoop的安装与设置(1)

    在Ubuntu下安装与设置Hadoop的主要过程. 1. 创建Hadoop用户 创建一个用户,用户名为hadoop,在home下创建该用户的主目录,就不详细介绍了. 2. 安装Java环境 下载Lin ...

随机推荐

  1. CSS概念 - 可视化格式模型(二) 定位概述(普通流、绝对定位)

    2.定位概念 上一节熟悉了盒模型, 现在来看一下可视化格式模型和定位模型. 理解这两个模型的细微差异是非常重要的, 因为它们一起控制着如何在页面上布置每个元素 2.1 可视化格式模型 CSS有三种基本 ...

  2. AD对象DirectoryEntry本地开发

    DirectoryEntry类如果需要在本地计算机开发需要满足以下条件: 1.本地计算机dns解析必须和AD域控制器的dns保持一致,如图: 2.必须模拟身份验证,才能操作查询AD用户 /// < ...

  3. LSI Storcli 工具使用

    查看RAID卡ID 命令功能 查看LSI SAS3108RAID卡的ID. 命令格式 storcli64 show 使用实例 # 查看LSI SAS3108RAID卡的ID. [root@localh ...

  4. spring深入了解心得

    spring 主要核心组件 :Core.上下文(Context) .实体(Bean): spring 主要由两大特色:控制反转(IOC).面向对象(AOP): spring中Core主要用于组建Bea ...

  5. Microsoft.Office.Interop.Word.DocumentClass.SaveAs 命令失败

    asp.net 常用的生成word功能,代码也是网上常见的,自己本地反复测试过没问题.serves 2003下运行没问题,可是发布到2008上就出错.组件权限已配置,windows目录下temp权限已 ...

  6. 正则表达式回溯-导致CPU偏高

    最近了解了下有关正则表达式回溯的内容,想想就写下来,方便自己. 正则表达式匹配算法是建立在正则表达式引擎的基础上的,目前有两种引擎:DFA(确定型有穷自动机)和NFA(不确定型有穷自动机).这两种引擎 ...

  7. SSH的三个组件ssh、sftp、scp介绍

    SSH  包含3个组件 (1) ssh 远程登录节点 : ssh 用户名@IP地址 ① 不允许空密码或错误密码认证登录 ② 不允许root用户登录 ③ 有两个版本 ssh,ssh2安全性更高 (2)  ...

  8. fwrite()

    注:fwrite(),fread -可对数据块读写,且数据为二进制,文本下查看为乱码,文件的打开方式为 “b*” 实例: 写入二进制数据 for (int i = 0; i < SN; i++) ...

  9. 苹果开发者账号提示“Unable to verify mobile phone number”的解决方案

    在注册苹果开发者账号时,会提示:"Unable to verify mobile phone number.".顾名思义,没有有效的手机号码. 解决方案: 进入到Your Appl ...

  10. 20165219 《Java程序设计》实验一(Java开发环境的熟悉)实验报告

    20165219 <Java程序设计>实验一(Java开发环境的熟悉)实验报告 一.实验报告封面 课程:Java程序设计 班级:1652班 姓名:王彦博 学号:20165219 成绩: 指 ...