Hadoop 2.x(YARN)安装配置LZO
今天尝试在Hadoop 2.x(YARN)上安装和配置LZO,遇到了很多坑,网上的资料都是基于Hadoop 1.x的,基本没有对于Hadoop 2.x上应用LZO,我在这边记录整个安装配置过程
1. 安装LZO
下载lzo 2.06版本,编译64位版本,同步到集群中
wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.06.tar.gz
export CFLAGS=-m64
./configure -enable-shared -prefix=/usr/local/hadoop/lzo/
make && make test && make install
同步 /usr/local/hadoop/lzo/到整个集群上
2. 安装Hadoop-LZO
git clone https://github.com/twitter/hadoop-lzo.git
export CFLAGS=-m64
export CXXFLAGS=-m64
export C_INCLUDE_PATH=/usr/local/hadoop/lzo/include
export LIBRARY_PATH=/usr/local/hadoop/lzo/lib
mvn clean package -Dmaven.test.skip=true
tar -cBf - -C target/native/Linux-amd64-64/lib . | tar -xBvf - -C /usr/local/hadoop/hadoop-2.1.0-beta/lib/native/
cp target/hadoop-lzo-0.4.18-SNAPSHOT.jar /usr/local/hadoop/hadoop-2.1.0-beta/share/hadoop/common/
lib/native下的文件,包含native libraries和native compression
-rw-r--r-- 1 hadoop hadoop 104206 Sep 2 10:44 libgplcompression.a
-rw-rw-r-- 1 hadoop hadoop 1121 Sep 2 10:44 libgplcompression.la
lrwxrwxrwx 1 hadoop hadoop 26 Sep 2 10:47 libgplcompression.so -> libgplcompression.so.0.0.0
lrwxrwxrwx 1 hadoop hadoop 26 Sep 2 10:47 libgplcompression.so.0 -> libgplcompression.so.0.0.0
-rwxrwxr-x 1 hadoop hadoop 67833 Sep 2 10:44 libgplcompression.so.0.0.0
-rw-rw-r-- 1 hadoop hadoop 835968 Aug 29 17:12 libhadoop.a
-rw-rw-r-- 1 hadoop hadoop 1482132 Aug 29 17:12 libhadooppipes.a
lrwxrwxrwx 1 hadoop hadoop 18 Aug 29 17:12 libhadoop.so -> libhadoop.so.1.0.0
-rwxrwxr-x 1 hadoop hadoop 465801 Aug 29 17:12 libhadoop.so.1.0.0
-rw-rw-r-- 1 hadoop hadoop 580384 Aug 29 17:12 libhadooputils.a
-rw-rw-r-- 1 hadoop hadoop 273642 Aug 29 17:12 libhdfs.a
lrwxrwxrwx 1 hadoop hadoop 16 Aug 29 17:12 libhdfs.so -> libhdfs.so.0.0.0
-rwxrwxr-x 1 hadoop hadoop 181171 Aug 29 17:12 libhdfs.so.0.0.0
将 hadoop-lzo-0.4.18-SNAPSHOT.jar和/usr/local/hadoop/hadoop-2.1.0-beta/lib/native/ 同步到整个集群中
export LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib
core-site加入
<property>
<name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
mapred-site.xml加入
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
<property>
<name>mapred.child.env</name>
<value>LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib</value>
</property>
2013-09-02 11:20:12,004 INFO [main] com.hadoop.compression.lzo.GPLNativeCodeLoader: Loaded native gpl library
2013-09-02 11:20:12,006 WARN [main] com.hadoop.compression.lzo.LzoCompressor: java.lang.UnsatisfiedLinkError: Cannot load liblzo2.so.2 (liblzo2.so.2: cannot open shared object file: No such file or directory)!
2013-09-02 11:20:12,006 ERROR [main] com.hadoop.compression.lzo.LzoCodec: Failed to load/initialize native-lzo library
同步hadoop-env.sh, core-site.xml, mapred-site.xml到集群
static {
if (GPLNativeCodeLoader.isNativeCodeLoaded()) {
nativeLzoLoaded = LzoCompressor.isNativeLzoLoaded() &&
LzoDecompressor.isNativeLzoLoaded();
if (nativeLzoLoaded) {
LOG.info("Successfully loaded & initialized native-lzo library [hadoop-lzo rev " + getRevisionHash() + "]");
} else {
LOG.error("Failed to load/initialize native-lzo library");
}
} else {
LOG.error("Cannot load native-lzo without native-hadoop");
}
}
Java_com_hadoop_compression_lzo_LzoCompressor_initIDs(
JNIEnv *env, jclass class
) {
// Load liblzo2.so
liblzo2 = dlopen(HADOOP_LZO_LIBRARY, RTLD_LAZY | RTLD_GLOBAL);
if (!liblzo2) {
char* msg = (char*)malloc(1000);
snprintf(msg, 1000, "%s (%s)!", "Cannot load " HADOOP_LZO_LIBRARY, dlerror());
THROW(env, "java/lang/UnsatisfiedLinkError", msg);
return;
} LzoCompressor_clazz = (*env)->GetStaticFieldID(env, class, "clazz",
"Ljava/lang/Class;");
LzoCompressor_finish = (*env)->GetFieldID(env, class, "finish", "Z");
LzoCompressor_finished = (*env)->GetFieldID(env, class, "finished", "Z");
LzoCompressor_uncompressedDirectBuf = (*env)->GetFieldID(env, class,
"uncompressedDirectBuf",
"Ljava/nio/ByteBuffer;");
LzoCompressor_uncompressedDirectBufLen = (*env)->GetFieldID(env, class,
"uncompressedDirectBufLen", "I");
LzoCompressor_compressedDirectBuf = (*env)->GetFieldID(env, class,
"compressedDirectBuf",
"Ljava/nio/ByteBuffer;");
LzoCompressor_directBufferSize = (*env)->GetFieldID(env, class,
"directBufferSize", "I");
LzoCompressor_lzoCompressor = (*env)->GetFieldID(env, class,
"lzoCompressor", "J");
LzoCompressor_lzoCompressionLevel = (*env)->GetFieldID(env, class,
"lzoCompressionLevel", "I");
LzoCompressor_workingMemoryBufLen = (*env)->GetFieldID(env, class,
"workingMemoryBufLen", "I");
LzoCompressor_workingMemoryBuf = (*env)->GetFieldID(env, class,
"workingMemoryBuf",
"Ljava/nio/ByteBuffer;"); // record lzo library version
void* lzo_version_ptr = NULL;
LOAD_DYNAMIC_SYMBOL(lzo_version_ptr, env, liblzo2, "lzo_version");
liblzo2_version = (NULL == lzo_version_ptr) ? 0
: (jint) ((unsigned (__LZO_CDECL *)())lzo_version_ptr)();
}
CREATE TABLE lzo_test(
col String
)
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
下载lzop工具,load一个lzo文件进lzo_test表中,执行“select * from lzo_test"和"select count(1) from lzo_test"正确
hadoop jar /usr/local/hadoop/hadoop-2.1.0-beta/share/hadoop/common/hadoop-lzo-0.4.18-SNAPSHOT.jar com.hadoop.compression.lzo.DistributedLzoIndexer /user/hive/warehouse/lzo_test/
hadoop jar /usr/local/hadoop/hadoop-2.1.0-beta/share/hadoop/common/hadoop-lzo-0.4.18-SNAPSHOT.jar com.hadoop.compression.lzo.LzoIndexer /user/hive/warehouse/lzo_test/
Hadoop 2.x(YARN)安装配置LZO的更多相关文章
- 每天收获一点点------Hadoop基本介绍与安装配置
一.Hadoop的发展历史 说到Hadoop的起源,不得不说到一个传奇的IT公司—全球IT技术的引领者Google.Google(自称)为云计算概念的提出者,在自身多年的搜索引擎业务中构建了突破性的G ...
- Hadoop集群_Hadoop安装配置
1.集群部署介绍 1.1 Hadoop简介 Hadoop是Apache软件基金会旗下的一个开源分布式计算平台.以Hadoop分布式文件系统(HDFS,Hadoop Distributed Filesy ...
- 三、hadoop、yarn安装配置
本文hadoop的安装版本为hadoop-2.6.5 关闭防火墙 systemctl stop firewalld 一.安装JDK 1.下载java jdk1.8版本,放在/mnt/sata1目录下, ...
- Hadoop学习笔记: 安装配置Hadoop
安装前的一些环境配置: 1. 给用户添加sudo权限,输入su - 进入root账号,然后输入visudo,进入编辑模式,找到这一行:"root ALL=(ALL) ALL"在下面 ...
- Storm on Yarn 安装配置
1.背景知识 在不修改Storm任何源代码的情况下,让Storm运行在YARN上,最简单的实现方法是将Storm的各个服务组件(包括Nimbus和Supervisor),作为单独的任务运行在YARN上 ...
- Hadoop 2.7.3 安装配置及测试
1.概述 Hadoop是一个由Apache基金会所开发的分布式系统基础架构.用户可以在不了解分布式底层细节的情况下,开发分布式程序.hadoop三种安装模式:单机模式,伪分布式,真正分布式.因在实际生 ...
- [Hadoop]Hive-1.2.x安装配置+Mysql安装
HIve的元数据存储在mysql中,需要配置与MySQL建立连接,除了安装MySQL外还要安装连接的jar包:mysql-connector-java-5.1.47.tar.gz 安装环境:Cen ...
- Hadoop学习笔记: 安装配置Hive
1. 在官网http://hive.apache.org/下载所需要版本的Hive,以下我们就以hive 2.1.0版为例. 2. 将下载好的压缩包放到指定文件夹解压,tar -zxvf apache ...
- Hadoop集群_VSFTP安装配置
原作者写的太好了,我这个菜鸟不自觉就转载了,原文链接:http://www.cnblogs.com/xia520pi/archive/2012/05/16/2503864.html 如果,您认为阅读这 ...
随机推荐
- LintCode-乱序字符串
题目描述: 给出一个字符串数组S,找到其中所有的乱序字符串(Anagram).如果一个字符串是乱序字符串,那么他存在一个字母集合相同,但顺序不同的字符串也在S中. 注意事项 所有的字符串都只包含小写字 ...
- WINDOWS操作系统中可以允许最大的线程数(线程栈预留1M空间)(56篇Windows博客值得一看)
WINDOWS操作系统中可以允许最大的线程数 默认情况下,一个线程的栈要预留1M的内存空间 而一个进程中可用的内存空间只有2G,所以理论上一个进程中最多可以开2048个线程 但是内存当然不可能完全拿来 ...
- C++模板:字典树
//插入 void insert(char *s,char *s1){ for(int l=strlen(s),x=0,i=0;i<l;i++){ if(!trie[x].son[s[i]-'a ...
- out/target/common/obj/PACKAGING/public_api.txt android.view.KeyEvent.KEYCODE_has changed value from
编译出错: out/target/common/obj/PACKAGING/public_api.txt:22549: error 17: Field android.view.KeyEvent.KE ...
- 没有login页面
"/"应用程序中的服务器错误. 无法找到资源. 说明:HTTP 404.您正在查找的资源(或者它的一个依赖项)可能已被移除,或其名称已更改,或暂时不可用.请检查以下 URL 并确保 ...
- 解决 win10 预览版开始菜单打不开的问题
除了该文章[http://jingyan.baidu.com/article/64d05a025d2668de55f73b9e.html]里面说的解决方法之外,我只加上一点 . 打开本机防火墙,或者调 ...
- Oracle存储过程function语法及案例
create or replace function F01_SX03_SUM(statdate varchar2, code varchar2, para varchar2) RETURN numb ...
- Nginx 之五: Nginx服务器的负载均衡、缓存与动静分离功能
一.负载均衡: 通过反向代理客户端的请求到一个服务器群组,通过某种算法,将客户端的请求按照自定义的有规律的一种调度调度给后端服务器. Nginx的负载均衡使用upstream定义服务器组,后面跟着组名 ...
- Java疯狂讲义(四)
- 在Github上搭建你的博客
title: blog on github date: 2014-03-24 20:29:47 tags: [blog,github,hexo] --- **用Github写博文** 参考http:/ ...