环境:
hadoop2.2.0
hive0.13.1
Ubuntu 14.04 LTS
java version "1.7.0_60"
Oracle10g

***欢迎转载。请注明来源*** 
 

http://blog.csdn.net/u010967382/article/details/38709751

到下面地址下载安装包
http://mirrors.cnnic.cn/apache/hive/stable/apache-hive-0.13.1-bin.tar.gz

安装包解压到server上
/home/fulong/Hive/apache-hive-0.13.1-bin

改动环境变量,加入下面内容
export HIVE_HOME=/home/fulong/Hive/apache-hive-0.13.1-bin
export PATH=$HIVE_HOME/bin:$PATH

进到conf文件夹下拷贝模板配置文件重命名
fulong@FBI006:~/Hive/apache-hive-0.13.1-bin/conf$ ls
hive-default.xml.template  hive-exec-log4j.properties.template
hive-env.sh.template       hive-log4j.properties.template
fulong@FBI006:~/Hive/apache-hive-0.13.1-bin/conf$ cp hive-env.sh.template hive-env.sh
fulong@FBI006:~/Hive/apache-hive-0.13.1-bin/conf$ cp hive-default.xml.template hive-site.xml
fulong@FBI006:~/Hive/apache-hive-0.13.1-bin/conf$ ls
hive-default.xml.template  hive-env.sh.template                 hive-log4j.properties.template
hive-env.sh                hive-exec-log4j.properties.template  hive-site.xml

改动配置文件hive-env.sh中的下面几处。分别制定Hadoop的根文件夹,Hive的conf和lib文件夹
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/home/fulong/Hadoop/hadoop-2.2.0

# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/home/fulong/Hive/apache-hive-0.13.1-bin/conf

# Folder containing extra ibraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=/home/fulong/Hive/apache-hive-0.13.1-bin/lib

改动配置文件hive-site.sh中的下面几处连接Oracle相关參数
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:oracle:thin:@192.168.0.138:1521:orcl</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>oracle.jdbc.driver.OracleDriver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
  <description>username to use against metastore database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>hivefbi</value>
  <description>password to use against metastore database</description>
</property>


配置log4j
在$HIVE_HOME下创建log4j文件夹,用于存储日志文件
拷贝模板重命名
fulong@FBI006:~/Hive/apache-hive-0.13.1-bin/conf$ cp hive-log4j.properties.template hive-log4j.properties

改动存放日志的文件夹
hive.log.dir=/home/fulong/Hive/apache-hive-0.13.1-bin/log4j

拷贝Oracle JDBC的jar包
将相应Oracle的jdbc包复制到$HIVE_HOME/lib下

启动Hive
fulong@FBI006:~/Hive/apache-hive-0.13.1-bin$ hive
14/08/20 17:14:05 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/08/20 17:14:05 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/08/20 17:14:05 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/08/20 17:14:05 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
14/08/20 17:14:05 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/08/20 17:14:05 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
14/08/20 17:14:05 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/08/20 17:14:05 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
14/08/20 17:14:05 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead

Logging initialized using configuration in file:/home/fulong/Hive/apache-hive-0.13.1-bin/conf/hive-log4j.properties
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/fulong/Hadoop/hadoop-2.2.0/lib/native/libhadoop.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
hive>

验证
打算创建一张表存储搜狗实验室下载的用户搜索行为日志。

数据下载地址:
http://www.sogou.com/labs/dl/q.html

首先创建表:
hive> create table searchlog (time string,id string,sword string,rank int,clickrank int,url string) row format delimited fields
terminated by '\t' lines terminated by '\n' stored as textfile;

此时会报错:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:javax.jdo.JDODataStoreException: An exception was thrown while adding/validating class(es) : ORA-01754: a table may contain
only one column of type LONG

解决的方法:
用解压缩工具打开${HIVE_HOME}/lib中的hive-metastore-0.13.0.jar,发现名为package.jdo的文件。打开该文件并找到以下的内容。
<field name="viewOriginalText" default-fetch-group="false">
        <column name="VIEW_ORIGINAL_TEXT" jdbc-type="LONGVARCHAR"/>
</field>
<field name="viewExpandedText" default-fetch-group="false">
        <column name="VIEW_EXPANDED_TEXT" jdbc-type="LONGVARCHAR"/>
</field>
能够发现列VIEW_ORIGINAL_TEXT和VIEW_EXPANDED_TEXT的类型都为LONGVARCHAR,相应于Oracle中的LONG,这样就与Oracle表仅仅能存在一列类型为LONG的列的要求相矛盾,所以就出现错误了。

依照Hive官网的建议将该两列的jdbc-type的值改为CLOB。改动后的内容例如以下所看到的。
<field name="viewOriginalText"default-fetch-group="false">
             <column name="VIEW_ORIGINAL_TEXT" jdbc-type="CLOB"/>
</field>
<field name="viewExpandedText"default-fetch-group="false">
             <column name="VIEW_EXPANDED_TEXT" jdbc-type="CLOB"/>
</field>

改动以后,重新启动hive。


又一次运行创建表的命令。创建表成功:
hive> create table searchlog (time string,id string,sword string,rank int,clickrank int,url string) row format delimited fields terminated by '\t' lines terminated by '\n' stored
as textfile;
OK
Time taken: 0.986 seconds

将本地数据载入进表中:
hive> load data local inpath '/home/fulong/Downloads/SogouQ.reduced' overwrite into table searchlog;
Copying data from file:/home/fulong/Downloads/SogouQ.reduced
Copying file: file:/home/fulong/Downloads/SogouQ.reduced
Loading data to table default.searchlog
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted hdfs://fulonghadoop/user/hive/warehouse/searchlog
Table default.searchlog stats: [numFiles=1, numRows=0, totalSize=152006060, rawDataSize=0]
OK
Time taken: 25.705 seconds

查看全部表:
hive> show tables;
OK
searchlog
Time taken: 0.139 seconds, Fetched: 1 row(s)

统计行数:
hive> select count(*) from searchlog;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1407233914535_0001, Tracking URL = http://FBI003:8088/proxy/application_1407233914535_0001/
Kill Command = /home/fulong/Hadoop/hadoop-2.2.0/bin/hadoop job  -kill job_1407233914535_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2014-08-20 18:03:17,667 Stage-1 map = 0%,  reduce = 0%
2014-08-20 18:04:05,426 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.46 sec
2014-08-20 18:04:27,317 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.74 sec
MapReduce Total cumulative CPU time: 4 seconds 740 msec
Ended Job = job_1407233914535_0001
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 4.74 sec   HDFS Read: 152010455 HDFS Write: 8 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 740 msec
OK
Time taken: 103.154 seconds, Fetched: 1 row(s)





版权声明:本文博客原创文章。博客,未经同意,不得转载。

【甘道夫】Hive 0.13.1 on Hadoop2.2.0 + Oracle10g部署详细解释的更多相关文章

  1. hive-安装0.13.1(hadoop2.2.0)

    hadoop2.2.0 hive0.13.1 (事先已经安装好hadoop.MySQL以及在MySQL中建好了hive专用账号,数据创建不创建都可以) 1.下载解压 2.把MySQL驱动加入hive的 ...

  2. 【甘道夫】Apache Hadoop 2.5.0-cdh5.2.0 HDFS Quotas 配额控制

    前言 HDFS为管理员提供了针对文件夹的配额控制特性,能够控制名称配额(指定文件夹下的文件&文件夹总数),或者空间配额(占用磁盘空间的上限). 本文探究了HDFS的配额控制特性,记录了各类配额 ...

  3. 【甘道夫】Win7x64环境下编译Apache Hadoop2.2.0的Eclipse小工具

    目标: 编译Apache Hadoop2.2.0在win7x64环境下的Eclipse插件 环境: win7x64家庭普通版 eclipse-jee-kepler-SR1-win32-x86_64.z ...

  4. 【甘道夫】MapReduce实现矩阵乘法--实现代码

    之前写了一篇分析MapReduce实现矩阵乘法算法的文章: [甘道夫]Mapreduce实现矩阵乘法的算法思路 为了让大家更直观的了解程序运行,今天编写了实现代码供大家參考. 编程环境: java v ...

  5. 【甘道夫】Win7环境下Eclipse连接Hadoop2.2.0

    准备: 确保hadoop2.2.0集群正常执行 1.eclipse中建立javaproject,导入hadoop2.2.0相关jar包 2.在src根文件夹下拷入log4j.properties,通过 ...

  6. 【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】

    引言: 前面转载过一篇团队兄弟[伊利丹]写的NN HA实验记录,我也基于他的环境实验了NN HA对于Client的透明性. 本篇文章记录的是亲自配置NN HA的具体全过程,以及全面測试HA对clien ...

  7. 【甘道夫】Ubuntu14 server + Hadoop2.2.0环境下Sqoop1.99.3部署记录

    第一步.下载.解压.配置环境变量: 官网下载sqoop1.99.3 http://mirrors.cnnic.cn/apache/sqoop/1.99.3/ 将sqoop解压到目标文件夹,我的是 /h ...

  8. 【甘道夫】Hadoop2.2.0环境使用Sqoop-1.4.4将Oracle11g数据导入HBase0.96,并自己主动生成组合行键

    目的: 使用Sqoop将Oracle中的数据导入到HBase中,并自己主动生成组合行键! 环境: Hadoop2.2.0 Hbase0.96 sqoop-1.4.4.bin__hadoop-2.0.4 ...

  9. 【甘道夫】Sqoop1.99.3基础操作--导入Oracle的数据到HDFS

    第一步:进入clientShell fulong@FBI008:~$ sqoop.sh client Sqoop home directory: /home/fulong/Sqoop/sqoop-1. ...

随机推荐

  1. Swift - 自定义UIActivity分享

    UIActivity可以十分方便地将文字.图片等内容进行分享,比如分享到微信.微博.发送邮件.短信等等.我们不仅可以分享内容出来,也可以在自己的App里添加自己的分享按钮或隐藏已有的分享按钮来实现定制 ...

  2. 转]解析C语言中的sizeof

    解析C语言中的sizeof 一.sizeof的概念 sizeof是C语言的一种单目操作符,如C语言的其他操作符++.--等.它并不是函数.sizeof操作符以字节形式给出 了其操作数的存储大小.操作数 ...

  3. java垃圾回收那点事(二)不同gc策略的heap分配

    在前面的文章中曾提到了在java虚拟机启动的时候会对G1,CMS, SerialGC定义不同的heap的类,并且定义不同的policy. CollectorPolicy CollectorPolicy ...

  4. Android开发之Sqlite的使用

    在Android中存储数据可以用文件.数据库.网络,其中文件和数据库是最常用的,数据库我们常用的就是Sqlite,它是一种经量级的.嵌入式的关系型数据库:在android中当需要操作SQLite数据库 ...

  5. Linux实现字符设备驱动的基础步骤

    Linux应用层想要操作kernel层的API,比方想操作相关GPIO或寄存器,能够通过写一个字符设备驱动来实现. 1.先在rootfs中的 /dev/ 下生成一个字符设备.注意主设备号 和 从设备号 ...

  6. poj 3270 更换使用

    1.确定初始和目标状态. 明确.目标状态的排序状态. 2.得出置换群,.比如,数字是8 4 5 3 2 7,目标状态是2 3 4 5 7 8.能写为两个循环:(8 2 7)(4 3 5). 3.观察当 ...

  7. 使用Ajax以及Jquery.form异步上传图片

    一.前言 之前做图片上传一直用的第三方插件,Uploadify  这个应该是用的比較多的,相同也用过别的,在方便了自己的同一时候也非常赞叹人家的功能. 思来想去,仅仅会用别的人东西,始终自己学到的少, ...

  8. hdu3790最短路径问题 (用优先队列实现的)

    Problem Description 给你n个点,m条无向边,每条边都有长度d和花费p,给你起点s终点t,要求输出起点到终点的最短距离及其花费,如果最短距离有多条路线,则输出花费最少的.   Inp ...

  9. Dubbo与Zookeeper、SpringMVC整合和使用(负载均衡、容错)(转)

    互联网的发展,网站应用的规模不断扩大,常规的垂直应用架构已无法应对,分布式服务架构以及流动计算架构势在必行,Dubbo是一个分布式服务框架,在这种情况下诞生的.现在核心业务抽取出来,作为独立的服务,使 ...

  10. (六)unity4.6Ugui中国教程文档-------概要-UGUI Animation Integration

     大家好,我是太阳广东.   转载请注明出处:http://write.blog.csdn.net/postedit/38922399 更全的内容请看我的游戏蛮牛地址:mod=guide& ...