[rc@vq18ptkh01 ~]$ hadoop fs -ls /
drwxr-xr-x+ - jc_rc supergroup 0 2016-11-03 11:46 /dt
[rc@vq18ptkh01 ~]$ hadoop fs -copyFromLocal wifi_phone_list_1030.csv /dt
[rc@vq18ptkh01 ~]$ hadoop fs -copyFromLocal wifi_phone_list_1031.csv /dt
[rc@vq18ptkh01 ~]$ hadoop fs -copyFromLocal wifi_phone_list_1101.csv /dt [rc@vq18ptkh01 ~]$ hadoop fs -ls /dt
16/11/03 11:53:16 INFO hdfs.PeerCache: SocketCache disabled.
Found 3 items
-rw-r--r--+ 3 jc_rc supergroup 1548749 2016-11-03 11:48 /dt/wifi_phone_list_1030.csv
-rw-r--r--+ 3 jc_rc supergroup 1262964 2016-11-03 11:52 /dt/wifi_phone_list_1031.csv
-rw-r--r--+ 3 jc_rc supergroup 979619 2016-11-03 11:52 /dt/wifi_phone_list_1101.csv [rc@vq18ptkh01 ~]$ beeline
Connecting to jdbc:hive2://1.8.15.1:24002,10.78.152.24:24002,1.8.15.2:24002,1.8.12.42:24002,1.8.15.62:24002/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sasl.qop=auth-conf;auth=KERBEROS;principal=hive/hadoop.hadoop.com@HADOOP.COM
Debug is true storeKey false useTicketCache true useKeyTab false doNotPrompt false ticketCache is null isInitiator true KeyTab is null refreshKrb5Config is false principal is null tryFirstPass is false useFirstPass is false storePass is false clearPass is false
Acquire TGT from Cache
Principal is jc_rc@HADOOP.COM
Commit Succeeded Connected to: Apache Hive (version 1.3.0)
Driver: Hive JDBC (version 1.3.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.3.0 by Apache Hive
0: jdbc:hive2://1.8.15.2:21066/> use r_hive_db;
No rows affected (0.547 seconds) 0: jdbc:hive2://1.8.15.2:21066/> create table tmp_wifi1030(imisdn string,starttime string,endtime string) row format delimited fields terminated by ',' stored as textfile;
0: jdbc:hive2://1.8.15.2:21066/> show tables; [rc@vq18ptkh01 ~]$ wc wifi_phone_list_1030.csv -l
25390 wifi_phone_list_1030.csv
+---------------+--+
| tab_name |
+---------------+--+
| tmp_wifi1030 |
+---------------+--+
1 row selected (0.401 seconds)
0: jdbc:hive2://1.8.15.2:21066/> load data inpath 'hdfs:/dt/wifi_phone_list_1030.csv' into table tmp_wifi1030;
0: jdbc:hive2://1.8.15.2:21066/> select * from tmp_wifi1030;
| tmp_wifi1030.imisdn | tmp_wifi1030.starttime | tmp_wifi1030.endtime |
+----------------------+--------------------------+--------------------------+--+
| 18806503523 | 2016-10-30 23:58:56.000 | 2016-10-31 00:01:07.000 |
| 15700125216 | 2016-10-30 23:58:57.000 | 2016-10-31 00:01:49.000 |
+----------------------+--------------------------+--------------------------+--+
25,390 rows selected (5.649 seconds) 0: jdbc:hive2://1.8.15.2:21066/> select count(*) from tmp_wifi1030;
INFO : Number of reduce tasks determined at compile time: 1
INFO : In order to change the average load for a reducer (in bytes):
INFO : set hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a constant number of reducers:
INFO : set mapreduce.job.reduces=<number>
INFO : number of splits:1
INFO : Submitting tokens for job: job_1475071482566_2471703
INFO : Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster, Ident: (HDFS_DELEGATION_TOKEN token 19416140 for jc_rc)
INFO : Kind: HIVE_DELEGATION_TOKEN, Service: HiveServer2ImpersonationToken, Ident: 00 05 6a 63 5f 72 63 05 6a 63 5f 72 63 21 68 69 76 65 2f 68 61 64 6f 6f 70 2e 68 61 64 6f 6f 70 2e 63 6f 6d 40 48 41 44 4f 4f 50 2e 43 4f 4d 8a 01 58 28 57 df 96 8a 01 58 4c 64 63 96 8d 0d 65 ff 8e 03 97
INFO : The url to track the job: https://pc-z1:26001/proxy/application_1475071482566_2471703/
INFO : Starting Job = job_1475071482566_2471703, Tracking URL = https://pc-z1:26001/proxy/application_1475071482566_2471703/
INFO : Kill Command = /opt/huawei/Bigdata/FusionInsight_V100R002C60SPC200/FusionInsight-Hive-1.3.0/hive-1.3.0/bin/..//../hadoop/bin/hadoop job -kill job_1475071482566_2471703
INFO : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
INFO : 2016-11-03 12:04:58,351 Stage-1 map = 0%, reduce = 0%
INFO : 2016-11-03 12:05:04,702 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.72 sec
INFO : 2016-11-03 12:05:12,096 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.86 sec
INFO : MapReduce Total cumulative CPU time: 4 seconds 860 msec
INFO : Ended Job = job_1475071482566_2471703
+--------+--+
| _c0 |
+--------+--+
| 25390 |
+--------+--+
1 row selected (25.595 seconds) 0: jdbc:hive2://1.8.15.62:21066/> select * from default.d_s1mme limit 10;
+----------------------+--------------------+-------------------------+----------------------+-------------------+--------------------+--------------------+----------------------+------------------------------+------------------------------------+----------------------------------+--------------------------------+-----------------------------+-----------------------------+-----------------------+------------------------------+----------------------------+------------------------+----------------------+--------------------+------------------------------------+------------------------------------+--------------------------+--------------------------+------------------------+------------------------+-------------------+-----------------------+-------------------------+-------------------------+-------------------+---------------------------------+---------------------------+-----------------------------+----------------------------+-------------------------------+-------------------------------------+-------------------------------------+---------------------------+-----------------------------+----------------------------+-------------------------------+-------------------------------------+-------------------------------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+----------------------+--+
| d_s1mme .length | d_s1mme .city 。。。。。。。。。。。。 | | 2016101714 |
| NULL | 579 | 5 | 130980097fb8c900 | 6 | 460006791248581 | 352093070081343 | 88888888888888888 | 20 | 2016-10-17 13:30:23.0 | 2016-10-17 13:30:23.0 | 0 | 20 | NULL | 0 | 209743848 | 419 | 32 | D5095073 | NULL | NULL | NULL | 100.67.254.45 | 100.111.211.166 | 36412 | 36412 | 589D | BAE6802 | NULL | NULL | NULL | 0 | NULL | NULL | NULL | NULL | NULL | NULL | | | | | | | | | | | 2016101714 |
+----------------------+--------------------+-------------------------+----------------------+-------------------+--------------------+--------------------+----------------------+------------------------------+------------------------------------+----------------------------------+--------------------------------+-----------------------------+-----------------------------+-----------------------+------------------------------+----------------------------+------------------------+----------------------+--------------------+------------------------------------+------------------------------------+--------------------------+--------------------------+------------------------+------------------------+-------------------+-----------------------+-------------------------+-------------------------+-------------------+---------------------------------+---------------------------+-----------------------------+----------------------------+-------------------------------+-------------------------------------+-------------------------------------+---------------------------+-----------------------------+----------------------------+-------------------------------+-------------------------------------+-------------------------------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+----------------------+--+
10 rows selected (0.6 seconds)

create table tmp_mr_s1_mme1030 as
select a.length,a.city,a.interface,a.xdr_id,a.rat,a.imsi,a.imei,a.msisdn,a.procedure_start_time,a.procedure_end_time,a.mme_ue_s1ap_id,a.mme_group_id,a.mme_code,a.user_ipv4,a.tac,a.cell_id,a.other_tac,a.other_eci
from default.d_s1mme a join r_hive_db.tmp_wifi1030 b on a.msisdn=b.imisdn and a.p_hour>='20161030' and a.p_hour<'20161031'; 0: jdbc:hive2://1.8.15.2:21066/> create table tmp_mr_s1_mme_enbs1030 as
0: jdbc:hive2://1.8.15.2:21066/> select cell_id/256 from tmp_mr_s1_mme1030;
0: jdbc:hive2://1.8.15.62:21066/> create table tmp_mr_s1_mme_cellids1030 as select distinct cast(cell_id as bigint) as cellid from tmp_mr_s1_mme1030; 0: jdbc:hive2://1.8.15.62:21066/> set hive.merge.mapfiles;
+---------------------------+--+
| set |
+---------------------------+--+
| hive.merge.mapfiles=true |
+---------------------------+--+
1 row selected (0.022 seconds)
0: jdbc:hive2://1.8.15.62:21066/> set hive.merge.mapredfields;
+---------------------------------------+--+
| set |
+---------------------------------------+--+
| hive.merge.mapredfields is undefined |
+---------------------------------------+--+
1 row selected (0.022 seconds)
0: jdbc:hive2://1.8.15.62:21066/> set hive.merge.size.per.task=1024000000;
No rows affected (0.012 seconds)
0: jdbc:hive2://1.8.15.62:21066/> set hive.merge.smallfiles.avgsize=1024000000;
No rows affected (0.012 seconds)
0: jdbc:hive2://1.8.15.62:21066/> use r_hive_db;
No rows affected (0.031 seconds)
0: jdbc:hive2://1.8.15.62:21066/> insert overwrite directory '/dt/' row format delimited fields terminated by '|' select * from tmp_mr_s1_mme_cellids1030;
INFO : Number of reduce tasks is set to 0 since there's no reduce operator
INFO : number of splits:17
INFO : Submitting tokens for job: job_1475071482566_2477152
INFO : Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster, Ident: (HDFS_DELEGATION_TOKEN token 19422634 for jc_rc)
INFO : Kind: HIVE_DELEGATION_TOKEN, Service: HiveServer2ImpersonationToken, Ident: 00 05 6a 63 5f 72 63 05 6a 63 5f 72 63 21 68 69 76 65 2f 68 61 64 6f 6f 70 2e 68 61 64 6f 6f 70 2e 63 6f 6d 40 48 41 44 4f 4f 50 2e 43 4f 4d 8a 01 58 28 d2 8f 0b 8a 01 58 4c df 13 0b 8d 0d 6c 4b 8e 03 98
INFO : The url to track the job: https://pc-z1:26001/proxy/application_1475071482566_2477152/
INFO : Starting Job = job_1475071482566_2477152, Tracking URL = https://pc-z1:26001/proxy/application_1475071482566_2477152/
INFO : Kill Command = /opt/huawei/Bigdata/FusionInsight_V100R002C60SPC200/FusionInsight-Hive-1.3.0/hive-1.3.0/bin/..//../hadoop/bin/hadoop job -kill job_1475071482566_2477152
INFO : Hadoop job information for Stage-1: number of mappers: 17; number of reducers: 0
INFO : 2016-11-03 14:40:52,492 Stage-1 map = 0%, reduce = 0%
INFO : 2016-11-03 14:40:58,835 Stage-1 map = 76%, reduce = 0%, Cumulative CPU 28.78 sec
INFO : 2016-11-03 14:40:59,892 Stage-1 map = 88%, reduce = 0%, Cumulative CPU 33.55 sec
INFO : 2016-11-03 14:41:10,486 Stage-1 map = 94%, reduce = 0%, Cumulative CPU 37.13 sec
INFO : 2016-11-03 14:41:11,549 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 41.13 sec
INFO : MapReduce Total cumulative CPU time: 41 seconds 130 msec
INFO : Ended Job = job_1475071482566_2477152
INFO : Stage-3 is filtered out by condition resolver.
INFO : Stage-2 is selected by condition resolver.
INFO : Stage-4 is filtered out by condition resolver.
INFO : Number of reduce tasks is set to 0 since there's no reduce operator
INFO : number of splits:1
INFO : Submitting tokens for job: job_1475071482566_2477181
INFO : Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster, Ident: (HDFS_DELEGATION_TOKEN token 19422663 for jc_rc)
INFO : Kind: HIVE_DELEGATION_TOKEN, Service: HiveServer2ImpersonationToken, Ident: 00 05 6a 63 5f 72 63 05 6a 63 5f 72 63 21 68 69 76 65 2f 68 61 64 6f 6f 70 2e 68 61 64 6f 6f 70 2e 63 6f 6d 40 48 41 44 4f 4f 50 2e 43 4f 4d 8a 01 58 28 d2 8f 0b 8a 01 58 4c df 13 0b 8d 0d 6c 4b 8e 03 98
INFO : The url to track the job: https://pc-z1:26001/proxy/application_1475071482566_2477181/
INFO : Starting Job = job_1475071482566_2477181, Tracking URL = https://pc-z1:26001/proxy/application_1475071482566_2477181/
INFO : Kill Command = /opt/huawei/Bigdata/FusionInsight_V100R002C60SPC200/FusionInsight-Hive-1.3.0/hive-1.3.0/bin/..//../hadoop/bin/hadoop job -kill job_1475071482566_2477181
INFO : Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 0
INFO : 2016-11-03 14:41:22,190 Stage-2 map = 0%, reduce = 0%
INFO : 2016-11-03 14:41:28,571 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 2.2 sec
INFO : MapReduce Total cumulative CPU time: 2 seconds 200 msec
INFO : Ended Job = job_1475071482566_2477181
INFO : Moving data to directory /dt from hdfs://hacluster/dt/.hive-staging_hive_2016-11-03_14-40-43_774_4317869403646242426-140183/-ext-10000
No rows affected (46.604 seconds) [rc@vq18ptkh01 dt]$ hadoop fs -ls /dt
16/11/03 14:46:18 INFO hdfs.PeerCache: SocketCache disabled.
Found 1 items
-rwxrwxrwx+ 3 jc_rc supergroup 26819 2016-11-03 14:41 /dt/000000_0
[rc@vq18ptkh01 dt]$ hadoop fs -copyToLocal /dt/000000_0
16/11/03 14:46:33 INFO hdfs.PeerCache: SocketCache disabled.
[rc@vq18ptkh01 dt]$ ls
000000_0
[rc@vq18ptkh01 dt]$ [rc@vq18ptkh01 dt]$ ls
000000_0 000001_0 000002_0 000003_0 000004_0 000005_0
[rc@vq18ptkh01 dt]$ ftp 10.70.41.126 21
Connected to 10.70.41.126 (10.70.41.126).
220 10.70.41.126 FTP server ready
Name (10.70.41.126:rc): joy
331 Password required for joy.
Password:
230 User joy logged in.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> put 000000_0 /Temp/a_dt/
local: 000000_0 remote: /Temp/a_dt/
227 Entering Passive Mode (10,70,41,126,168,163).
550 /Temp/a_dt/: Not a regular file
ftp> put
(local-file) 000000_0
(remote-file) /Temp/a_dt/000000_0
local: 000000_0 remote: /Temp/a_dt/000000_0
227 Entering Passive Mode (10,70,41,126,168,207).
150 Opening BINARY mode data connection for /Temp/a_dt/000000_0
226 Transfer complete.
1049905992 bytes sent in 33 secs (31787.20 Kbytes/sec)
ftp> put 000001_0 /Temp/a_dt/000001_0
local: 000001_0 remote: /Temp/a_dt/000001_0
227 Entering Passive Mode (10,70,41,126,168,255).
150 Opening BINARY mode data connection for /Temp/a_dt/000001_0
452 Transfer aborted. No space left on device
ftp> put 000002_0 /Temp/a_dt/000002_0
local: 000002_0 remote: /Temp/a_dt/000002_0
227 Entering Passive Mode (10,70,41,126,169,20).
150 Opening BINARY mode data connection for /Temp/a_dt/000002_0
452 Transfer aborted. No space left on device
ftp> put 000003_0 /Temp/a_dt/000003_0
local: 000003_0 remote: /Temp/a_dt/000003_0
227 Entering Passive Mode (10,70,41,126,169,40).
150 Opening BINARY mode data connection for /Temp/a_dt/000003_0
452 Transfer aborted. No space left on device
ftp> put 000004_0 /Temp/a_dt/000004_0
local: 000004_0 remote: /Temp/a_dt/000004_0
227 Entering Passive Mode (10,70,41,126,169,66).
150 Opening BINARY mode data connection for /Temp/a_dt/000004_0
452 Transfer aborted. No space left on device
ftp> put 000005_0 /Temp/a_dt/000005_0
local: 000005_0 remote: /Temp/a_dt/000005_0
227 Entering Passive Mode (10,70,41,126,169,85).
150 Opening BINARY mode data connection for /Temp/a_dt/000005_0
226 Transfer complete.
23465237 bytes sent in 0.747 secs (31391.79 Kbytes/sec)
ftp>

查询hdfs文件内容,如果文件过大时不能一次加载,可以使用:

hadoop fs -cat /user/my/ab.txt |more

Spark+Hadoop+Hive集群上数据操作记录的更多相关文章

  1. 06、部署Spark程序到集群上运行

    06.部署Spark程序到集群上运行 6.1 修改程序代码 修改文件加载路径 在spark集群上执行程序时,如果加载文件需要确保路径是所有节点能否访问到的路径,因此通常是hdfs路径地址.所以需要修改 ...

  2. Redis Cluster高可用集群在线迁移操作记录【转】

    之前介绍了redis cluster的结构及高可用集群部署过程,今天这里简单说下redis集群的迁移.由于之前的redis cluster集群环境部署的服务器性能有限,需要迁移到高配置的服务器上.考虑 ...

  3. Hadoop跨集群迁移数据(整理版)

    1. 什么是DistCp DistCp(分布式拷贝)是用于大规模集群内部和集群之间拷贝的工具.它使用Map/Reduce实现文件分发,错误处理和恢复,以及报告生成.它把文件和目录的列表作为map任务的 ...

  4. Hadoop hbase集群断电数据块被破坏无法启动

    集群机器意外断电重启,导致hbase 无法正常启动,抛出reflect invocation异常,可能是正在执行的插入或合并等操作进行到一半时中断,导致部分数据文件不完整格式不正确或在hdfs上blo ...

  5. 使用hive客户端java api读写hive集群上的信息

    上文介绍了hdfs集群信息的读取方式,本文说hive 1.先解决依赖 <properties> <hive.version>1.2.1</hive.version> ...

  6. Redis Cluster高可用集群在线迁移操作记录

    之前介绍了redis cluster的结构及高可用集群部署过程,今天这里简单说下redis集群的迁移.由于之前的redis cluster集群环境部署的服务器性能有限,需要迁移到高配置的服务器上.考虑 ...

  7. 使用DBeaver Enterprise连接redis集群的一些操作记录

    要点总结: 使用DBeaver Enterprise连接redis集群可以通过SQL语句查看key对应的value,但是没法查看key. 使用RedisDesktopManager连接redis集群可 ...

  8. 大数据学习系列之七 ----- Hadoop+Spark+Zookeeper+HBase+Hive集群搭建 图文详解

    引言 在之前的大数据学习系列中,搭建了Hadoop+Spark+HBase+Hive 环境以及一些测试.其实要说的话,我开始学习大数据的时候,搭建的就是集群,并不是单机模式和伪分布式.至于为什么先写单 ...

  9. HADOOP+SPARK+ZOOKEEPER+HBASE+HIVE集群搭建(转)

    原文地址:https://www.cnblogs.com/hanzhi/articles/8794984.html 目录 引言 目录 一环境选择 1集群机器安装图 2配置说明 3下载地址 二集群的相关 ...

随机推荐

  1. 【HDU】1848 Fibonacci again and again

    http://acm.hdu.edu.cn/showproblem.php?pid=1848 题意:同nim,3堆,每次取的为fib数,n<=1000 #include <cstdio&g ...

  2. 【BZOJ1012】 【JSOI2008】最大数maxnumber

    Description 现在请求你维护一个数列,要求提供以下两种操作: 1. 查询操作.语法:Q L 功能:查询当前数列中末尾L个数中的最大的数,并输出这个数的值.限制:L不超过当前数列的长度. 2. ...

  3. tornado 学习笔记1 引言

    从事软件开发这行业也快5年啦,其实从事的工作也不完全是软件开发,软件开发只是我工作中的一部分.其中包括课题研究.信息化方案设计.软件开发.信息系统监理.项目管理等工作,比较杂乱.开发的软件比较多,但是 ...

  4. gcc 版本降级

    由于刚刚装了ubuntu 16.04,该版本gcc版本为5.4.0太高,很多软件不支持,所以要降版本,可以直接看(三)解决 一.gcc源代码网站 ftp://mirrors.kernel.org/gn ...

  5. centos 7 搭建本地yum仓库

    首先需要创建一个目录 mkdir /1   #在根目录下创建一个名字为1的目录 将光盘挂载到创建的这个目录 mount /dev/cdrom /1 yum命令配置文件在/etc/yum.repos.d ...

  6. CentOS上安装RabbitMQ3.6.X

    RabbitMQ3.6.1的安装方法跟以前的版本有点不一样,我在网上找了很多资料,基本都是3.1左右的版本,而且安装过程很繁琐,所以我花了一下午 的时间研究如何实现最简安装.为了让大家少走弯路,就把安 ...

  7. IE6、7绝对定位层被遮挡的原因(主要是父层决定的)

    最近做项目,经常遇到IE7以下浏览器中.一些悬浮框被一些元素遮挡的问题,这些元素一般都是设置了position的.问题的根本在不是被设置绝对定位的元素上,而是在设置了相对定位的父元素上.   我查阅了 ...

  8. bootstrap如何给.list-group加上序号

    在bootstrap中,我们可以使用不带任何class的<ol>跟<li>来创建一个有序列表,但是如果加上list-group类,样式有了,但列表前面的数字却没了. Boots ...

  9. 已知树的前序、中序,求后序的java实现&已知树的后序、中序,求前序的java实现

    public class Order { int findPosInInOrder(String str,String in,int position){ char c = str.charAt(po ...

  10. android-数据存储之远程服务器存储

    一.如何编码实现客户端与服务器端的交互 <一>JDK内置原生API HttpUrlConnection <二>Android内置的包装API HttpClient浏览器 < ...