Sqoop Export HDFS
Sqoop Export应用场景——直接导出
直接导出
我们先复制一个表,然后将上一篇博文(Sqoop Import HDFS)导入的数据再导出到我们所复制的表里。

sqoop export \
--connect 'jdbc:mysql://202.193.60.117/dataweb?useUnicode=true&characterEncoding=utf-8' \
--username root \
--password-file /user/hadoop/.password \
--table user_info_copy \
--export-dir /user/hadoop/user_info \
--input-fields-terminated-by "," //此处分隔符根据建表时所用分隔符确定,可查看博客sqoop导出hive数据到mysql错误: Caused by: java.lang.RuntimeException: Can't parse input data
运行过程如下:
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: Job job_1529567189245_0010 completed successfully
// :: INFO mapreduce.Job: Counters:
File System Counters
FILE: Number of bytes read=
FILE: Number of bytes written=
FILE: Number of read operations=
FILE: Number of large read operations=
FILE: Number of write operations=
HDFS: Number of bytes read=
HDFS: Number of bytes written=
HDFS: Number of read operations=
HDFS: Number of large read operations=
HDFS: Number of write operations=
Job Counters
Launched map tasks=3
Data-local map tasks=3 //map数为3,在下面可以指定map数来执行导出操作
Total time spent by all maps in occupied slots (ms)=
Total time spent by all reduces in occupied slots (ms)=
Total time spent by all map tasks (ms)=
Total vcore-seconds taken by all map tasks=
Total megabyte-seconds taken by all map tasks=
Map-Reduce Framework
Map input records=
Map output records=
Input split bytes=
Spilled Records=
Failed Shuffles=
Merged Map outputs=
GC time elapsed (ms)=
CPU time spent (ms)=
Physical memory (bytes) snapshot=
Virtual memory (bytes) snapshot=
Total committed heap usage (bytes)=
File Input Format Counters
Bytes Read=
File Output Format Counters
Bytes Written=
// :: INFO mapreduce.ExportJobBase: Transferred bytes in 38.2702 seconds (18.1865 bytes/sec)
// :: INFO mapreduce.ExportJobBase: Exported records.
导入成功后我们再手动查看一下数据库。

上图表示我们的导入是成功的。
指定Map个数
sqoop export \
--connect 'jdbc:mysql://202.193.60.117/dataweb?useUnicode=true&characterEncoding=utf-8' \
--username root \
--password-file /user/hadoop/.password \
--table user_info_copy \
--export-dir /user/hadoop/user_info \
--input-fields-terminated-by "," \
-m 1 //map数设定为1
先清除本地数据库数据之后再测试。

// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: Job job_1529567189245_0011 completed successfully
// :: INFO mapreduce.Job: Counters:
File System Counters
FILE: Number of bytes read=
FILE: Number of bytes written=
FILE: Number of read operations=
FILE: Number of large read operations=
FILE: Number of write operations=
HDFS: Number of bytes read=
HDFS: Number of bytes written=
HDFS: Number of read operations=
HDFS: Number of large read operations=
HDFS: Number of write operations=
Job Counters
Launched map tasks=1
Data-local map tasks=1 //map数变为了1个
Total time spent by all maps in occupied slots (ms)=
Total time spent by all reduces in occupied slots (ms)=
Total time spent by all map tasks (ms)=
Total vcore-seconds taken by all map tasks=
Total megabyte-seconds taken by all map tasks=
Map-Reduce Framework
Map input records=
Map output records=
Input split bytes=
Spilled Records=
Failed Shuffles=
Merged Map outputs=
GC time elapsed (ms)=
CPU time spent (ms)=
Physical memory (bytes) snapshot=
Virtual memory (bytes) snapshot=
Total committed heap usage (bytes)=
File Input Format Counters
Bytes Read=
File Output Format Counters
Bytes Written=
// :: INFO mapreduce.ExportJobBase: Transferred bytes in 25.1976 seconds (12.9774 bytes/sec) //执行时间也较上面减少了
// :: INFO mapreduce.ExportJobBase: Exported records.

Sqoop Export应用场景——插入和更新
先将已经插入的信息作一点修改,然后重新导入,导入之后会将我们修改的信息又给复原回去。

执行命令
sqoop export \
--connect 'jdbc:mysql://202.193.60.117/dataweb?useUnicode=true&characterEncoding=utf-8' \
--username root \
--password-file /user/hadoop/.password \
--table user_info_copy \
--export-dir /user/hadoop/user_info \
--input-fields-terminated-by "," \
-m \
--update-key id \
--update-mode allowinsert //默认为updateonly(只更新),也可以设置为allowinsert(允许插入)
执行完毕后,信息又重新修改了回来。

Sqoop Export应用场景
事务处理
在将HDFS上的数据导入到数据库中之前先导入到一个临时表tmp中,如果导入成功的话,再转移到目标表中去。
sqoop export \
--connect 'jdbc:mysql://202.193.60.117/dataweb?useUnicode=true&characterEncoding=utf-8' \
--username root \
--password-file /user/hadoop/.password \
--table user_info_copy \
--staging-table user_info_tmp \ //临时表需要提前创建,可直接复制再重命名
--clear-staging-table \
--export-dir /user/hadoop/user_info \
--input-fields-terminated-by ","
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: Job job_1529567189245_0014 completed successfully
// :: INFO mapreduce.Job: Counters:
File System Counters
FILE: Number of bytes read=
FILE: Number of bytes written=
FILE: Number of read operations=
FILE: Number of large read operations=
FILE: Number of write operations=
HDFS: Number of bytes read=
HDFS: Number of bytes written=
HDFS: Number of read operations=
HDFS: Number of large read operations=
HDFS: Number of write operations=
Job Counters
Launched map tasks=
Data-local map tasks=
Total time spent by all maps in occupied slots (ms)=
Total time spent by all reduces in occupied slots (ms)=
Total time spent by all map tasks (ms)=
Total vcore-seconds taken by all map tasks=
Total megabyte-seconds taken by all map tasks=
Map-Reduce Framework
Map input records=
Map output records=
Input split bytes=
Spilled Records=
Failed Shuffles=
Merged Map outputs=
GC time elapsed (ms)=
CPU time spent (ms)=
Physical memory (bytes) snapshot=
Virtual memory (bytes) snapshot=
Total committed heap usage (bytes)=
File Input Format Counters
Bytes Read=
File Output Format Counters
Bytes Written=
// :: INFO mapreduce.ExportJobBase: Transferred bytes in 36.8371 seconds (18.894 bytes/sec)
// :: INFO mapreduce.ExportJobBase: Exported records.
// :: INFO mapreduce.ExportJobBase: Starting to migrate data from staging table to destination.
// :: INFO manager.SqlManager: Migrated 3 records from `user_info_tmp` to `user_info_copy`
字段不对应问题
先将数据库中的表内容导入到hdfs上(但不是所有的内容都导入,而是只导入部分字段,在这里就没有导入id字段),然后再从hdfs导出到本地数据库中。
[hadoop@centpy hadoop-2.6.]$ sqoop import --connect jdbc:mysql://202.193.60.117/dataweb
> --username root
> --password-file /user/hadoop/.password
> --table user_info
> --columns name,password,intStatus //确定导入哪些字段
> --target-dir /user/hadoop/user_info
> --delete-target-dir
> --fields-terminated-by ","
> -m 1 [hadoop@centpy hadoop-2.6.]$ hdfs dfs -cat /user/hadoop/user_info/part-m-* admin,, hello,, hahaha,haha,
可以看到我们此处导入的数据和数据库相比少了“id”这个字段,接下来,我们如果不使用上面的columns字段,仍然按照原来的方式导入,肯定会报错,因为这和我们的数据库格式和字段不匹配。如下所示:
[hadoop@centpy hadoop-2.6.]$ sqoop export \
> --connect 'jdbc:mysql://202.193.60.117/dataweb?useUnicode=true&characterEncoding=utf-8' \
> --username root \
> --password-file /user/hadoop/.password \
> --table user_info_copy \
> --export-dir /user/hadoop/user_info \
> --input-fields-terminated-by "," \
> -m 1

要实现字段不匹配导入必须使用columns字段导出。
[hadoop@centpy hadoop-2.6.]$ sqoop export \
> --connect 'jdbc:mysql://202.193.60.117/dataweb?useUnicode=true&characterEncoding=utf-8' \
> --username root \
> --password-file /user/hadoop/.password \
> --table user_info_copy \
> --columns name,password,intStatus \
> --export-dir /user/hadoop/user_info \
> --input-fields-terminated-by "," \
以上就是博主为大家介绍的这一板块的主要内容,这都是博主自己的学习过程,希望能给大家带来一定的指导作用,有用的还望大家点个支持,如果对你没用也望包涵,有错误烦请指出。如有期待可关注博主以第一时间获取更新哦,谢谢!
Sqoop Export HDFS的更多相关文章
- (MySQL里的数据)通过Sqoop Import HDFS 里 和 通过Sqoop Export HDFS 里的数据到(MySQL)(五)
下面我们结合 HDFS,介绍 Sqoop 从关系型数据库的导入和导出 一.MySQL里的数据通过Sqoop import HDFS 它的功能是将数据从关系型数据库导入 HDFS 中,其流程图如下所示. ...
- (MySQL里的数据)通过Sqoop Import Hive 里 和 通过Sqoop Export Hive 里的数据到(MySQL)
Sqoop 可以与Hive系统结合,实现数据的导入和导出,用户需要在 sqoop-env.sh 中添加HIVE_HOME的环境变量. 具体,见我的如下博客: hadoop2.6.0(单节点)下Sqoo ...
- (MySQL里的数据)通过Sqoop Import HBase 里 和 通过Sqoop Export HBase 里的数据到(MySQL)
Sqoop 可以与HBase系统结合,实现数据的导入和导出,用户需要在 sqoop-env.sh 中添加HBASE_HOME的环境变量. 具体,见我的如下博客: hadoop2.6.0(单节点)下Sq ...
- sqoop导入hdfs上的数据到oracle
/opt/sqoop-/bin/sqoop export --table mytablename --connect jdbc:oracle:thin:@**.**.**.**:***:dbasena ...
- Hadoop生态组件Hive,Sqoop安装及Sqoop从HDFS/hive抽取数据到关系型数据库Mysql
一般Hive依赖关系型数据库Mysql,故先安装Mysql $: yum install mysql-server mysql-client [yum安装] $: /etc/init.d/mysqld ...
- 通过sqoop将hdfs数据导入MySQL
简介:Sqoop是一款开源的工具,主要用于在Hadoop(Hive)与传统的数据库(mysql.postgresql...)间进行数据的传递,可以将一个关系型数据库(例如 : MySQL ,Oracl ...
- Sqoop与HDFS、Hive、Hbase等系统的数据同步操作
Sqoop与HDFS结合 下面我们结合 HDFS,介绍 Sqoop 从关系型数据库的导入和导出. Sqoop import 它的功能是将数据从关系型数据库导入 HDFS 中,其流程图如下所示. 我们来 ...
- 一个sqoop export案例中踩到的坑
案例分析: 需要将hdfs上的数据导出到mysql里的一张表里. 虚拟机集群的为:centos1-centos5 问题1: 在centos1上将hdfs上的数据导出到centos1上的mysql里: ...
- sqoop从hdfs 中导出数据到mysql
bin/sqoop export \ --connect "jdbc:mysql://mini1:3306/study?useUnicode=true&characterEncodi ...
随机推荐
- java基础知识(7)---多态
多 态:(面向对象特征之一):函数本身就具备多态性,某一种事物有不同的具体的体现.体现:父类引用或者接口的引用指向了自己的子类对象.//Animal a = new Cat();多态的好处:提高了程序 ...
- python 链表
在C/C++中,通常采用“指针+结构体”来实现链表:而在Python中,则可以采用“引用+类”来实现链表. 节点类: class Node: def __init__(self, data): sel ...
- Jenkins搭建Nodejs自动化测试
一.安装Jenkins(Windows) 1. 在Jenkins官网(https://jenkins.io/)下载安装包,解压并安装 2. 安装完成后,会自动打开一个页面,根据提示在安装目录下找到随机 ...
- Project Server2016升级安装问题项目中心无法显示
sharepoint 2016升级后,project server 相关中心页面出现空白页面,这是是sharepoint2016一个bug,解决方案用PWA.resx内容替换PWA.en-us.res ...
- eclipse安装WTP部署WEB项目
打开WTP官方安装指南,找到想要的下载站点 http://wiki.eclipse.org/WTP_FAQ#How_do_I_install_WTP.3F 我选择的是http://download.e ...
- 关于startservice的几个启动返回值的意义
START_NOT_STICKY 如果服务进程在它启动后(从onStartCommand()返回后)被kill掉, 并且没有新启动的intent传给他, 那么将服务移出启动状态并且不重新生成, 直到再 ...
- CodeForces 1107F. Vasya and Endless Credits
题目简述:给定 $n \leq 500$ 个贷款方式,其中第$i$个贷款额为$a_i$元,需要$k_i$个月偿还,每月还贷$b_i$元.在每个月月初可申请其中一个贷款,而在每个月月底时需要还贷.求:( ...
- sort命令实战
本文参考:http://www.cnblogs.com/dong008259/archive/2011/12/08/2281214.html 东方雨中漫步者 sort命令,帮助我们依据不同的数据类型进 ...
- p2341&bzoj1051 受欢迎的牛
传送门(洛谷) 传送门(bzoj) 题目 每一头牛的愿望就是变成一头最受欢迎的牛.现在有N头牛,给你M对整数(A,B),表示牛A认为牛B受欢迎. 这 种关系是具有传递性的,如果A认为B受欢迎,B认为C ...
- Struts简单入门实例
转自http://www.cnblogs.com/xing901022/p/3961661.html 有改动 struts2其实就是为我们封装了servlet,简化了jsp跳转的复杂操作,并且提供了易 ...