运行命令:

sqoop import --connect "jdbc:mysql://x.x.x.x:3306/intelligent_qa_bms?useUnicode=true&characterEncoding=utf-8&zeroDateTimeBehavior=convertToNull"  --username root  --password xxxx  --query "select id,siteName,type,section,title,content,url,word_count,status,publishDate,crawlDate from crawl_znwd_all where  updateDate <= 20190717 and word_count <= 800 AND status=2 and \$CONDITIONS;" -m 1 --null-string 'null' --null-non-string 'null' --fields-terminated-by '¥' --lines-terminated-by '\n' --hive-drop-import-delims --target-dir /znwd/input --as-textfile --delete-target-dir

报错一:

mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
19/07/18 10:54:16 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/07/18 10:54:17 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/07/18 10:54:18 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/07/18 10:54:19 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/07/18 10:54:20 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

解决办法:

在mapred-site.xml添加配置:

<property>
<name>mapreduce.jobhistory.address</name>
<value>HDFSNameNode:10020</value>
</property>

报错二:

9/07/18 11:04:51 INFO mapreduce.Job: map 0% reduce 100%
19/07/18 11:04:51 INFO mapreduce.Job: Job job_1562900696500_0017 failed with state FAILED due to:
19/07/18 11:04:51 INFO mapreduce.ImportJobBase: The MapReduce job has already been retired. Performance
19/07/18 11:04:51 INFO mapreduce.ImportJobBase: counters are unavailable. To get this information,
19/07/18 11:04:51 INFO mapreduce.ImportJobBase: you will need to enable the completed job store on
19/07/18 11:04:51 INFO mapreduce.ImportJobBase: the jobtracker with:
19/07/18 11:04:51 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.active = true
19/07/18 11:04:51 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.hours = 1
19/07/18 11:04:51 INFO mapreduce.ImportJobBase: A jobtracker restart is required for these settings
19/07/18 11:04:51 INFO mapreduce.ImportJobBase: to take effect.
19/07/18 11:04:51 ERROR tool.ImportTool: Import failed: Import job failed!

解决办法:

权限问题,对hdfs目录没有权限,切换到指定目录有权限的用户即可解决;

同时在mapred-site.xml添加配置(先加的此配置,并不能解决问题,切换到指定用户后解决,所以应该就是权限的问题,此配置也可以不加):

<property>
<name>mapreduce.jobtracker.persist.jobstatus.active</name>
<value>true</value>
</property>
<property>
<name>mapreduce.jobtracker.persist.jobstatus.hours</name>
<value>1</value>
</property>

问题三:

19/07/18 11:15:10 INFO mapreduce.Job: Task Id : attempt_1562900696500_0019_m_000000_0, Status : FAILED
Container [pid=89807,containerID=container_1562900696500_0019_01_000002] is running beyond virtual memory limits. Current usage: 549.3 MB of 1 GB physical memory used; 4.0 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1562900696500_0019_01_000002 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 89807 89804 89807 89807 (bash) 4 3 115920896 370 /bin/bash -c /home/admin/jdk1.8.0_144/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2048M -Djava.io.tmpdir=/admin_data/hadoop_tmp/nm-local-dir/usercache/zhaolei/appcache/application_1562900696500_0019/container_1562900696500_0019_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/admin/hadoop-2.7.3/logs/userlogs/application_1562900696500_0019/container_1562900696500_0019_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.10.15 36756 attempt_1562900696500_0019_m_000000_0 2 1>/home/admin/hadoop-2.7.3/logs/userlogs/application_1562900696500_0019/container_1562900696500_0019_01_000002/stdout 2>/home/admin/hadoop-2.7.3/logs/userlogs/application_1562900696500_0019/container_1562900696500_0019_01_000002/stderr
|- 90025 89807 89807 89807 (java) 1477 86 4182704128 140259 /home/admin/jdk1.8.0_144/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2048M -Djava.io.tmpdir=/admin_data/hadoop_tmp/nm-local-dir/usercache/zhaolei/appcache/application_1562900696500_0019/container_1562900696500_0019_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/admin/hadoop-2.7.3/logs/userlogs/application_1562900696500_0019/container_1562900696500_0019_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.10.15 36756 attempt_1562900696500_0019_m_000000_0 2

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

解决办法:

此问题并不影响执行结果,但是如果想追求完美的话依然需要解决:

在yarn-site.xml中添加配置:

<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>300000</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>30000</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>3000</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>2000</value>
</property>

具体参数解释可参考:https://www.cnblogs.com/xjh713/p/9681442.html

记一次sqoop安装后测试的问题的更多相关文章

  1. PHP源码安装后设置别名

    PHP源码安装后测试是否能正常运行 每次在php目录./bin./php调用php很不方便,可以设置别名(方法一) vi ~/.bash_profile     (修改根目录下这个文件) 设置完成后还 ...

  2. 基于iSCSI的SQL Server 2012群集测试(二)--SQL群集安装后初始化配置测试

    4.群集安装后初始化配置测试 4.1 禁用full-text 服务和Browser服务 Full-text服务:公司目前暂不使用,需在两个节点上分别禁用 Browser服务:为保证安全,建议将Brow ...

  3. 【sqoop】安装配置测试sqoop1

    3.1.1 下载sqoop1:sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz 3.1.2 解压并查看目录: [hadoop@hadoop01 ~]$ tar -zxvf sq ...

  4. Sqoop 安装与简单测试

    sqoop基于Hadoop与Hive Hadoop https://www.cnblogs.com/xibuhaohao/p/11772031.html Hive      https://www.c ...

  5. mosquitto在Linux环境下的部署/安装/使用/测试

    mosquitto在Linux环境下的部署 看了有三四天的的源码,(当然没怎么好好看了),突然发现对mosquitto的源码有了一点点感觉,于是在第五天决定在Linux环境下部署mosquitto. ...

  6. Hadoop生态组件Hive,Sqoop安装及Sqoop从HDFS/hive抽取数据到关系型数据库Mysql

    一般Hive依赖关系型数据库Mysql,故先安装Mysql $: yum install mysql-server mysql-client [yum安装] $: /etc/init.d/mysqld ...

  7. Sysbench 1.0.17安装与测试

    Sysbench安装与测试 1.安装: cd /usr/local/src wget https://codeload.github.com/akopytov/sysbench/tar.gz/1.0. ...

  8. sqoop安装与简单实用

    一,sqoop安装 1.解压源码包 2.配置环境变量 3.在bin目录下的 /bin/configsqoop 注释掉check报错信息 4.配置conf目录下 /conf/sqoop-env.sh 配 ...

  9. [Hadoop]&nbsp;Sqoop安装过程详解

    Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,也可 ...

随机推荐

  1. C语言折半查找法练习题冒泡排序

    C语言折半查找法练习题 折半查找法: 折半查找法是效率较高的一种查找方法.假设有已经按照从小到大的顺序排列好的五个整数num[0]~num[4],要查找的数是key,其基本思想是: 设查找数据的范围下 ...

  2. Cilium架构 (Cilium 2)

    Cilium架构 译自:http://docs.cilium.io/en/stable/architecture/ 本文档描述了Cilium的架构.它通过记录BPF数据路径(datapath)的钩子来 ...

  3. FJUT2019暑假第二次周赛题解

    A 服务器维护 题目大意: 给出时间段[S,E],这段时间需要人维护服务器,给出n个小时间段[ai,bi],代表每个人会维护的时间段,每个人维护这段时间有一个花费,现在问题就是维护服务器[S,E]这段 ...

  4. 这价格看得我偷偷摸了泪——用python爬取北京二手房数据

    如果想了解更多关于python的应用,可以私信我,或者加群,里面到资料都是免费的 http://t.cn/A6Zvjdun 近期,有个朋友联系我,想统计一下北京二手房的相关的数据,而自己用Excel统 ...

  5. stand up meeting 1/14/2016

    part 组员                工作              工作耗时/h 明日计划 工作耗时/h    UI 冯晓云  主要对生词本卡片的整体设计做修改:协助主程序完成popup部分 ...

  6. 详解 Paths类 与 Files类

    在本篇博文中,本人主要讲解NIO 的两个核心点 -- 缓冲区(Buffer) 和 通道 (Channel)之一的 缓冲区(Buffer), 有关NIO流的其他知识点请观看本人博文<详解 NIO流 ...

  7. 1. git 本地给远程仓库创建分支 三步法

    命令如下: 1:本地创建分支dev 1 2 Peg@PEG-PC /D/home/myself/Symfony (master) $ git branch dev 2:下面是把本地分支提交到远程仓库 ...

  8. JS-Array-新增方法

    1. filter( ) var arr = [5,4,3,2,1]; newarr = arr.filter((item)=>{ return item<3 }) ;  // => ...

  9. 改善 Python 程序的 91 个建议

    1.引论 建议1:理解Pythonic概念—-详见Python中的<Python之禅> 建议2:编写Pythonic代码 避免不规范代码,比如只用大小写区分变量.使用容易混淆的变量名.害怕 ...

  10. JasperReports入门教程(三):Paramters,Fields和Detail基本组件介绍

    JasperReports入门教程(三):Paramter,Field和Detail基本组件介绍 前言 前两篇博客带领大家进行了入门,做出了第一个例子.也解决了中文打印的问题.大家跟着例子也做出了de ...