Hadoop生态圈-Oozie实战之逻辑调度执行多个Job
Hadoop生态圈-Oozie实战之逻辑调度执行多个Job
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
1>.启动hadoop集群
[root@yinzhengjie hadoop-2.5.-cdh5.3.6]# sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh Starting namenodes on [s101]
s101: starting namenode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-namenode-yinzhengjie.out
s102: starting datanode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-datanode-s102.out
s104: starting datanode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-datanode-s104.out
s103: starting datanode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-datanode-s103.out
Starting secondary namenodes [s101]
s101: starting secondarynamenode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-secondarynamenode-yinzhengjie.out
starting yarn daemons
starting resourcemanager, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/yarn-yinzhengjie-resourcemanager-s101.out
s104: starting nodemanager, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/yarn-root-nodemanager-s104.out
s102: starting nodemanager, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/yarn-root-nodemanager-s102.out
s103: starting nodemanager, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/yarn-root-nodemanager-s103.out
[root@yinzhengjie hadoop-2.5.-cdh5.3.6]#
启动hadoop集群([root@yinzhengjie hadoop-2.5.0-cdh5.3.6]# sbin/start-all.sh )
[root@yinzhengjie hadoop-2.5.-cdh5.3.6]# sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/mapred-yinzhengjie-historyserver-s101.out
[root@yinzhengjie hadoop-2.5.-cdh5.3.6]#
启动日志服务([root@yinzhengjie hadoop-2.5.0-cdh5.3.6]# sbin/mr-jobhistory-daemon.sh start historyserver)
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# bin/oozied.sh start Setting OOZIE_HOME: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6
Setting OOZIE_CONFIG: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/conf
Sourcing: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/conf/oozie-env.sh
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
Setting OOZIE_CONFIG_FILE: oozie-site.xml
Setting OOZIE_DATA: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/data
Setting OOZIE_LOG: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/logs
Setting OOZIE_LOG4J_FILE: oozie-log4j.properties
Setting OOZIE_LOG4J_RELOAD:
hostname: Name or service not known
Setting OOZIE_HTTP_HOSTNAME:
Setting OOZIE_HTTP_PORT:
Setting OOZIE_ADMIN_PORT:
Setting OOZIE_HTTPS_PORT:
Setting OOZIE_BASE_URL: http://:11000/oozie
Setting CATALINA_BASE: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server
Setting OOZIE_HTTPS_KEYSTORE_FILE: /root/.keystore
Setting OOZIE_HTTPS_KEYSTORE_PASS: password
Setting OOZIE_INSTANCE_ID:
Setting CATALINA_OUT: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/logs/catalina.out
Setting CATALINA_PID: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server/temp/oozie.pid Using CATALINA_OPTS: -Xmx1024m -Dderby.stream.error.file=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/logs/derby.log
Adding to CATALINA_OPTS: -Doozie.home.dir=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6 -Doozie.config.dir=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/conf -Doozie.log.dir=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/logs -Doozie.data.dir=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/data -Doozie.instance.id= -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload= -Doozie.http.hostname= -Doozie.admin.port= -Doozie.http.port= -Doozie.https.port= -Doozie.base.url=http://:11000/oozie -Doozie.https.keystore.file=/root/.keystore -Doozie.https.keystore.pass=password -Djava.library.path= Using CATALINA_BASE: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server
Using CATALINA_HOME: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server
Using CATALINA_TMPDIR: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server/temp
Using JRE_HOME: /soft/jdk
Using CLASSPATH: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server/bin/bootstrap.jar
Using CATALINA_PID: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server/temp/oozie.pid
Existing PID file found during start.
Removing/clearing stale PID file.
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
启动oozie([root@yinzhengjie oozie-4.0.0-cdh5.3.6]# bin/oozied.sh start)
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# xcall.sh jps
============= s101 jps ============
ResourceManager
SecondaryNameNode
JobHistoryServer
Jps
Bootstrap
NameNode
命令执行成功
============= s102 jps ============
Jps
NodeManager
DataNode
命令执行成功
============= s103 jps ============
DataNode
Jps
NodeManager
命令执行成功
============= s104 jps ============
NodeManager
DataNode
Jps
命令执行成功
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
查看进程是否启动成功([root@yinzhengjie oozie-4.0.0-cdh5.3.6]# xcall.sh jps)
查看oozie界面是否启动成功:

2>.解压官方案例模板
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# tar -zxf oozie-examples.tar.gz
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
3>.编写脚本
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# cat yinzhengjie-oozie-jobs/shell/test-.sh
#!/bin/bash
#@author :yinzhengjie
#blog:http://www.cnblogs.com/yinzhengjie
#EMAIL:y1053419035@qq.com /bin/date -d today +"%Y-%m-%d %T" > /home/yinzhengjie/data/access-.log
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# cat yinzhengjie-oozie-jobs/shell/test-.sh
#!/bin/bash
#@author :yinzhengjie
#blog:http://www.cnblogs.com/yinzhengjie
#EMAIL:y1053419035@qq.com /bin/date -d today +"%Y-%m-%d %T" > /home/yinzhengjie/data/access-.log
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
4>.编辑job.properties配置文件
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# more yinzhengjie-oozie-jobs/shell/job.properties
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# #HDFS地址
nameNode=hdfs://s101:8020 #ResourceManager地址
jobTracker=s101: #队列名称
queueName=default
examplesRoot=yinzhengjie-oozie-jobs #指定oozie的shell脚本存放路径
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/shell #指定执行的脚本名称
EXEC1=test-.sh
EXEC2=test-.sh
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
5>.编辑workflow.xml 配置文件
[root@yinzhengjie oozie-4.0.0-cdh5.3.6]# cat yinzhengjie-oozie-jobs/shell/workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
<start to="yinzhengjie-shell-node1"/>
<action name="yinzhengjie-shell-node1">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${EXEC1}</exec>
<file>/user/root/yinzhengjie-oozie-jobs/shell/${EXEC1}#${EXEC1}</file>
<!-- <argument>my_output=Hello Oozie</argument>-->
<capture-output/>
</shell>
<ok to="yinzhengjie-shell-node2"/>
<error to="fail"/>
</action> <action name="yinzhengjie-shell-node2">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${EXEC2}</exec>
<file>/user/root/yinzhengjie-oozie-jobs/shell/${EXEC2}#${EXEC2}</file>
<!-- <argument>my_output=Hello Oozie</argument>-->
<capture-output/>
</shell>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
[root@yinzhengjie oozie-4.0.0-cdh5.3.6]#
6>.上传任务配置到hdfs
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/bin/hdfs dfs -put yinzhengjie-oozie-jobs/shell/ /user/root/yinzhengjie-oozie-jobs/shell
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/bin/hdfs dfs -ls -R /user/root/yinzhengjie-oozie-jobs/shell
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/blog.sh
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/job.properties
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/test-.sh
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/test-.sh
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/workflow.xml
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
[root@yinzhengjie oozie-4.0.0-cdh5.3.6]# /home/yinzhengjie/download/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put yinzhengjie-oozie-jobs/shell/ /user/root/yinzhengjie-oozie-jobs/shell
7>.执行任务
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# bin/oozie job -oozie http://s101:11000/oozie -config yinzhengjie-oozie-jobs/shell/job.properties -run
job: --oozie-root-W
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
8>.
9>.
Hadoop生态圈-Oozie实战之逻辑调度执行多个Job的更多相关文章
- Hadoop生态圈-Oozie实战之调度shell脚本
Hadoop生态圈-Oozie实战之调度shell脚本 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 本篇博客展示案例:使用Oozie调度Shell脚本. 1>.解压官方案例 ...
- Hadoop生态圈-Azkaban实战之Command类型执行指定脚本
Hadoop生态圈-Azkaban实战之Command类型执行指定脚本 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 1>.服务端测试代码(别忘记添加权限哟!) [yinzh ...
- Hadoop生态圈-Oozie部署实战
Hadoop生态圈-Oozie部署实战 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.Oozie简介 1>.什么是Oozie Oozie英文翻译为:驯象人.一个基于工作流 ...
- Hadoop生态圈-Azkaban实战之Command类型多job工作流flow
Hadoop生态圈-Azkaban实战之Command类型多job工作流flow 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. Azkaban内置的任务类型支持command.ja ...
- Hadoop生态圈-Azkaban部署实战
Hadoop生态圈-Azkaban部署实战 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.Azkaban部署流程 1>.上传azkaban程序并创建解压目录 [yinz ...
- Hadoop生态圈-Azkaban实现hive脚本执行
Hadoop生态圈-Azkaban实现hive脚本执行 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 本篇博客中在HDFS分布式系统取的数据,而这个数据的是有之前我通过MapRed ...
- Hadoop生态圈-Azkaban实现文件上传到hdfs并执行MR数据清洗
Hadoop生态圈-Azkaban实现文件上传到hdfs并执行MR数据清洗 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 如果你没有Hadoop集群的话也没有关系,我这里给出当时我 ...
- 一篇了解大数据架构及Hadoop生态圈
一篇了解大数据架构及Hadoop生态圈 阅读建议,有一定基础的阅读顺序为1,2,3,4节,没有基础的阅读顺序为2,3,4,1节. 第一节 集群规划 大数据集群规划(以CDH集群为例),参考链接: ht ...
- hadoop生态圈介绍
原文地址:大数据技术Hadoop入门理论系列之一----hadoop生态圈介绍 1. hadoop 生态概况 Hadoop是一个由Apache基金会所开发的分布式系统基础架构. 用户可以在不了解分 ...
随机推荐
- 第三周Linux学习报告
Linux内核源代码简介: arch/x86中内容重点关注 init目录重要,内核启动相关的代码基本上都在init目录下.如main.c等.Start_kernel函数相当于普通C程序的main函数. ...
- Github上传更新
通过2天的时间,不停的网上找各种资料,今天下午终于可以登录上github for Windows 客户端了,,, 然后通过一整晚的摸索,也把项目上传到github里. github地址:https:/ ...
- Orchard Core学习一
Orchard Core学习一 Orchard Core是ASP.NET Core上Orchard CMS的重新开发. Orchard Core由两个不同的目标组成: Orchard核心框架:用于在A ...
- log4php的使用方法与详细配置
log4php的使用 首先引入logger.php文件.log4php可以通过引入logger.php来完成自动加载的过程.文件位置如下: 日志记录器自身没有定义日志的输出目的地和格式,所以我们通常需 ...
- CRM 数据查重
2.8 小工具 · 纷享销客产品手册https://www.fxiaoke.com/mob/guide/crmdoc/src/2-8%E5%B0%8F%E5%B7%A5%E5%85%B7.html C ...
- C#中byte[] 与string相互转化问题
using System; using System.IO; using System.Security.Cryptography; namespace ShareX.UploadersLib.Oth ...
- [转帖]CR3,PDE,PTE,TLB 内存管理的简单说明
CR3,PDE,PTE,TLB Copy From https://www.cnblogs.com/zzSoftware/archive/2013/02/11/2908824.html 网上关于 ...
- Linux 重启网络提示找不到eth0(no device found for “System eth0”)
一.背景 使用VMWare创建了一个虚拟机(VM1),然后通过拷贝的方式创建了另一台虚拟机(VM2).在第二台虚拟机上设置网卡为固定IP,使用service network restart重启网络的时 ...
- 《ERP系统》客户信用及风控代码
1.风控核心代码: <?php namespace core\models; class SalesCustomersFacade extends \common\models\Base{ /* ...
- python之datetime类
datetime.time时间类,一般用于显示当地时间 import datetime # 新建对象 datetime_obj = datetime.time(hour=12, minute=20, ...