hadoop中日志聚集问题

遇到的问题：

当点击上面的logs时，会出现下面问题：

这个解决方案为：

By default, Hadoop stores the logs of each container in the node where that container was hosted. While this is irrelevant if you're just testing some Hadoop executions in a single-node environment (as all the logs will be in your machine anyway), with a cluster of nodes, keeping track of the logs can become quite a bother. In addition, since logs are kept on the normal filesystem, you may run into storage problems if you keep logs for a long time or have heterogeneous storage capabilities.

Log aggregation is a new feature that allows Hadoop to store the logs of each application in a central directory in HDFS. To activate it, just add the following to yarn-site.xmland restart the Hadoop services:

 <property>

    <description>Whether to enable log aggregation</description>

    <name>yarn.log-aggregation-enable</name>

    <value>true</value>

  </property>

By adding this option, you're telling Hadoop to move the application logs to hdfs:///logs/userlogs/<your user>/<app id>. You can change this path and other options related to log aggregation by specifying some other properties mentioned in the default yarn-site.xml (just do a search for log.aggregation).

However, these aggregated logs are not stored in a human readable format so you can't just cat their contents. Fortunately, Hadoop developers have included several handy command line tools for reading them:

# Read logs from any YARN application

$HADOOP_HOME/bin/yarn logs -applicationId <applicationId>

# Read logs from MapReduce jobs

$HADOOP_HOME/bin/mapred job -logs <jobId>

# Read it in a scrollable window with search (type '/' followed by your query).

$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> | less

# Or just save it to a file and use your favourite editor

$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> > log.txt

You can also access these logs via a web app for MapReduce jobs by using the JobHistory daemon. This daemon can be started/stopped by running the following:

# Start JobHistory daemon

$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver

# Stop JobHistory daemon

$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver

My Fabric script includes an optional variable for setting the node where to launch this daemon so it is automatically started/stopped when you run fab start or fab stop.

Unfortunately, a generic history daemon for universal web access to aggregated logs does not exist yet. However, as you can see by checking YARN-321, there's considerable work being done in this area. When this gets introduced I'll update this section.

hadoop中日志聚集问题的更多相关文章

Hadoop基础-完全分布式模式部署yarn日志聚集功能
Hadoop基础-完全分布式模式部署yarn日志聚集功能作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 其实我们不用配置也可以在服务器后台通过命令行的形式查看相应的日志,但为了更方 ...
hadoop配置历史服务器&&配置日志聚集
配置历史服务器 1.在mapred-site.xml中写入一下配置 <property> <name>mapreduce.jobhistory.address</name ...
hadoop 3.x 配置日志聚集功能
打开$HADOOP_HOME/etc/hadoop/yarn-site.xml,增加以下配置(在此配置文件中尽量不要使用中文注释)  <property> ...
开启spark日志聚集功能
spark监控应用方式: 1)在运行过程中可以通过web Ui:4040端口进行监控 2)任务运行完成想要监控spark,需要启动日志聚集功能开启日志聚集功能方法: 编辑conf/spark-env ...
Yarn 的日志聚集功能配置使用
需要 hadoop 的安装目录/etc/hadoop/yarn-site.xml 中进行配置配置内容 <property> <name>yarn.log-aggregati ...
5，Hadoop中的文件
1,文件结构 · bin:脚本和命令目录. · etc:配置文件目录. · sbin:命令目录,主要包含HDFS和YARN中各类服务的启动和关闭,依赖于bin中的脚本. · share:各个模块编译后 ...
再谈SQL Server中日志的的作用
简介之前我已经写了一个关于SQL Server日志的简单系列文章.本篇文章会进一步挖掘日志背后的一些概念,原理以及作用.如果您没有看过我之前的文章,请参阅: 浅谈SQL Server ...
Hive分析hadoop进程日志
想把hadoop的进程日志导入hive表进行分析,遂做了以下的尝试. 关于hadoop进程日志的解析使用正则表达式获取四个字段,一个是日期时间,一个是日志级别,一个是类,最后一个是详细信息, 然后在 ...
hadoop中常见元素的解释
secondarynamenode 图: secondarynamenode根据文件的的大小对namenode的编辑日志和镜像日志进行合并. 光从字面上来理解,很容易让一些初学者先入为主的认为:Se ...

随机推荐

[转载]Spring Autowire自动装配介绍
转自: http://www.cnblogs.com/zhishan/p/3190757.html 在应用中,我们常常使用<ref>标签为JavaBean注入它依赖的对象.但是对于一个大型 ...
HDU 1102 Constructing Roads(最小生成树，基础题)
注意标号要减一才为下标,还有已建设的路长可置为0 题目 #define _CRT_SECURE_NO_WARNINGS #include <stdio.h> #include<str ...
lintcode ： find peak element 寻找峰值
题目寻找峰值你给出一个整数数组(size为n),其具有以下特点: 相邻位置的数字是不同的 A[0] < A[1] 并且 A[n - 2] > A[n - 1] 假定P是峰值的位置则满足 ...
从svn删除文件夹和文件
由于项目开始放在自己项目组的一个服务器上,而且svn也是自己在该服务器上搭建的,但是不知道是什么原因,svn上的代码被误删了.为了更稳定地使用svn,所以使用公司的svn来管理代码. 运维将不是最新版 ...
linux下安装Apache(https) 服务器证书安装配置指南
一. 安装准备 1. 安装Openssl 要使Apache支持SSL,需要首先安装Openssl支持.推荐下载安装openssl-0.9.8k.tar.gz 下载Openssl:http: ...
基于Jws的WebService项目
基于Jws的WebService项目 1.服务器端建立 1.1.创建接口 [java] view plaincopy @WebService public interface IWebServi ...
8、SpringMVC源码分析（3）：分析ModelAndView的形成过程
首先,我们还是从DispatcherServlet.doDispatch(HttpServletRequest request, HttpServletResponse response) throw ...
[从jQuery看JavaScript]-匿名函数与闭包（Anonymous Function and Closure）【转】
(function(){ //这里忽略jQuery所有实现 })(); 半年前初次接触jQuery的时候,我也像其他人一样很兴奋地想看看源码是什么样的.然而,在看到源码的第一眼,我就迷糊了.为什么只有 ...
HTML CSS3 手风琴菜单
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <link rel= ...
testNG小试牛刀
testNG是一个测试框架,其灵感来自JUnit和NUnit的,但引入了一些新的功能,使其功能更强大,使用更方便. testNG是一个开源自动化测试框架:testNG表示下一代. testNG是类似于 ...

hadoop中日志聚集问题

hadoop中日志聚集问题的更多相关文章

随机推荐

热门专题