hadoop中日志聚集问题
遇到的问题:

当点击上面的logs时,会出现下面问题:

这个解决方案为:
By default, Hadoop stores the logs of each container in the node where that container was hosted. While this is irrelevant if you're just testing some Hadoop executions in a single-node environment (as all the logs will be in your machine anyway), with a cluster of nodes, keeping track of the logs can become quite a bother. In addition, since logs are kept on the normal filesystem, you may run into storage problems if you keep logs for a long time or have heterogeneous storage capabilities.
Log aggregation is a new feature that allows Hadoop to store the logs of each application in a central directory in HDFS. To activate it, just add the following to yarn-site.xmland restart the Hadoop services:
<property>
<description>Whether to enable log aggregation</description>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
By adding this option, you're telling Hadoop to move the application logs to hdfs:///logs/userlogs/<your user>/<app id>. You can change this path and other options related to log aggregation by specifying some other properties mentioned in the default yarn-site.xml (just do a search for log.aggregation).
However, these aggregated logs are not stored in a human readable format so you can't just cat their contents. Fortunately, Hadoop developers have included several handy command line tools for reading them:
# Read logs from any YARN application
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> # Read logs from MapReduce jobs
$HADOOP_HOME/bin/mapred job -logs <jobId> # Read it in a scrollable window with search (type '/' followed by your query).
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> | less # Or just save it to a file and use your favourite editor
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> > log.txt
You can also access these logs via a web app for MapReduce jobs by using the JobHistory daemon. This daemon can be started/stopped by running the following:
# Start JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver
# Stop JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver
My Fabric script includes an optional variable for setting the node where to launch this daemon so it is automatically started/stopped when you run fab start or fab stop.
Unfortunately, a generic history daemon for universal web access to aggregated logs does not exist yet. However, as you can see by checking YARN-321, there's considerable work being done in this area. When this gets introduced I'll update this section.
hadoop中日志聚集问题的更多相关文章
- Hadoop基础-完全分布式模式部署yarn日志聚集功能
Hadoop基础-完全分布式模式部署yarn日志聚集功能 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 其实我们不用配置也可以在服务器后台通过命令行的形式查看相应的日志,但为了更方 ...
- hadoop配置历史服务器&&配置日志聚集
配置历史服务器 1.在mapred-site.xml中写入一下配置 <property> <name>mapreduce.jobhistory.address</name ...
- hadoop 3.x 配置日志聚集功能
打开$HADOOP_HOME/etc/hadoop/yarn-site.xml,增加以下配置(在此配置文件中尽量不要使用中文注释) <!--logs--> <property> ...
- 开启spark日志聚集功能
spark监控应用方式: 1)在运行过程中可以通过web Ui:4040端口进行监控 2)任务运行完成想要监控spark,需要启动日志聚集功能 开启日志聚集功能方法: 编辑conf/spark-env ...
- Yarn 的日志聚集功能配置使用
需要 hadoop 的安装目录/etc/hadoop/yarn-site.xml 中进行配置 配置内容 <property> <name>yarn.log-aggregati ...
- 5,Hadoop中的文件
1,文件结构 · bin:脚本和命令目录. · etc:配置文件目录. · sbin:命令目录,主要包含HDFS和YARN中各类服务的启动和关闭,依赖于bin中的脚本. · share:各个模块编译后 ...
- 再谈SQL Server中日志的的作用
简介 之前我已经写了一个关于SQL Server日志的简单系列文章.本篇文章会进一步挖掘日志背后的一些概念,原理以及作用.如果您没有看过我之前的文章,请参阅: 浅谈SQL Server ...
- Hive分析hadoop进程日志
想把hadoop的进程日志导入hive表进行分析,遂做了以下的尝试. 关于hadoop进程日志的解析 使用正则表达式获取四个字段,一个是日期时间,一个是日志级别,一个是类,最后一个是详细信息, 然后在 ...
- hadoop中常见元素的解释
secondarynamenode 图: secondarynamenode根据文件的的大小对namenode的编辑日志和镜像日志 进行合并. 光从字面上来理解,很容易让一些初学者先入为主的认为:Se ...
随机推荐
- recursion lead to out of memory
There are two storage areas involved: the stack and the heap. The stack is where the current state o ...
- G家二面
题目都很基本,都属于听说过但是不会做的…都是操作系统,compiler的概念题… 概念题郁闷就郁闷在不会就是不会,就算能扯两句也会被问倒… 算法就一个,pow(x, y),5分钟不到…… 不是听说G家 ...
- struts2学习笔记(4)——数据类型转换
回过头来看昨天的那个例子. 在昨天的例子中,只转换了一个Point类,如果想转换多个Point类怎么办呢?在昨天的例子上面做一个小的修改. 首先在input.jsp页面中修改几个输入框. <s: ...
- 搭建hadoop环境,在win7的eclipse上远程操作(Linux上)hadoop2.6.0出错的一些总结
问题1:在DFS Lcation 上不能对文件进行操作: 解决方法: 在hadoop上的每个节点上修改该文件 conf/mapred-site.xml,增加: <property> < ...
- 日志工具logback的简介与配置
Logback是由log4j创始人设计的又一个开源日志组件.logback当前分成三个模块:logback-core,logback- classic和logback-access.logback-c ...
- JS操作Radio与Select
//radio的chang事件,以及获取选中的radio的值 $("input[name=radioName]").on("change", function( ...
- ibatis动态查询
在复杂查询过程中,我们常常需要根据用户的选择决定查询条件,这里发生变化的并不只是SQL 中的参数,包括Select 语句中所包括的字段和限定条件,都可能发生变化.典型情况,如在一个复杂的组合查询页面, ...
- Executing Raw SQL Queries using Entity Framework
原文 Executing Raw SQL Queries using Entity Framework While working with Entity Framework developers m ...
- java对象实例化
JAVA类,只要知道了类名(全名)就可以创建其实例对象,通用的方法是直接使用该类提供的构造方法,如 NewObject o = new NewObject(); NewObject o = new N ...
- Android列表视图(List View)
Android列表视图(ListView) ListView是一个显示滚动项列表的示视图组(viewgroup),通过使用适配器(Adapter)把这些列表项自动插入到列表中.适配器比如从一个数组或是 ...