遇到的问题:

当点击上面的logs时,会出现下面问题:

这个解决方案为:

By default, Hadoop stores the logs of each container in the node where that container was hosted. While this is irrelevant if you're just testing some Hadoop executions in a single-node environment (as all the logs will be in your machine anyway), with a cluster of nodes, keeping track of the logs can become quite a bother. In addition, since logs are kept on the normal filesystem, you may run into storage problems if you keep logs for a long time or have heterogeneous storage capabilities.

Log aggregation is a new feature that allows Hadoop to store the logs of each application in a central directory in HDFS. To activate it, just add the following to yarn-site.xmland restart the Hadoop services:

 <property>
<description>Whether to enable log aggregation</description>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>

By adding this option, you're telling Hadoop to move the application logs to hdfs:///logs/userlogs/<your user>/<app id>. You can change this path and other options related to log aggregation by specifying some other properties mentioned in the default yarn-site.xml (just do a search for log.aggregation).

However, these aggregated logs are not stored in a human readable format so you can't just cat their contents. Fortunately, Hadoop developers have included several handy command line tools for reading them:

# Read logs from any YARN application
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> # Read logs from MapReduce jobs
$HADOOP_HOME/bin/mapred job -logs <jobId> # Read it in a scrollable window with search (type '/' followed by your query).
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> | less # Or just save it to a file and use your favourite editor
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> > log.txt

You can also access these logs via a web app for MapReduce jobs by using the JobHistory daemon. This daemon can be started/stopped by running the following:

# Start JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver
# Stop JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver

My Fabric script includes an optional variable for setting the node where to launch this daemon so it is automatically started/stopped when you run fab start or fab stop.

Unfortunately, a generic history daemon for universal web access to aggregated logs does not exist yet. However, as you can see by checking YARN-321, there's considerable work being done in this area. When this gets introduced I'll update this section.

hadoop中日志聚集问题的更多相关文章

  1. Hadoop基础-完全分布式模式部署yarn日志聚集功能

    Hadoop基础-完全分布式模式部署yarn日志聚集功能 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 其实我们不用配置也可以在服务器后台通过命令行的形式查看相应的日志,但为了更方 ...

  2. hadoop配置历史服务器&&配置日志聚集

    配置历史服务器 1.在mapred-site.xml中写入一下配置 <property> <name>mapreduce.jobhistory.address</name ...

  3. hadoop 3.x 配置日志聚集功能

    打开$HADOOP_HOME/etc/hadoop/yarn-site.xml,增加以下配置(在此配置文件中尽量不要使用中文注释) <!--logs--> <property> ...

  4. 开启spark日志聚集功能

    spark监控应用方式: 1)在运行过程中可以通过web Ui:4040端口进行监控 2)任务运行完成想要监控spark,需要启动日志聚集功能 开启日志聚集功能方法: 编辑conf/spark-env ...

  5. Yarn 的日志聚集功能配置使用

    需要  hadoop 的安装目录/etc/hadoop/yarn-site.xml 中进行配置 配置内容 <property> <name>yarn.log-aggregati ...

  6. 5,Hadoop中的文件

    1,文件结构 · bin:脚本和命令目录. · etc:配置文件目录. · sbin:命令目录,主要包含HDFS和YARN中各类服务的启动和关闭,依赖于bin中的脚本. · share:各个模块编译后 ...

  7. 再谈SQL Server中日志的的作用

    简介     之前我已经写了一个关于SQL Server日志的简单系列文章.本篇文章会进一步挖掘日志背后的一些概念,原理以及作用.如果您没有看过我之前的文章,请参阅:     浅谈SQL Server ...

  8. Hive分析hadoop进程日志

    想把hadoop的进程日志导入hive表进行分析,遂做了以下的尝试. 关于hadoop进程日志的解析 使用正则表达式获取四个字段,一个是日期时间,一个是日志级别,一个是类,最后一个是详细信息, 然后在 ...

  9. hadoop中常见元素的解释

    secondarynamenode 图: secondarynamenode根据文件的的大小对namenode的编辑日志和镜像日志 进行合并. 光从字面上来理解,很容易让一些初学者先入为主的认为:Se ...

随机推荐

  1. POJ 3104

    Drying Time Limit: 2000MS   Memory Limit: 65536K Total Submissions: 7959   Accepted: 2014 Descriptio ...

  2. C# foreach循环绑定key数组和value 数组(备用)

    <div class="ContextualTab inner_warp clearfix" data-max="2" data-blur=false d ...

  3. hdu 4187 Alphabet Soup

    这题的主要就是找循环节数,这里用找字符串最小覆盖来实现,也就是n-next[n],证明在这http://blog.csdn.net/fjsd155/article/details/6866991 #i ...

  4. ASP.NET 免费开源控件

    AspNetPager分页控件(当前版本:7.5.1) AspNetPager分页控件是应用于ASP.NET WebForm网站或应用程序中的自定义分页控件,支持默认的回发(Postback)分页和U ...

  5. Sina App Engine(SAE)入门教程(9)- SaeMail(邮件)使用

    参考资料: SAE mail api 文档 怎么使用? 参见代码: <?php $mail = new SaeMail(); $f = new SaeFetchurl(); $img_data ...

  6. 开发版本控制git

    git init 在git命令行中依次输入 touch readme.txt并回车, git add . 点代表所有, git commit -m "init first"并回车, ...

  7. Android Cursor空指针的问题

    最近几天无聊自己动手写个音乐播放器,用到Cursor来取得数据库中音乐文件的信息,但是当用到Cursor的时候总是报空指针错误,后来发现是模拟器上没有音乐文件,使用Cursor的时候 ,若Cursor ...

  8. FileObverse文件观察者的Debug报告

    FileObverse文件观察者的Debug报告 2014年9月18日 9:03

  9. Oracle 数据集成的实际解决方案

    就针对市场与企业的发展的需求,Oracle公司提供了一个相对统一的关于企业级的实时数据解决方案,即Oracle数据集成的解决方案.以下的文章主要是对其解决方案的具体描述,望你会有所收获. Oracle ...

  10. hMailserver设置外部反病毒扫描程序

    刚在5dmail上发现有人提出一个问题,他在hmailserver的外部病毒扫描程序中使用了瑞星那个娱乐货,结果呢,说瑞星太勇猛了,所有附件都认为病毒了,这是怎么个情况呢? 先从hmailadmin里 ...