遇到的问题:

当点击上面的logs时,会出现下面问题:

这个解决方案为:

By default, Hadoop stores the logs of each container in the node where that container was hosted. While this is irrelevant if you're just testing some Hadoop executions in a single-node environment (as all the logs will be in your machine anyway), with a cluster of nodes, keeping track of the logs can become quite a bother. In addition, since logs are kept on the normal filesystem, you may run into storage problems if you keep logs for a long time or have heterogeneous storage capabilities.

Log aggregation is a new feature that allows Hadoop to store the logs of each application in a central directory in HDFS. To activate it, just add the following to yarn-site.xmland restart the Hadoop services:

 <property>
<description>Whether to enable log aggregation</description>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>

By adding this option, you're telling Hadoop to move the application logs to hdfs:///logs/userlogs/<your user>/<app id>. You can change this path and other options related to log aggregation by specifying some other properties mentioned in the default yarn-site.xml (just do a search for log.aggregation).

However, these aggregated logs are not stored in a human readable format so you can't just cat their contents. Fortunately, Hadoop developers have included several handy command line tools for reading them:

# Read logs from any YARN application
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> # Read logs from MapReduce jobs
$HADOOP_HOME/bin/mapred job -logs <jobId> # Read it in a scrollable window with search (type '/' followed by your query).
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> | less # Or just save it to a file and use your favourite editor
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> > log.txt

You can also access these logs via a web app for MapReduce jobs by using the JobHistory daemon. This daemon can be started/stopped by running the following:

# Start JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver
# Stop JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver

My Fabric script includes an optional variable for setting the node where to launch this daemon so it is automatically started/stopped when you run fab start or fab stop.

Unfortunately, a generic history daemon for universal web access to aggregated logs does not exist yet. However, as you can see by checking YARN-321, there's considerable work being done in this area. When this gets introduced I'll update this section.

hadoop中日志聚集问题的更多相关文章

  1. Hadoop基础-完全分布式模式部署yarn日志聚集功能

    Hadoop基础-完全分布式模式部署yarn日志聚集功能 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 其实我们不用配置也可以在服务器后台通过命令行的形式查看相应的日志,但为了更方 ...

  2. hadoop配置历史服务器&&配置日志聚集

    配置历史服务器 1.在mapred-site.xml中写入一下配置 <property> <name>mapreduce.jobhistory.address</name ...

  3. hadoop 3.x 配置日志聚集功能

    打开$HADOOP_HOME/etc/hadoop/yarn-site.xml,增加以下配置(在此配置文件中尽量不要使用中文注释) <!--logs--> <property> ...

  4. 开启spark日志聚集功能

    spark监控应用方式: 1)在运行过程中可以通过web Ui:4040端口进行监控 2)任务运行完成想要监控spark,需要启动日志聚集功能 开启日志聚集功能方法: 编辑conf/spark-env ...

  5. Yarn 的日志聚集功能配置使用

    需要  hadoop 的安装目录/etc/hadoop/yarn-site.xml 中进行配置 配置内容 <property> <name>yarn.log-aggregati ...

  6. 5,Hadoop中的文件

    1,文件结构 · bin:脚本和命令目录. · etc:配置文件目录. · sbin:命令目录,主要包含HDFS和YARN中各类服务的启动和关闭,依赖于bin中的脚本. · share:各个模块编译后 ...

  7. 再谈SQL Server中日志的的作用

    简介     之前我已经写了一个关于SQL Server日志的简单系列文章.本篇文章会进一步挖掘日志背后的一些概念,原理以及作用.如果您没有看过我之前的文章,请参阅:     浅谈SQL Server ...

  8. Hive分析hadoop进程日志

    想把hadoop的进程日志导入hive表进行分析,遂做了以下的尝试. 关于hadoop进程日志的解析 使用正则表达式获取四个字段,一个是日期时间,一个是日志级别,一个是类,最后一个是详细信息, 然后在 ...

  9. hadoop中常见元素的解释

    secondarynamenode 图: secondarynamenode根据文件的的大小对namenode的编辑日志和镜像日志 进行合并. 光从字面上来理解,很容易让一些初学者先入为主的认为:Se ...

随机推荐

  1. [转载]Spring Annotation Based Configuration

    Annotation injection is performed before XML injection, thus the latter configuration will override ...

  2. java基础知识回顾之---java StringBuffer类

    /*         * StringBuffer:就是字符串缓冲区,线程安全.         * 用于存储数据的容器.         * 特点:         * 1,长度的可变的.      ...

  3. MAC 上升级python为最新版本

    第1步:下载Python3.4 下载地址如下: 下载Mac OS X 64-bit/32-bit installer https://www.python.org/downloads/release/ ...

  4. HDU2521反素数

    只是了解下这种简单的数论定义,解释可以戳这个 http://www.cnblogs.com/Findxiaoxun/p/3460450.html ,然后按Ctrl+ F搜索   反素数  ,找到那一部 ...

  5. [iOS]如何给Label或者TextView赋HTML数据

    // // ViewController.m // text // // Created by 李东旭 on 16/1/22. // Copyright © 2016年 李东旭. All rights ...

  6. Netstat 命令

    简介 Netstat 命令用于显示各种网络相关信息,如网络连接,路由表,接口状态 (Interface Statistics),masquerade 连接,多播成员 (Multicast Member ...

  7. (三)C#关于txt文件的读取和写入

    using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Tex ...

  8. (二)CSS基础语法

    CSS语法规则由两个主要的部分构成:选择器,以及一条或者多条声明. 下面的示意图为您展示了CSS语法结构: 例如: h1{color:red;font-size:14px;} 值得不同写法和单位 其中 ...

  9. Java开发之单例设计模式

    设计模式之单例模式: 一.单例模式实现特点:①单例类在整个应用程序中只能有一个实例(通过私有无参构造器实现):②单例类必须自己创建这个实例并且可供其他对象访问(通过静态公开的访问权限修饰的getIns ...

  10. IP地址分类及私网IP

    5类IP地址: IP地址共有32位字节,其中A~C类IP地址由类标识号.网络地址和主机地址组成,A类标识最高位为0,网络地址为1字节,主机地址为3字节, B类标识最高位为10,网络地址为2字节,主机地 ...