4 weekend110的textinputformat对切片规划的源码分析 + 倒排索引的mr实现 + 多个job在同一个main方法中提交



好的,现在,来weekend110的textinputformat对切片规划的源码分析,
Inputformat默认是textinputformat,一通百通。
















































这就是今天,weekend110的textinputformat对切片规划的源码分析入口















































[LocatedFileStatus{path=hdfs://weekend110:9000/wc/srcdata/words.log; isDirectory=false; length=90; replication=1; blocksize=134217728; modification_time=1469247371536; access_time=1469501356933; owner=hadoop; group=supergroup; permission=rw-r--r--; isSymlink=false}]













































[hdfs://weekend110:9000/wc/srcdata/words.log:0+90]







[hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp
Found 1 items
drwx------ - hadoop supergroup 0 2016-07-23 12:25 /tmp/hadoop-yarn
[hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn
Found 1 items
drwx------ - hadoop supergroup 0 2016-07-23 12:26 /tmp/hadoop-yarn/staging
[hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging
Found 2 items
drwx------ - hadoop supergroup 0 2016-07-23 12:25 /tmp/hadoop-yarn/staging/hadoop
drwxr-xr-x - hadoop supergroup 0 2016-07-23 12:26 /tmp/hadoop-yarn/staging/history
[hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging/history
Found 1 items
drwxrwxrwt - hadoop supergroup 0 2016-07-23 12:26 /tmp/hadoop-yarn/staging/history/done_intermediate
[hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging/history/done_intermediate
Found 1 items
drwxrwx--- - hadoop supergroup 0 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop
[hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop
Found 48 items
-rwxrwx--- 1 hadoop supergroup 32973 2016-07-23 12:29 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469234255012_0001-1469247921943-hadoop-wc.jar-1469248148068-1-1-SUCCEEDED-default-1469248027901.jhist
-rwxrwx--- 1 hadoop supergroup 347 2016-07-23 12:29 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469234255012_0001.summary
-rwxrwx--- 1 hadoop supergroup 91579 2016-07-23 12:29 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469234255012_0001_conf.xml
-rwxrwx--- 1 hadoop supergroup 32957 2016-07-25 19:45 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469446305412_0001-1469447061251-hadoop-wc.jar-1469447138744-1-1-SUCCEEDED-default-1469447093632.jhist
-rwxrwx--- 1 hadoop supergroup 347 2016-07-25 19:45 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469446305412_0001.summary
-rwxrwx--- 1 hadoop supergroup 91579 2016-07-25 19:45 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469446305412_0001_conf.xml
-rwxrwx--- 1 hadoop supergroup 33003 2016-07-26 20:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469500058449_0001-1469536528574-hadoop-flow.jar-1469536711053-1-1-SUCCEEDED-default-1469536621793.jhist
-rwxrwx--- 1 hadoop supergroup 349 2016-07-26 20:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469500058449_0001.summary
-rwxrwx--- 1 hadoop supergroup 91594 2016-07-26 20:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469500058449_0001_conf.xml
-rwxrwx--- 1 hadoop supergroup 32975 2016-07-27 09:07 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0001-1469581609069-hadoop-flow.jar-1469581669098-1-1-SUCCEEDED-default-1469581639942.jhist
-rwxrwx--- 1 hadoop supergroup 349 2016-07-27 09:07 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0001.summary
-rwxrwx--- 1 hadoop supergroup 91594 2016-07-27 09:07 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0001_conf.xml
-rwxrwx--- 1 hadoop supergroup 32966 2016-07-27 09:13 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0002-1469581980369-hadoop-flow.jar-1469582016624-1-1-SUCCEEDED-default-1469581991321.jhist
-rwxrwx--- 1 hadoop supergroup 348 2016-07-27 09:13 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0002.summary
-rwxrwx--- 1 hadoop supergroup 91594 2016-07-27 09:13 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0002_conf.xml
-rwxrwx--- 1 hadoop supergroup 32947 2016-07-27 09:34 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0003-1469583259497-hadoop-flow.jar-1469583283697-1-1-SUCCEEDED-default-1469583266059.jhist
-rwxrwx--- 1 hadoop supergroup 347 2016-07-27 09:34 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0003.summary
-rwxrwx--- 1 hadoop supergroup 91594 2016-07-27 09:34 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0003_conf.xml
-rwxrwx--- 1 hadoop supergroup 32973 2016-07-27 09:56 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0004-1469584535785-hadoop-flow.jar-1469584574236-1-1-SUCCEEDED-default-1469584549659.jhist
-rwxrwx--- 1 hadoop supergroup 347 2016-07-27 09:56 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0004.summary
-rwxrwx--- 1 hadoop supergroup 91594 2016-07-27 09:56 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0004_conf.xml
-rwxrwx--- 1 hadoop supergroup 32994 2016-07-27 16:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0001-1469609254627-hadoop-flowSort.jar-1469609480611-1-1-SUCCEEDED-default-1469609373636.jhist
-rwxrwx--- 1 hadoop supergroup 353 2016-07-27 16:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0001.summary
-rwxrwx--- 1 hadoop supergroup 91630 2016-07-27 16:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0001_conf.xml
-rwxrwx--- 1 hadoop supergroup 32989 2016-07-27 17:01 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0002-1469609990434-hadoop-flowSort.jar-1469610090600-1-1-SUCCEEDED-default-1469610004692.jhist
-rwxrwx--- 1 hadoop supergroup 353 2016-07-27 17:01 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0002.summary
-rwxrwx--- 1 hadoop supergroup 91622 2016-07-27 17:01 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0002_conf.xml
-rwxrwx--- 1 hadoop supergroup 52581 2016-07-27 22:28 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0001-1469629441509-hadoop-flowArea.jar-1469629695512-1-0-FAILED-default-1469629461365.jhist
-rwxrwx--- 1 hadoop supergroup 352 2016-07-27 22:28 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0001.summary
-rwxrwx--- 1 hadoop supergroup 91494 2016-07-27 22:28 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0001_conf.xml
-rwxrwx--- 1 hadoop supergroup 30548 2016-07-27 22:42 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0002-1469629856935-hadoop-flowArea.jar-1469630543551-1-0-FAILED-default-1469630477324.jhist
-rwxrwx--- 1 hadoop supergroup 350 2016-07-27 22:42 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0002.summary
-rwxrwx--- 1 hadoop supergroup 91494 2016-07-27 22:42 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0002_conf.xml
-rwxrwx--- 1 hadoop supergroup 30560 2016-07-27 22:55 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0003-1469630391568-hadoop-flowArea.jar-1469631307275-1-0-FAILED-default-1469631249046.jhist
-rwxrwx--- 1 hadoop supergroup 350 2016-07-27 22:55 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0003.summary
-rwxrwx--- 1 hadoop supergroup 91494 2016-07-27 22:55 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0003_conf.xml
-rwxrwx--- 1 hadoop supergroup 54558 2016-07-28 09:12 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0001-1469668063936-hadoop-flowArea.jar-1469668319036-1-0-FAILED-default-1469668087466.jhist
-rwxrwx--- 1 hadoop supergroup 352 2016-07-28 09:11 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0001.summary
-rwxrwx--- 1 hadoop supergroup 91494 2016-07-28 09:12 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0001_conf.xml
-rwxrwx--- 1 hadoop supergroup 30329 2016-07-28 09:25 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0002-1469669047716-hadoop-flow.jar-1469669116225-1-0-FAILED-default-1469669070963.jhist
-rwxrwx--- 1 hadoop supergroup 346 2016-07-28 09:25 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0002.summary
-rwxrwx--- 1 hadoop supergroup 91595 2016-07-28 09:25 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0002_conf.xml
-rwxrwx--- 1 hadoop supergroup 30331 2016-07-28 09:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0003-1469669444122-hadoop-flow.jar-1469669914163-1-0-FAILED-default-1469669867080.jhist
-rwxrwx--- 1 hadoop supergroup 346 2016-07-28 09:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0003.summary
-rwxrwx--- 1 hadoop supergroup 91595 2016-07-28 09:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0003_conf.xml
-rwxrwx--- 1 hadoop supergroup 32950 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0004-1469670210160-hadoop-flow.jar-1469670688549-1-1-SUCCEEDED-default-1469670670491.jhist
-rwxrwx--- 1 hadoop supergroup 347 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0004.summary
-rwxrwx--- 1 hadoop supergroup 91619 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0004_conf.xml
[hadoop@weekend110 ~]$














file:/tmp/hadoop-Administrator/mapred/staging/Administrator1242101173/.staging/job_local1242101173_0001/job.xml



job-id : job_local1242101173_0001uber-mode : falsemap-progress : 1.0reduce-progress : 1.0cleanup-progress : 1.0setup-progress : 1.0runstate : SUCCEEDEDstart-time : 0user-name : Administratorpriority : NORMALscheduling-info : NAnum-used-slots0num-reserved-slots0used-mem0reserved-mem0needed-mem0

主题:
主用户: NTUserPrincipal: Administrator
主用户: NTSidUserPrincipal: S-1-5-21-2155837731-1039603112-1552600933-500
主用户: NTDomainPrincipal: WIN-BQOBV63OBNM
主用户: NTSidDomainPrincipal: S-1-5-21-2155837731-1039603112-1552600933
主用户: NTSidPrimaryGroupPrincipal: S-1-5-21-2155837731-1039603112-1552600933-513
主用户: NTSidGroupPrincipal: S-1-1-0
主用户: NTSidGroupPrincipal: S-1-5-114
主用户: NTSidGroupPrincipal: S-1-5-32-544
主用户: NTSidGroupPrincipal: S-1-5-32-545
主用户: NTSidGroupPrincipal: S-1-5-4
主用户: NTSidGroupPrincipal: S-1-2-1
主用户: NTSidGroupPrincipal: S-1-5-11
主用户: NTSidGroupPrincipal: S-1-5-15
主用户: NTSidGroupPrincipal: S-1-5-113
主用户: NTSidGroupPrincipal: S-1-5-5-0-112222
主用户: NTSidGroupPrincipal: S-1-2-0
主用户: NTSidGroupPrincipal: S-1-5-64-10
主用户: NTSidGroupPrincipal: S-1-16-12288
主用户: Administrator
公共身份证明: NTNumericCredential: 2088
专用身份证明: org.apache.hadoop.security.Credentials@77084cb5







以上是weekend110的textinputformat的对切片规划的源码分析





建立索引


看mr程序实现倒排索引



























Soga:
之前在分析切片规划的源码分析时,Inputspilt里,我们知道,是包括block信息、文件路径信息、、、












































































[hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop jar ii.jar cn.itcast.hadoop.mr.ii.InverseIndexStepOne /ii/data /ii/stepone





为什么可以,因为,




































































拿这个结果,作为输入




以上是weekend110的倒排索引的mr实现



以下是,多个job在同一个mian方法中提交


总结,不推荐这种哈。当然,在这里是玩玩而已
4 weekend110的textinputformat对切片规划的源码分析 + 倒排索引的mr实现 + 多个job在同一个main方法中提交的更多相关文章
- MapReduce 切片机制源码分析
总体来说大概有以下2个大的步骤 1.连接集群(yarnrunner或者是localjobrunner) 2.submitter.submitJobInternal()在该方法中会创建提交路径,计算切片 ...
- Hadoop TextInputFormat源码分析
from:http://blog.csdn.net/lzm1340458776/article/details/42707047 InputFormat主要用于描述输入数据的格式(我们只分析新API, ...
- 旧版API的TextInputFormat源码分析
TextInputFormat类 package org.apache.hadoop.mapred; import java.io.*; import org.apache.hadoop.fs.*; ...
- ROS源码解读(二)--全局路径规划
博客转载自:https://blog.csdn.net/xmy306538517/article/details/79032324 ROS中,机器人全局路径规划默认使用的是navfn包 ,move_b ...
- MapReduce中TextInputFormat分片和读取分片数据源码级分析
InputFormat主要用于描述输入数据的格式(我们只分析新API,即org.apache.hadoop.mapreduce.lib.input.InputFormat),提供以下两个功能: (1) ...
- 4 weekend110的hdfs下载数据源码跟踪铺垫 + hdfs下载数据源码分析-getFileSystem(值得反复推敲和打断点源码)
Hdfs下载数据源码分析 在这里,我是接着之前的,贴下代码 package cn.itcast.hadoop.hdfs; import java.io.FileInputStream; import ...
- 1 weekend110的hdfs源码跟踪之打开输入流 + hdfs源码跟踪之打开输入流总结
3种形式的元数据,fsimage是在磁盘上,meta.data是在内存上, 我们继续,前面呢,断点是打在这一行代码处, FileSystem fs = FileSystem.get(conf); we ...
- Hadoop源码解析之: TextInputFormat如何处理跨split的行
我们知道hadoop将数据给到map进行处理前会使用InputFormat对数据进行两方面的预处理: 对输入数据进行切分,生成一组split,一个split会分发给一个mapper进行处理. 针对每个 ...
- ArcGIS紧凑型切片读取与应用1-解析(附源码)
1.前言 ArcGIS 发布的切片服务分为紧凑型切片和传统的分散型切片以及最新ArcGIS por新增的矢量切片.传统的分散型切片面临的问题是文件个数太多,部署拷贝过程十分的耗时,紧凑型切片是对分散型 ...
随机推荐
- MySql中常用语句
1.查询语句: SELECT 查询字段 FROM 表名 WHERE 条件 查询字段可以使用 通配符* 字段名 别名(把长的名字命名一个别名,比较短的) 通配符:SELECT * FROM ' ...
- Sublime Text 3 插件、主题、配置
换电脑,Sublime Text 3 重新配置一遍,做个记录 1. 下载:http://www.sublimetext.com/3 2. 插件管理器 Package Control (Ctrl + ` ...
- PHP5.5安装php-redis扩展
windows下开发用的xampp集成的环境,想装个php-redis扩展,扩展的github地址: https://github.com/nicolasff/phpredis 描述里找到window ...
- 对 HTTP 304 的理解(转)
最近和同事一起看Web的Cache问题,又进一步理解了 HTTP 中的 304 又有了一些了解. 304 的标准解释是:Not Modified 客户端有缓冲的文档并发出了一个条件性的请求(一般是提供 ...
- Android 自定义View实现单击和双击事件
自定义View, 1. 自定义一个Runnable线程TouchEventCountThread , 用来统计500ms内的点击次数 2. 在MyView中的 onTouchEvent 中调用 上面 ...
- asp.net mvc将html编译
从数据库查询出来的值,如果包含html标签并且通过MVC绑定页面的话,那么他会通过浏览器编译为字符串显示,所以我们有得在从新的转一次: HtmlString hh = new HtmlString(M ...
- jquery实现抽奖
用jquery实现抽奖小程序 用jquery实现抽奖小程序 这些日子,到处都可以看到关于微信小程序的新闻或报到,在博客园中写关于微信小程序的也不少.但是今天我要说的不是微信小程序,而是用简单的jq ...
- 根据WSDL生成代理类方式(2)
运行开发人员工具提示 输入命令行svcutil http://localhost:8080/Test/TestClassPort?wsdl
- JVM 学习笔记(一)
JVM ----Java Virtual Machine (熟称:JAVA虚拟机),JVM 在执行JAVA程序的过程中将内容划分为若干个区域,其有各自的用途和管理机制.如下图: 1. 程序计 ...
- 关于MATLAB中的tic toc的问题
关于MATLAB中的tic toc的问题 其一) MATLAB实际单位时间计时函数的具体应用,在编写程序时,经常需要获知代码的执行实际时间,这就需要在程序中用到计时函数,matlab中提供了以下三种方 ...