统计某几个词在文章出现的次数

-file参数分发,是从客户端分发到各个执行mapreduce端的机器上

1.找一篇文章The_Man_of_Property.txt如下:

He was proud of him! He could not but feel that in similar circumstances he himself would have been tempted to enlarge his replies, but his instinct told him that this taciturnity was the very thing. He sighed with relief, however, when Soames, slowly turning, and without any change of expression, descended from the box.
When it came to the turn of Bosinney’s Counsel to address the Judge, James redoubled his attention, and he searched the Court again and again to see if Bosinney were not somewhere concealed.
Young Chankery began nervously; he was placed by Bosinney’s absence in an awkward position. He therefore did his best to turn that absence to account.
He could not but fear — he said — that his client had met with an accident. He had fully expected him there to give evidence; they had sent round that morning both to Mr. Bosinney’s office and to his rooms (though he knew they were one and the same, he thought it was as well not to say so), but it was not known where he was, and this he considered to be ominous, knowing how anxious Mr. Bosinney had been to give his evidence. He had not, however, been instructed to apply for an adjournment, and in default of such instruction he conceived it his duty to go on. The plea on which he somewhat confidently relied, and which his client, had he not unfortunately been prevented in some way from attending, would have supported by his evidence, was that such an expression as a ‘free hand’ could not be limited, fettered, and rendered unmeaning, by any verbiage which might follow it. He would go further and say that the correspondence showed that whatever he might have said in his evidence, Mr. Forsyte had in fact never contemplated repudiating liability on any of the work ordered or executed by his architect. The defendant had certainly never contemplated such a contingency, or, as was demonstrated by his letters, he would never have proceeded with the work — a work of extreme delicacy, carried out with great care and efficiency, to meet and satisfy the fastidious taste of a connoisseur, a rich man, a man of property. He felt strongly on this point, and feeling strongly he used, perhaps, rather strong words when he said that this action was of a most unjustifiable, unexpected, indeed — unprecedented character. If his Lordship had had the opportunity that he himself had made it his duty to take, to go over this very fine house and see the great delicacy and beauty of the decorations executed by his client — an artist in his most honourable profession — he felt convinced that not for one moment would his Lordship tolerate this, he would use no stronger word than daring attempt to evade legitimate responsibility.
Taking the text of Soames’ letters, he lightly touched on ‘Boileau v. The Blasted Cement Company, Limited.’ “It is doubtful,” he said, “what that authority has decided; in any case I would submit that it is just as much in my favour as in my friend’s.” He then argued the ‘nice point’ closely. With all due deference he submitted that Mr. Forsyte’s expression nullified itself. His client not being a rich man, the matter was a serious one for him; he was a very talented architect, whose professional reputation was undoubtedly somewhat at stake. He concluded with a perhaps too personal appeal to the Judge, as a lover of the arts, to show himself the protector of artists, from what was occasionally — he said occasionally — the too iron hand of capital. “What,” he said, “will be the position of the artistic professions, if men of property like this Mr. Forsyte refuse, and are allowed to refuse, to carry out the obligations of the commissions which they have given.” He would now call his client, in case he should at the last moment have found himself able to be present.
The name Philip Baynes Bosinney was called three times by the Ushers, and the sound of the calling echoed with strange melancholy throughout the Court and Galleries.
The crying of this name, to which no answer was returned, had upon James a curious effect: it was like calling for your lost dog about the streets. And the creepy feeling that it gave him, of a man missing, grated on his sense of comfort and security-on his cosiness. Though he could not have said why, it made him feel uneasy.
He looked now at the clock — a quarter to three! It would be all over in a quarter of an hour. Where could the young fellow be?
It was only when Mr. Justice Bentham delivered judgment that he got over the turn he had received.
Behind the wooden erection, by which he was fenced from more ordinary mortals, the learned Judge leaned forward. The electric light, just turned on above his head, fell on his face, and mellowed it to an orange hue beneath the snowy crown of his wig; the amplitude of his robes grew before the eye; his whole figure, facing the comparative dusk of the Court, radiated like some majestic and sacred body. He cleared his throat, took a sip of water, broke the nib of a quill against the desk, and, folding his bony hands before him, began.
A divorce! Thus close, the word was paralyzing, so utterly at variance with all the principles that had hitherto guided his life. Its lack of compromise appalled him; he felt — like the captain of a ship, going to the side of his vessel, and, with his own hands throwing over the most precious of his bales. This jettisoning of his property with his own hand seemed uncanny to Soames. It would injure him in his profession: He would have to get rid of the house at Robin Hill, on which he had spent so much money, so much anticipation — and at a sacrifice. And she! She would no longer belong to him, not even in name! She would pass out of his life, and he — he should never see her again!
He traversed in the cab the length of a street without getting beyond the thought that he should never see her again!
r her hair; and at this scent the burning sickness of his jealousy seized him again.
Struggling into his fur,the watch was a three-cornered note addressed ‘Soames Forsyte,’ in‘Ierceived under the softness and immobility of this figure something desperate and resolved; something not to be turned away, something dangerous. She tore off her hat, and, putting both hands to her brow, pressed back the bronze mass of her hair.

2.将文章上传至hdfs文件系统/mapreduce目录下

hadoop  fs  -put  The_Man_of_Property.txt    /mapreduce

3.白名单文件white_list如下,统计白名单文件中单词在上述文章The_Man_of_Property.txt出现的次数

suitable
against
recent
across

4.map.py(map端代码如下)

#!usr/bin/python
import sys
def read_local_file(file):
word_set = set()
file_in = open (file,'r')
for line in file_in:
word = line.strip()
word_set.add(word)
return word_set
def mapper_func(file):
word_set=read_local_file(file) for line in sys.stdin:
ss=line.strip().split()
for word in ss:
word.strip()
if word != "" and (word in word_set):
print "%s\t%s"%(word,"") if __name__ == "__main__":
func = getattr(sys.modules[__name__],sys.argv[1])
args = None
if len(sys.argv) > 1:
args = sys.argv[2:]
func(*args)

5.red.py(reduce端代码如下)

#!usr/bin/python
import sys
def reducer_func():
word="None"
sum=0
for line in sys.stdin:
ss=line.split()
cur_word=ss[0]
cnt=int(ss[1])
if cur_word!=word:
if word!="None":
print "%s\t%s"%(word,sum)
word=cur_word
sum=0
else:
sum+=cnt
print "%s\t%s"%(word,sum)
if __name__ == "__main__":
func = getattr(sys.modules[__name__],sys.argv[1])
args = None
if len(sys.argv) > 1:
args=sys.argv[2:]
func(*args)

6.run.sh脚本如下:设置2个map,提前将输出路径删除(mapreduce不允许输出路径存在)

HADOOP="/usr/local/src/hadoop-1.2.1/bin/hadoop"
HADOOP_STREAMING="/usr/local/src/hadoop-1.2.1/contrib/streaming/hadoop-streaming-1.2.1.jar"
INPUT_PATH="/mapreduce/The_Man_of_Property.txt"
OUTPUT_PATH="/mapreduce/out"
$HADOOP fs -rmr $OUTPUT_PATH
$HADOOP jar $HADOOP_STREAMING \
-input "$INPUT_PATH" \
-output "$OUTPUT_PATH" \
-mapper "python map.py mapper_func white_list" \
-reducer "python red.py reducer_func" \
-file "./map.py"\
-file "./red.py"\
-file "./white_list"\
-jobconf "mapred.map.tasks=2"

7.运行shell脚本

bash run.sh

大数据python词频统计之本地分发-file的更多相关文章

  1. 大数据python词频统计之hdfs分发-cacheArchive

    -cacheArchive也是从hdfs上进分发,但是分发文件是一个压缩包,压缩包内可能会包含多层目录多个文件 1.The_Man_of_Property.txt文件如下(将其上传至hdfs上) ha ...

  2. 大数据python词频统计之hdfs分发-cacheFile

    -cacheFile 分发,文件事先上传至Hdfs上,分发的是一个文件 1.找一篇文章The_Man_of_Property.txt: He was proud of him! He could no ...

  3. Python 词频统计

    利用Python做一个词频统计 GitHub地址:FightingBob [Give me a star , thanks.] 词频统计 对纯英语的文本文件[Eg: 瓦尔登湖(英文版).txt]的英文 ...

  4. 大数据Python学习大纲

    最近公司在写一个课程<大数据运维实训课>,分为4个部分,linux实训课.Python开发.hadoop基础知识和项目实战.这门课程主要针对刚从学校毕业的学生去应聘时不会像一个小白菜一样被 ...

  5. python词频统计及其效能分析

    1) 博客开头给出自己的基本信息,格式建议如下: 学号2017****7128 姓名:肖文秀 词频统计及其效能分析仓库:https://gitee.com/aichenxi/word_frequenc ...

  6. 大数据理论篇HDFS的基石——Google File System

    Google File System 但凡是要开始讲大数据的,都绕不开最初的Google三驾马车:Google File System(GFS), MapReduce,BigTable. 为这一切的基 ...

  7. 入门大数据---Python基础

    前言 由于AI的发展,包括Python集成了很多计算库,所以淡入了人们的视野,成为一个极力追捧的语言. 首先概括下Python中文含义是蟒蛇,它是一个胶水语言和一个脚本语言,胶水的意思是能和多种语言集 ...

  8. python词频统计

    1.jieba 库 -中文分词库 words = jieba.lcut(str)  --->列表,词语 count = {} for word in words: if len(word)==1 ...

  9. 软工之词频统计器及基于sketch在大数据下的词频统计设计

    目录 摘要 算法关键 红黑树 稳定排序 代码框架 .h文件: .cpp文件 频率统计器的实现 接口设计与实现 接口设计 核心功能词频统计器流程 效果 单元测试 性能分析 性能分析图 问题发现 解决方案 ...

随机推荐

  1. 使用 JS 实现图片左右跑马灯

    Ø  前言 之前写了一篇使用 JS 实现文字上下跑马灯,现在乘热打铁在把图片左右跑马灯一起贴出来,不多说直接看代码. 1.   首先定义 css 样式 <style type="tex ...

  2. Groovy的脚本统一于类的世界

    http://groovy-lang.org/structure.html 3.2. Script class A script is always compiled into a class. Th ...

  3. 【bzoj 3669】[Noi2014]魔法森林

    Description 为了得到书法大家的真传,小E同学下定决心去拜访住在魔法森林中的隐士.魔法森林可以被看成一个包含个N节点M条边的无向图,节点标号为1..N,边标号为1..M.初始时小E同学在号节 ...

  4. C++中extern(转)

    1 基本解释:extern可以置于变量或者函数前,以标示变量或者函数的定义在别的文件中,提示编译器遇到此变量和函数时在其他模块中寻找其定义.此外extern也可用来进行链接指定. 也就是说extern ...

  5. LVS Nginx 负载均衡区别

    lvs nginx haproxy 对比都可以做负载均衡:工作方式和应用场景各有特点: lvs Linux 虚拟 服务: 1.可以应用支持协议: ftp http dns telnet smtp sm ...

  6. webstorm破解版

    链接:https://www.cnblogs.com/LUA123/p/8452501.html#undefined

  7. 【移动端】解决fixed定位闪动问题

    经常我们会把导航按钮固定在页面的最底部位置,比如饿了么的首页 但是导航栏在页面滚动的时候会不断的闪动,这样的用户体验非常不好,那么可以使用下面的CSS样式处理一下 transform: transla ...

  8. Flask中Mysql数据库的常见操作

    from flask import Flask,render_template #导入第三方链接库sql点金术 from flask_sqlalchemy import SQLAlchemy #建立对 ...

  9. python基础-----函数/装饰器

    函数 在Python中,定义一个函数要使用def语句,依次写出函数名.括号.括号中的参数和冒号:,然后,在缩进块中编写函数体,函数的返回值用return语句返回. 函数的优点之一是,可以将代码块与主程 ...

  10. go byte 和 string 类型之间转换

    string 不能直接和byte数组转换 string可以和byte的切片转换 1,string 转为[]byte var str string = "test" var data ...