Word histogram
Here is a program that reads a file and builds a histogram of the words in the file:
process_file loops through the lines of the file, passing them one at a time to process_line. The histogram h is being used as an accumulator. process_line uses the string method replace to replace hyphens with spaces before using split to break the line into a list of strings. It traverses the list of words and uses strip and lower to remove punctuation and convert to lower case. (It is a shorthand to say that strings are ‘converted;’ remember that string are immutable, so methods like strip and lower return new strings.)
Finally, process_line updates the histogram by creating a new item incrementing an existing one. To count the total number of words in the file, we can add up the frequencies in the histogram:
from Thinking in Python
Word histogram的更多相关文章
- {ICIP2014}{收录论文列表}
This article come from HEREARS-L1: Learning Tuesday 10:30–12:30; Oral Session; Room: Leonard de Vinc ...
- Create and format Word documents using R software and Reporters package
http://www.sthda.com/english/wiki/create-and-format-word-documents-using-r-software-and-reporters-pa ...
- 84. Largest Rectangle in Histogram *HARD* -- 柱状图求最大面积 85. Maximal Rectangle *HARD* -- 求01矩阵中的最大矩形
1. Given n non-negative integers representing the histogram's bar height where the width of each bar ...
- Bag of word based image retrieval
主要参考维基百科Bag of Word 在DLP领域里,bow(bag of word)是一个稀疏的向量,向量的每个元素记录词的出现次数,相当于对每篇文章都关于词典做词的直方图统计.同样的道理用在co ...
- Word/Excel 在线预览
前言 近日项目中做到一个功能,需要上传附件后能够在线预览.之前也没做过这类似的,于是乎就查找了相关资料,.net实现Office文件预览大概有这几种方式: ① 使用Microsoft的Office组件 ...
- C#中5步完成word文档打印的方法
在日常工作中,我们可能常常需要打印各种文件资料,比如word文档.对于编程员,应用程序中文档的打印是一项非常重要的功能,也一直是一个非常复杂的工作.特别是提到Web打印,这的确会很棘手.一般如果要想选 ...
- C# 给word文档添加水印
和PDF一样,在word中,水印也分为图片水印和文本水印,给文档添加图片水印可以使文档变得更为美观,更具有吸引力.文本水印则可以保护文档,提醒别人该文档是受版权保护的,不能随意抄袭.前面我分享了如何给 ...
- 获取打开的Word文档
using Word = Microsoft.Office.Interop.Word; int _getApplicationErrorCount=0; bool _isMsOffice = true ...
- How to accept Track changes in Microsoft Word 2010?
"Track changes" is wonderful and remarkable tool of Microsoft Word 2010. The feature allow ...
随机推荐
- java 基础概念 -- 数组与内存控制
问题1: Java在声明数组的过程中,是怎样分配内存的? 在栈内存中 建一个数组变量,再在堆内存中 建一个 数组对象.至于详细的内存分配细节,还得看 该初始化是 数组动态初始化 还是 数组静态初始化. ...
- Mysql数据库事务的隔离级别和锁的实现原理分析
Mysql数据库事务的隔离级别和锁的实现原理分析 找到大神了:http://blog.csdn.net/tangkund3218/article/details/51753243 InnoDB使用MV ...
- [Javascirpt] Developer-friendly Flow Charts with flowchart.js
Flowchart.js is a great tool for creating quick, simple flowcharts in a way that keeps you out of a ...
- 【Android】桌面歌词悬浮效果简单实现
在使用"网易云音乐"的时候,发现有一个显示"桌面歌词"的功能,于是就想着自己实现下.查了下资料,是用WindowManage实现的.实现过程中也出现了些问题,看 ...
- POI 导入excel数据自己主动封装成model对象--代码分析
上完代码后,对代码进行基本的分析: 1.主要使用反射api将数数据注入javabean对象 2.代码中的日志信息级别为debug级别 3.获取ExcelImport对象后须要调用init()方法初始化 ...
- C++ STL 源代码学习(之deque篇)
stl_deque.h /** Class invariants: * For any nonsingular iterator i: * i.node is the address of an el ...
- LeetCode OJ 215. Kth Largest Element in an Array 堆排序求解
题目链接:https://leetcode.com/problems/kth-largest-element-in-an-array/ 215. Kth Largest Element in an A ...
- spring web mvc第一天
spring web mvc 感觉就是高大上啊!啥都是配置文件就能够了.所以第一步就是弄清楚配置文件使用和总体框架的流程! Spring web mvc最重要的当然是Controller,也就是首先 ...
- HDU 5296 Annoying problem dfs序 lca set
Annoying problem Problem Description Coco has a tree, whose nodes are conveniently labeled by 1,2,…, ...
- the process android.process.acore has stopped或the process com.phone。。。。
模拟器一启动 The process android.process.acore has stopped unexpectedly 今天不知道怎么回事,模拟器一启动就狂报错, 模拟器已经重新安装过了, ...