Here is a program that reads a file and builds a histogram of the words in the file:

process_file loops through the lines of the file, passing them one at a time to process_line. The histogram h is being used as an accumulator. process_line uses the string method replace to replace hyphens with spaces before using split to break the line into a list of strings. It traverses the list of words and uses strip and lower to remove punctuation and convert to lower case. (It is a shorthand to say that strings are ‘converted;’ remember that string are immutable, so methods like strip and lower return new strings.)

Finally, process_line updates the histogram by creating a new item incrementing an existing one. To count the total number of words in the file, we can add up the frequencies in the histogram:

from Thinking in Python

Word histogram的更多相关文章

  1. {ICIP2014}{收录论文列表}

    This article come from HEREARS-L1: Learning Tuesday 10:30–12:30; Oral Session; Room: Leonard de Vinc ...

  2. Create and format Word documents using R software and Reporters package

    http://www.sthda.com/english/wiki/create-and-format-word-documents-using-r-software-and-reporters-pa ...

  3. 84. Largest Rectangle in Histogram *HARD* -- 柱状图求最大面积 85. Maximal Rectangle *HARD* -- 求01矩阵中的最大矩形

    1. Given n non-negative integers representing the histogram's bar height where the width of each bar ...

  4. Bag of word based image retrieval

    主要参考维基百科Bag of Word 在DLP领域里,bow(bag of word)是一个稀疏的向量,向量的每个元素记录词的出现次数,相当于对每篇文章都关于词典做词的直方图统计.同样的道理用在co ...

  5. Word/Excel 在线预览

    前言 近日项目中做到一个功能,需要上传附件后能够在线预览.之前也没做过这类似的,于是乎就查找了相关资料,.net实现Office文件预览大概有这几种方式: ① 使用Microsoft的Office组件 ...

  6. C#中5步完成word文档打印的方法

    在日常工作中,我们可能常常需要打印各种文件资料,比如word文档.对于编程员,应用程序中文档的打印是一项非常重要的功能,也一直是一个非常复杂的工作.特别是提到Web打印,这的确会很棘手.一般如果要想选 ...

  7. C# 给word文档添加水印

    和PDF一样,在word中,水印也分为图片水印和文本水印,给文档添加图片水印可以使文档变得更为美观,更具有吸引力.文本水印则可以保护文档,提醒别人该文档是受版权保护的,不能随意抄袭.前面我分享了如何给 ...

  8. 获取打开的Word文档

    using Word = Microsoft.Office.Interop.Word; int _getApplicationErrorCount=0; bool _isMsOffice = true ...

  9. How to accept Track changes in Microsoft Word 2010?

    "Track changes" is wonderful and remarkable tool of Microsoft Word 2010. The feature allow ...

随机推荐

  1. java 基础概念 -- 数组与内存控制

    问题1: Java在声明数组的过程中,是怎样分配内存的? 在栈内存中 建一个数组变量,再在堆内存中 建一个 数组对象.至于详细的内存分配细节,还得看 该初始化是 数组动态初始化 还是 数组静态初始化. ...

  2. Mysql数据库事务的隔离级别和锁的实现原理分析

    Mysql数据库事务的隔离级别和锁的实现原理分析 找到大神了:http://blog.csdn.net/tangkund3218/article/details/51753243 InnoDB使用MV ...

  3. [Javascirpt] Developer-friendly Flow Charts with flowchart.js

    Flowchart.js is a great tool for creating quick, simple flowcharts in a way that keeps you out of a ...

  4. 【Android】桌面歌词悬浮效果简单实现

    在使用"网易云音乐"的时候,发现有一个显示"桌面歌词"的功能,于是就想着自己实现下.查了下资料,是用WindowManage实现的.实现过程中也出现了些问题,看 ...

  5. POI 导入excel数据自己主动封装成model对象--代码分析

    上完代码后,对代码进行基本的分析: 1.主要使用反射api将数数据注入javabean对象 2.代码中的日志信息级别为debug级别 3.获取ExcelImport对象后须要调用init()方法初始化 ...

  6. C++ STL 源代码学习(之deque篇)

    stl_deque.h /** Class invariants: * For any nonsingular iterator i: * i.node is the address of an el ...

  7. LeetCode OJ 215. Kth Largest Element in an Array 堆排序求解

    题目链接:https://leetcode.com/problems/kth-largest-element-in-an-array/ 215. Kth Largest Element in an A ...

  8. spring web mvc第一天

    spring  web mvc 感觉就是高大上啊!啥都是配置文件就能够了.所以第一步就是弄清楚配置文件使用和总体框架的流程! Spring web mvc最重要的当然是Controller,也就是首先 ...

  9. HDU 5296 Annoying problem dfs序 lca set

    Annoying problem Problem Description Coco has a tree, whose nodes are conveniently labeled by 1,2,…, ...

  10. the process android.process.acore has stopped或the process com.phone。。。。

    模拟器一启动 The process android.process.acore has stopped unexpectedly 今天不知道怎么回事,模拟器一启动就狂报错, 模拟器已经重新安装过了, ...