【LeetCode】819. Most Common Word 解题报告(Python)
作者: 负雪明烛
id: fuxuemingzhu
个人博客: http://fuxuemingzhu.cn/
题目地址:https://leetcode.com/problems/most-common-word/description/
题目描述
Given a paragraph and a list of banned words, return the most frequent word that is not in the list of banned words. It is guaranteed there is at least one word that isn’t banned, and that the answer is unique.
Words in the list of banned words are given in lowercase, and free of punctuation. Words in the paragraph are not case sensitive. The answer is in lowercase.
Example:
Input:
paragraph = "Bob hit a ball, the hit BALL flew far after it was hit."
banned = ["hit"]
Output: "ball"
Explanation:
"hit" occurs 3 times, but it is a banned word.
"ball" occurs twice (and no other word does), so it is the most frequent non-banned word in the paragraph.
Note that words in the paragraph are not case sensitive,
that punctuation is ignored (even if adjacent to words, such as "ball,"),
and that "hit" isn't the answer even though it occurs more because it is banned.
Note:
1 <= paragraph.length <= 1000.1 <= banned.length <= 100.1 <= banned[i].length <= 10.- The answer is unique, and written in lowercase (even if its occurrences in paragraph may have uppercase symbols, and even if it is a proper noun.)
- paragraph only consists of letters, spaces, or the punctuation symbols
!?',;. - Different words in paragraph are always separated by a space.
- There are no hyphens or hyphenated words.
- Words only consist of letters, never apostrophes or other punctuation symbols.
题目大意
给出了一段文字,并给出了一个过滤词的列表。要把该段文字全部转化为小写,过滤掉标点符号和过滤词。在剩下的单词里,找出出现次数最多的词。
解题方法
正则+统计
字符串问题都不是问题。
因为要过滤掉标点,所以可以使用很多次的replace函数,但这样太蠢了。直接正则搞定,使用sub函数可以实现替换,即把出现过的标点符号替换成空格" "。并且把字符串转成小写。
例如对于case:
"a, a, a, a, b,b,b,c, c"
["a"]
经过正则把标点替换成空格之后的words的结果:
['a', '', 'a', '', 'a', '', 'a', '', 'b', 'b', 'b', 'c', '', 'c']
剩下的工作就比较简单了:找出不是空字符串并且不在过滤词表里的词,使用Counter统计词频。
most_common函数的参数设为1表示找出出现次数最多的词,返回的格式是[["hit",3]]。
Python代码如下:
class Solution:
def mostCommonWord(self, paragraph, banned):
"""
:type paragraph: str
:type banned: List[str]
:rtype: str
"""
p = re.compile(r"[!?',;.]")
sub_para = p.sub(' ', paragraph.lower())
words = sub_para.split(' ')
words = [word for word in words if word and word not in banned]
count = collections.Counter(words)
return count.most_common(1)[0][0]
另外一种正则方法是,只保留字符串。写法如下:
class Solution(object):
def mostCommonWord(self, paragraph, banned):
"""
:type paragraph: str
:type banned: List[str]
:rtype: str
"""
paragraph = re.findall(r"\w+", paragraph.lower())
count = collections.Counter(x for x in paragraph if x not in banned)
return count.most_common(1)[0][0]
日期
2018 年 5 月 27 日 —— 周末的天气很好~
2018 年 11 月 19 日 —— 周一又开始了
2020 年 3 月 26 日 —— 感谢本文评论给出的case
【LeetCode】819. Most Common Word 解题报告(Python)的更多相关文章
- LeetCode 819. Most Common Word (最常见的单词)
Given a paragraph and a list of banned words, return the most frequent word that is not in the list ...
- LeetCode 748 Shortest Completing Word 解题报告
题目要求 Find the minimum length word from a given dictionary words, which has all the letters from the ...
- LeetCode 819. Most Common Word
原题链接在这里:https://leetcode.com/problems/most-common-word/description/ 题目: Given a paragraph and a list ...
- 【LeetCode】62. Unique Paths 解题报告(Python & C++)
作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 题目地址:https://leetcode.com/problems/unique-pa ...
- 【LeetCode】809. Expressive Words 解题报告(Python)
[LeetCode]809. Expressive Words 解题报告(Python) 标签(空格分隔): LeetCode 作者: 负雪明烛 id: fuxuemingzhu 个人博客: http ...
- 【LeetCode】376. Wiggle Subsequence 解题报告(Python)
[LeetCode]376. Wiggle Subsequence 解题报告(Python) 作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.c ...
- 【LeetCode】649. Dota2 Senate 解题报告(Python)
[LeetCode]649. Dota2 Senate 解题报告(Python) 作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 题目地 ...
- 【LeetCode】911. Online Election 解题报告(Python)
[LeetCode]911. Online Election 解题报告(Python) 作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ ...
- 【LeetCode】886. Possible Bipartition 解题报告(Python)
[LeetCode]886. Possible Bipartition 解题报告(Python) 作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu ...
随机推荐
- 11 — springboot集成swagger — 更新完毕
1.前言 理论知识滤过,自行百度百科swagger是什么 2.导入依赖 <!-- swagger所需要的依赖--> <dependency> <groupId>io ...
- 巩固javaweb第十四天
巩固内容: 单行文本框: 单行文本框的基本语法格式如下: < input type="text" name="输入信息的字" value=" ...
- 重量级&轻量级
重量级 就是说包的大小,还有就是与个人项目的耦合程度,重量级的框架与项目耦合程度大些 代表EJB容器的服务往往是"买一送三",不要都不行 轻量级 就是相对较小的包,当然与项目的耦合 ...
- 【Java基础】HashMap原理详解
哈希表(hash table) 也叫散列表,是一种非常重要的数据结构,应用场景及其丰富,许多缓存技术(比如memcached)的核心其实就是在内存中维护一张大的哈希表,本文会对java集合框架中Has ...
- 分布式全局ID生成器原理剖析及非常齐全开源方案应用示例
为何需要分布式ID生成器 **本人博客网站 **IT小神 www.itxiaoshen.com **拿我们系统常用Mysql数据库来说,在之前的单体架构基本是单库结构,每个业务表的ID一般从1增,通过 ...
- DevOps和SRE的区别
目录 一.误区 二.DevOps 和 SRE 定义 三.两者产生背景和历史 四.两者的职能不同 五.工作内容不同 六.DevOps 和 SRE 关系 七.附录:技能点 DevOps SRE 一.误区 ...
- 跨平台调用之WebService
一.简介 web service是一种跨编程语言和跨操作系统平台的远程调用技术,是基于网络的.分布式的模块化组件. 跨编程语言就是说服务器端程序采用 Java 编写,客户端程序则可以采用其他编程语言编 ...
- Java后端高频知识点学习笔记1---Java基础
Java后端高频知识点学习笔记1---Java基础 参考地址:牛_客_网 https://www.nowcoder.com/discuss/819297 1.重载和重写的区别 重载:同一类中多个同名方 ...
- 搭建直接通过CPU执行汇编语言的环境
搭建直接通过CPU执行汇编语言环境 我们通过编译写好的汇编语言代码可以生成.bin的机器语言二进制代码.但是这个.bin程序我们该如何运行呢? 这里其实有两个办法: 1: 将其作为一个Windows/ ...
- Java值引用和对象引用区别Demo
转自:http://blog.csdn.net/gundsoul/article/details/4927404 以前就知道JAVA对象分对象引用和值引用,并且还知道8种基础数据类型,即引用时是值引用 ...