Given a paragraph and a list of banned words, return the most frequent word that is not in the list of banned words.  It is guaranteed there is at least one word that isn't banned, and that the answer is unique.

Words in the list of banned words are given in lowercase, and free of punctuation.  Words in the paragraph are not case sensitive.  The answer is in lowercase.

Example:

Input:
paragraph = "Bob hit a ball, the hit BALL flew far after it was hit."
banned = ["hit"]
Output: "ball"
Explanation:
"hit" occurs 3 times, but it is a banned word.
"ball" occurs twice (and no other word does), so it is the most frequent non-banned word in the paragraph.
Note that words in the paragraph are not case sensitive,
that punctuation is ignored (even if adjacent to words, such as "ball,"),
and that "hit" isn't the answer even though it occurs more because it is banned.

Note:

  • 1 <= paragraph.length <= 1000.
  • 1 <= banned.length <= 100.
  • 1 <= banned[i].length <= 10.
  • The answer is unique, and written in lowercase (even if its occurrences in paragraph may have uppercase symbols, and even if it is a proper noun.)
  • paragraph only consists of letters, spaces, or the punctuation symbols !?',;.
  • There are no hyphens or hyphenated words.
  • Words only consist of letters, never apostrophes or other punctuation symbols.

这道题给了我们一个字符串,是一个句子,里面有很多单词,并且还有标点符号,然后又给了我们一个类似黑名单功能的一个字符串数组,让我们在返回句子中出现的频率最高的一个单词。要求非常简单明了,那么思路也就简单粗暴一些吧。因为我们返回的单词不能是黑名单中的,所以我们对于每一个统计的单词肯定都需要去黑名单中检查,为了提高效率,我们可以把黑名单中所有的单词放到一个HashSet中,这样就可以常数级时间内查询了。然后要做的就是处理一下字符串数组,因为字符串中可能有标点符号,所以我们先过滤一遍字符串,这里我们使用了系统自带的两个函数isalpha()和tolower()函数,其实自己写也可以,就放到一个子函数处理一下也不难,这里就偷懒了,遍历每个字符,如果不是字母,就换成空格符号,如果是大写字母,就换成小写的。然后我们又使用一个C++中的读取字符串流的类,Java中可以直接调用split函数,叼的飞起。但谁让博主固执的写C++呢,也无所谓啦,习惯就好,这里我们也是按照空格拆分,将每个单词读出来,这里要使用一个mx变量,统计当前最大的频率,还需要一个HashMap来建立单词和其出现频率之间的映射。然后我们看读取出的单词,如果不在黑名单中内,并且映射值加1后大于mx的话,我们更新mx,并且更新结果res即可,参见代码如下:

class Solution {
public:
string mostCommonWord(string paragraph, vector<string>& banned) {
unordered_set<string> ban(banned.begin(), banned.end());
unordered_map<string, int> strCnt;
int mx = ;
for (auto &c : paragraph) c = isalpha(c) ? tolower(c) : ' ';
istringstream iss(paragraph);
string t = "", res = "";
while (iss >> t) {
if (!ban.count(t) && ++strCnt[t] > mx) {
mx = strCnt[t];
res = t;
}
}
return res;
}
};

参考资料:

https://leetcode.com/problems/most-common-word/

https://leetcode.com/problems/most-common-word/discuss/124286/Clean-6ms-C%2B%2B-solution

https://leetcode.com/problems/most-common-word/discuss/123854/C%2B%2BJavaPython-Easy-Solution-with-Explanation

[LeetCode] Most Common Word 最常见的单词的更多相关文章

  1. LeetCode 819. Most Common Word (最常见的单词)

    Given a paragraph and a list of banned words, return the most frequent word that is not in the list ...

  2. Leetcode819.Most Common Word最常见的单词

    给定一个段落 (paragraph) 和一个禁用单词列表 (banned).返回出现次数最多,同时不在禁用列表中的单词.题目保证至少有一个词不在禁用列表中,而且答案唯一. 禁用列表中的单词用小写字母表 ...

  3. [LeetCode] 244. Shortest Word Distance II 最短单词距离 II

    This is a follow up of Shortest Word Distance. The only difference is now you are given the list of ...

  4. [LeetCode] 245. Shortest Word Distance III 最短单词距离 III

    This is a follow up of Shortest Word Distance. The only difference is now word1 could be the same as ...

  5. [LeetCode] Shortest Completing Word 最短完整的单词

    Find the minimum length word from a given dictionary words, which has all the letters from the strin ...

  6. [leetcode]244. Shortest Word Distance II最短单词距离(允许连环call)

    Design a class which receives a list of words in the constructor, and implements a method that takes ...

  7. leetcode Most Common Word——就是在考察自己实现split

    819. Most Common Word Given a paragraph and a list of banned words, return the most frequent word th ...

  8. C#LeetCode刷题之#819-最常见的单词(Most Common Word)

    问题 该文章的最新版本已迁移至个人博客[比特飞],单击链接 https://www.byteflying.com/archives/3969 访问. 给定一个段落 (paragraph) 和一个禁用单 ...

  9. [LeetCode] 288.Unique Word Abbreviation 独特的单词缩写

    An abbreviation of a word follows the form <first letter><number><last letter>. Be ...

随机推荐

  1. centos7.2 环境下两个数据库的安装部署

    首先假如服务器上已经有一个 数据库mysql5.6.29,端口是3306. 接下来在安装一个mysql数据库,端口是3307的. 一:创建mysql编译目录 mkdir /usr/local/mysq ...

  2. 使用Notepad++开发Java程序

    安装NppExec插件(已安装可跳过) 插件下载地址 我选择了最新的RC2 根据软件位数下载对应的版本,我直接下载了32位对应的dll 解压后里面有两个文件夹和一个dll文件 拷贝到Notepad++ ...

  3. SQLserver 数据库高版本无法还原到低版本的数据解决方法

    sql server 数据库的版本只支持从上往下兼容.即高版本可以兼容低版本 .低版本不能兼容低版本.通常我们在开发时会用比较高的版本.但是部署到客户那边可能他们的数据库版本会比较低. 我们可以通过导 ...

  4. python开发遇到的坑(1)xpath解析ValueError: Unicode strings with encoding declaration are not supported

    Traceback (most recent call last): File "/Users/*******.py", line 37, in <module> Bt ...

  5. python进制转化函数,10进制字符串互转,16进制字符串互转

    来了老弟,emmmmm,今天想到平时经常用到编码转化,把字符串转化为16进制绕过等等的,今天想着用python写个玩,查询了一些资料,看了些bolg 上面的两个函数是将二进制流转化为16进制,data ...

  6. SQL数据插入

    T-SQL中提供了多个将数据插入到表中的语句:insert values.insert select.insert exec.select into和Bulk insert: 1.  insert v ...

  7. Mac下mongodb connect failed 连接错误解决方法

    查看elm 后台node 代码 一直连不上mongodb,报错 MongoDB shell version v3.6.0 connecting to: mongodb://127.0.0.1:2701 ...

  8. Python内置模块之-hashlib

    一 .概述 摘要算法又称哈希算法.散列算法.它通过一个函数,把任意长度的数据转换为一个长度固定的数据串(通常用16进制的字符串表示). 摘要算法的特点 不论data大小,摘要结果是固定长度 单向函数, ...

  9. CSS之链接

    改变链接样式 当设置为若干链路状态的样式,也有一些顺序规则: a:hover 必须跟在 a:link 和 a:visited后面 a:active 必须跟在 a:hover后面 <!DOCTYP ...

  10. 寻找符合条件的最短子字符串——SLIDING WINDOW

    简介 用一个可伸缩的窗口遍历字符串,时间复杂度大致为O(n).适用于“寻找符合某条件的最小子字符串”题型. 题目 链接 求某字符串T中含有某字符串S的所有字符的最小子字符串.如果不存在则返回" ...