Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#'). For each character they type except '#', you need to return the top 3historical hot sentences that have prefix the same as the part of sentence already typed. Here are the specific rules:

  1. The hot degree for a sentence is defined as the number of times a user typed the exactly same sentence before.
  2. The returned top 3 hot sentences should be sorted by hot degree (The first is the hottest one). If several sentences have the same degree of hot, you need to use ASCII-code order (smaller one appears first).
  3. If less than 3 hot sentences exist, then just return as many as you can.
  4. When the input is a special character, it means the sentence ends, and in this case, you need to return an empty list.

Your job is to implement the following functions:

The constructor function:

AutocompleteSystem(String[] sentences, int[] times): This is the constructor. The input is historical data. Sentences is a string array consists of previously typed sentences. Times is the corresponding times a sentence has been typed. Your system should record these historical data.

Now, the user wants to input a new sentence. The following function will provide the next character the user types:

List<String> input(char c): The input c is the next character typed by the user. The character will only be lower-case letters ('a' to 'z'), blank space (' ') or a special character ('#'). Also, the previously typed sentence should be recorded in your system. The output will be the top 3 historical hot sentences that have prefix the same as the part of sentence already typed.

Example:
Operation: AutocompleteSystem(["i love you", "island","ironman", "i love leetcode"], [5,3,2,2]) 
The system have already tracked down the following sentences and their corresponding times: 
"i love you" : 5 times 
"island" : 3 times 
"ironman" : 2 times 
"i love leetcode" : 2 times 
Now, the user begins another search:

Operation: input('i') 
Output: ["i love you", "island","i love leetcode"] 
Explanation: 
There are four sentences that have prefix "i". Among them, "ironman" and "i love leetcode" have same hot degree. Since ' ' has ASCII code 32 and 'r' has ASCII code 114, "i love leetcode" should be in front of "ironman". Also we only need to output top 3 hot sentences, so "ironman" will be ignored.

Operation: input(' ') 
Output: ["i love you","i love leetcode"] 
Explanation: 
There are only two sentences that have prefix "i ".

Operation: input('a') 
Output: [] 
Explanation: 
There are no sentences that have prefix "i a".

Operation: input('#') 
Output: [] 
Explanation: 
The user finished the input, the sentence "i a" should be saved as a historical sentence in system. And the following input will be counted as a new search.

Note:

    1. The input sentence will always start with a letter and end with '#', and only one blank space will exist between two words.
    2. The number of complete sentences that to be searched won't exceed 100. The length of each sentence including those in the historical data won't exceed 100.
    3. Please use double-quote instead of single-quote when you write test cases even for a character input.
    4. Please remember to RESET your class variables declared in class AutocompleteSystem, as static/class variables are persisted across multiple test cases. Please see here for more details.

这道题让实现一个简单的搜索自动补全系统,当我们用谷歌或者百度进行搜索时,会有这样的体验,输入些单词,搜索框会弹出一些以你输入为开头的一些完整的句子供你选择,这就是一种搜索自动补全系统。根据题目的要求,补全的句子是按之前出现的频率排列的,高频率的出现在最上面,如果频率相同,就按字母顺序来显示。输入规则是每次输入一个字符,然后返回自动补全的句子,如果遇到井字符,表示完整句子结束。那么肯定需要一个 HashMap,建立句子和其出现频率的映射,还需要一个字符串 data,用来保存之前输入过的字符。在构造函数中,给了一些句子,和其出现的次数,直接将其加入 HashMap,然后 data 初始化为空字符串。在 input 函数中,首先判读输入字符是否为井字符,如果是的话,那么表明当前的 data 字符串已经是一个完整的句子,在 HashMap 中次数加1,并且 data 清空,返回空集。否则的话将当前字符加入 data 字符串中,现在就要找出包含 data 前缀的前三高频句子了,使用优先队列来做,设计的思路是,始终用优先队列保存频率最高的三个句子,应该把频率低的或者是字母顺序大的放在队首,以便随时可以移出队列,所以应该是个最小堆,队列里放句子和其出现频率的 pair 对儿,并且根据其频率大小进行排序,要重写优先队列的 comparator。然后遍历 HashMap 中的所有句子,首先要验证当前 data 字符串是否是其前缀,没啥好的方法,就逐个字符比较,用标识符 matched,初始化为 true,如果发现不匹配,则 matched 标记为 false,并 break 掉。然后判断如果 matched 为 true 的话,说明 data 字符串是前缀,那么就把这个 pair 加入优先队列中,如果此时队列中的元素大于三个,那把队首元素移除,因为是最小堆,所以频率小的句子会被先移除。然后就是将优先队列的元素加到结果 res 中,由于先出队列的是频率小的句子,所以要加到结果 res 的末尾,参见代码如下:

class AutocompleteSystem {
public:
AutocompleteSystem(vector<string> sentences, vector<int> times) {
for (int i = ; i < sentences.size(); ++i) {
freq[sentences[i]] += times[i];
}
data = "";
}
vector<string> input(char c) {
if (c == '#') {
++freq[data];
data = "";
return {};
}
data.push_back(c);
auto cmp = [](pair<string, int>& a, pair<string, int>& b) {
return a.second > b.second || (a.second == b.second && a.first < b.first);
};
priority_queue<pair<string, int>, vector<pair<string, int>>, decltype(cmp) > q(cmp);
for (auto f : freq) {
bool matched = true;
for (int i = ; i < data.size(); ++i) {
if (data[i] != f.first[i]) {
matched = false;
break;
}
}
if (matched) {
q.push(f);
if (q.size() > ) q.pop();
}
}
vector<string> res(q.size());
for (int i = q.size() - ; i >= ; --i) {
res[i] = q.top().first; q.pop();
}
return res;
} private:
unordered_map<string, int> freq;
string data;
};

Github 同步地址:

https://github.com/grandyang/leetcode/issues/642

类似题目:

Implement Trie (Prefix Tree)

Top K Frequent Words

参考资料:

https://leetcode.com/problems/design-search-autocomplete-system/

https://leetcode.com/problems/design-search-autocomplete-system/discuss/176550/Java-simple-solution-without-using-Trie-(only-use-HashMap-and-PriorityQueue)

https://leetcode.com/problems/design-search-autocomplete-system/discuss/105379/Straight-forward-hash-table-%2B-priority-queue-solution-in-c%2B%2B-no-trie

LeetCode All in One 题目讲解汇总(持续更新中...)

[LeetCode] Design Search Autocomplete System 设计搜索自动补全系统的更多相关文章

  1. [LeetCode] 642. Design Search Autocomplete System 设计搜索自动补全系统

    Design a search autocomplete system for a search engine. Users may input a sentence (at least one wo ...

  2. StringBoot整合ELK实现日志收集和搜索自动补全功能(详细图文教程)

    @ 目录 StringBoot整合ELK实现日志收集和搜索自动补全功能(详细图文教程) 一.下载ELK的安装包上传并解压 1.Elasticsearch下载 2.Logstash下载 3.Kibana ...

  3. Design Search Autocomplete System

    Design a search autocomplete system for a search engine. Users may input a sentence (at least one wo ...

  4. autocomplete实现联想输入,自动补全

    jQuery.AutoComplete是一个基于jQuery的自动补全插件.借助于jQuery优秀的跨浏览器特性,可以兼容Chrome/IE/Firefox/Opera/Safari等多种浏览器. 特 ...

  5. WinForm AutoComplete 输入提示、自动补全

    一.前言 又临近春节,作为屌丝的我,又要为车票发愁了.记得去年出现了各种12306的插件,最近不忙,于是也着手自己写个抢票插件,当是熟悉一下WinForm吧.小软件还在开发中,待完善后,也写篇博客与大 ...

  6. 仿Google首页搜索自动补全

    仿Google自动补全,实现细节: 后台是简单的servlet(其实就是负责后台处理数据交互的,没必要非跌用个struts...什么的) 传输介质:xml 使用jQuery js框架 功能实现: 如果 ...

  7. jquery input 搜索自动补全、typeahead.js

    最近做个一个功能需要用到自动补全,然后在网上找了很久,踩了各种的坑 最后用typeahead.js这个插件,经过自己的测试完美实现 使用方法:在页面中引入jquery.jquery.typeahead ...

  8. [LeetCode] Design Log Storage System 设计日志存储系统

    You are given several logs that each log contains a unique id and timestamp. Timestamp is a string t ...

  9. [LeetCode] Design In-Memory File System 设计内存文件系统

    Design an in-memory file system to simulate the following functions: ls: Given a path in string form ...

随机推荐

  1. java程序性能调优---------------性能概述

    一.程序的性能通过哪几个方面表现 1.执行速度(程序反应反应是否迅速.响应时间是否足够短) 2.分配内存 (分配内存是否合理,是否过多的消耗内存或者内存溢出) 3.启动时间(程序从运行到可以正常处理业 ...

  2. SQL中哪些情况会引起全表扫描

    1.模糊查询效率很低:原因:like本身效率就比较低,应该尽量避免查询条件使用like:对于like '%...%'(全模糊)这样的条件,是无法使用索引的,全表扫描自然效率很低:另外,由于匹配算法的关 ...

  3. python全栈学习--day9(函数初始)

    Python 函数 函数是组织好的,可重复使用的,用来实现单一,或相关联功能的代码段. 函数能提高应用的模块性,和代码的重复利用率.你已经知道Python提供了许多内建函数,比如print().但你也 ...

  4. Mybatis学习笔记二

    本篇内容,紧接上一篇内容Mybatis学习笔记一 输入映射和输出映射 传递简单类型和pojo类型上篇已介绍过,下面介绍一下包装类型. 传递pojo包装对象 开发中通过可以使用pojo传递查询条件.查询 ...

  5. 2017-2018-1 1623 bug终结者 冲刺004

    bug终结者 冲刺004 by 20162322 朱娅霖 整体连接 简要说明 目前,我们已经完成了欢迎界面,主菜单界面,排行榜界面,选项界面,胜利界面,地板类.小人类.墙体类.箱子类和虚拟按键类. 主 ...

  6. Node入门教程(1)目录

    aicoder.com 全栈实习之简明 Node 入门文档 aicoder.com 线下实习: 不 8000 就业,不还实习费. 如果需要转载本文档,请联系老马,Q: 515154084 JS基础教程 ...

  7. php中(包括织梦cms)set_time_limit(0)不起作用的解决方法

    背景介绍: 在做织梦冗余图片清理的功能时, 由于冗余图片太多,导致每次清理时都会超时, 后来在网上搜索了各种文章,网上有如下的解决方法: set_time_limit(0) ini_set('max_ ...

  8. django三种文件下载方式

    一.概述 在实际的项目中很多时候需要用到下载功能,如导excel.pdf或者文件下载,当然你可以使用web服务自己搭建可以用于下载的资源服务器,如nginx,这里我们主要介绍django中的文件下载. ...

  9. OpenID Connect + OAuth2.0

    一.问题的提出 现代应用程序或多或少都是如下这样的架构: 在这种情况下,前端.中间层和后端都需要进行验证和授权来保护资源,所以不能仅仅在业务逻辑层或者服务接口层来实现基础的安全功能.为了解决这样的问题 ...

  10. maven入门(6)maven的生命周期

    1. 三套生命周期     Maven拥有三套相互独立的生命周期,它们分别为clean,default和site. 每个生命周期包含一些阶段,这些阶段是有顺序的,并且后面的阶段依赖于前面的阶段,用户和 ...