POJ 1451 - T9 - [字典树]

题目链接：http://bailian.openjudge.cn/practice/1451/

总时间限制: 1000ms　　内存限制: 65536kB

描述

Background

A while ago it was quite cumbersome to create a message for the Short Message Service (SMS) on a mobile phone. This was because you only have nine keys and the alphabet has more than nine letters, so most characters could only be entered by pressing one key several times. For example, if you wanted to type "hello" you had to press key 4 twice, key 3 twice, key 5 three times, again key 5 three times, and finally key 6 three times. This procedure is very tedious and keeps many people from using the Short Message Service.

This led manufacturers of mobile phones to try and find an easier way to enter text on a mobile phone. The solution they developed is called T9 text input. The "9" in the name means that you can enter almost arbitrary words with just nine keys and without pressing them more than once per character. The idea of the solution is that you simply start typing the keys without repetition, and the software uses a built-in dictionary to look for the "most probable" word matching the input. For example, to enter "hello" you simply press keys 4, 3, 5, 5, and 6 once. Of course, this could also be the input for the word "gdjjm", but since this is no sensible English word, it can safely be ignored. By ruling out all other "improbable" solutions and only taking proper English words into account, this method can speed up writing of short messages considerably. Of course, if the word is not in the dictionary (like a name) then it has to be typed in manually using key repetition again.

More precisely, with every character typed, the phone will show the most probable combination of characters it has found up to that point. Let us assume that the phone knows about the words "idea" and "hello", with "idea" occurring more often. Pressing the keys 4, 3, 5, 5, and 6, one after the other, the phone offers you "i", "id", then switches to "hel", "hell", and finally shows "hello".

Problem

Write an implementation of the T9 text input which offers the most probable character combination after every keystroke. The probability of a character combination is defined to be the sum of the probabilities of all words in the dictionary that begin with this character combination. For example, if the dictionary contains three words "hell", "hello", and "hellfire", the probability of the character combination "hell" is the sum of the probabilities of these words. If some combinations have the same probability, your program is to select the first one in alphabetic order. The user should also be able to type the beginning of words. For example, if the word "hello" is in the dictionary, the user can also enter the word "he" by pressing the keys 4 and 3 even if this word is not listed in the dictionary.

输入
The first line contains the number of scenarios.

Each scenario begins with a line containing the number w of distinct words in the dictionary (0<=w<=1000). These words are iven in the next w lines in ascending alphabetic order. Every line starts with the word which is a sequence of lowercase letters from the alphabet without whitespace, followed by a space and an integer p, 1<=p<=100, representing the probability of that word. No word will contain more than 100 letters.

Following the dictionary, there is a line containing a single integer m. Next follow m lines, each consisting of a sequence of at most 100 decimal digits 2?, followed by a single 1 meaning "next word".

输出
The output for each scenario begins with a line containing "Scenario #i:", where i is the number of the scenario starting at 1.

For every number sequence s of the scenario, print one line for every keystroke stored in s, except for the 1 at the end. In this line, print the most probable word prefix defined by the probabilities in the dictionary and the T9 selection rules explained above. Whenever none of the words in the dictionary match the given number sequence, print "MANUALLY" instead of a prefix.

Terminate the output for every number sequence with a blank line, and print an additional blank line at the end of every scenario.

样例输入
2
5
hell 3
hello 4
idea 8
next 8
super 3
2
435561
43321
7
another 5
contest 6
follow 3
give 13
integer 6
new 14
program 4
5
77647261
6391
4681
26684371
77771

样例输出
Scenario #1:
i
id
hel
hell
hello

i
id
ide
idea

Scenario #2:
p
pr
pro
prog
progr
progra
program

n
ne
new

g
in
int

c
co
con
cont
anoth
anothe
another

p
pr
MANUALLY
MANUALLY

来源
Northwestern Europe 2001

题意：

原来的按键手机都一般是九键，九键输入英文很麻烦，例如要键入“hello”，必须按两次键4、两次键3、三次键5、三次键5，最后按三次键6。

现有一种新的输入方案名叫“T9”，只需要不重复地按键，软件就会使用内置的字典来查找最可能的与输入匹配的单词。例如，输入“hello”，只要依次按下4,3,5,5,6各一次即可。当然，这也可能是“gdjjm”一词的输入，但是因为这不是一个合理的英语单词，所以可以安全地忽略它。通过排除所有其他不可能的解决方案，并且只考虑适当的英语单词，这种方法可以大大加快短信的写作速度。当然，如果这个词不在软件内置的字典中（比如名字），那么它必须再次使用多次按键的方式输入。

现在给你“T9”的字典，包含 $w(0 \le w \le 1e3)$ 个互不相同的字符串（串长不超过 $100$，已经按字典序升序排序），以及他们的出现概率 $p$。又给出 $m$ 次打字操作，每次输入包含不超过 $100$ 个数字的 $2 \sim 9$ 数字串，代表依次按键，最后跟一个 $1$ 代表结束本次打字。

对“字符组合”的概率，定义为所有以该字符组合为前缀的单词的出现概率之和。例如，如果字典包含三个单词“hell”,“hello”和“hellfire”，则字符组合“hell”的概率是这些词的概率之和。如果某些组合具有相同的概率，则您的程序是选字典序最小的。

题解：

按照手机的 $2 \sim 9$ 八个数字键作为字典树每个节点的八个分支建树。

每个节点定义一个 $pr$ 和 $s$ 分别代表：按键按到当前位置，最有可能的是字符串 $s$，并且可能性为 $pr$。

这样一来，每次插入一个字符串，对每个节点均维护 $pr$ 和 $s$。

这样的话，不能分批次插入同一个前缀，一个前缀只能插入一次，因此不妨先把每个前缀的 $pr$ 求出来，由于输入的字典是按字典序排好序的，因此可以直接累加出前缀的最大概率。

这道题对于加深字典树的理解还是很有帮助的。

AC代码：

#include<bits/stdc++.h>

using namespace std;

const int maxn=1e3+;

const int maxs=;

int n,m;

string s[maxn];

int pr[maxn][maxs];

namespace Trie

{

    const int SIZE=maxn*maxs;

    int sz;

    struct TrieNode{

        int ed;

        string s;

        int pr;

        int nxt[];

    }trie[SIZE];

    void init()

    {

        sz=;

        for(int i=;i<SIZE;i++)

        {

            trie[i].ed=trie[i].pr=;

            trie[i].s.clear();

            memset(trie[i].nxt,,sizeof(trie[i].nxt));

        }

    }

    const string key="";

    void insert(int idx)

    {

        string &str=s[idx];

        int p=;

        for(int k=;k<str.size();k++)

        {

            int ch=key[str[k]-'a']-'';

            if(!trie[p].nxt[ch]) trie[p].nxt[ch]=++sz;

            p=trie[p].nxt[ch];

            if(pr[idx][k]>trie[p].pr)

            {

                trie[p].pr=pr[idx][k];

                trie[p].s=str.substr(,k+);

            }

        }

        trie[p].ed++;

    }

    void search(const string& s)

    {

        int p=;

        for(int i=;i<s.size()-;i++)

        {

            int ch=s[i]-'';

            p=trie[p].nxt[ch];

            if(!p) cout<<"MANUALLY\n";

            else cout<<trie[p].s<<'\n';

        }

    }

};

int main()

{

    ios::sync_with_stdio();

    cin.tie(0);

    int T;

    cin>>T;

    for(int kase=;kase<=T;kase++)

    {

        cin>>n;

        for(int i=,prob;i<=n;i++)

        {

            cin>>s[i]>>prob;

            for(int k=;k<s[i].size();k++) pr[i][k]=prob;

        }

        for(int i=;i<=n;i++)

        {

            for(int k=;k<min(s[i].size(),s[i-].size());k++)

            {

                if(s[i][k]==s[i-][k])

                {

                    pr[i][k]+=pr[i-][k];

                    pr[i-][k]=;

                }

                else break;

            }

        }

        Trie::init();

        for(int i=;i<=n;i++) Trie::insert(i);

        cin>>m;

        cout<<"Scenario #"<<kase<<":\n";

        for(int i=;i<=m;i++)

        {

            cin>>s[];

            Trie::search(s[]);

            cout<<'\n';

        }

        cout<<'\n';

    }

}

注：

害怕cin/cout太慢，关闭IO同步并且解除cin/cout绑定，参考https://blog.csdn.net/qian2213762498/article/details/81982380：

影响cout和cin的性能的有两个方面：同步性和缓冲区，同步性可以通过 ios::sync_with_stdio(false); 禁用；操作系统会对缓冲区进行管理和优化，但十分有限，使用了endl之后，会对缓冲区执行清空操作，这个过程会先执行’\n’，再执行flush操作，非常漫长，所以尽量使用‘\n’而不是endl执行换行。然后，还有一个cout和cin的绑定效果，两者同时使用的话，cin与cout交替操作，会有一个flush过程，所以还是会很漫长，可以通过 cin.tie(nullptr); 解除绑定。

POJ 1451 - T9 - [字典树]的更多相关文章

POJ 1451 T9 （字典树好题）
背景:为了方便九宫格手机用户发短信,希望在用户按键时,根据提供的字典(给出字符串和频数),给出各个阶段最有可能要打的单词. 题意: 首先给出的是字典,每个单词有一个出现频率.然后给出的是询问,每个询问 ...
POJ 1451 T9
T9 Time Limit: 1000MS Memory Limit: 10000K Total Submissions: 3083 Accepted: 1101 Description Ba ...
poj 2513(欧拉路径+字典树映射)
题目链接:http://poj.org/problem?id=2513 思路:题目还是很简单的,就是判断是否存在欧拉路径,我们给每个单词的头和尾映射序号,统计度数.对于给定的无向图,当且仅当图连通并且 ...
poj 2503 Babelfish(字典树或着ＳＴＬ)
Babelfish Time Limit: 3000MS Memory Limit: 65536K Total Submissions: 35828 Accepted: 15320 Descr ...
POJ 2513 【字典树】【欧拉回路】
题意: 有很多棒子,两端有颜色,告诉你两端的颜色,让你把这些棒子拼接起来要求相邻的接点的两个颜色是一样的. 问能否拼接成功. 思路: 将颜色看作节点,将棒子看作边,寻找欧拉通路. 保证图的连通性的时候 ...
poj 2503 Babelfish(字典树或map或哈希或排序二分)
输入若干组对应关系,然后输入应该单词,输出对应的单词,如果没有对应的输出eh 此题的做法非常多,很多人用了字典树,还有有用hash的,也有用了排序加二分的(感觉这种方法时间效率最差了),这里我参考了M ...
HDU 1298 T9 ( 字典树 )
题意 : 给你 w 个单词以及他们的频率,现在给出模拟 9 键打字的一串数字,要你在其模拟打字的过程中给出不同长度的提示词,出现的提示词应当是之前频率最高的,当然提示词不需要完整的,也可以是 w 个单 ...
HDU 1298 T9 字典树+DFS
必须要批评下自己了,首先就是这个题目的迟疑不定,去年做字典树的时候就碰到这个题目了,当时没什么好的想法,就暂时搁置了,其实想法应该有很多,只是居然没想到. 同样都是对单词进行建树,并插入可能值,但是拨 ...
POJ - 3764 01字典树+前缀异或和
异或关于前缀的特性:[u,v]=[1,u]^[1,v] 注意是路径,假设1为根,prexor[1]不保留数值 /*H E A D*/ int to[maxn<<1],nxt[maxn< ...

随机推荐

Swift Assert 断言
前言对每次运行都会出现的错误通常不会过于苦恼,可以使用断点调试或者 try catch 之类的方式判断并修复它.但是一些偶发(甚至是无数次运行才会出现一次)的错误单靠断点之类的方式是很难排除掉的,为 ...
使用ExpandableListView以及如何优化view的显示减少内存占用
上篇博客讲到如何获取手机中所有歌曲的信息.本文就把上篇获取到的歌曲按照歌手名字分类.用一个ExpandableListView显示出来. MainActivity .java public cla ...
Socketserver 笔记
引入Socketserver的背景: 我们之前使用socket编程的时候,Server端创建一个连接循环(建立连接)+一个通信循环(基于一次连接建立通信循环),(这里的黏包问题我们的实现方式是:我们在 ...
物联网架构成长之路(5)-EMQ插件配置
1. 前言上一小结说了插件的创建,这一节主要怎么编写代码,以及具体流程之类的.2. 增加一句Hello World 修改 ./deps/emq_plugin_wunaozai/src/emq_plu ...
十大高明的Google搜索技巧
WHY 对于google检索,有时需要技巧会得到更好的检索结果,不用简单输入关键字检索后,一个个去浏览.对于这些技巧,基本都知道,但是如果不经常用,总是忘了,又得重新搜索下具体使用方法,这里就把它放到 ...
python os详解
1.os.getcwd()--起始执行目录获取当前执行程序文件所在的目录,需要注意的是,getcwd不是获取代码所在文件的目录,也不是获取执行文件所在的目录,而是起始执行目录. 目录结构: test ...
java中自定义注释@interface的用法
一.什么是注释说起注释,得先提一提什么是元数据(metadata).所谓元数据就是数据的数据.也就是说,元数据是描述数据的.就象数据表中的字段一样,每个字段描述了这个字段下的数据的含义.而J ...
FilenameFilter总结
一.FilenameFilter介绍 java.io.FilenameFilter是文件名过滤器,用来过滤不符合规格的文件名,并返回合格的文件: 一般地: (1)String[] fs = f.l ...
fputc和putc和putchar函数的用法
功能: 输出一字符到指定流中 putc()与fputc()等价.不同之处为:当putc函数被定义为宏时,它可能多次计算stream的值. 关于fputc(): 原型:int fputc(char c ...
ubuntu安装anaconda后，终端输入conda，出现未找到命令
解决办法: 终端输入:vim ~/.bashrc 键盘大写“G”,在最末端输入:export PATH=~/anaconda2/bin:$PATH 使其生效:source ~/.bashrc 打印 ...

POJ 1451 - T9 - [字典树]

POJ 1451 - T9 - [字典树]的更多相关文章

随机推荐

热门专题