Moderate 加入空格使得可辨别单词数量最多 @CareerCup
递归题目,注意结合了memo的方法和trie的应用
package Moderate; import java.util.Hashtable; import CtCILibrary.AssortedMethods;
import CtCILibrary.Trie; /**
* Oh, no! You have just completed a lengthy document when you have an unfortu-
nate Find/Replace mishap. You have accidentally removed all spaces, punctuation,
and capitalization in the document. A sentence like "I reset the computer. It still
didn't boot!" would become "iresetthecomputeritstilldidntboot". You figure that you
can add back in the punctation and capitalization later, once you get the individual
words properly separated. Most of the words will be in a dictionary, but some strings,
like proper names, will not.
Given a dictionary (a list of words), design an algorithm to find the optimal way of
"unconcatenating" a sequence of words. In this case, "optimal" is defined to be the
parsing which minimizes the number of unrecognized sequences of characters.
For example, the string "jesslookedjustliketimherbrother" would be optimally parsed
as "JESS looked just like TIM her brother". This parsing has seven unrecognized char-
acters, which we have capitalized for clarity. 给一个string,把string内的所有标点,空格都去掉。然后要求找到把空格加回去使得不可辨别的
单词数量达到最少的方法(判断是否可以辨别是通过提供一个字典来判断) *
*/
public class S17_14 { public static String sentence;
public static Trie dictionary; /* incomplete code */
public static Result parse(int wordStart, int wordEnd, Hashtable<Integer, Result> cache) {
if (wordEnd >= sentence.length()) {
return new Result(wordEnd - wordStart, sentence.substring(wordStart).toUpperCase());
}
if (cache.containsKey(wordStart)) {
return cache.get(wordStart).clone();
}
String currentWord = sentence.substring(wordStart, wordEnd + 1);
boolean validPartial = dictionary.contains(currentWord, false);
boolean validExact = validPartial && dictionary.contains(currentWord, true); /* break current word */
Result bestExact = parse(wordEnd + 1, wordEnd + 1, cache);
if (validExact) {
bestExact.parsed = currentWord + " " + bestExact.parsed;
} else {
bestExact.invalid += currentWord.length();
bestExact.parsed = currentWord.toUpperCase() + " " + bestExact.parsed;
} /* extend current word */
Result bestExtend = null;
if (validPartial) {
bestExtend = parse(wordStart, wordEnd + 1, cache);
} /* find best */
Result best = Result.min(bestExact, bestExtend);
cache.put(wordStart, best.clone());
return best;
} public static int parseOptimized(int wordStart, int wordEnd, Hashtable<Integer, Integer> cache) {
if (wordEnd >= sentence.length()) {
return wordEnd - wordStart;
}
if (cache.containsKey(wordStart)) {
return cache.get(wordStart);
} String currentWord = sentence.substring(wordStart, wordEnd + 1);
boolean validPartial = dictionary.contains(currentWord, false); /* break current word */
int bestExact = parseOptimized(wordEnd + 1, wordEnd + 1, cache);
if (!validPartial || !dictionary.contains(currentWord, true)) {
bestExact += currentWord.length();
} /* extend current word */
int bestExtend = Integer.MAX_VALUE;
if (validPartial) {
bestExtend = parseOptimized(wordStart, wordEnd + 1, cache);
} /* find best */
int min = Math.min(bestExact, bestExtend);
cache.put(wordStart, min);
return min;
} public static int parseSimple(int wordStart, int wordEnd) {
if (wordEnd >= sentence.length()) {
return wordEnd - wordStart;
} String word = sentence.substring(wordStart, wordEnd + 1); /* break current word */
int bestExact = parseSimple(wordEnd + 1, wordEnd + 1);
if (!dictionary.contains(word, true)) {
bestExact += word.length();
} /* extend current word */
int bestExtend = parseSimple(wordStart, wordEnd + 1); /* find best */
return Math.min(bestExact, bestExtend);
} public static String clean(String str) {
char[] punctuation = {',', '"', '!', '.', '\'', '?', ','};
for (char c : punctuation) {
str = str.replace(c, ' ');
}
return str.replace(" ", "").toLowerCase();
} public static void main(String[] args) {
dictionary = AssortedMethods.getTrieDictionary();
sentence = "As one of the top companies in the world, Google will surely attract the attention of computer gurus. This does not, however, mean the company is for everyone.";
sentence = clean(sentence);
System.out.println(sentence);
//Result v = parse(0, 0, new Hashtable<Integer, Result>());
//System.out.println(v.parsed);
int v = parseOptimized(0, 0, new Hashtable<Integer, Integer>());
System.out.println(v);
} static class Result {
public int invalid = Integer.MAX_VALUE;
public String parsed = "";
public Result(int inv, String p) {
invalid = inv;
parsed = p;
} public Result clone() {
return new Result(this.invalid, this.parsed);
} public static Result min(Result r1, Result r2) {
if (r1 == null) {
return r2;
} else if (r2 == null) {
return r1;
} return r2.invalid < r1.invalid ? r2 : r1;
}
} }
Moderate 加入空格使得可辨别单词数量最多 @CareerCup的更多相关文章
- Storm监控文件夹变化 统计文件单词数量
监控指定文件夹,读取文件(新文件动态读取)里的内容,统计单词的数量. FileSpout.java,监控文件夹,读取新文件内容 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...
- python核心编程正则表达式练习题1-2匹配由单个空格分隔的任意单词对,也就是性和名
# 匹配由单个空格分隔的任意单词对,也就是姓和名 import re patt = '[A-Za-z]+ [A-Za-z]+' # 方法一 +加号操作符匹配它左边的正则表达式至少出现一次的情况 # p ...
- Python GitHub上星星数量最多的项目
GitHub上星星数量最多的项目 """ most_popular.py 查看GitHub上获得星星最多的项目都是用什么语言写的 """ i ...
- Java基础IO类之字符串流(查字符串中的单词数量)与管道流
一.字符串流 定义:字符串流(StringReader),以一个字符为数据源,来构造一个字符流. 作用:在Web开发中,我们经常要从服务器上获取数据,数据返回的格式通常一个字符串(XML.JSON), ...
- go语言小练习——给定英语文章统计单词数量
给定一篇英语文章,要求统计出所有单词的个数,并按一定次序输出.思路是利用go语言的map类型,以每个单词作为关键字存储数量信息,代码实现如下: package main import ( " ...
- 练习1-21:编写程序entab,将空格串替换为最少数量的制表符和空格。。。(C程序设计语言 第2版)
#include <stdio.h> #define N 5 main() { int i, j, c, lastc; lastc = 'a'; i = j = ; while ((c=g ...
- hadoop-mapreduce-(1)-统计单词数量
编写map程序 package com.cvicse.ump.hadoop.mapreduce.map; import java.io.IOException; import org.apache.h ...
- 在Linux系统下有一个目录/usr/share/dict/ 这个目录里包含了一个词典的文本文件,我们可以利用这个文件来辨别单词是否为词典中的单词。
#!/bin/bash s=`cat /usr/share/dict/linux.words` for i in $s; do if [ $1 = $i ];then echo "$1 在字 ...
- Python的 counter内置函数,统计文本中的单词数量
counter是 colletions内的一个类 可以理解为一个简单的计数 import collections str1=['a','a','b','d'] m=collections.Counte ...
随机推荐
- latex引用多篇参考文献
1.如何使连续的参考文献能够中间用破折号连起来?比如[6,7,8,9]变成[6-9]? 方法:在文档开始前加上下面的语句命令 \usepackage[numbers,sort&compress ...
- jquery的select元素和option的相关操作
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/ ...
- 关于点击空白关闭弹窗的js写法推荐?
$(document).mouseup(function(e){ var _con = $(' 目标区域 '); // 设置目标区域 ){ // Mark 1 some code... // 功能代码 ...
- 安装search everything中文语言包
Everything 作为很多人的必备工具,特写这篇文章,一方面让想使用国外优秀软件的英语小白有一段过渡期,另一方面也为自己整理下资料.当然,这个可不是不学好英语的正当理由. 步骤: 1. 下载好se ...
- 强制不使用“兼容性视图”的HTML代码(转)
在IE8浏览器以后版本,都有一个“兼容性视图”,让不少新技术无法使用.那么如何禁止浏览器自动选择“兼容性视图”,强制IE以最高级别的可用模式显示内容呢?下面就介绍一段HTML代码. X-UA-Comp ...
- 一篇旧文章,结合汇编探索this指针
//VC6.0下成功编译 #include <iostream.h> class X{ public: void foo(int b,int c){ this->a=b*c; cou ...
- 追踪CM_CONTROLCHANGE消息的产生和执行过程,可以较好的领会VCL的思想(就是到处通知,但耦合性很弱)
追踪CM_CONTROLCHANGE消息的流向,可以较好的 测试代码: procedure TForm1.Button1Click(Sender: TObject);var Image2 : TIma ...
- mysql数据库操作语句大全
一 . 常用mysql命令行命令 1 .启动MYSQL服务 net start mysql 停止MYSQL服务 net stop mysql 2 . netstat –na | findstr 3 ...
- java学习之面向对象概念
思考的两种方式: 举例: 把大象放到冰箱里 一.面向过程 :[打开冰箱->把大象放里面->关上冰箱门]面向过程注重的是过程,也就是(动作[函数]),然后按照动作依次去执行就好了. 代表语言 ...
- IIS的安装与配置
IIS的安装与配置 5.1.1. IIS安装视频教程 5.1.2. IIS配置与建站设置视频教程 IIS是什么 IIS是Internet Information Services(Internet信息 ...