递归题目,注意结合了memo的方法和trie的应用

package Moderate;

import java.util.Hashtable;

import CtCILibrary.AssortedMethods;
import CtCILibrary.Trie; /**
* Oh, no! You have just completed a lengthy document when you have an unfortu-
nate Find/Replace mishap. You have accidentally removed all spaces, punctuation,
and capitalization in the document. A sentence like "I reset the computer. It still
didn't boot!" would become "iresetthecomputeritstilldidntboot". You figure that you
can add back in the punctation and capitalization later, once you get the individual
words properly separated. Most of the words will be in a dictionary, but some strings,
like proper names, will not.
Given a dictionary (a list of words), design an algorithm to find the optimal way of
"unconcatenating" a sequence of words. In this case, "optimal" is defined to be the
parsing which minimizes the number of unrecognized sequences of characters.
For example, the string "jesslookedjustliketimherbrother" would be optimally parsed
as "JESS looked just like TIM her brother". This parsing has seven unrecognized char-
acters, which we have capitalized for clarity. 给一个string,把string内的所有标点,空格都去掉。然后要求找到把空格加回去使得不可辨别的
单词数量达到最少的方法(判断是否可以辨别是通过提供一个字典来判断) *
*/
public class S17_14 { public static String sentence;
public static Trie dictionary; /* incomplete code */
public static Result parse(int wordStart, int wordEnd, Hashtable<Integer, Result> cache) {
if (wordEnd >= sentence.length()) {
return new Result(wordEnd - wordStart, sentence.substring(wordStart).toUpperCase());
}
if (cache.containsKey(wordStart)) {
return cache.get(wordStart).clone();
}
String currentWord = sentence.substring(wordStart, wordEnd + 1);
boolean validPartial = dictionary.contains(currentWord, false);
boolean validExact = validPartial && dictionary.contains(currentWord, true); /* break current word */
Result bestExact = parse(wordEnd + 1, wordEnd + 1, cache);
if (validExact) {
bestExact.parsed = currentWord + " " + bestExact.parsed;
} else {
bestExact.invalid += currentWord.length();
bestExact.parsed = currentWord.toUpperCase() + " " + bestExact.parsed;
} /* extend current word */
Result bestExtend = null;
if (validPartial) {
bestExtend = parse(wordStart, wordEnd + 1, cache);
} /* find best */
Result best = Result.min(bestExact, bestExtend);
cache.put(wordStart, best.clone());
return best;
} public static int parseOptimized(int wordStart, int wordEnd, Hashtable<Integer, Integer> cache) {
if (wordEnd >= sentence.length()) {
return wordEnd - wordStart;
}
if (cache.containsKey(wordStart)) {
return cache.get(wordStart);
} String currentWord = sentence.substring(wordStart, wordEnd + 1);
boolean validPartial = dictionary.contains(currentWord, false); /* break current word */
int bestExact = parseOptimized(wordEnd + 1, wordEnd + 1, cache);
if (!validPartial || !dictionary.contains(currentWord, true)) {
bestExact += currentWord.length();
} /* extend current word */
int bestExtend = Integer.MAX_VALUE;
if (validPartial) {
bestExtend = parseOptimized(wordStart, wordEnd + 1, cache);
} /* find best */
int min = Math.min(bestExact, bestExtend);
cache.put(wordStart, min);
return min;
} public static int parseSimple(int wordStart, int wordEnd) {
if (wordEnd >= sentence.length()) {
return wordEnd - wordStart;
} String word = sentence.substring(wordStart, wordEnd + 1); /* break current word */
int bestExact = parseSimple(wordEnd + 1, wordEnd + 1);
if (!dictionary.contains(word, true)) {
bestExact += word.length();
} /* extend current word */
int bestExtend = parseSimple(wordStart, wordEnd + 1); /* find best */
return Math.min(bestExact, bestExtend);
} public static String clean(String str) {
char[] punctuation = {',', '"', '!', '.', '\'', '?', ','};
for (char c : punctuation) {
str = str.replace(c, ' ');
}
return str.replace(" ", "").toLowerCase();
} public static void main(String[] args) {
dictionary = AssortedMethods.getTrieDictionary();
sentence = "As one of the top companies in the world, Google will surely attract the attention of computer gurus. This does not, however, mean the company is for everyone.";
sentence = clean(sentence);
System.out.println(sentence);
//Result v = parse(0, 0, new Hashtable<Integer, Result>());
//System.out.println(v.parsed);
int v = parseOptimized(0, 0, new Hashtable<Integer, Integer>());
System.out.println(v);
} static class Result {
public int invalid = Integer.MAX_VALUE;
public String parsed = "";
public Result(int inv, String p) {
invalid = inv;
parsed = p;
} public Result clone() {
return new Result(this.invalid, this.parsed);
} public static Result min(Result r1, Result r2) {
if (r1 == null) {
return r2;
} else if (r2 == null) {
return r1;
} return r2.invalid < r1.invalid ? r2 : r1;
}
} }

Moderate 加入空格使得可辨别单词数量最多 @CareerCup的更多相关文章

  1. Storm监控文件夹变化 统计文件单词数量

    监控指定文件夹,读取文件(新文件动态读取)里的内容,统计单词的数量. FileSpout.java,监控文件夹,读取新文件内容 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...

  2. python核心编程正则表达式练习题1-2匹配由单个空格分隔的任意单词对,也就是性和名

    # 匹配由单个空格分隔的任意单词对,也就是姓和名 import re patt = '[A-Za-z]+ [A-Za-z]+' # 方法一 +加号操作符匹配它左边的正则表达式至少出现一次的情况 # p ...

  3. Python GitHub上星星数量最多的项目

    GitHub上星星数量最多的项目 """ most_popular.py 查看GitHub上获得星星最多的项目都是用什么语言写的 """ i ...

  4. Java基础IO类之字符串流(查字符串中的单词数量)与管道流

    一.字符串流 定义:字符串流(StringReader),以一个字符为数据源,来构造一个字符流. 作用:在Web开发中,我们经常要从服务器上获取数据,数据返回的格式通常一个字符串(XML.JSON), ...

  5. go语言小练习——给定英语文章统计单词数量

    给定一篇英语文章,要求统计出所有单词的个数,并按一定次序输出.思路是利用go语言的map类型,以每个单词作为关键字存储数量信息,代码实现如下: package main import ( " ...

  6. 练习1-21:编写程序entab,将空格串替换为最少数量的制表符和空格。。。(C程序设计语言 第2版)

    #include <stdio.h> #define N 5 main() { int i, j, c, lastc; lastc = 'a'; i = j = ; while ((c=g ...

  7. hadoop-mapreduce-(1)-统计单词数量

    编写map程序 package com.cvicse.ump.hadoop.mapreduce.map; import java.io.IOException; import org.apache.h ...

  8. 在Linux系统下有一个目录/usr/share/dict/ 这个目录里包含了一个词典的文本文件,我们可以利用这个文件来辨别单词是否为词典中的单词。

    #!/bin/bash s=`cat /usr/share/dict/linux.words` for i in $s; do if [ $1 = $i ];then echo "$1 在字 ...

  9. Python的 counter内置函数,统计文本中的单词数量

    counter是 colletions内的一个类 可以理解为一个简单的计数 import collections str1=['a','a','b','d'] m=collections.Counte ...

随机推荐

  1. Swift中可选类型(Optional)的用法 以及? 和 ! 的区别 (转载博客,知识分享)

    本文转载自:代码手工艺人的博客,原文名称:Swift之 ? 和 ! Swift语言使用var定义变量,但和别的语言不同,Swift里不会自动给变量赋初始值,也就是说变量不会有默认值,所以要求使用变量之 ...

  2. Linux下ln链接命令详解

    ln是linux中又一个非常重要命令,它的功能是为某一个文件在另外一个位置建立一个不同的链接,这个命令最常用的参数是-s,具体用法是:ln –s 源文件 目标文件. 当我们需要在不同的目录,用到相同的 ...

  3. 利用js获取时间并输出值

    <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8" ...

  4. 封装cookie组件

    var Cookie = { // 读取 get: function(name){ var cookieStr = "; "+document.cookie+"; &qu ...

  5. Grnymotion模拟器和Android真机访问PC端Tomcat下的应用

    最近因为要学安卓与服务器交互的知识,所以必须要让android程序能访问一个测试服务器.所以我就考虑让真机或者模拟器访问PC端的Tomcat或者Apache服务. 在介绍步骤之前,有必要说点基础的.我 ...

  6. Java环境配置出现的问题及解决办法

    我使用的安装文件,从博客园一个童鞋那里找到的,连接是http://www.cnblogs.com/SelectError/p/3205582.html#commentform 开发基础环境,版本为Ja ...

  7. Windows Open with Sublime Text

    Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\Open with Sublime Text] "Icon&q ...

  8. Rewrite的QSA是什么意思?

    原版的英文: When the replacement URI contains a query string, the default behavior of RewriteRule is to d ...

  9. 运行在TQ2440开发板上以及X86平台上的linux内核编译

    一.运行在TQ2440开发板上的linux内核编译 1.获取源码并解压 直接使用天嵌移植好的“linux-2.6.30.4_20100531.tar.bz2”源码包. 解压(天嵌默认解压到/opt/E ...

  10. 持续集成之戏说Check-in Dance

    尽管Thoughtworks的首席科学家Martion folwer 为“持续集成 ” 下了定义,但由于自身背景与经历的不同,每个人对其都有不同的理解.从狭义上讲,持续集成可以认为是一种基于某种或者某 ...