Oh, no! You have just completed  a lengthy  document when you have an  unfortu-
nate Find/Replace mishap. You have accidentally removed all spaces, punctuation,
and capitalization in the document. A sentence like "I reset the computer. It still
didn't boot!" would become "iresetthecomputeritstilldidntboot". You figure that you
can add back in the punctation and capitalization later, once you get the individual
words properly separated. Most of the words will be in a dictionary, but some strings,
like proper names, will not.
Given a dictionary (a list of words), design an algorithm to find the optimal way of
"unconcatenating" a sequence of words. In this case, "optimal" is defined to be the
parsing which minimizes the number of unrecognized sequences of characters.
For example, the string "jesslookedjustliketimherbrother" would be optimally parsed
as "JESS looked just like TIM her brother". This parsing has seven unrecognized char-
acters, which we have capitalized for clarity.

这是CareerCup Chapter 17的第14题,我没怎么看CareerCup上的解法,但感觉这道题跟Word Break, Palindrome Partition II很像,都是有一个dictionary, 可以用一维DP来做,用一个int[] res = new int[len+1]; res[i] refers to minimized # of unrecognized chars in first i chars, res[0]=0, res[len]即为所求。

有了维护量,现在需要考虑转移方程,如下:

int unrecogNum = dict.contains(s.substring(j, i))? 0 : i-j; //看index从j到i-1的substring在不在dictionary里,如果不在,unrecogNum=j到i-1的char数
res[i] = Math.min(res[i], res[j]+unrecogNum);

亲测,我使用的case都过了,只是不知道有没有不过的Corner Case:

 package fib;

 import java.util.Arrays;
import java.util.HashSet;
import java.util.Set; public class unconcatenating {
public int optway(String s, Set<String> dict) {
if (s==null || s.length()==0) return 0;
int len = s.length();
if (dict.isEmpty()) return len;
int[] res = new int[len+1]; // res[i] refers to minimized # of unrecognized chars in first i chars
Arrays.fill(res, Integer.MAX_VALUE);
res[0] = 0;
for (int i=1; i<=len; i++) {
for (int j=0; j<i; j++) {
String str = s.substring(j, i);
int unrecogNum = dict.contains(str)? 0 : i-j;
res[i] = Math.min(res[i], res[j]+unrecogNum);
}
}
return res[len];
} public static void main(String[] args) {
unconcatenating example = new unconcatenating();
Set<String> dict = new HashSet<String>();
dict.add("reset");
dict.add("the");
dict.add("computer");
dict.add("it");
dict.add("still");
dict.add("didnt");
dict.add("boot");
int result = example.optway("johnresetthecomputeritdamnstilldidntboot", dict);
System.out.print("opt # of unrecognized chars is ");
System.out.println(result);
} }

output是:opt # of unrecognized chars is 8

CareerCup: 17.14 minimize unrecognized characters的更多相关文章

  1. [CareerCup] 17.14 Unconcatenate Words 断词

    17.14 Oh, no! You have just completed a lengthy document when you have an unfortunate Find/Replace m ...

  2. [CareerCup] 17.6 Sort Array 排列数组

    17.6 Given an array of integers, write a method to find indices m and n such that if you sorted elem ...

  3. [CareerCup] 17.2 Tic Tac Toe 井字棋游戏

    17.2 Design an algorithm to figure out if someone has won a game oftic-tac-toe. 这道题让我们判断玩家是否能赢井字棋游戏, ...

  4. [CareerCup] 17.13 BiNode 双向节点

    17.13 Consider a simple node-like data structure called BiNode, which has pointers to two other node ...

  5. [CareerCup] 17.12 Sum to Specific Value 和为特定数

    17.12 Design an algorithm to find all pairs of integers within an array which sum to a specified val ...

  6. [CareerCup] 17.11 Rand7 and Rand5 随机生成数字

    17.11 Implement a method rand7() given rand5(). That is, given a method that generates a random numb ...

  7. [CareerCup] 17.10 Encode XML 编码XML

    17.10 Since XML is very verbose, you are given a way of encoding it where each tag gets mapped to a ...

  8. [CareerCup] 17.9 Word Frequency in a Book 书中单词频率

    17.9 Design a method to find the frequency of occurrences of any given word in a book. 这道题让我们找书中单词出现 ...

  9. [CareerCup] 17.8 Contiguous Sequence with Largest Sum 连续子序列之和最大

    17.8 You are given an array of integers (both positive and negative). Find the contiguous sequence w ...

随机推荐

  1. PHP学习(二)----数组

    数组: 首先说一下对PHP中的理解,建立一个好的理解模型还是很关键的: 1.PHP中的数组实际上可以理解为键值对,key=>value;而对于key的取值,可以是string/integer;v ...

  2. 会php不回缓存行吗?多重实现

    1.普遍缓存技术: 数据缓存:这里所说的数据缓存是指数据库查询PHP缓存机制,每次访问页面的时候,都会先检测相应的缓存数据是否存在,如果不存在,就连接数据库,得到数据,并把查询结果序列化后保存到文件中 ...

  3. 【转】Mysql中的排序规则utf8_unicode_ci、utf8_general_ci的区别总结

    Mysql中utf8_general_ci与utf8_unicode_ci有什么区别呢?在编程语言中,通常用unicode对中文字符做处理,防止出现乱码,那么在MySQL里,为什么大家都使用utf8_ ...

  4. excel15个技巧

    自动定时保存Excel中的文件 点击“工具”菜单“自动保存”项,设置自动保存文件夹的间隔时间.如果在“工具”菜单下没有“自动保存”菜单项,那么执行“工具”菜单下“加载宏…”选上“自动保存”,“确定”. ...

  5. Bash 快捷键大全

    快捷键的一些说明: CTRL=C:这个键是指PC键盘上的Ctrl键 ALT=M:这个键是PC键盘上的ALT键,如果你键盘上没有这个键,可以尝试使用ESC键代替 SHIFT=S:此键是PC上的Shift ...

  6. percona

     三.      mysql安装 安装 Percona Server:vi /etc/yum.repos.d/Percona.repo[percona]name = CentOS $releaseve ...

  7. pomelo架构概览

    pomelo之所以简单易用.功能全面,并且具有高可扩展性.可伸缩性等特点,这与它的技术选型和方案设计是密不可分的.在研究大量游戏引擎设计思路基础上,结合以往游戏开发的经验,确定了pomelo框架的设计 ...

  8. Dreamweaver代码提示快捷键冲突

    平时习惯使用Ctrl+空格进行输入法的切换,然后Dreamweaver代码提示的快捷键也是Ctrl+空格,那么只好把Dw的快捷键改掉吧, 选择菜单:编辑->快捷键 复制副本 选择编辑 找到&qu ...

  9. 从高版本JDK换成低版本JDK报错Unsupported major.minor version 52.0

    ava.lang.UnsupportedClassVersionError: PR/Sort : Unsupported major.minor version 52.0这个错误是由于高版本的java ...

  10. Protocol and Delegate

    为什么使用委托? 答:比如,我上班的工作主要内容包括 (1)写代码(2)写文档(3)测试程序(4)接电话(5)会见客户 (1)(2)我自己全权负责,但是后面(3)(4)(5)我不想或者不方便自己做,所 ...