Oh, no! You have just completed  a lengthy  document when you have an  unfortu-
nate Find/Replace mishap. You have accidentally removed all spaces, punctuation,
and capitalization in the document. A sentence like "I reset the computer. It still
didn't boot!" would become "iresetthecomputeritstilldidntboot". You figure that you
can add back in the punctation and capitalization later, once you get the individual
words properly separated. Most of the words will be in a dictionary, but some strings,
like proper names, will not.
Given a dictionary (a list of words), design an algorithm to find the optimal way of
"unconcatenating" a sequence of words. In this case, "optimal" is defined to be the
parsing which minimizes the number of unrecognized sequences of characters.
For example, the string "jesslookedjustliketimherbrother" would be optimally parsed
as "JESS looked just like TIM her brother". This parsing has seven unrecognized char-
acters, which we have capitalized for clarity.

这是CareerCup Chapter 17的第14题,我没怎么看CareerCup上的解法,但感觉这道题跟Word Break, Palindrome Partition II很像,都是有一个dictionary, 可以用一维DP来做,用一个int[] res = new int[len+1]; res[i] refers to minimized # of unrecognized chars in first i chars, res[0]=0, res[len]即为所求。

有了维护量,现在需要考虑转移方程,如下:

int unrecogNum = dict.contains(s.substring(j, i))? 0 : i-j; //看index从j到i-1的substring在不在dictionary里,如果不在,unrecogNum=j到i-1的char数
res[i] = Math.min(res[i], res[j]+unrecogNum);

亲测,我使用的case都过了,只是不知道有没有不过的Corner Case:

 package fib;

 import java.util.Arrays;
import java.util.HashSet;
import java.util.Set; public class unconcatenating {
public int optway(String s, Set<String> dict) {
if (s==null || s.length()==0) return 0;
int len = s.length();
if (dict.isEmpty()) return len;
int[] res = new int[len+1]; // res[i] refers to minimized # of unrecognized chars in first i chars
Arrays.fill(res, Integer.MAX_VALUE);
res[0] = 0;
for (int i=1; i<=len; i++) {
for (int j=0; j<i; j++) {
String str = s.substring(j, i);
int unrecogNum = dict.contains(str)? 0 : i-j;
res[i] = Math.min(res[i], res[j]+unrecogNum);
}
}
return res[len];
} public static void main(String[] args) {
unconcatenating example = new unconcatenating();
Set<String> dict = new HashSet<String>();
dict.add("reset");
dict.add("the");
dict.add("computer");
dict.add("it");
dict.add("still");
dict.add("didnt");
dict.add("boot");
int result = example.optway("johnresetthecomputeritdamnstilldidntboot", dict);
System.out.print("opt # of unrecognized chars is ");
System.out.println(result);
} }

output是:opt # of unrecognized chars is 8

CareerCup: 17.14 minimize unrecognized characters的更多相关文章

  1. [CareerCup] 17.14 Unconcatenate Words 断词

    17.14 Oh, no! You have just completed a lengthy document when you have an unfortunate Find/Replace m ...

  2. [CareerCup] 17.6 Sort Array 排列数组

    17.6 Given an array of integers, write a method to find indices m and n such that if you sorted elem ...

  3. [CareerCup] 17.2 Tic Tac Toe 井字棋游戏

    17.2 Design an algorithm to figure out if someone has won a game oftic-tac-toe. 这道题让我们判断玩家是否能赢井字棋游戏, ...

  4. [CareerCup] 17.13 BiNode 双向节点

    17.13 Consider a simple node-like data structure called BiNode, which has pointers to two other node ...

  5. [CareerCup] 17.12 Sum to Specific Value 和为特定数

    17.12 Design an algorithm to find all pairs of integers within an array which sum to a specified val ...

  6. [CareerCup] 17.11 Rand7 and Rand5 随机生成数字

    17.11 Implement a method rand7() given rand5(). That is, given a method that generates a random numb ...

  7. [CareerCup] 17.10 Encode XML 编码XML

    17.10 Since XML is very verbose, you are given a way of encoding it where each tag gets mapped to a ...

  8. [CareerCup] 17.9 Word Frequency in a Book 书中单词频率

    17.9 Design a method to find the frequency of occurrences of any given word in a book. 这道题让我们找书中单词出现 ...

  9. [CareerCup] 17.8 Contiguous Sequence with Largest Sum 连续子序列之和最大

    17.8 You are given an array of integers (both positive and negative). Find the contiguous sequence w ...

随机推荐

  1. enhance convenience rather than contribute to the fundamental power of the language

    Computer Science An Overview _J. Glenn Brookshear _11th Edition Universal Programming Languages In  ...

  2. Oracle 安装 INS-30131错误。

    需要学习SDE配置相关知识,其中Oracle数据库安装遇到错误INS-30131,虽然未能最终解决,但找到了初步的思路,记录下来给大家提供参考.下文对很多知识的理解可能存在错误或不够精准,仅作参考. ...

  3. cookie 操作

    //创建并赋值 重新赋值也是这样操作 document.cookie="userId=828"; document.cookie="userName=hulk" ...

  4. ssi服务器端指令

    SSI使用详解 你是否曾经或正在为如何能够在最短的时间内完成对一个包含上千个页面的网站的修改而苦恼?那么可以看一下本文的介绍,或许能够对你有所帮助.什么是SSI?SSI是英文Server Side I ...

  5. Windows下mysql自动备份的最佳方案

    网上有很多关于window下Mysql自动备份的方法,其实不乏一些不好的地方和问题,现总结出一个最好的方法供大家参考: 新建一个记事本,然后重命名为: mysql_backup.bat 然后单击右键选 ...

  6. VB动态添加WebBrowser控件,并拦截弹出窗口(不用引用任何组件)

    新建空白窗体,然后粘帖下面代码: Option ExplicitPublic WithEvents br As VBControlExtender Private Sub br_ObjectEvent ...

  7. 蓝牙BLE MTU规则与约定

    1. 问题引言: 想在gatt client上(一般是手机上)传输长一点的数据给gatt server(一般是一个Bluetooth smart设备,即只有BLE功能的设备),但通过 writeCha ...

  8. 视频播放器开发中遇到的一些小问题MPMoviePlayerController

    1 开发环境是 xcode6  ipad3真机 ios8.1.1越狱 需要添加以下代码  ,否则真机测试没有外音,只有耳机   NSError *setCategoryError = nil;    ...

  9. H264关于RTP协议的实现

    完整的C/S架构的基于RTP/RTCP的H.264视频传输方案.此方案中,在服务器端和客户端分别进行了功能模块设计. 服务器端:RTP封装模块主要是对H.264码流进行打包封装:RTCP分析模块负责产 ...

  10. http://blog.csdn.net/fw0124/article/details/48280083

    http://blog.csdn.net/fw0124/article/details/48280083