Oh, no! You have just completed  a lengthy  document when you have an  unfortu-
nate Find/Replace mishap. You have accidentally removed all spaces, punctuation,
and capitalization in the document. A sentence like "I reset the computer. It still
didn't boot!" would become "iresetthecomputeritstilldidntboot". You figure that you
can add back in the punctation and capitalization later, once you get the individual
words properly separated. Most of the words will be in a dictionary, but some strings,
like proper names, will not.
Given a dictionary (a list of words), design an algorithm to find the optimal way of
"unconcatenating" a sequence of words. In this case, "optimal" is defined to be the
parsing which minimizes the number of unrecognized sequences of characters.
For example, the string "jesslookedjustliketimherbrother" would be optimally parsed
as "JESS looked just like TIM her brother". This parsing has seven unrecognized char-
acters, which we have capitalized for clarity.

这是CareerCup Chapter 17的第14题,我没怎么看CareerCup上的解法,但感觉这道题跟Word Break, Palindrome Partition II很像,都是有一个dictionary, 可以用一维DP来做,用一个int[] res = new int[len+1]; res[i] refers to minimized # of unrecognized chars in first i chars, res[0]=0, res[len]即为所求。

有了维护量,现在需要考虑转移方程,如下:

int unrecogNum = dict.contains(s.substring(j, i))? 0 : i-j; //看index从j到i-1的substring在不在dictionary里,如果不在,unrecogNum=j到i-1的char数
res[i] = Math.min(res[i], res[j]+unrecogNum);

亲测,我使用的case都过了,只是不知道有没有不过的Corner Case:

 package fib;

 import java.util.Arrays;
import java.util.HashSet;
import java.util.Set; public class unconcatenating {
public int optway(String s, Set<String> dict) {
if (s==null || s.length()==0) return 0;
int len = s.length();
if (dict.isEmpty()) return len;
int[] res = new int[len+1]; // res[i] refers to minimized # of unrecognized chars in first i chars
Arrays.fill(res, Integer.MAX_VALUE);
res[0] = 0;
for (int i=1; i<=len; i++) {
for (int j=0; j<i; j++) {
String str = s.substring(j, i);
int unrecogNum = dict.contains(str)? 0 : i-j;
res[i] = Math.min(res[i], res[j]+unrecogNum);
}
}
return res[len];
} public static void main(String[] args) {
unconcatenating example = new unconcatenating();
Set<String> dict = new HashSet<String>();
dict.add("reset");
dict.add("the");
dict.add("computer");
dict.add("it");
dict.add("still");
dict.add("didnt");
dict.add("boot");
int result = example.optway("johnresetthecomputeritdamnstilldidntboot", dict);
System.out.print("opt # of unrecognized chars is ");
System.out.println(result);
} }

output是:opt # of unrecognized chars is 8

CareerCup: 17.14 minimize unrecognized characters的更多相关文章

  1. [CareerCup] 17.14 Unconcatenate Words 断词

    17.14 Oh, no! You have just completed a lengthy document when you have an unfortunate Find/Replace m ...

  2. [CareerCup] 17.6 Sort Array 排列数组

    17.6 Given an array of integers, write a method to find indices m and n such that if you sorted elem ...

  3. [CareerCup] 17.2 Tic Tac Toe 井字棋游戏

    17.2 Design an algorithm to figure out if someone has won a game oftic-tac-toe. 这道题让我们判断玩家是否能赢井字棋游戏, ...

  4. [CareerCup] 17.13 BiNode 双向节点

    17.13 Consider a simple node-like data structure called BiNode, which has pointers to two other node ...

  5. [CareerCup] 17.12 Sum to Specific Value 和为特定数

    17.12 Design an algorithm to find all pairs of integers within an array which sum to a specified val ...

  6. [CareerCup] 17.11 Rand7 and Rand5 随机生成数字

    17.11 Implement a method rand7() given rand5(). That is, given a method that generates a random numb ...

  7. [CareerCup] 17.10 Encode XML 编码XML

    17.10 Since XML is very verbose, you are given a way of encoding it where each tag gets mapped to a ...

  8. [CareerCup] 17.9 Word Frequency in a Book 书中单词频率

    17.9 Design a method to find the frequency of occurrences of any given word in a book. 这道题让我们找书中单词出现 ...

  9. [CareerCup] 17.8 Contiguous Sequence with Largest Sum 连续子序列之和最大

    17.8 You are given an array of integers (both positive and negative). Find the contiguous sequence w ...

随机推荐

  1. Progressive enhancement

    https://en.wikipedia.org/wiki/Progressive_enhancement Progressive enhancement is a strategy for web ...

  2. mysql和oracle 分页查询(转)

    最近简单的对oracle,mysql,sqlserver2005的数据分页查询作了研究,把各自的查询的语句贴出来供大家学习..... (一). mysql的分页查询 mysql的分页查询是最简单的,借 ...

  3. git 第一次初始化

    Command line instructions Git global setup git config --global user.name "{名字}({工号})" git ...

  4. oracle EBS 资产定义

    一.资产定义也就是江项目任务上的特定(能生成资产的)物料按照一定格式生成资产信息,其中每个独立物料生成一条资产,具体操作步骤如下: 1.省本部库存超级用户系统内生成领料单.审批领料单.最后进行出库处理 ...

  5. ASP.NET WebForm与ASP.NET MVC的不同点

    ASP.NET WebForm ASP.NET MVC ASP.NET Web Form 遵循传统的事件驱动开发模型 ASP.NET MVC是轻量级的遵循MVC模式的请求处理响应的基本开发模型 ASP ...

  6. How To Install Tinc and Set Up a Basic VPN on Ubuntu 14.04

    Introduction In this tutorial, we will go over how to use Tinc, an open source Virtual Private Netwo ...

  7. 获得输入框的文本document.getElementById('id').value;

    <input id="demo" type="text" value="" > x=document.getElementByI ...

  8. Ubuntu 14.04安装OpenCV 3.1

    从OpenCV官网上下载OpenCV官网上下载OpenCV的未编译源代码: 点击这里 国内很多网络打开OpenCV官网速度缓慢,可以点击如下地址直接从GitHub上下载OpenCV 3.1的源代码 下 ...

  9. SQL server 2012

    MICROSOFT SQL SERVER 2012 企业核心版激活码序列号: FH666-Y346V-7XFQ3-V69JM-RHW28MICROSOFT SQL SERVER 2012 商业智能版激 ...

  10. shell脚本编程-使用结构化命令(if/else)(转)

    11.1 使用if-then语句 格式如下 if语句会执行if行定义的那个命令,如果该命令的退出状态码是0,则then部分的语句就会执行,其他值,则不会   1 2 3 4 if command th ...