CareerCup: 17.14 minimize unrecognized characters
Oh, no! You have just completed a lengthy document when you have an unfortu-
nate Find/Replace mishap. You have accidentally removed all spaces, punctuation,
and capitalization in the document. A sentence like "I reset the computer. It still
didn't boot!" would become "iresetthecomputeritstilldidntboot". You figure that you
can add back in the punctation and capitalization later, once you get the individual
words properly separated. Most of the words will be in a dictionary, but some strings,
like proper names, will not.
Given a dictionary (a list of words), design an algorithm to find the optimal way of
"unconcatenating" a sequence of words. In this case, "optimal" is defined to be the
parsing which minimizes the number of unrecognized sequences of characters.
For example, the string "jesslookedjustliketimherbrother" would be optimally parsed
as "JESS looked just like TIM her brother". This parsing has seven unrecognized char-
acters, which we have capitalized for clarity.
这是CareerCup Chapter 17的第14题,我没怎么看CareerCup上的解法,但感觉这道题跟Word Break, Palindrome Partition II很像,都是有一个dictionary, 可以用一维DP来做,用一个int[] res = new int[len+1]; res[i] refers to minimized # of unrecognized chars in first i chars, res[0]=0, res[len]即为所求。
有了维护量,现在需要考虑转移方程,如下:
int unrecogNum = dict.contains(s.substring(j, i))? 0 : i-j; //看index从j到i-1的substring在不在dictionary里,如果不在,unrecogNum=j到i-1的char数
res[i] = Math.min(res[i], res[j]+unrecogNum);
亲测,我使用的case都过了,只是不知道有没有不过的Corner Case:
package fib; import java.util.Arrays;
import java.util.HashSet;
import java.util.Set; public class unconcatenating {
public int optway(String s, Set<String> dict) {
if (s==null || s.length()==0) return 0;
int len = s.length();
if (dict.isEmpty()) return len;
int[] res = new int[len+1]; // res[i] refers to minimized # of unrecognized chars in first i chars
Arrays.fill(res, Integer.MAX_VALUE);
res[0] = 0;
for (int i=1; i<=len; i++) {
for (int j=0; j<i; j++) {
String str = s.substring(j, i);
int unrecogNum = dict.contains(str)? 0 : i-j;
res[i] = Math.min(res[i], res[j]+unrecogNum);
}
}
return res[len];
} public static void main(String[] args) {
unconcatenating example = new unconcatenating();
Set<String> dict = new HashSet<String>();
dict.add("reset");
dict.add("the");
dict.add("computer");
dict.add("it");
dict.add("still");
dict.add("didnt");
dict.add("boot");
int result = example.optway("johnresetthecomputeritdamnstilldidntboot", dict);
System.out.print("opt # of unrecognized chars is ");
System.out.println(result);
} }
output是:opt # of unrecognized chars is 8
CareerCup: 17.14 minimize unrecognized characters的更多相关文章
- [CareerCup] 17.14 Unconcatenate Words 断词
17.14 Oh, no! You have just completed a lengthy document when you have an unfortunate Find/Replace m ...
- [CareerCup] 17.6 Sort Array 排列数组
17.6 Given an array of integers, write a method to find indices m and n such that if you sorted elem ...
- [CareerCup] 17.2 Tic Tac Toe 井字棋游戏
17.2 Design an algorithm to figure out if someone has won a game oftic-tac-toe. 这道题让我们判断玩家是否能赢井字棋游戏, ...
- [CareerCup] 17.13 BiNode 双向节点
17.13 Consider a simple node-like data structure called BiNode, which has pointers to two other node ...
- [CareerCup] 17.12 Sum to Specific Value 和为特定数
17.12 Design an algorithm to find all pairs of integers within an array which sum to a specified val ...
- [CareerCup] 17.11 Rand7 and Rand5 随机生成数字
17.11 Implement a method rand7() given rand5(). That is, given a method that generates a random numb ...
- [CareerCup] 17.10 Encode XML 编码XML
17.10 Since XML is very verbose, you are given a way of encoding it where each tag gets mapped to a ...
- [CareerCup] 17.9 Word Frequency in a Book 书中单词频率
17.9 Design a method to find the frequency of occurrences of any given word in a book. 这道题让我们找书中单词出现 ...
- [CareerCup] 17.8 Contiguous Sequence with Largest Sum 连续子序列之和最大
17.8 You are given an array of integers (both positive and negative). Find the contiguous sequence w ...
随机推荐
- nginx下使用memcache
nginx配置支持memcache,但不支持写,支持读,所以读取部分由程序设置,整个代码如下nginx的server段配置如下:#将静态文件放入memcachelocation ~* \.(gif|j ...
- 函数式编程Map()&Reduce()
.forEach():每个元素都调用指定函数,可传三个参数:数组元素丶元素索引丶数组本身丶 , , , , , , , ]; a.forEach(function(v,i,a){a[i]=v+;}); ...
- 数据库留言板例题:session和cookie区别
session和cookie区别: <?php session_start(); //session_start();必须写在所有的php代码前边 ?> <!DOCTYPE html ...
- freebsd 禁用root登录ssh并给普通用户登录权限
转自http://www.linux521.com/2009/system/200904/2021.html http://www.myhack58.com/Article/48/67/2011/30 ...
- readonly=“readonly”与readonly=“true”
<input id="u" readonly /> <input id="u" readonly="readonly" / ...
- Maven实战(七)settings.xml相关配置
一.简介 settings.xml对于maven来说相当于全局性的配置,用于所有的项目,当Maven运行过程中的各种配置,例如pom.xml,不想绑定到一个固定的project或者要分配给用户时,我们 ...
- Java学习-008-判断文件类型实例
此文源码主要为应用 Java 如何判断文件类型的源码及其测试源码.若有不足之处,敬请大神指正,不胜感激!源代码测试通过日期为:2015-2-2 23:02:00,请知悉. Java 判断文件类型源码如 ...
- Java学习-002-Java初识
此文主要讲述什么是 Java,以及 Java 常识性知识,方便亲们进一步了解 Java 语言相关的常识. 一.Java 概述 Java 语言是美国 Sun Microsystems 公司于 1995 ...
- http://localhost/certsrv 错误找不到页面解决方法
http://localhost/certsrv 错误找不到页面解决方法 最近公司需要后台启动安全证书,可安装了“Active Directory证书服务” 后,http://localhost/ce ...
- Advanced REST client的使用说明
1. 为什么要使用REST Client 在实际企业开发过程中经常会有这样的需求: 1.我当前开发的这个系统是需要调用其他系统的接口,也就是我们需要频繁的测试接口,尝试不同的入参参数去查看返回结果, ...