LintCode-Word Segmentation

Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

Example

Given

s = "lintcode",

dict = ["lint", "code"].

Return true because "lintcode" can be segmented as "lint code".

Analysis:

It is a DP problem. However, we need to use charAt() instead of substring() to optimize speed. Also, we can first check whether each char in s has appeared in dict, if not, then directly return false. (This is used to pass the last test case in LintCode).

Solution:

 public class Solution {

     /**

      * @param s: A string s

      * @param dict: A dictionary of words dict

      */

     public boolean wordSegmentation(String s, Set<String> dict) {

         if (s.length()==0) return true;

         char[] chars = new char[256];

         for (String word : dict)

             for (int i=0;i<word.length();i++)

                 chars[word.charAt(i)]++;

         for (int i = 0;i<s.length();i++)

             if (chars[s.charAt(i)]==0) return false;

         boolean[] d = new boolean[s.length()+1];

         Arrays.fill(d,false);

         d[0] = true;

         for (int i=1;i<=s.length();i++){

         StringBuilder builder = new StringBuilder();

             for (int j=i-1;j>=0;j--){

                 builder.insert(0,s.charAt(j));

                 String cur = builder.toString();

                 if (d[j] && dict.contains(cur)){

                     d[i]=true;

                     break;

                 }

             }

         }

         return d[s.length()];

     }

 }

LintCode-Word Segmentation的更多相关文章

Solution for automatic update of Chinese word segmentation full-text index in NEO4J
Solution for automatic update of Chinese word segmentation full-text index in NEO4J 1. Sample data 2 ...
长短时间记忆的中文分词 (LSTM for Chinese Word Segmentation)
翻译学长的一片论文:Long Short-Term Memory Neural Networks for Chinese Word Segmentation 传统的neural Model for C ...
[LintCode] Word Break
Given a string s and a dictionary of words dict, determine if s can be break into a space-separated ...
[Lintcode]Word Squares(DFS|字符串)
题意略分析 0.如果直接暴力1000^5会TLE,因此考虑剪枝 1.如果当前需要插入第i个单词,其剪枝如下 1.1 其前缀(0~i-1)已经知道,必定在前缀对应的集合中找 – 第一个词填了ball ...
zpar使用方法之Chinese Word Segmentation
第一步在这里: http://people.sutd.edu.sg/~yue_zhang/doc/doc/qs.html 你可以找到这句话, 所以在命令行中分别敲入 make zpar make zp ...
论文阅读及复现 | Effective Neural Solution for Multi-Criteria Word Segmentation
主要思想这篇文章主要是利用多个标准进行中文分词,和之前复旦的那篇文章比,它的方法更简洁,不需要复杂的结构,但比之前的方法更有效. 方法堆叠的LSTM,最上层是CRF. 最底层是字符集的Bi-LST ...
Java——word分词·自定义词库
word: https://github.com/ysc/word word-1.3.1.jar 需要JDK8word-1.2.jar c语言给解析成了“语言”,自定义词库必须为UTF-8 程序一旦运 ...
【中文分词】最大熵马尔可夫模型MEMM
Xue & Shen '2003 [2]用两种序列标注模型--MEMM (Maximum Entropy Markov Model)与CRF (Conditional Random Field ...
【中文分词】二阶隐马尔可夫模型2-HMM
在前一篇中介绍了用HMM做中文分词,对于未登录词(out-of-vocabulary, OOV)有良好的识别效果,但是缺点也十分明显--对于词典中的(in-vocabulary, IV)词却未能很好地 ...
【中文分词】隐马尔可夫模型HMM
Nianwen Xue在<Chinese Word Segmentation as Character Tagging>中将中文分词视作为序列标注问题(sequence labeling ...

随机推荐

Asp.net绑定带层次下拉框（select控件）
1.效果图 2.数据库中表数据结构 3.前台页面 <select id="pid" runat="server" style="width:16 ...
Mysql中IFNULL与IN操作
Mysql IFNULL操作项目中用到的,当SQL查询某个字段为空的时候,查询结果中设置其值为默认值.最笨的方法当然是对查询结果进行处理了,遍历查询结果,当为空的时候,设置其值: 代码如下复制代码 ...
JDBC连接MySQL数据库及示例
JDBC是Sun公司制定的一个可以用Java语言连接数据库的技术. 一.JDBC基础知识 JDBC(Java Data Base Connectivity,java数据库连接)是一 ...
Java--CJDP
was定义,包定义, 1. Java的接口概念进行封装,方便的使用 2. 包定义,Java 中多种包,进行迁移使用,包的导入,例如对数据库的操作Hibernate 3. 配置文件xml和json,对 ...
LLVM language 参考手册（译）（4）
函数(Functions) LLVM函数定义由“define” 关键字,一个可选的链接标识,一个可选的可见性模式,一个可选的DLL存储类别,一个可选的调用约定,一个可选的 unnamed_addr 属 ...
Android开发代码规范
目录 1.命名基本原则 2.命名基本规范 2.1编程基本命名规范 2.2分类命名规范 3.分类命名规范 3.1基本数据类型命名规范 3.2控件命名规范 3.3变量命名规范 3.4整个项目的目录规范化 ...
multiple backgrounds 多重背景
语法缩写如下: background : [background-color] | [background-image] | [background-position][/background-siz ...
Entity Framework Code First 常用方法集成
using System; using System.Collections.Generic; using System.Data.Entity; using System.Linq; using S ...
一个好用的PHP验证码类
分享一个好用的php验证码类,包括调用示例. 说明: 如果不适用指定的字体,那么就用imagestring()函数,如果需要遇到指定的字体,就要用到imagettftext()函数.字体的位置在C盘下 ...
php错误收集
1.Notice: Undefined offset: 1 in F:\www\my\test.php on line 39,原因offset:接下去的数字是出错的数组下标,一般是超出了数组的取值范围 ...

LintCode-Word Segmentation

LintCode-Word Segmentation的更多相关文章

随机推荐

热门专题