KMP Demo

The key of Kmp is to build a look up table that records the match result of prefix and postfix. Value in the table means the max len of matching substring that exists in both prefix and postfix. In the prefix this substring should start from 0, while in the postfix this substring should ends at current index.

For example, now we have a string "ababc"

The KMP table will look liks this:
 a  b  a  b  c

-1  0  1  2
(Note: we will not match substring with itself, so we will skip index 0)

So how does this table help us search string match faster?

Well, the answer is if we are trying to match a char after posfix with target string and failed, then we can smartly shift the string, so that the matching string in perfix will replace postfix and now we can try to match the char after prefix with this char in target.

Take above string as an example.

Now we try to match string "ababc" with "abababc".

We will initially have match as below
 0 1 2 3 4 5 6
 a b a b a b c (string x)

 a b a b c (string y)

-1 0 1 2 
We found char at index 4 does not match, then we can use look up table and shift the string y wisely.

We found table[3] = 2, which means we can shift the string y rightward by 2, and still have same but shorter prefix before index 4, like this:
 0 1 2 3 4 5 6

 a b a b a b c (string x)

     a b a b c (string y)

    -1 0 1 2
If there is a long gap between prefix and postfix, this shift can help us save a lot of time. In the brute force way, we cannot do that because we have no information of the string. We have to compare each possible pair of chars. While in KMP, we know the information of string y so we can move smartly. We can directly jump to the next possible match pairwhile discard useless pair of chars.

We are almost done with KMP, but we still have one special case that needs to be taken care of.

Say now we have a input like this:
 0 1 2 3 4 5

 a a b a a a

-1 1 0 2 3
How should we build the KMP table for this string?

Say the pointer in prefix is 'x', which is at index 2 now and the pointer in postfix is 'y' which is at index 5 now. We need to match 'b' pointed by x with 'a' pointed by y. It is an unmatched pair, how should we update the cell?

Well, we really don't need to reset it to 0, that will make us skip a valid shorter matching substring "aa".

What we do now is just to shorten the length of substring by 1 unit and try to match a shorter substring "aa". This can be done by moving pointer x to the index recorded in [indexOf(x) - 1] while keep pointer y stay still. This is because by following the value in KMP table we can always make sure previous part of prefix and postfix is matched even we have shorten their length, so wo only need to care about the char after matched part in prefix and posfix.

Code: [Java]

// JAVA program for implementation of KMP pattern

// searching algorithm 

class KMP_String_Matching {

	void KMPSearch(String pat, String txt)

	{

		int M = pat.length();

		int N = txt.length(); 

		// create lps[] that will hold the longest

		// prefix suffix values for pattern

		int lps[] = new int[M];

		int j = 0; // index for pat[] 

		// Preprocess the pattern (calculate lps[]

		// array)

		computeLPSArray(pat, M, lps); 

		int i = 0; // index for txt[]

		while (i < N) {

			if (pat.charAt(j) == txt.charAt(i)) {

				j++;

				i++;

			}

			if (j == M) {

				System.out.println("Found pattern "

								+ "at index " + (i - j));

				j = lps[j - 1];

			} 

			// mismatch after j matches

			else if (i < N && pat.charAt(j) != txt.charAt(i)) {

				// Do not match lps[0..lps[j-1]] characters,

				// they will match anyway

				if (j != 0)

					j = lps[j - 1];

				else

					i = i + 1;

			}

		}

	} 

	void computeLPSArray(String pat, int M, int lps[])

	{

		// length of the previous longest prefix suffix

		int len = 0;

		int i = 1;

		lps[0] = 0; // lps[0] is always 0 

		// the loop calculates lps[i] for i = 1 to M-1

		while (i < M) {

			if (pat.charAt(i) == pat.charAt(len)) {

				len++;

				lps[i] = len;

				i++;

			}

			else // (pat[i] != pat[len])

			{

				// This is tricky. Consider the example.

				// AAACAAAA and i = 7. The idea is similar

				// to search step.

				if (len != 0) {

					len = lps[len - 1]; 

					// Also, note that we do not increment

					// i here

				}

				else // if (len == 0)

				{

					lps[i] = len;

					i++;

				}

			}

		}

	} 

	// Driver program to test above function

	public static void main(String args[])

	{

		String txt = "ABABDABACDABABCABAB";

		String pat = "ABABCABAB";

		new KMP_String_Matching().KMPSearch(pat, txt);

	}

}

// This code has been contributed by Amit Khandelwal.

Reference:

https://www.geeksforgeeks.org/java-program-for-kmp-algorithm-for-pattern-searching-2/

https://leetcode.com/problems/shortest-palindrome/discuss/60113/Clean-KMP-solution-with-super-detailed-explanation

KMP Demo的更多相关文章

KMP算法实现
链接:http://blog.csdn.net/joylnwang/article/details/6778316 KMP算法是一种很经典的字符串匹配算法,链接中的讲解已经是很明确得了,自己按照其讲解 ...
kmp算法,求重复字符串
public class Demo { public static void main(String[] args) { String s1 = "ADBCFHABESCACDABCDABC ...
KMP 算法 & 字符串查找算法
KMP算法 Knuth–Morris–Pratt algorithm 克努斯-莫里斯-普拉特算法 algorithm kmp_search: input: an array of character ...
【数据结构&算法】10-串基础&KMP算法源码
目录前言串的定义串的比较串的抽象类型数据串与线性表的比较串的数据串的存储结构串的顺序存储结构串的链式存储结构朴素的模式匹配算法模式匹配的定义朴素的匹配方法(BRUTE FORC ...
通过一个demo了解Redux
TodoList小demo 效果展示项目地址 (单向)数据流数据流是我们的行为与响应的抽象:使用数据流能帮我们明确了行为对应的响应,这和react的状态可预测的思想是不谋而合的. 常见的数据流框架 ...
KMP算法求解
// KMP.cpp : 定义控制台应用程序的入口点. // #include "stdafx.h" #include<iostream> using namespac ...
很多人很想知道怎么扫一扫二维码就能打开网站，就能添加联系人，就能链接wifi，今天说下这些格式，明天做个demo
有些功能部分手机不能使用,网站,通讯录,wifi基本上每个手机都可以使用. 在看之前你可以扫一扫下面几个二维码先看看效果: 1.二维码生成网址 (URL) 包含网址的二维码生成是大家平时最常接触 ...
在线浏览PDF之PDF.JS （附demo）
平台之大势何人能挡? 带着你的Net飞奔吧!:http://www.cnblogs.com/dunitian/p/4822808.html#skill 下载地址:http://mozilla.gith ...
简单有效的kmp算法
以前看过kmp算法,当时接触后总感觉好深奥啊,抱着数据结构的数啃了一中午,最终才大致看懂,后来提起kmp也只剩下“奥,它是做模式匹配的”这点干货.最近有空,翻出来算法导论看看,原来就是这么简单(先不说 ...

随机推荐

ecplise导入工程出现乱码的解决方案
eclipse之所以会出现乱码问题是因为eclipse编辑器选择的编码规则是可变的.一般默认都是UTF-8或者GBK,当从外部导入的一个工程时,如果该工程的编码方式与eclipse中设置的编码方式不同 ...
菜刀连接webshell
中国菜刀,一个非常好用而又强大的webshell,它可不是用来切菜的做饭的道具哦,是一款专业的网站管理软件,大小只有300多KB,真是小巧实用啊!不过被不法分子利用到,就是一个黑站的利器了.我记得以前 ...
在windows系统下安装oracle 11g
oracle 11g 安装在windows server 2012 系统下. 最近,需要配置数据库,要求在windows操作系统下,安装oracle 11g 数据库,因为以前没有安装过,所以成功后, ...
android 混淆文件proguard.cfg详解（转载）
-injars androidtest.jar[jar包所在地址] -outjars out[输出地址] -libraryjars 'D:\android-sdk-windows\platf ...
MYSQL 问题小总结
mysql 问题小总结 1.MySQL远程连接ERROR 2003(HY000):Can't connect to MySQL server on ‘ip’(111)的问题通常是mysql配置文件中 ...
shell的基本语法
一赋值运算符 1 += :使用方法是,((x+=需要增加的数字))算和值. 2 *= :使用方法是,((x*=需要怎加的倍数))算乘值. 3 %= :使用方法是,((x%=需要除以的数字))算余数 ...
[原创汉化] 价值990美元的顶级专业数据恢复软件O&O DiskRecovery 11（技术员版）汉化绿色版
百度没搜索到11有汉化版的,有空就把它汉化了,大部分借鉴的是以前汉化版的词条.另外,顺便做了个二合一的单文件版给有需要的朋友. 运行环境: 可用于 Windows 2000/XP/2003/Vista ...
sqlserver中怎么查询字段为空的记录
sqlserver中怎么查询字段为空的记录的两种方法: 详细介绍请查看全文:https://cnblogs.com/qianzf/ 原文博客的链接地址:https://cnblogs.com/qzf/
Linux必须学的东西，鉴于各大公司实际开发都不用Windows系统
Windows安全性比较差,所以各大公司会使用其他的平台,所以像Linux就是很常用的,基于Unix的开源系统,鉴于很多人写的很散,所以自己总结下对于自己有用的重点,现在总结下简单的linxu的命令使 ...
ansible-playbook api 2.0 运行项目
上篇 api 的文章 <ansible-playbook api 2.0 直接运行> 介绍的是直接将 tasks 直接写在代码中的,本文介绍 api 运行整个项目 [root@10_1_ ...

KMP Demo

KMP Demo的更多相关文章

随机推荐

热门专题