poj_3461 KMP算法解析

A - Oulipo

Time Limit:1000MS Memory Limit:65536KB 64bit IO Format:%I64d & %I64u

Description

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.

Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

Sample Input

3

BAPC

BAPC

AZA

AZAZAZA

VERDI

AVERDXIVYERDIAN

Sample Output

纯KMP算法，KMP确实是神算法。。一开始我还没弄懂就开始敲，居然把长字串给求next值了，KMP其妙是在其求的出现的重复字串的next值，当失配时，只需要回溯到第一个重复字串（后面的都是重复前面的字串）的相应失配位置，再开始匹配就行，不需要从头开始。。这样就把一个嵌套的循环活生生的改成了一个O（n）的时间复杂度。

KMP的实现主要是两步

1.求next数组的值。

我的前提是字串读取为从0开始到len-1位置。

先定义next[0]为0（也有定义为-1的，我的习惯是0）。

从1到len-1循环，

定义i为循环变量，顺便定义j为next数组的循环变量（j=0初始），当s[i]==s[j]时，说明出现重复字串，则j++;

如果不相等，则说明此处失配，但是！！！注意：失配不代表当前next值就为0，需回溯到该失配处对应的上一处重复点，

即while（j>0&&s[i]!=s[j]）j=next[j-1];(这里需要着重说明，因为刚刚坑也坑在这里，因为此时j处失配，因此通过next[j-1]回溯到之前，但正好由于数组下标比实际数位小1，因此得到的next[j-1]值正好就是失配点对应的回溯点，不用再+1了，当然如果一直失配下去，j最终为0)

要是比较难理解这个失配点的回溯点是怎么回事，我们看个例子：

ID	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17
str	a	b	a	b	c	a	b	a	b	a	b	a	b	c	a	b	a	b
Next	0	0	1	2	0	1	2	3	4	3	4	3	4	5	7	7	8	9

我们可以看到,着重注意一下下标为9的点，此时s[i=9]!=s[j=4] 则，找回溯点 j=next[j-1=3]=2,即失配前一个字符对应的回溯点告诉我们失配点的回溯点的下标为2，则此时 s[j=2]==s[i=9],即匹配成功，说明回溯过去，找到了失配点的一个配对点。但注意，一旦配对成功，（有一种情况是回溯过去，找不到配对点，j直接=0，之后不做处理），但如果配对成功了，说明相对于j=2，该点是j=2之后的又一配对点，所以j++，使得j要=3；故此时next[9]=3;

2.匹配过程

1.匹配是从另一个长串开始循环（定义循环变量为i，长串为ls），另一个循环变量q=0则负责s。

初始i s均为0，若ls[i]==s[q] 则：q++，i++；

如果不等，则s需要回答其失配点对应的回溯点，当然如果q==0，则没有回溯点可言，故直接i++；否则 q=next[q-1](为什么是q-1和上述相同)；

。。。。此处进行很多次循环；

当q==len；说明此时已经匹配完成，ans++；(说明匹配到一个字串) 并且 q=next[q-1]（原理相同，将q回到末尾点的回溯点的下一点，与ls[i](此时i已经递增一位)继续进行匹配）。

#include <iostream>

#include <cstdio>

#include <cstring>

using namespace std;

char w[];

char t[];

int next[];

int ans;

void kmp_1()//求next数组过程

{

    next[]=;

    int j=;

    for (int i=; w[i]!='\0'; i++)//用w[i]!='\0',就不用测算w的长度，减少时间

    {

        if (w[i]==w[j])

        {

            j++;

        }

        else
        {

            while (j>&&w[i]!=w[j])

                j=next[j-];
            if (w[i]==w[j]) //确实找到了匹配的回溯点，说明next值可以+1了.
              j++;
        }

        next[i]=j;

    }

}

void kmp_2()//匹配过程

{

    int k,q;

    ans=;

    for (k=,q=; t[k]!='\0';)

    {

        if (t[k]==w[q])

        {

            q++;

            k++;

        }

        else

        {

            if (q==) k++;

            else

                q=next[q-];

        }

        if (w[q]=='\0')

        {

            ans++;

            q=next[q-];

        }

    }

}

int main()

{

    int tt;

    scanf("%d",&tt);

    getchar();

    while (tt--)

    {

        memset(next,,sizeof next);

        gets(w);

        gets(t);

        int lenw;

        //lenw=strlen(w);

        kmp_1();

        kmp_2();

        printf("%d\n",ans);

    }

    return ;

}

poj_3461 KMP算法解析的更多相关文章

KMP算法解析（转自图灵社区）
KMP算法是一个很精妙的字符串算法,个人认为这个算法十分符合编程美学:十分简洁,而又极难理解.笔者算法学的很烂,所以接触到这个算法的时候也是一头雾水,去网上看各种帖子,发现写着各种KMP算法详解的转载 ...
KMP算法解析
介绍一种高效的KMP算法:代码可以直接运行 #include <iostream> #include <iomanip> using namespace std; void p ...
字符串匹配的KMP算法详解及C#实现
字符串匹配是计算机的基本任务之一. 举例来说,有一个字符串"BBC ABCDAB ABCDABCDABDE",我想知道,里面是否包含另一个字符串"ABCDABD" ...
KMP串匹配算法解析与优化
朴素串匹配算法说明串匹配算法最常用的情形是从一篇文档中查找指定文本.需要查找的文本叫做模式串,需要从中查找模式串的串暂且叫做查找串吧. 为了更好理解KMP算法,我们先这样看待一下朴素匹配算法吧.朴素 ...
KMP算法深入解析
本文主要介绍KMP算法原理.KMP算法是一种高效的字符串匹配算法,通过对源串进行一次遍历即可完成对字符串的匹配. 1.基础知识的铺垫字符串T的前k(0 =< k <=tlen)个连续的字 ...
经典串匹配算法（KMP）解析
一.问题重述现有字符串S1,求S1中与字符串S2完全匹配的部分,例如: S1 = "ababaababc" S2 = "ababc" 那么得到匹配的结果是5( ...
不能更通俗了！KMP算法实现解析
我之前对于KMP算法理解的也不是很到位,如果很长时间不写KMP的话,代码就记不清了,今天刷leetcode的时候突然决定干脆把它彻底总结一下,这样即便以后忘记了也好查看.所以就有了这篇文章. 本文在于 ...
Java数据结构之字符串模式匹配算法---KMP算法
本文主要的思路都是参考http://kb.cnblogs.com/page/176818/ 如有冒犯请告知,多谢. 一.KMP算法 KMP算法可以在O(n+m)的时间数量级上完成串的模式匹配操作,其基 ...
KMP算法具体解释(转)
作者:July. 出处:http://blog.csdn.net/v_JULY_v/. 引记此前一天,一位MS的朋友邀我一起去与他讨论高速排序,红黑树,字典树,B树.后缀树,包含KMP算法,只有在解 ...

随机推荐

Zabbix WebUI 配置监控Zabbix Agent
Zabbix WebUI 配置监控Zabbix Agent 作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.部署zabbix服务 1>.部署zabbix server 和z ...
51nod 1009：数字1的数量
1009 数字1的数量基准时间限制:1 秒空间限制:131072 KB 分值: 5 难度:1级算法题收藏关注给定一个十进制正整数N,写下从1开始,到N的所有正数,计算出其中出现所有1的个 ...
WireShark 之抓包QQ协议
006、MySQL取当前系统时间
#取当前时间文本格式 SELECT curdate( ) , now( ); 效果如下图: 不忘初心,如果您认为这篇文章有价值,认同作者的付出,可以微信二维码打赏任意金额给作者(微信号:3824772 ...
010-PHP输出数组中第某个元素
<?php $monthName = array(1 => "January", "February", "March",//初 ...
gem5-gpu 运行 PARSEC2.1
PARSEC是针对共享内存多核处理器(CPU)的一套基准测试程序,详细介绍见wiki:http://wiki.cs.princeton.edu/index.php/PARSEC,主要参考:http:/ ...
在 Delphi 中使用微软全文翻译的小例子
使用帮助需要先去申请一个 AppID: http://www.bing.com/toolbox/bingdeveloper/使用帮助在: http://msdn.microsoft.com/en-u ...
JS - 获取页面滚动的高度
document.documentElement.scrollTop||document.body.scrollTop
nodeks —— fs模块 —— 从流中读取和写入数据
Fs流读取和写入数据使用文件流来读取大文件不会卡顿 1, 从流中读取数据 var fs = require("fs"); var data = ''; var count = 0 ...
指令——cp
一个完整的指令的标准格式: Linux通用的格式——#指令主体(空格) [选项](空格) [操作对象] 一个指令可以包含多个选项,操作对象也可以是多个. 指令:cp (copy,复制) 作用:复制文件 ...

poj_3461 KMP算法解析

poj_3461 KMP算法解析的更多相关文章

随机推荐

热门专题