POJ2774Long Long Message (后缀数组&后缀自动机)
问题:
The little cat lives in an unrich family, so he frequently comes to the mobile service center, to check how much money he has spent on SMS. Yesterday, the computer of service center was broken, and printed two very long messages. The brilliant little cat soon found out:
1. All characters in messages are lowercase Latin letters, without punctuations and spaces.
2. All SMS has been appended to each other – (i+1)-th SMS comes directly after the i-th one – that is why those two messages are quite long.
3. His own SMS has been appended together, but possibly a great many redundancy characters appear leftwards and rightwards due to the broken computer.
E.g: if his SMS is “motheriloveyou”, either long message printed by that machine, would possibly be one of “hahamotheriloveyou”, “motheriloveyoureally”, “motheriloveyouornot”, “bbbmotheriloveyouaaa”, etc.
4. For these broken issues, the little cat has printed his original text twice (so there appears two very long messages). Even though the original text remains the same in two printed messages, the redundancy characters on both sides would be possibly different.
You are given those two very long messages, and you have to output the length of the longest possible original text written by the little cat.
Background:
The SMS in Byterland mobile service are charging in dollars-per-byte. That is why the little cat is worrying about how long could the longest original text be.
Why ask you to write a program? There are four resions:
1. The little cat is so busy these days with physics lessons;
2. The little cat wants to keep what he said to his mother seceret;
3. POJ is such a great Online Judge;
4. The little cat wants to earn some money from POJ, and try to persuade his mother to see the doctor :(
Input
Output
Sample Input
yeshowmuchiloveyoumydearmotherreallyicannotbelieveit
yeaphowmuchiloveyoumydearmother
Sample Output
27
题意:
求两个字符串的最长的公共字串。
思路:
后缀自动机:可以直接匹配,然后又默写了一遍后缀自动机。
#include<cstdio>
#include<cstdlib>
#include<iostream>
#include<cstring>
#include<algorithm>
using namespace std;
const int maxn=;
char chr[maxn],str[maxn];
struct SAM
{
int ch[maxn][],fa[maxn],maxlen[maxn],Last,sz;
void init()
{
sz=Last=; fa[]=maxlen[]=;
memset(ch[],,sizeof(ch[]));
}
void add(int x)
{
int np=++sz,p=Last;Last=np;
memset(ch[np],,sizeof(ch[np]));
maxlen[np]=maxlen[p]+;
while(p&&!ch[p][x]) ch[p][x]=np,p=fa[p];
if(!p) fa[np]=;
else {
int q=ch[p][x];
if(maxlen[p]+==maxlen[q]) fa[np]=q;
else {
int nq=++sz;
memcpy(ch[nq],ch[q],sizeof(ch[q]));
maxlen[nq]=maxlen[p]+;
fa[nq]=fa[q];
fa[q]=fa[np]=nq;
while(p&&ch[p][x]==q) ch[p][x]=nq,p=fa[p];
}
}
}
void solve()
{
scanf("%s",chr);
int L=strlen(chr),x,tmp=,ans=;Last=;
for(int i=;i<L;i++){
x=chr[i]-'a';
if(ch[Last][x]) tmp++,Last=ch[Last][x];
else {
while(Last&&!ch[Last][x]) Last=fa[Last];
if(!Last) tmp=,Last=;
else tmp=maxlen[Last]+,Last=ch[Last][x];
}
ans=max(ans,tmp);
}
printf("%d\n",ans);
}
};
SAM Sam;
int main()
{ int T,i,L;
while(~scanf("%s",chr)){
Sam.init();
L=strlen(chr);
for(i=;i<L;i++) Sam.add(chr[i]-'a');
Sam.solve();
}
return ;
}
后缀数组:需要把两个串连接起来,之间加一个特殊符号,用来保证得到的结果符合两个串来自不同的母串。
效率对比:
后缀自动机94ms,后缀数组782ms。这种基本题型我还是愿意写后缀自动机,不过练一练总是有好处的。
#include<cstdio>
#include<cstdlib>
#include<cstring>
#include<iostream>
#include<algorithm>
using namespace std;
const int maxn=;
char str1[maxn],str2[maxn];
int L,ch[maxn];
struct SA
{
int cntA[maxn],cntB[maxn],A[maxn],B[maxn];
int rank[maxn],sa[maxn],tsa[maxn],ht[maxn];
void sort()
{
for (int i = ; i < ; i ++) cntA[i] = ;
for (int i = ; i <= L; i ++) cntA[ch[i]] ++;
for (int i = ; i < ; i ++) cntA[i] += cntA[i - ];
for (int i = L; i; i --) sa[cntA[ch[i]] --] = i;
rank[sa[]] = ;
for (int i = ; i <= L; i ++){
rank[sa[i]] = rank[sa[i - ]];
if (ch[sa[i]] != ch[sa[i - ]]) rank[sa[i]] ++;
}
for (int l = ; rank[sa[L]] < L; l <<= ){
for (int i = ; i <= L; i ++) cntA[i] = ;
for (int i = ; i <= L; i ++) cntB[i] = ;
for ( int i = ; i <= L; i ++){
cntA[A[i] = rank[i]] ++;
cntB[B[i] = (i + l <= L) ? rank[i + l] : ] ++;
}
for (int i = ; i <= L; i ++) cntB[i] += cntB[i - ];
for (int i = L; i; i --) tsa[cntB[B[i]] --] = i;
for (int i = ; i <= L; i ++) cntA[i] += cntA[i - ];
for (int i = L; i; i --) sa[cntA[A[tsa[i]]] --] = tsa[i];
rank[sa[]] = ;
for (int i = ; i <= L; i ++){
rank[sa[i]] = rank[sa[i - ]];
if (A[sa[i]] != A[sa[i - ]] || B[sa[i]] != B[sa[i - ]]) rank[sa[i]] ++;
}
}
}
void getht()
{
for (int i = , j = ; i <= L; i ++){
if (j) j --;
while (ch[i + j] == ch[sa[rank[i] - ] + j]) j ++;
ht[rank[i]] = j;
}
}
};
SA Sa;
int main()
{
scanf("%s",str1+);
scanf("%s",str2+);
int L1=strlen(str1+);
int L2=strlen(str2+);
for(int i=;i<=L1;i++) ch[i]=str1[i]-'a'+1;
ch[L1+]=;
for(int i=;i<=L2;i++) ch[i+L1+]=str2[i]-'a'+1;
L=L1+L2+;
Sa.sort();
Sa.getht();
int ans=;
for(int i = ; i <= L; i++)
{
if((Sa.sa[i]<=L1)!=(Sa.sa[i-]<=L1))
ans = max(ans, Sa.ht[i]);
}
printf("%d\n",ans);
return ;
}
POJ2774Long Long Message (后缀数组&后缀自动机)的更多相关文章
- 字符串的模板 Manacher kmp ac自动机 后缀数组 后缀自动机
为何scanf("%s", str)不需要&运算 经常忘掉的字符串知识点,最好不加&,不加&最标准,指针如果像scanf里一样加&是错的,大概是未定 ...
- 【整理】如何选取后缀数组&&后缀自动机
后缀家族已知成员 后缀树 后缀数组 后缀自动机 后缀仙人掌 后缀预言 后缀Splay ? 后缀树是后缀数 ...
- loj6173 Samjia和矩阵(后缀数组/后缀自动机)
题目: https://loj.ac/problem/6173 分析: 考虑枚举宽度w,然后把宽度压位集中,将它们哈希 (这是w=2的时候) 然后可以写一下string=“ac#bc” 然后就是求这个 ...
- POJ - 2774 Long Long Message (后缀数组/后缀自动机模板题)
后缀数组: #include<cstdio> #include<algorithm> #include<cstring> #include<vector> ...
- bzoj 3172 后缀数组|AC自动机
后缀数组或者AC自动机都可以,模板题. /************************************************************** Problem: 3172 Us ...
- [Luogu5161]WD与数列(后缀数组/后缀自动机+线段树合并)
https://blog.csdn.net/WAautomaton/article/details/85057257 解法一:后缀数组 显然将原数组差分后答案就是所有不相交不相邻重复子串个数+n*(n ...
- SPOJ694 DISUBSTR --- 后缀数组 / 后缀自动机
SPOJ694 DISUBSTR 题目描述: Given a string, we need to find the total number of its distinct substrings. ...
- POJ1743 Musical Theme (后缀数组 & 后缀自动机)最大不重叠相似子串
A musical melody is represented as a sequence of N (1<=N<=20000)notes that are integers in the ...
- SPOJ- Distinct Substrings(后缀数组&后缀自动机)
Given a string, we need to find the total number of its distinct substrings. Input T- number of test ...
随机推荐
- 解决Java工程URL路径中含有中文的情况
问题: 当Java工程路径中含有中文时,得不到正确的路径 *** 解决: 这其实是编码转换的问题.当我们使用ClassLoader的getResource方法获取路径时,获取到的路径被URLEncod ...
- iOS ARC也会有内存泄露
本文转载至 http://blog.csdn.net/allison162004/article/details/38753219 iOS提供了ARC功能,很大程度上简化了内存管理的代码. 但使用A ...
- IO多路复用的作用?并列举实现机制以及区别?
I/O多路复用是用于提升效率,单个进程可以同时监听多个网络连接IO. 举例:通过一种机制,可以监视多个文件描述符,一旦描述符就绪(读就绪和写就绪),能通知程序进行相应的读写操作,I/O多路复用避免阻塞 ...
- python+NLTK 自然语言学习处理五:词典资源
前面介绍了很多NLTK中携带的词典资源,这些词典资源对于我们处理文本是有大的作用的,比如实现这样一个功能,寻找由egivronl几个字母组成的单词.且组成的单词每个字母的次数不得超过egivronl中 ...
- centos7 Authentication failure
root@localhost ~]#su bash-4.2$ su Password: su: Authentication failure //这里切换的是系统用户,现在还不清楚为什么postgre ...
- 在网页中显示PDF文件及vue项目中弹出PDF
1.<embed width="800" height="600" src="test_pdf.pdf"> </embed ...
- 图片加载Picasso
https://github.com/square/picasso 基本用法 // 基本用法 // 普通加载图片 Picasso.with(PicassoActivity.this) .load(&q ...
- 【转】.net中快捷键的使用
当前行行首:Home 当前行行尾:End 当前文档首行:ctrl+Home 当前文档尾行:ctrl+End 选中当前行: ① 按Home(定位到行首)然后按Shift+Dnd(行尾) {从行首连选 ...
- socket通信——通过Udp传输方式,将一段文字数据发送出去
需求:通过Udp传输方式,将一段文字数据发送出去 定义一个Udp发送端 思路: 1.建立updsocket服务 2.提供数据,并将数据封装到数据包中. 3.通过socket服务的发送功能,将数据包发出 ...
- 某国际知名IT公司笔试
原文地址:http://blog.csdn.net/lazy_tiger/article/details/1790986 这段时间没怎么顾及自己的这个“一寸土地”, 实在惭愧.因为这些天小弟又经历了“ ...