SPOJ - PHRASES Relevant Phrases of Annihilation —— 后缀数组 出现于所有字符串中两次且不重叠的最长公共子串
题目链接:https://vjudge.net/problem/SPOJ-PHRASES
PHRASES - Relevant Phrases of Annihilation
You are the King of Byteland. Your agents have just intercepted a batch of encrypted enemy messages concerning the date of the planned attack on your island. You immedietaly send for the Bytelandian Cryptographer, but he is currently busy eating popcorn and claims that he may only decrypt the most important part of the text (since the rest would be a waste of his time). You decide to select the fragment of the text which the enemy has strongly emphasised, evidently regarding it as the most important. So, you are looking for a fragment of text which appears in all the messages disjointly at least twice. Since you are not overfond of the cryptographer, try to make this fragment as long as possible.
Input
The first line of input contains a single positive integer t<=10, the number of test cases. t test cases follow. Each test case begins with integer n (n<=10), the number of messages. The next n lines contain the messages, consisting only of between 2 and 10000 characters 'a'-'z', possibly with some additional trailing white space which should be ignored.
Output
For each test case output the length of longest string which appears disjointly at least twice in all of the messages.
Example
Input:
1
4
abbabba
dabddkababa
bacaba
baba Output:
2
(in the example above, the longest substring which fulfills the requirements is 'ba')
题意:
给出n个字符串,求出现于所有字符串中两次且不重叠的最长公共子串,输出长度。
题解:
1.将所有字符串拼接在一起,相邻两个之间用各异的分隔符隔开,得到新串。
2.求出新串的后缀数组,然后二分mid:mid将新串的后缀分成若干组,每一组对应着一个公共子串,且长度大于等于mid。在每一组中,统计公共子串出现于每个字符串中的最小和最大下标,如果最大下标-最小下标>=mid,即表明公共子串出现在该字符串内两次且不重叠。如果在同一组内,所有字符串都满足最大下标-最小下标>=mid,那么表明当前mid合法,否则不合法,因此根据此规则求出答案。
代码如下:
#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <vector>
#include <cmath>
#include <queue>
#include <stack>
#include <map>
#include <string>
#include <set>
using namespace std;
typedef long long LL;
const int INF = 2e9;
const LL LNF = 9e18;
const int MOD = 1e9+;
const int MAXN = 2e5+; int id[MAXN];
int r[MAXN], sa[MAXN], Rank[MAXN], height[MAXN];
int t1[MAXN], t2[MAXN], c[MAXN]; bool cmp(int *r, int a, int b, int l)
{
return r[a]==r[b] && r[a+l]==r[b+l];
} void DA(int str[], int sa[], int Rank[], int height[], int n, int m)
{
n++;
int i, j, p, *x = t1, *y = t2;
for(i = ; i<m; i++) c[i] = ;
for(i = ; i<n; i++) c[x[i] = str[i]]++;
for(i = ; i<m; i++) c[i] += c[i-];
for(i = n-; i>=; i--) sa[--c[x[i]]] = i;
for(j = ; j<=n; j <<= )
{
p = ;
for(i = n-j; i<n; i++) y[p++] = i;
for(i = ; i<n; i++) if(sa[i]>=j) y[p++] = sa[i]-j; for(i = ; i<m; i++) c[i] = ;
for(i = ; i<n; i++) c[x[y[i]]]++;
for(i = ; i<m; i++) c[i] += c[i-];
for(i = n-; i>=; i--) sa[--c[x[y[i]]]] = y[i]; swap(x, y);
p = ; x[sa[]] = ;
for(i = ; i<n; i++)
x[sa[i]] = cmp(y, sa[i-], sa[i], j)?p-:p++; if(p>=n) break;
m = p;
} int k = ;
n--;
for(i = ; i<=n; i++) Rank[sa[i]] = i;
for(i = ; i<n; i++)
{
if(k) k--;
j = sa[Rank[i]-];
while(str[i+k]==str[j+k]) k++;
height[Rank[i]] = k;
}
} //pos用于记录在当前组中,字符串i的子串出现的最小下标以及最大下标
//vis用于记录在当前组中,字符串i是否已经出现了两个不重叠的子串
int pos[][], vis[];
bool test(int n, int len, int k)
{
int cnt = ;
memset(pos, -, sizeof(pos));
memset(vis, false, sizeof(vis));
for(int i = ; i<=len; i++)
{
if(height[i]<k)
{
cnt = ;
memset(pos, -, sizeof(pos));
memset(vis, false, sizeof(vis));
}
else
{
int b1 = id[sa[i-]], b2 = id[sa[i]];
pos[b1][] = pos[b1][]==-?sa[i-]:min(pos[b1][], sa[i-]); //最小下标
pos[b1][] = pos[b1][]==-?sa[i-]:max(pos[b1][], sa[i-]); //最大下标
pos[b2][] = pos[b2][]==-?sa[i]:min(pos[b2][], sa[i]);
pos[b2][] = pos[b2][]==-?sa[i]:max(pos[b2][], sa[i]); if(!vis[b1] && pos[b1][]!=- && pos[b1][]!=- && pos[b1][]-pos[b1][]>=k)
vis[b1] = true, cnt++;
if(!vis[b2] && pos[b2][]!=- && pos[b2][]!=- && pos[b2][]-pos[b2][]>=k)
vis[b2] = true, cnt++;
if(cnt==n) return true;
}
}
return false;
} char str[MAXN];
int main()
{
int T, n;
scanf("%d", &T);
while(T--)
{
scanf("%d", &n);
int len = ;
for(int i = ; i<n; i++)
{
scanf("%s", str);
int LEN = strlen(str);
for(int j = ; j<LEN; j++)
{
r[len] = str[j];
id[len++] = i;
}
r[len] = +i;
id[len++] = i;
}
r[len] = ;
DA(r,sa,Rank,height,len,); int l = , r = len;
while(l<=r)
{
int mid = (l+r)>>;
if(test(n,len,mid))
l = mid + ;
else
r = mid - ;
}
printf("%d\n", r);
}
}
SPOJ - PHRASES Relevant Phrases of Annihilation —— 后缀数组 出现于所有字符串中两次且不重叠的最长公共子串的更多相关文章
- SPOJ 220 Relevant Phrases of Annihilation(后缀数组+二分答案)
[题目链接] http://www.spoj.pl/problems/PHRASES/ [题目大意] 求在每个字符串中出现至少两次的最长的子串 [题解] 注意到这么几个关键点:最长,至少两次,每个字符 ...
- SPOJ 220 Relevant Phrases of Annihilation(后缀数组)
You are the King of Byteland. Your agents have just intercepted a batch of encrypted enemy messages ...
- SPOJ - PHRASES Relevant Phrases of Annihilation (后缀数组)
You are the King of Byteland. Your agents have just intercepted a batch of encrypted enemy messages ...
- SPOJ220 Relevant Phrases of Annihilation(后缀数组)
引用罗穗骞论文中的话: 先将n 个字符串连起来,中间用不相同的且没有出现在字符串中的字符隔开,求后缀数组.然后二分答案,再将后缀分组.判断的时候,要看是否有一组后缀在每个原来的字符串中至少出现两次,并 ...
- SPOJ PHRASES 每个字符串至少出现两次且不重叠的最长子串
Description You are the King of Byteland. Your agents have just intercepted a batch of encrypted ene ...
- SPOJ 1811 Longest Common Substring (后缀自动机第一题,求两个串的最长公共子串)
题目大意: 给出两个长度小于等于25W的字符串,求它们的最长公共子串. 题目链接:http://www.spoj.com/problems/LCS/ 算法讨论: 二分+哈希, 后缀数组, 后缀自动机. ...
- 后缀数组(模板题) - 求最长公共子串 - poj 2774 Long Long Message
Language: Default Long Long Message Time Limit: 4000MS Memory Limit: 131072K Total Submissions: 21 ...
- POJ 2217 (后缀数组+最长公共子串)
题目链接: http://poj.org/problem?id=2217 题目大意: 求两个串的最长公共子串,注意子串是连续的,而子序列可以不连续. 解题思路: 后缀数组解法是这类问题的模板解法. 对 ...
- HDU 1403 Longest Common Substring(后缀数组,最长公共子串)
hdu题目 poj题目 参考了 罗穗骞的论文<后缀数组——处理字符串的有力工具> 题意:求两个序列的最长公共子串 思路:后缀数组经典题目之一(模版题) //后缀数组sa:将s的n个后缀从小 ...
随机推荐
- Maven引入Hadoop依赖报错:Missing artifact jdk.tools:jdk.tools:jar:1.6
Maven引入Hadoop依赖报错:Missing artifact jdk.tools:jdk.tools:jar:1.6 原因是缺少tools.jar的依赖,tools.jar在jdk的安装目录中 ...
- 【Salvation】—— 项目策划&市场分析
写在前面:这个项目是2017年,我们评选校级创新基金项目的参加作品,小组4人,我为负责人,这个项目现在已经基本完成,目前处于后期收尾阶段. 一.项目的目标.内容及创新之处 1.研究目标 体现人类与自然 ...
- 2016.8.19 在dialog上增加一个button出现错误:failed to execute setAttribute on Element...
目标:想要在dialog上多加一个button. 语法来自: http://api.jqueryui.com/dialog/#option-buttons 可见新增在dialog上的button要 ...
- 给java类加static修饰编译器会说什么?
Illegal modifier for the class XXX;only public abstract & final are permitted.
- nginx apache防盗链
要实现防盗链,我们就必须先理解盗链的实现原理,提到防盗链的实现原理就不得不从HTTP协议说起,在HTTP协议中,有一个表头字段叫referer,采用URL的格式来表示从哪儿链接到当前的网页或文件.换句 ...
- 高速掌握Lua 5.3 —— Lua与C之间的交互概览
Q:什么是Lua的虚拟栈? A:C与Lua之间通信关键内容在于一个虚拟的栈.差点儿全部的调用都是对栈上的值进行操作,全部C与Lua之间的数据交换也都通过这个栈来完毕.另外,你也能够使用栈来保存暂时变量 ...
- UVA12096 - The SetStack Computer(set + map映射)
UVA12096 - The SetStack Computer(set + map映射) 题目链接 题目大意:有五个动作: push : 把一个空集合{}放到栈顶. dup : 把栈顶的集合取出来, ...
- Oracle 中for update和for update nowait的区别
http://www.cnblogs.com/quanweiru/archive/2012/11/09/2762223.html 1.for update 和 for update nowait 的区 ...
- HDU 1874 畅通project续 最短路径入门(dijkstra)
Problem Description 某省自从实行了非常多年的畅通project计划后,最终修建了非常多路.只是路多了也不好,每次要从一个城镇到还有一个城镇时,都有很多种道路方案能够选择,而某些方案 ...
- 移动app性能测试(待完善)
移动终端性能测试是测试手机终端是否符合特定性能指标的测试,包括有:内存.CPU.电量.流量.流畅度.时延等 测试准备:测试账号.测试需求.测试用例.待测手机.待测应用包.测试工具.测试电脑 1. 时 ...