POJ 3294 n个串中至少一半的串共享的最长公共子串
| Time Limit: 5000MS | Memory Limit: 65536K | |
| Total Submissions: 12484 | Accepted: 3502 |
Description
You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such as height, colour, wrinkles, ears, eyebrows and the like. A few bear no human resemblance; these typically have geometric or amorphous shapes like cubes, oil slicks or clouds of dust.
The answer is given in the 146th episode of Star Trek - The Next Generation, titled The Chase. It turns out that in the vast majority of the quadrant's life forms ended up with a large fragment of common DNA.
Given the DNA sequences of several life forms represented as strings of letters, you are to find the longest substring that is shared by more than half of them.
Input
Standard input contains several test cases. Each test case begins with 1 ≤ n ≤ 100, the number of life forms. n lines follow; each contains a string of lower case letters representing the DNA sequence of a life form. Each DNA sequence contains at least one and not more than 1000 letters. A line containing 0 follows the last test case.
Output
For each test case, output the longest string or strings shared by more than half of the life forms. If there are many, output all of them in alphabetical order. If there is no solution with at least one letter, output "?". Leave an empty line between test cases.
/*
POJ 3294 n个串中至少一半的串共享的最长公共子串 求的是最长公共子串,所以考虑 二分答案len+判断
因为要判断是否为x个串共享所以对height进行分组,即height数组中各个连续≥len
的集合,然后对每个组进行判断,看书否能找到x+1个不同的来源。
满足条件就记录 子串的起始位置和长度 1.串之间的间隔符号不能相同
2.因为有100个串,所以已经占据了0-99,所以字符串的信息转换成int的时候
必需是从100开始 hhh-2016-03-17 19:04:50
*/
#include <algorithm>
#include <cmath>
#include <queue>
#include <iostream>
#include <cstring>
#include <map>
#include <cstdio>
#include <vector>
#include <functional>
#define lson (i<<1)
#define rson ((i<<1)|1)
using namespace std;
typedef long long ll;
const int maxn = 101000; int t1[maxn],t2[maxn],c[maxn];
bool cmp(int *r,int a,int b,int l)
{
return r[a]==r[b] &&r[l+a] == r[l+b];
} void get_sa(int str[],int sa[],int Rank[],int height[],int n,int m)
{
n++;
int p,*x=t1,*y=t2;
for(int i = 0; i < m; i++) c[i] = 0;
for(int i = 0; i < n; i++) c[x[i] = str[i]]++;
for(int i = 1; i < m; i++) c[i] += c[i-1];
for(int i = n-1; i>=0; i--) sa[--c[x[i]]] = i;
for(int j = 1; j <= n; j <<= 1)
{
p = 0;
for(int i = n-j; i < n; i++) y[p++] = i;
for(int i = 0; i < n; i++) if(sa[i] >= j) y[p++] = sa[i]-j;
for(int i = 0; i < m; i++) c[i] = 0;
for(int i = 0; i < n; i++) c[x[y[i]]]++ ;
for(int i = 1; i < m; i++) c[i] += c[i-1];
for(int i = n-1; i >= 0; i--) sa[--c[x[y[i]]]] = y[i]; swap(x,y);
p = 1;
x[sa[0]] = 0;
for(int i = 1; i < n; i++)
x[sa[i]] = cmp(y,sa[i-1],sa[i],j)? p-1:p++;
if(p >= n) break;
m = p;
}
int k = 0;
n--;
for(int i = 0; i <= n; i++)
Rank[sa[i]] = i;
for(int i = 0; i < n; i++)
{
if(k) k--;
int j = sa[Rank[i]-1];
while(str[i+k] == str[j+k]) k++;
height[Rank[i]] = k;
}
} int Rank[maxn];
int sa[maxn];
int str[maxn],height[maxn];
char s[1010];
char allstr[maxn];
int anslen,anspos[maxn];
int ansnum,vis[110];
int id[maxn]; bool judge(int len,int k,int n,int l,int r)
{
int num = 0;
memset(vis,0,sizeof(vis));
for(int i = l; i <= r; i++)
{
if(height[i] >= len)
{
if(!vis[id[sa[i-1]]])
{
vis[id[sa[i-1]]] = 1;
num ++;
}
if(!vis[id[sa[i]]])
{
vis[id[sa[i]]] = 1;
num ++;
}
if(num > k)
return 1;
}
}
return 0;
} bool can(int len,int k,int n)
{
int l=2,r=2;
int flag = 0;
ansnum = 0;
for(int i = 2; i <= n; i++)
{
if(height[i]>=len)
r++;
else
{
if(judge(len,k,n,l,r))
{
anspos[ansnum++] = sa[l];
flag =1;
}
l = i,r = i;
}
}
if(judge(len,k,n,l,r) && l < r)
{
anspos[ansnum++] = sa[l];
flag =1;
}
return flag;
} int main()
{
int k,n;
while(scanf("%d",&n) != EOF && n)
{
int len=0;
int tot = 0;
for(int i = 0; i< n; i++)
{
scanf("%s",s);
for(int j = 0; s[j]!='\0'; j++)
{
id[tot] = i;
allstr[tot] = s[j];
str[tot++] = s[j]-'a'+100;
}
len=max(len,(int)strlen(s));
id[tot] = i,allstr[tot]='$';
str[tot++]=i;
}
str[tot] = 0;
get_sa(str,sa,Rank,height,tot,128);
int k = n/2;
int ans = 0;
int l=1,r=len;
while(l <= r)
{
int mid =(l+r)>>1;
if(can(mid,k,tot))
{
l = mid+1;
anslen = mid;
ans = ansnum;
}
else
r = mid-1;
} if(!ans)
printf("?\n");
else
{
//cout<<ans<<endl;
for(int i = 0; i < ans; i++)
{
for(int j = 0; j<anslen; j++)
printf("%c",allstr[anspos[i]+j]);
printf("\n");
}
}
printf("\n");
}
return 0;
}
POJ 3294 n个串中至少一半的串共享的最长公共子串的更多相关文章
- SPOJ - PHRASES Relevant Phrases of Annihilation —— 后缀数组 出现于所有字符串中两次且不重叠的最长公共子串
题目链接:https://vjudge.net/problem/SPOJ-PHRASES PHRASES - Relevant Phrases of Annihilation no tags You ...
- POJ 3294 Life Forms [最长公共子串加强版 后缀数组 && 二分]
题目:http://poj.org/problem?id=3294 Life Forms Time Limit: 5000MS Memory Limit: 65536K Total Submiss ...
- 字符串hash + 二分答案 - 求最长公共子串 --- poj 2774
Long Long Message Problem's Link:http://poj.org/problem?id=2774 Mean: 求两个字符串的最长公共子串的长度. analyse: 前面在 ...
- 后缀数组(模板题) - 求最长公共子串 - poj 2774 Long Long Message
Language: Default Long Long Message Time Limit: 4000MS Memory Limit: 131072K Total Submissions: 21 ...
- poj 2774 最长公共子串 后缀数组
Long Long Message Time Limit: 4000MS Memory Limit: 131072K Total Submissions: 25752 Accepted: 10 ...
- POJ 2774 Long Long Message [ 最长公共子串 后缀数组]
题目:http://poj.org/problem?id=2774 Long Long Message Time Limit: 4000MS Memory Limit: 131072K Total ...
- 「双串最长公共子串」SP1811 LCS - Longest Common Substring
知识点: SAM,SA,单调栈,Hash 原题面 Luogu 来自 poj 的双倍经验 简述 给定两字符串 \(S_1, S_2\),求它们的最长公共子串长度. \(|S_1|,|S_2|\le 2. ...
- POJ 2217 (后缀数组+最长公共子串)
题目链接: http://poj.org/problem?id=2217 题目大意: 求两个串的最长公共子串,注意子串是连续的,而子序列可以不连续. 解题思路: 后缀数组解法是这类问题的模板解法. 对 ...
- SPOJ 1811 Longest Common Substring (后缀自动机第一题,求两个串的最长公共子串)
题目大意: 给出两个长度小于等于25W的字符串,求它们的最长公共子串. 题目链接:http://www.spoj.com/problems/LCS/ 算法讨论: 二分+哈希, 后缀数组, 后缀自动机. ...
随机推荐
- Python 实现火车票查询工具
注意:由于 12306 的接口经常变化,课程内容可能很快过期,如果遇到接口问题,需要根据最新的接口对代码进行适当修改才可以完成实验. 一.实验简介 当你想查询一下火车票信息的时候,你还在上 12306 ...
- java的socket通信
本文讲解如何用java实现网络通信,是一个非常简单的例子,我比较喜欢能够立马看到结果,所以先上代码再讲解具体细节. 服务端: import java.io.BufferedReader; import ...
- MMA8451重力加速度计通过写内部校准寄存器进行校准
|版权声明:本文为博主原创文章,未经博主允许不得转载. AN4069应用笔记中提到MMA8451的三个轴重力校准有两种方法, 第一种方法是简易校准,将贴有MMA8451的设备整体,Z轴正面朝上放在校准 ...
- LeetCode & Q53-Maximum Subarray-Easy & 动态规划思路分析
Array DP Divide and Conquer Description: Find the contiguous subarray within an array (containing at ...
- Python内置函数(61)——eval
英文文档: eval(expression, globals=None, locals=None) The arguments are a string and optional globals an ...
- Python内置函数(60)——compile
英文文档: compile(source, filename, mode, flags=0, dont_inherit=False, optimize=-1) Compile the source i ...
- linux下安装 配置 redis数据库
通过终端命令安装(推荐): 1 确保更新源服务器能正常使用 如果没有更换更新源服务器,那么可能一直都下不了软件.欢迎参考我之前的博文来更换成国内的镜像服务器http://www.cnblogs.com ...
- mysql中的视图、事务和索引
视图: 对于一个sql查询,如果发生了修改,就需要修改sql语句. 我们可以通过定义视图来解决问题.改变需求之后就改变视图. 视图是对查询的封装 定义视图: create view 视图名称 as s ...
- CentOS7 安装eclipse
1. 首先将eclipse的压缩包文件解压到/opt目录下,要使用root权限.执行如下解压命令:tar -zxvf eclipse-jee-oxygen-1a-linux-gtk-x86_64.ta ...
- servlet2.3/2.5/3.0/3.1的xml名称空间备忘
The web.xml is a configuration file to describe how a web application should be deployed. Here’re 5 ...