POJ 3294 n个串中至少一半的串共享的最长公共子串
| Time Limit: 5000MS | Memory Limit: 65536K | |
| Total Submissions: 12484 | Accepted: 3502 |
Description
You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such as height, colour, wrinkles, ears, eyebrows and the like. A few bear no human resemblance; these typically have geometric or amorphous shapes like cubes, oil slicks or clouds of dust.
The answer is given in the 146th episode of Star Trek - The Next Generation, titled The Chase. It turns out that in the vast majority of the quadrant's life forms ended up with a large fragment of common DNA.
Given the DNA sequences of several life forms represented as strings of letters, you are to find the longest substring that is shared by more than half of them.
Input
Standard input contains several test cases. Each test case begins with 1 ≤ n ≤ 100, the number of life forms. n lines follow; each contains a string of lower case letters representing the DNA sequence of a life form. Each DNA sequence contains at least one and not more than 1000 letters. A line containing 0 follows the last test case.
Output
For each test case, output the longest string or strings shared by more than half of the life forms. If there are many, output all of them in alphabetical order. If there is no solution with at least one letter, output "?". Leave an empty line between test cases.
/*
POJ 3294 n个串中至少一半的串共享的最长公共子串 求的是最长公共子串,所以考虑 二分答案len+判断
因为要判断是否为x个串共享所以对height进行分组,即height数组中各个连续≥len
的集合,然后对每个组进行判断,看书否能找到x+1个不同的来源。
满足条件就记录 子串的起始位置和长度 1.串之间的间隔符号不能相同
2.因为有100个串,所以已经占据了0-99,所以字符串的信息转换成int的时候
必需是从100开始 hhh-2016-03-17 19:04:50
*/
#include <algorithm>
#include <cmath>
#include <queue>
#include <iostream>
#include <cstring>
#include <map>
#include <cstdio>
#include <vector>
#include <functional>
#define lson (i<<1)
#define rson ((i<<1)|1)
using namespace std;
typedef long long ll;
const int maxn = 101000; int t1[maxn],t2[maxn],c[maxn];
bool cmp(int *r,int a,int b,int l)
{
return r[a]==r[b] &&r[l+a] == r[l+b];
} void get_sa(int str[],int sa[],int Rank[],int height[],int n,int m)
{
n++;
int p,*x=t1,*y=t2;
for(int i = 0; i < m; i++) c[i] = 0;
for(int i = 0; i < n; i++) c[x[i] = str[i]]++;
for(int i = 1; i < m; i++) c[i] += c[i-1];
for(int i = n-1; i>=0; i--) sa[--c[x[i]]] = i;
for(int j = 1; j <= n; j <<= 1)
{
p = 0;
for(int i = n-j; i < n; i++) y[p++] = i;
for(int i = 0; i < n; i++) if(sa[i] >= j) y[p++] = sa[i]-j;
for(int i = 0; i < m; i++) c[i] = 0;
for(int i = 0; i < n; i++) c[x[y[i]]]++ ;
for(int i = 1; i < m; i++) c[i] += c[i-1];
for(int i = n-1; i >= 0; i--) sa[--c[x[y[i]]]] = y[i]; swap(x,y);
p = 1;
x[sa[0]] = 0;
for(int i = 1; i < n; i++)
x[sa[i]] = cmp(y,sa[i-1],sa[i],j)? p-1:p++;
if(p >= n) break;
m = p;
}
int k = 0;
n--;
for(int i = 0; i <= n; i++)
Rank[sa[i]] = i;
for(int i = 0; i < n; i++)
{
if(k) k--;
int j = sa[Rank[i]-1];
while(str[i+k] == str[j+k]) k++;
height[Rank[i]] = k;
}
} int Rank[maxn];
int sa[maxn];
int str[maxn],height[maxn];
char s[1010];
char allstr[maxn];
int anslen,anspos[maxn];
int ansnum,vis[110];
int id[maxn]; bool judge(int len,int k,int n,int l,int r)
{
int num = 0;
memset(vis,0,sizeof(vis));
for(int i = l; i <= r; i++)
{
if(height[i] >= len)
{
if(!vis[id[sa[i-1]]])
{
vis[id[sa[i-1]]] = 1;
num ++;
}
if(!vis[id[sa[i]]])
{
vis[id[sa[i]]] = 1;
num ++;
}
if(num > k)
return 1;
}
}
return 0;
} bool can(int len,int k,int n)
{
int l=2,r=2;
int flag = 0;
ansnum = 0;
for(int i = 2; i <= n; i++)
{
if(height[i]>=len)
r++;
else
{
if(judge(len,k,n,l,r))
{
anspos[ansnum++] = sa[l];
flag =1;
}
l = i,r = i;
}
}
if(judge(len,k,n,l,r) && l < r)
{
anspos[ansnum++] = sa[l];
flag =1;
}
return flag;
} int main()
{
int k,n;
while(scanf("%d",&n) != EOF && n)
{
int len=0;
int tot = 0;
for(int i = 0; i< n; i++)
{
scanf("%s",s);
for(int j = 0; s[j]!='\0'; j++)
{
id[tot] = i;
allstr[tot] = s[j];
str[tot++] = s[j]-'a'+100;
}
len=max(len,(int)strlen(s));
id[tot] = i,allstr[tot]='$';
str[tot++]=i;
}
str[tot] = 0;
get_sa(str,sa,Rank,height,tot,128);
int k = n/2;
int ans = 0;
int l=1,r=len;
while(l <= r)
{
int mid =(l+r)>>1;
if(can(mid,k,tot))
{
l = mid+1;
anslen = mid;
ans = ansnum;
}
else
r = mid-1;
} if(!ans)
printf("?\n");
else
{
//cout<<ans<<endl;
for(int i = 0; i < ans; i++)
{
for(int j = 0; j<anslen; j++)
printf("%c",allstr[anspos[i]+j]);
printf("\n");
}
}
printf("\n");
}
return 0;
}
POJ 3294 n个串中至少一半的串共享的最长公共子串的更多相关文章
- SPOJ - PHRASES Relevant Phrases of Annihilation —— 后缀数组 出现于所有字符串中两次且不重叠的最长公共子串
题目链接:https://vjudge.net/problem/SPOJ-PHRASES PHRASES - Relevant Phrases of Annihilation no tags You ...
- POJ 3294 Life Forms [最长公共子串加强版 后缀数组 && 二分]
题目:http://poj.org/problem?id=3294 Life Forms Time Limit: 5000MS Memory Limit: 65536K Total Submiss ...
- 字符串hash + 二分答案 - 求最长公共子串 --- poj 2774
Long Long Message Problem's Link:http://poj.org/problem?id=2774 Mean: 求两个字符串的最长公共子串的长度. analyse: 前面在 ...
- 后缀数组(模板题) - 求最长公共子串 - poj 2774 Long Long Message
Language: Default Long Long Message Time Limit: 4000MS Memory Limit: 131072K Total Submissions: 21 ...
- poj 2774 最长公共子串 后缀数组
Long Long Message Time Limit: 4000MS Memory Limit: 131072K Total Submissions: 25752 Accepted: 10 ...
- POJ 2774 Long Long Message [ 最长公共子串 后缀数组]
题目:http://poj.org/problem?id=2774 Long Long Message Time Limit: 4000MS Memory Limit: 131072K Total ...
- 「双串最长公共子串」SP1811 LCS - Longest Common Substring
知识点: SAM,SA,单调栈,Hash 原题面 Luogu 来自 poj 的双倍经验 简述 给定两字符串 \(S_1, S_2\),求它们的最长公共子串长度. \(|S_1|,|S_2|\le 2. ...
- POJ 2217 (后缀数组+最长公共子串)
题目链接: http://poj.org/problem?id=2217 题目大意: 求两个串的最长公共子串,注意子串是连续的,而子序列可以不连续. 解题思路: 后缀数组解法是这类问题的模板解法. 对 ...
- SPOJ 1811 Longest Common Substring (后缀自动机第一题,求两个串的最长公共子串)
题目大意: 给出两个长度小于等于25W的字符串,求它们的最长公共子串. 题目链接:http://www.spoj.com/problems/LCS/ 算法讨论: 二分+哈希, 后缀数组, 后缀自动机. ...
随机推荐
- xcode7,ios9 部分兼容设置
神奇的苹果公司,再一次让程序员中枪. 一.xcode7 新建的项目,Foundation下默认所有http请求都被改为https请求. HTTP+SSL/TLS+TCP = HTTPS 也就是说,服务 ...
- 第三篇:Python字符编码
一 .了解字符编码的知识储备 1计算机基础知识 1.2文本编辑器存取文件的原理(nodepat++,Pycharm,word) #.打开编辑器就打开了启动了一个进程,是在内存中的,所以,用编辑器编写的 ...
- thinkphp调试技巧
调试的经验:很多时候程序调试不出来,但是又找不出错误,往往是拼写错误可能是很小的拼写错误,很难看出,或者多了一个空格,比如在配置路由的时候'URL_ROUTER_ON '=true,这样设置就会错误, ...
- Web Api 接收图片
public async Task<HttpResponseMessage> Upload() { if (!Request.Content.IsMimeMultipartContent( ...
- 快速搭建fabric-v1.1.0的chaincode开发环境
本文参考了fabric官方文档:http://hyperledger-fabric.readthedocs.io/en/latest/peer-chaincode-devmode.html?highl ...
- ELK学习总结(3-2)elk的过滤查询
和一般查询比较,filter查询:能够缓存数据在内存中,应该尽可能使用 建立测试数据 查看测试数据 1.filtered查询 GET /store/products/_search { "q ...
- spring-oauth-server实践:授权方式四:client_credentials 模式下access_token的产生
授权结果 获取access_token成功, 访问资源服务器API http://localhost:9000/api-gateway-engine/unity/user_info?access_to ...
- SpringCloud的服务消费者 (二):(rest+feign/ribbon)声明式访问注册的微服务
采用Ribbon或Feign方式访问注册到EurekaServer中的微服务.1.Ribbon实现了客户端负载均衡,Feign底层调用Ribbon2.注册在EurekaServer中的微服务api,不 ...
- SiteMesh入门(1-1)SiteMesh是什么?
1.问题的提出 在开发Web 应用时,Web页面可能由不同的人参与开发,因此开发出来的界面通常千奇百怪.五花八门,风格难以保持一致. 为了统一界面的风格,Struts 框架提供了一个标签库Tiles ...
- IDE-Android Studio 导入Ecplise项目不改变结构
Android Studio 导入 Ecplise创建的android 项目 无导入 不修改目录结构 首先,Ecplise 原有目录结构创建的android项目一枚 Sept 1 . 打开项目 S ...