poj3294 --Life Forms
| Time Limit: 5000MS | Memory Limit: 65536K | |
| Total Submissions: 12483 | Accepted: 3501 |
Description
You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such as height, colour, wrinkles, ears, eyebrows and the like. A few bear no human resemblance; these typically have geometric or amorphous shapes like cubes, oil slicks or clouds of dust.
The answer is given in the 146th episode of Star Trek - The Next Generation, titled The Chase. It turns out that in the vast majority of the quadrant's life forms ended up with a large fragment of common DNA.
Given the DNA sequences of several life forms represented as strings of letters, you are to find the longest substring that is shared by more than half of them.
Input
Standard input contains several test cases. Each test case begins with 1 ≤ n ≤ 100, the number of life forms. n lines follow; each contains a string of lower case letters representing the DNA sequence of a life form. Each DNA sequence contains at least one and not more than 1000 letters. A line containing 0 follows the last test case.
Output
For each test case, output the longest string or strings shared by more than half of the life forms. If there are many, output all of them in alphabetical order. If there is no solution with at least one letter, output "?". Leave an empty line between test cases.
Sample Input
3
abcdefg
bcdefgh
cdefghi
3
xxx
yyy
zzz
0
Sample Output
bcdefg
cdefgh ? 瘠薄
#include<cstdio>
#include<cstring>
#include<cmath>
#include<string>
#include<algorithm>
#include<iostream>
#define maxn 200005
int ws[maxn],wa[maxn],sa[maxn],num[maxn],n,wv[maxn],rank[maxn];
int h[maxn],wb[maxn],sum[maxn],m;
char str[][];
bool cmp(int *r,int a,int b,int l){
return r[a]==r[b]&&r[a+l]==r[b+l];
} void da(int *r,int *sa,int n,int m){
int *t,*x=wa,*y=wb,i,j,p;
for (i=;i<m;i++) ws[i]=;
for (i=;i<n;i++) x[i]=r[i];
for (i=;i<n;i++) ws[x[i]]++;
for (i=;i<m;i++) ws[i]+=ws[i-];
for (i=n-;i>=;i--) sa[--ws[x[i]]]=i;
for (j=,p=;p<n;j*=,m=p){
for (p=,i=n-j;i<n;i++) y[p++]=i;
for (i=;i<n;i++) if (sa[i]-j>=) y[p++]=sa[i]-j;
for (i=;i<m;i++) ws[i]=;
for (i=;i<n;i++) wv[i]=x[y[i]];
for (i=;i<n;i++) ws[wv[i]]++;
for (i=;i<m;i++) ws[i]+=ws[i-];
for (i=n-;i>=;i--) sa[--ws[wv[i]]]=y[i];
for (t=x,x=y,y=t,i=,p=,x[sa[]]=;i<n;i++)
x[sa[i]]=cmp(y,sa[i],sa[i-],j)?p-:p++;
}
}
void cal(int *r,int n){
int i,j,k=;
for (int i=;i<=n;i++) rank[sa[i]]=i;
for (int i=;i<n;h[rank[i++]]=k)
for (k?k--:,j=sa[rank[i]-];r[i+k]==r[j+k];k++);
}
int getid(int k){
int l=,r=n-,mid;
while (l<r){
mid=(l+r)/;
if (sum[mid]<k) {
l=mid+;
}
else
r=mid;
}
return l;
} bool check(int len,int out=){
int i=n+,j,k,id,cnt;
bool f[maxn];
while (){
while (i<=m&&h[i]<len) i++;
if (i>m) break;
memset(f,,sizeof f);
id=getid(sa[i-]);
f[id]=true;
cnt=;
while (i<=m&&h[i]>=len){
id=getid(sa[i]);
if (!f[id]){
f[id]=true;
cnt++;
}
i++;
}
if (out==){
if (*cnt>n) return true;
}
else
if (*cnt>n){
for (k=sa[i-],j=;j<len;k++,j++){
printf("%c",num[k]+'a'-);
}
printf("\n");
} }
return false;
}
int main(){
freopen("tx.in","r",stdin);
int i,j,k;
while (scanf("%d",&n)&&n!=){
scanf("%s",str[]);
if (n==){
printf("%s\n\n",str[]);
continue;
}
sum[]=strlen(str[]);
for (i=;i<n;i++){
scanf("%s",str[i]);
sum[i]=sum[i-]+strlen(str[i])+;
}
for (k=i=;i<n;i++){
for (j=;j<strlen(str[i]);j++){
num[k++]=str[i][j]-'a'+;
}
num[k++]=i;
}
m=k-;
da(num,sa,m+,);
cal(num,m);
int l=,r=m,mid;
while (l<r){
mid=(l+r+)/;
if (check(mid)) l=mid;
else r=mid-;
}
if (l==) printf("?\n\n");
else {
check(l,);
printf("\n");
}
}
}
后缀数组
poj3294 --Life Forms的更多相关文章
- POJ3294 Life Forms —— 后缀数组 最长公共子串
题目链接:https://vjudge.net/problem/POJ-3294 Life Forms Time Limit: 5000MS Memory Limit: 65536K Total ...
- POJ3294 Life Forms(后缀数组)
引用罗穗骞论文中的话: 将n 个字符串连起来,中间用不相同的且没有出现在字符串中的字符隔开,求后缀数组.然后二分答案,用和例3 同样的方法将后缀分成若干组,判断每组的后缀是否出现在不小于k 个的原串中 ...
- poj3294 Life Forms(后缀数组)
[题目链接] http://poj.org/problem?id=3294 [题意] 多个字符串求出现超过R次的最长公共子串. [思路] 二分+划分height,判定一个组中是否包含不小于R个不同字符 ...
- POJ-3294 Life Forms n个字符串中出现超过n/2次的最长子串(按字典序依次输出)
按照以前两个字符串找两者的最长公共子串的思路类似,可以把所有串拼接到一起,这里为了避免讨论LCP跨越多个串需需要特别处理的问题用不同的字符把所有串隔开(因为char只有128位,和可能不够用,更推荐设 ...
- 2018.11.28 poj3294 Life Forms(后缀数组+双指针)
传送门 后缀数组经典题目. 我们先把所有的字符串都接在一起. 然后求出hththt数组和sasasa数组. 然后对于sasasa数组跑双指针统计答案. 如果双指针包括进去的属于不同字符串的数量达到了题 ...
- POJ3294 Life Forms 【后缀数组】
生命形式 时间限制: 5000MS 内存限制: 65536K 提交总数: 16660 接受: 4910 描述 你可能想知道为什么大多数外星人的生命形式与人类相似,不同的是表面特征,如身高,肤色 ...
- POJ3294 Life Forms(二分+后缀数组)
给n个字符串,求最长的多于n/2个字符串的公共子串. 依然是二分判定+height分组. 把这n个字符串连接,中间用不同字符隔开,跑后缀数组计算出height: 二分要求的子串长度,判断是否满足:he ...
- 【POJ3294】 Life Forms (后缀数组+二分)
Life Forms Description You may have wondered why most extraterrestrial life forms resemble humans, d ...
- Life Forms (poj3294 后缀数组求 不小于k个字符串中的最长子串)
(累了,这题做了很久!) Life Forms Time Limit: 5000MS Memory Limit: 65536K Total Submissions: 8683 Accepted ...
随机推荐
- 开始使用Logstash
开始使用Logstash 本节将指导处理安装Logstash 和确认一切是运行正常的, 后来的章节处理增加负载的配置来处理选择的使用案例. 这个章节包含下面的主题: Installing Logsta ...
- 如何做Gibbs采样(how to do gibbs-sampling)
原文地址:<如何做Gibbs采样(how to do gibbs-sampling)> 随机模拟 随机模拟(或者统计模拟)方法最早有数学家乌拉姆提出,又称做蒙特卡洛方法.蒙特卡洛是一个著名 ...
- jtree(选择框)
jtree一般的用法是: 1. 展示电脑中文件的层次结构,如图所示. 具体的代码: package jtree; import java.io.File; import javax.swing.JTr ...
- 教程:30分钟学会Adobe Premiere
原文地址:http://tieba.baidu.com/p/2785313831 视频教程地址
- <转载>C++的链接错误LNK2005
转载http://bbs.csdn.net/topics/70346371 编程中经常能遇到LNK2005错误——重复定义错误,其实LNK2005错误并不是一个很难解决的错误.弄清楚它形成的原因,就可 ...
- 第26讲 对话框AlertDialog的自定义实现
第26讲对话框AlertDialog的自定义实现 比如我们在开发过长当中,要通过介绍系统发送的一个广播弹出一个dialog.但是dialog必需是基于activity才能呈现出来,如果没有activi ...
- poj 1236 Network of Schools(tarjan+缩点)
Network of Schools Description A number of schools are connected to a computer network. Agreements h ...
- python学习Processing
# -*- coding: utf-8 -*-__author__ = 'Administrator'import bisect#排序说明:http://en.wikipedia.org/wiki/i ...
- java自定义随机数(实例)
import java.util.Random; /** * * @author mengzw * @since 3.0 2014-5-22 */ public class RandomTest { ...
- RPG JS跨平台测试
RPG JS虽然说是跨平台的,但是在具体的测试中效果并不理想. 以官方提供的Demo为例 问题一 手机的屏幕太小,导致画面上的人物都很小,连点击都很不准确.在9寸的平板上才可以看得比较清楚. 问题二 ...