UVA11107 Life Forms SA模板
| Time Limit: 5000MS | Memory Limit: 65536K | |
| Total Submissions: 16827 | Accepted: 4943 |
Description
You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such as height, colour, wrinkles, ears, eyebrows and the like. A few bear no human resemblance; these typically have geometric or amorphous shapes like cubes, oil slicks or clouds of dust.
The answer is given in the 146th episode of Star Trek - The Next Generation, titled The Chase. It turns out that in the vast majority of the quadrant's life forms ended up with a large fragment of common DNA.
Given the DNA sequences of several life forms represented as strings of letters, you are to find the longest substring that is shared by more than half of them.
Input
Standard input contains several test cases. Each test case begins with 1 ≤ n ≤ 100, the number of life forms. n lines follow; each contains a string of lower case letters representing the DNA sequence of a life form. Each DNA sequence contains at least one and not more than 1000 letters. A line containing 0 follows the last test case.
Output
For each test case, output the longest string or strings shared by more than half of the life forms. If there are many, output all of them in alphabetical order. If there is no solution with at least one letter, output "?". Leave an empty line between test cases.
Sample Input
3
abcdefg
bcdefgh
cdefghi
3
xxx
yyy
zzz
0
Sample Output
bcdefg
cdefgh ?
Source
【题解】
UVA炸了,在POJ上交的
把所有的串连起来,串尾加分隔符,一来识别串尾,二来防止跟后面的串拼起来一块与其他串组成LCP,分组时会出现错误
二分答案分组,看每个组里是否有大于n/2个串即可
细节很多,关键是要输出方案。。。还要做很鬼畜的标记,各种判断标识符。。
#include <iostream>
#include <cstdio>
#include <cstring>
#include <cstdlib>
#include <algorithm>
#include <queue>
#include <vector>
#include <cmath>
#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))
#define abs(a) ((a) < 0 ? (-1 * (a)) : (a))
template <class T>
inline void swap(T& a, T& b)
{
T tmp = a;a = b;b = tmp;
}
inline void read(int &x)
{
x = ;char ch = getchar(), c = ch;
while(ch < '' || ch > '') c = ch, ch = getchar();
while(ch <= '' && ch >= '') x = x * + ch - '', ch = getchar();
if(c == '-') x = -x;
}
const int INF = 0x3f3f3f3f;
const int MAXN = + ;
struct SuffixArray
{
char s[MAXN];int sa[MAXN], t1[MAXN], t2[MAXN], rank[MAXN], height[MAXN], c[MAXN], n;
void clear(){n = ;memset(sa, , sizeof(sa));}
void build_sa(int m)
{
int i, *x = t1, *y = t2;
for(i = ;i <= m;++ i) c[i] = ;
for(i = ;i <= n;++ i) ++ c[x[i] = s[i]];
for(i = ;i <= m;++ i) c[i] += c[i - ];
for(i = n;i >= ;-- i) sa[c[x[i]] --] = i;
for(int k = ;k <= n;k <<= )
{
int p = ;
for(i = n - k + ;i <= n;++ i) y[++ p] = i;
for(i = ;i <= n;++ i) if(sa[i] > k) y[++ p] = sa[i] - k;
for(i = ;i <= m;++ i) c[i] = ;
for(i = ;i <= n;++ i) ++ c[x[y[i]]];
for(i = ;i <= m;++ i) c[i] += c[i - ];
for(i = n;i >= ;-- i) sa[c[x[y[i]]] --] = y[i];
swap(x, y);p = ,x[sa[]] = ++ p;
for(i = ;i <= n;++ i) x[sa[i]] = sa[i] + k <= n && sa[i - ] + k <= n && y[sa[i]] == y[sa[i - ]] && y[sa[i] + k] == y[sa[i - ] + k] ? p : ++ p;
if(p >= n) break;m = p;
}
}
void build_height()
{
int i,j,k = ;
for(i = ;i <= n;++ i) rank[sa[i]] = i;
for(i = ;i <= n;++ i)
{
if(k) -- k; if(rank[i] == ) continue;
j = sa[rank[i] - ];
while(i + k <= n && j + k <= n && s[i + k] == s[j + k]) ++ k;
height[rank[i]] = k;
}
}
}A;
int cnt, n, tmp, ans, num[MAXN], vis[MAXN], tt = ;
std::vector<int> node;
int check(int x)
{
int num = , flag = , t = , f = ;
for(int i = ;i <= n;++ i)
{
if(A.s[A.sa[i]] == 'z' + || A.s[A.sa[i - ]] == 'z' + || ::num[A.sa[i] + x] != ::num[A.sa[i]] || ::num[A.sa[i - ] + x] != ::num[A.sa[i - ]])
{
num = , ++ tt, t = , f = ;
continue;
}
if(A.height[i] >= x)
{
if(t)
{
t = ;
vis[::num[A.sa[i - ]]] = tt;
++ num;
}
if(vis[::num[A.sa[i]]] == tt) continue;
++ num, vis[::num[A.sa[i]]] = tt;
}
else num = , ++ tt, t = , f = ;
if(num >= cnt/ + && f)
{
if(!flag) node.clear();
node.push_back(i);
f = ;
flag =;
}
}
return flag;
}
void put()
{
for(int i = ;i < node.size();++ i)
{
for(int j = A.sa[node[i]], k = ;k <= ans;++ k, ++ j)
printf("%c", A.s[j]);
putchar('\n');
}
}
int main()
{
while(scanf("%d", &cnt) != EOF && cnt)
{
if(cnt == )
{
scanf("%s", A.s + );
printf("%s\n\n", A.s + );
continue;
}
A.clear();n = ;tmp = ;
for(int i = ;i <= cnt;++ i)
{
scanf("%s", A.s + n + );
int t = n + ;
tmp = max(tmp, strlen(A.s + n + ));
n += strlen(A.s + n + );
A.s[++ n] = 'z' + ;
for(int j = t;j <= n;++ j) num[j] = i;
}
A.n = n;
A.build_sa('z' + );
A.build_height();
int l = , r = tmp, mid;ans = ;
while(l <= r)
{
mid = (l + r) >> ;
if(check(mid)) l = mid + , ans = mid;
else r = mid - ;
}
if(ans) put();
else printf("?\n");
putchar('\n');
}
return ;
}
UVA11107
UVA11107 Life Forms SA模板的更多相关文章
- UVA11107 Life Forms --- 后缀数组
UVA11107 Life Forms 题目描述: 求出出现在一半以上的字符串内的最长字符串. 数据范围: \(\sum len(string) <= 10^{5}\) 非常坑的题目. 思路非常 ...
- UVA-11107 Life Forms(后缀数组)
题目大意:给出n个字符串,找出所有最长的在超过一半的字符串中出现的子串. 题目分析:将所有的字符串连成一个,二分枚举长度,每次用O(n)的时间复杂度判断.连接字符串的时候中间添一个没有出现过的字符. ...
- UVA11107 Life Forms
思路 后缀数组 先都拼在一起 二分+height分段 按照小于x的为分界,判断是否有一个分段中包含超过n/2个串 代码 #include <cstdio> #include <cst ...
- 【POJ3294】 Life Forms(SA)
...又是TLE,对于单组数据肯定TLE不了,问题是多组的时候就呵呵了... 按height分组去搞,然后判一下是否不属于同一个串... ; var x,y,rank,sa,c,col,h,rec:. ...
- SA模板
#include<cstdio> #include<algorithm> #include<cstring> using namespace std; ; char ...
- UVA-11107 Life Forms(求出现K次的子串,后缀数组+二分答案)
题解: 题意: 输入n个DNA序列,你的任务是求出一个长度最大的字符串,使得它在超过一半的DNA序列中出现.如果有多解,按照字典序从小到大输入所有解. 把n个DNA序列拼在一起,中间用没有出现过的字符 ...
- 洛谷3809 SA模板 后缀数组学习笔记(复习)
其实SA这个东西很久之前就听过qwq 但是基本已经忘的差不多了 嘤嘤嘤 QWQ感觉自己不是很理解啊 所以写不出来那种博客 QWQ只能安利一些别人的博客了 小老板 真的是讲的非常好 不要在意名字 orz ...
- Visual Studio项目模板与向导开发
在[Xamarin+Prism开发详解系列]里面经常使用到[Prism unity app]的模板创建Prism.Forms项目: 备注:由于Unity社区已经不怎么活跃,下一个版本将会有Autofa ...
- [BZOJ4650][NOI2016]优秀的拆分(SAM构建SA)
关于解法这个讲的很清楚了,主要用了设关键点的巧妙思想. 主要想说的是一个刚学的方法:通过后缀自动机建立后缀树,再转成后缀数组. 后缀数组功能强大,但是最令人头疼的地方是模板太难背容易写错.用这个方法, ...
随机推荐
- 20.multi_case06
# coding:utf-8 import asyncio # 通过create_task()方法 async def a(t): print('-->', t) await asyncio.s ...
- C++开发系列-友元函数 友元类
友元函数 默认一个类的私有属性只能在该类的内部可以直接访问.友元函数申明在内的内部,实现在类的外部可以直接访问类的私有属性. class A1 { public: A1() { a1 = 100; a ...
- 【珍惜时间】vue-websocket
这个项目可能是个有始无终的项目?跟我一起分析吧,比较简单的一个项目 另外,我也想跟自己说,我好像失去了那个努力的自己了.要珍惜时间,好好加油啊~ 项目地址为:https://github.com/xi ...
- jvisualvm 工具使用【转】
VisualVM 是Netbeans的profile子项目,已在JDK6.0 update 7 中自带(java启动时不需要特定参数,监控工具在bin/jvisualvm.exe). https:// ...
- kafka数据分区的四种策略
kafka的数据的分区 探究的是kafka的数据生产出来之后究竟落到了哪一个分区里面去了 第一种分区策略:给定了分区号,直接将数据发送到指定的分区里面去 第二种分区策略:没有给定分区号,给定数据的ke ...
- System.Web.Mvc.RedirectToRouteResult.cs
ylbtech-System.Web.Mvc.RedirectToRouteResult.cs 1.程序集 System.Web.Mvc, Version=5.2.3.0, Culture=neutr ...
- 关于 ros
1.https://mikrotik.com/download 下载 x86 架构的 cd image (当日这是试用版,特殊版下载后道理一样) 2.exsi 上传,并新建 linux 的 其他 ...
- matlab调用keras深度学习模型(环境搭建)
matlab没有直接调用tensorflow模型的接口,但是有调用keras模型的接口,而keras又是tensorflow的高级封装版本,所以就研究一下这个……可以将model-based方法和le ...
- centos安装gcc4.8.2
1. 下载源码:镜像地址http://mirror.bjtu.edu.cn/gnu/gcc/gcc-4.8.2/gcc-4.8.2.tar.gz用svn下载可以随时更新到最新的版本svn checko ...
- java代码优化写法1(转摘)
源文地址:https://blog.csdn.net/qq_15766297/article/details/70503222 代码优化,一个很重要的课题.可能有些人觉得没用,一些细小的地方有什么好修 ...