Description

The Genographic Project is a research partnership between IBM and The National Geographic Society that is analyzing DNA from hundreds of thousands of contributors to map how the Earth was populated.

As an IBM researcher, you have been tasked with writing a program that will find commonalities amongst given snippets of DNA that can be correlated with individual survey information to identify new genetic markers.

A DNA base sequence is noted by listing the nitrogen bases in the order in which they are found in the molecule. There are four bases: adenine (A), thymine (T), guanine (G), and cytosine (C). A 6-base DNA sequence could be represented as TAGACC.

Given a set of DNA base sequences, determine the longest series of bases that occurs in all of the sequences.

Input

Input to this problem will begin with a line containing a single integer n indicating the number of datasets. Each dataset consists of the following components:

  • A single positive integer m (2 <= m <= 10) indicating the number of base sequences in this dataset.
  • m lines each containing a single base sequence consisting of 60 bases.

Output

For each dataset in the input, output the longest base subsequence common to all of the given base sequences. If the longest common subsequence is less than three bases in length, display the string "no significant commonalities" instead. If multiple subsequences of the same longest length exist, output only the subsequence that comes first in alphabetical order.

Sample Input

3
2
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
3
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
GATACTAGATACTAGATACTAGATACTAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
GATACCAGATACCAGATACCAGATACCAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
3
CATCATCATCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
ACATCATCATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AACATCATCATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

Sample Output

no significant commonalities
AGATAC
CATCATCAT

Source

#include<iostream>
#include<cstdio>
#include<algorithm>
#include<cstring>
#include<vector>
#include<string>
using namespace std;
typedef long long LL;
#define MAXN 63
/*
枚举所有子串,用KMP算法检查是否出现过,选择其中的最优解
*/
char s[MAXN][MAXN];
int next[MAXN];
void kmp_pre(char x[])
{
int i,j,m=strlen(x);
j = next[] = -;
i = ;
while(i<m)
{
if(j!=-&&x[i]!=x[j])
j = next[j];
next[++i] = ++j;
}
}
bool kmp(char x[],char y[])
{
int i,j,ans = ,m=strlen(x),n=strlen(y);
kmp_pre(x);
i=j=;
while(i<n)
{
while(j!=-&&y[i]!=x[j])
j = next[j];
i++;
j++;
if(j>=m)
return true;
}
return false;
}
int main()
{
int t;
scanf("%d",&t);
while(t--)
{
int m;
char ans[MAXN] = "Z";
scanf("%d",&m);
for(int i=;i<m;i++)
scanf("%s",s[i]);
for(int len=;len>=;len--)
for(int i=;i<=-len;i++)
{
char b[MAXN] = {};
strncpy(b,s[]+i,len);
//cout<<len<<":"<<b<<endl;
int j;
for(j=;j<m;j++)
{
if(!kmp(b,s[j]))
break;
}
if(j==m&&strcmp(ans,b)>)
{
strcpy(ans,b);
}
if(ans[]!='Z'&&i==-len)
{
i = ;
len = ;
}
}
if(ans[]=='Z')
printf("no significant commonalities\n");
else
printf("%s\n",ans);
}
return ;
}

Blue Jeans POJ 3080 寻找多个串的最长相同子串的更多相关文章

  1. (字符串 KMP)Blue Jeans -- POJ -- 3080:

    链接: http://poj.org/problem?id=3080 http://acm.hust.edu.cn/vjudge/contest/view.action?cid=88230#probl ...

  2. Blue Jeans - POJ 3080(多串的共同子串)

    题目大意:有M个串,每个串的长度都是60,查找这M个串的最长公共子串(连续的),长度不能小于3,如果同等长度的有多个输出字典序最小的那个.   分析:因为串不多,而且比较短,所致直接暴力枚举的第一个串 ...

  3. Match:Blue Jeans(POJ 3080)

    DNA序列 题目大意:给你m串字符串,要你找最长的相同的连续字串 这题暴力kmp即可,注意要按字典序排序,同时,是len<3才输出no significant commonalities #in ...

  4. Blue Jeans - poj 3080(后缀数组)

    大致题意: 给出n个长度为60的DNA基因(A腺嘌呤 G鸟嘌呤 T胸腺嘧啶 C胞嘧啶)序列,求出他们的最长公共子序列 使用后缀数组解决 #include<stdio.h> #include ...

  5. POJ 3294 出现在至少K个字符串中的子串

    在掌握POJ 2774(两个串求最长公共子串)以及对Height数组分组后,本题还是容易想出思路的. 首先用字符集外的不同字符连接所有串,这是为了防止两个后缀在比较时超过某个字符串的分界.二分子串的长 ...

  6. POJ 3080 Blue Jeans (求最长公共字符串)

    POJ 3080 Blue Jeans (求最长公共字符串) Description The Genographic Project is a research partnership between ...

  7. POJ 3080 Blue Jeans(后缀数组+二分答案)

    [题目链接] http://poj.org/problem?id=3080 [题目大意] 求k个串的最长公共子串,如果存在多个则输出字典序最小,如果长度小于3则判断查找失败. [题解] 将所有字符串通 ...

  8. POJ 3080 Blue Jeans(Java暴力)

    Blue Jeans [题目链接]Blue Jeans [题目类型]Java暴力 &题意: 就是求k个长度为60的字符串的最长连续公共子串,2<=k<=10 规定: 1. 最长公共 ...

  9. POJ 3080 Blue Jeans 找最长公共子串(暴力模拟+KMP匹配)

    Blue Jeans Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 20966   Accepted: 9279 Descr ...

随机推荐

  1. 必会!Linux文件的管理

    1.1 创建一个目录 /data [root@liuhao ~]# mkdir /data 1.2 查看目录是否创建成功 <可以找到data即为创建成功> [root@liuhao ~]# ...

  2. es6 import 与 export

    1.export 命令 export 命令用于规定模块的对外接口. 一个模块就是一个独立的文件.该文件内部所有的变量,外部无法获取.要想外部能够读取模块内部的某个变量,就必须使用 export 关键字 ...

  3. 【转】 [MySQL 查询语句]——分组查询group by

    group by (1) group by的含义:将查询结果按照1个或多个字段进行分组,字段值相同的为一组(2) group by可用于单个字段分组,也可用于多个字段分组 select * from ...

  4. 如何看待B站疑似源码泄漏的问题?

    今天突然看到关于B站源码泄漏事.网曝B站整个网站后台工程源码遭泄露,开源项目平台Github上疑似出现了Bilibili网站后台工程,内含部分用户名密码.目前官方还没对此事作出任何回应,所以还无法确定 ...

  5. python 学习笔记一 (数据结构和算法)

    2018年刚刚过完年,从今天起,做一个认真的技术人.开始进入记笔记阶段. python内置了很多数据结构,list , set,dictionary 1.将序列分解为单独的变量 1.1 通过赋值的方式 ...

  6. 配置Oracle数据库的开机自启动

    每当数据库服务器重启后,都要重新启动数据库的监听和实例,特别是在服务器断电重启.例行维护性的场景下.能否像Windows服务器一样,让实例和监听随着服务的启动而启动呢?答案当然是肯定的,我们可以利用O ...

  7. [转]asp.net MVC 常见安全问题及解决方案

    本文转自:http://www.cnblogs.com/Jessy/p/3539564.html asp.net MVC 常见安全问题及解决方案 一.CSRF (Cross-site request ...

  8. 6.11---swagger文件上传的写法【照着写就行了,主要是需要声明contentType未mutilpart---如果不设置这个,就无法识别文件的】

    MultipartFile 是直接接收前台传过来的文件,File是抽象出来的文件对象,用来表示文件,一般操作都是操作的File,所以需要将MultipartFile转为File controller写 ...

  9. PHP魔术法__set和__get

    __set: 在给不可访问属性赋值时,__set() 会被调用.语法如下: public void __set ( string $name , mixed $value ) __get: 读取不可访 ...

  10. Android开发初体验

    本文通过开发一个应用来学习Android基本概念及构成应用的UI组件. 开发的应用名叫GeoQuiz,它能给出一道道地理知识问题.用户点击true或false按钮回答问题,应用即时做出反馈 第一步请先 ...