Description

The Genographic Project is a research partnership between IBM and The National Geographic Society that is analyzing DNA from hundreds of thousands of contributors to map how the Earth was populated.

As an IBM researcher, you have been tasked with writing a program that will find commonalities amongst given snippets of DNA that can be correlated with individual survey information to identify new genetic markers.

A DNA base sequence is noted by listing the nitrogen bases in the order in which they are found in the molecule. There are four bases: adenine (A), thymine (T), guanine (G), and cytosine (C). A 6-base DNA sequence could be represented as TAGACC.

Given a set of DNA base sequences, determine the longest series of bases that occurs in all of the sequences.

Input

Input to this problem will begin with a line containing a single integer n indicating the number of datasets. Each dataset consists of the following components:

  • A single positive integer m (2 <= m <= 10) indicating the number of base sequences in this dataset.
  • m lines each containing a single base sequence consisting of 60 bases.

Output

For each dataset in the input, output the longest base subsequence common to all of the given base sequences. If the longest common subsequence is less than three bases in length, display the string "no significant commonalities" instead. If multiple subsequences of the same longest length exist, output only the subsequence that comes first in alphabetical order.

Sample Input

3
2
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
3
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
GATACTAGATACTAGATACTAGATACTAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
GATACCAGATACCAGATACCAGATACCAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
3
CATCATCATCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
ACATCATCATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AACATCATCATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

Sample Output

no significant commonalities
AGATAC
CATCATCAT

Source

#include<iostream>
#include<cstdio>
#include<algorithm>
#include<cstring>
#include<vector>
#include<string>
using namespace std;
typedef long long LL;
#define MAXN 63
/*
枚举所有子串,用KMP算法检查是否出现过,选择其中的最优解
*/
char s[MAXN][MAXN];
int next[MAXN];
void kmp_pre(char x[])
{
int i,j,m=strlen(x);
j = next[] = -;
i = ;
while(i<m)
{
if(j!=-&&x[i]!=x[j])
j = next[j];
next[++i] = ++j;
}
}
bool kmp(char x[],char y[])
{
int i,j,ans = ,m=strlen(x),n=strlen(y);
kmp_pre(x);
i=j=;
while(i<n)
{
while(j!=-&&y[i]!=x[j])
j = next[j];
i++;
j++;
if(j>=m)
return true;
}
return false;
}
int main()
{
int t;
scanf("%d",&t);
while(t--)
{
int m;
char ans[MAXN] = "Z";
scanf("%d",&m);
for(int i=;i<m;i++)
scanf("%s",s[i]);
for(int len=;len>=;len--)
for(int i=;i<=-len;i++)
{
char b[MAXN] = {};
strncpy(b,s[]+i,len);
//cout<<len<<":"<<b<<endl;
int j;
for(j=;j<m;j++)
{
if(!kmp(b,s[j]))
break;
}
if(j==m&&strcmp(ans,b)>)
{
strcpy(ans,b);
}
if(ans[]!='Z'&&i==-len)
{
i = ;
len = ;
}
}
if(ans[]=='Z')
printf("no significant commonalities\n");
else
printf("%s\n",ans);
}
return ;
}

Blue Jeans POJ 3080 寻找多个串的最长相同子串的更多相关文章

  1. (字符串 KMP)Blue Jeans -- POJ -- 3080:

    链接: http://poj.org/problem?id=3080 http://acm.hust.edu.cn/vjudge/contest/view.action?cid=88230#probl ...

  2. Blue Jeans - POJ 3080(多串的共同子串)

    题目大意:有M个串,每个串的长度都是60,查找这M个串的最长公共子串(连续的),长度不能小于3,如果同等长度的有多个输出字典序最小的那个.   分析:因为串不多,而且比较短,所致直接暴力枚举的第一个串 ...

  3. Match:Blue Jeans(POJ 3080)

    DNA序列 题目大意:给你m串字符串,要你找最长的相同的连续字串 这题暴力kmp即可,注意要按字典序排序,同时,是len<3才输出no significant commonalities #in ...

  4. Blue Jeans - poj 3080(后缀数组)

    大致题意: 给出n个长度为60的DNA基因(A腺嘌呤 G鸟嘌呤 T胸腺嘧啶 C胞嘧啶)序列,求出他们的最长公共子序列 使用后缀数组解决 #include<stdio.h> #include ...

  5. POJ 3294 出现在至少K个字符串中的子串

    在掌握POJ 2774(两个串求最长公共子串)以及对Height数组分组后,本题还是容易想出思路的. 首先用字符集外的不同字符连接所有串,这是为了防止两个后缀在比较时超过某个字符串的分界.二分子串的长 ...

  6. POJ 3080 Blue Jeans (求最长公共字符串)

    POJ 3080 Blue Jeans (求最长公共字符串) Description The Genographic Project is a research partnership between ...

  7. POJ 3080 Blue Jeans(后缀数组+二分答案)

    [题目链接] http://poj.org/problem?id=3080 [题目大意] 求k个串的最长公共子串,如果存在多个则输出字典序最小,如果长度小于3则判断查找失败. [题解] 将所有字符串通 ...

  8. POJ 3080 Blue Jeans(Java暴力)

    Blue Jeans [题目链接]Blue Jeans [题目类型]Java暴力 &题意: 就是求k个长度为60的字符串的最长连续公共子串,2<=k<=10 规定: 1. 最长公共 ...

  9. POJ 3080 Blue Jeans 找最长公共子串(暴力模拟+KMP匹配)

    Blue Jeans Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 20966   Accepted: 9279 Descr ...

随机推荐

  1. explain 详解 (转)

    原文:http://blog.csdn.net/zhuxineli/article/details/14455029 explain显示了MySQL如何使用索引来处理select语句以及连接表.可以帮 ...

  2. codevs3304水果姐逛街(线段数)

    3304 水果姐逛水果街Ⅰ  时间限制: 2 s  空间限制: 256000 KB  题目等级 : 钻石 Diamond   题目描述 Description 水果姐今天心情不错,来到了水果街. 水果 ...

  3. codevs1026商务旅行

    1036 商务旅行  时间限制: 1 s  空间限制: 128000 KB  题目等级 : 钻石 Diamond 题解       题目描述 Description 某首都城市的商人要经常到各城镇去做 ...

  4. [jzoj NOIP2018模拟10.23]

    丢分主要是下面几个方面: 1.T2代码交错了,有个特判没写丢了10分 2.T1线段树加等差数列写错了(其实二维差分就可以,但我当时不会) 3.T3思考再三还是为了10分写上了主席树,还是写错了 总体评 ...

  5. akka设计模式系列-Chain模式

    链式调用在很多框架和系统中经常存在,算不得上是我自己总结的设计模式,此处只是简单介绍在Akka中的两种实现方式.我在这边博客中简化了链式调用的场景,简化后也更符合Akka的设计哲学. trait Ch ...

  6. CF 351A - Jeff and Rounding DP

    http://codeforces.com/problemset/problem/351/C 题意:有2*n个浮点数a1,a2,a3...a2*n,把他们分成n队,对于每对<A,B>,对A ...

  7. python值函数名的使用以及闭包,迭代器

    一.函数名的运用 函数名就是一个变量名,但它是一个特殊的变量名,是一个后面加括号可以执行函数的变量名. def func(): print("我是一个小小的函数") a = fun ...

  8. Sqoop 产生背景(一)

    Sqoop 的产生主要源于: 1.目前很多使用hadoop技术的企业,有大量的数据存储在传统关系型数据库中. 2.早期由于工具的缺乏,hadoop与传统数据库之间的数据传输非常困难. 1)传统数据库中 ...

  9. Ambari是啥?主要是干啥的?

    简单来说,Ambari是一个拥有集群自动化安装.中心化管理.集群监控.报警功能的一个工具(软件),使得安装集群从几天的时间缩短在几个小时内,运维人员从数十人降低到几人以内,极大的提高集群管理的效率. ...

  10. EF在应用程序配置文件中找不到名为“XXX”的连接字符串。

    现象: 在配置EF的时候需要如题所述的问题,仔细检查了在EF实体模型对应程序集下的APP.Config文件中的ConnectionString配置项有了XXX项的数据库字符串的配置: <conn ...