The Genographic Project is a research partnership between IBM and The National Geographic Society that is analyzing DNA from hundreds of thousands of contributors to map how the Earth was populated.

As an IBM researcher, you have been tasked with writing a
program that will find commonalities amongst given snippets of DNA that
can be correlated with individual survey information to identify new
genetic markers.

A DNA base sequence is noted by listing the nitrogen bases in
the order in which they are found in the molecule. There are four
bases: adenine (A), thymine (T), guanine (G), and cytosine (C). A 6-base
DNA sequence could be represented as TAGACC.

Given a set of DNA base sequences, determine the longest series of bases that occurs in all of the sequences.

Input

Input to this problem will begin with a line containing a single
integer n indicating the number of datasets. Each dataset consists of
the following components:

  • A single positive integer m (2 <= m <= 10) indicating the number of base sequences in this dataset.
  • m lines each containing a single base sequence consisting of 60 bases.

Output

For each dataset in the input, output the longest base
subsequence common to all of the given base sequences. If the longest
common subsequence is less than three bases in length, display the
string "no significant commonalities" instead. If multiple subsequences
of the same longest length exist, output only the subsequence that comes
first in alphabetical order.

Sample Input

3
2
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
3
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
GATACTAGATACTAGATACTAGATACTAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
GATACCAGATACCAGATACCAGATACCAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
3
CATCATCATCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
ACATCATCATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AACATCATCATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

Sample Output

no significant commonalities
AGATAC
CATCATCAT 感觉暴力可以,但是没有去写。想用kmp,但是又无从下手,就学习了一波操作。 首先暴力第一串的所有子串,然后再其他字符串里面找是否存在。技巧之一就是从长到短枚举。 暴力:
 #include<iostream>
#include<stdio.h>
#include<string>
#include<set>
#include<vector>
using namespace std;
vector<string> t;
set<string> ss;
string s;
int _,n; string fun() {
ss.clear();
string str=t[];
bool flag;
for(int len=;len>=;len--) {
for(int ix=;ix<=-len;ix++) {
string temp=str.substr(ix,len);
flag=true;
for(int k=;k<t.size();k++) {
if(t[k].find(temp)==-) {
flag=false;
break;
}
}
if(flag) ss.insert(temp);
}
if(ss.size()) return *ss.begin();
}
return "no significant commonalities";
} int main() {
// freopen("in","r",stdin);
for(scanf("%d",&_);_;_--) {
scanf("%d",&n);
for(int i=;i<n;i++) {
cin>>s;
t.push_back(s);
}
cout<<fun()<<endl;
t.clear();
} }

kmp思想:不需要找第一个串的所有子串,只需枚举每一个后缀,去和其他字符串匹配就行了。其实这个匹配过程就好比所有子串进行匹配了。

 #include<stdio.h>
#include<iostream>
#include<string>
#include<algorithm>
#include<vector>
using namespace std;
int _,n,Next[];
string s,strans;
vector<string> t; void prekmp(string s) {
int len=s.size();
int i,j;
j=Next[]=-;
i=;
while(i<len) {
while(j!=-&&s[i]!=s[j]) j=Next[j];
if(s[++i]==s[++j]) Next[i]=Next[j];
else Next[i]=j;
}
} int kmp(string p,string t) {
int len=t.size();
int i=,j=,res=-;
while(i<len) {
while(j!=-&&t[i]!=p[j]) j=Next[j];
++i;++j;
res=max(res,j);
}
return res;
} int main() {
// freopen("in","r",stdin);
for(scanf("%d",&_);_;_--) {
scanf("%d",&n);
for(int i=;i<n;i++) {
cin>>s;
t.push_back(s);
}
int ans=-;
string str=t[];
for(int i=;i<;i++) {
string temp=str.substr(i,-i);
prekmp(temp);
int maxx=;
for(int j=;j<t.size();j++) {
maxx=min(maxx,kmp(temp,t[j]));
}
if(maxx>ans) {
strans=temp.substr(,maxx);
ans=maxx;
} else if(maxx==ans) {
string anstemp=temp.substr(,maxx);
if(anstemp<strans) strans=anstemp;
}
}
if(strans.size()<) cout<<"no significant commonalities"<<'\n';
else cout<<strans<<'\n';
t.clear();
}
}

kuangbin专题十六 KMP&&扩展KMP POJ3080 Blue Jeans的更多相关文章

  1. kuangbin专题十六 KMP&&扩展KMP HDU2609 How many (最小字符串表示法)

    Give you n ( n < 10000) necklaces ,the length of necklace will not large than 100,tell me How man ...

  2. kuangbin专题十六 KMP&&扩展KMP HDU2328 Corporate Identity

    Beside other services, ACM helps companies to clearly state their “corporate identity”, which includ ...

  3. kuangbin专题十六 KMP&&扩展KMP HDU1238 Substrings

    You are given a number of case-sensitive strings of alphabetic characters, find the largest string X ...

  4. kuangbin专题十六 KMP&&扩展KMP HDU3336 Count the string

    It is well known that AekdyCoin is good at string problems as well as number theory problems. When g ...

  5. kuangbin专题十六 KMP&&扩展KMP HDU3746 Cyclic Nacklace

    CC always becomes very depressed at the end of this month, he has checked his credit card yesterday, ...

  6. kuangbin专题十六 KMP&&扩展KMP HDU2087 剪花布条

    一块花布条,里面有些图案,另有一块直接可用的小饰条,里面也有一些图案.对于给定的花布条和小饰条,计算一下能从花布条中尽可能剪出几块小饰条来呢? Input输入中含有一些数据,分别是成对出现的花布条和小 ...

  7. kuangbin专题十六 KMP&&扩展KMP HDU1686 Oulipo

    The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e ...

  8. kuangbin专题十六 KMP&&扩展KMP HDU1711 Number Sequence

    Given two sequences of numbers : a[1], a[2], ...... , a[N], and b[1], b[2], ...... , b[M] (1 <= M ...

  9. kuangbin专题十六 KMP&&扩展KMP HDU3613 Best Reward(前缀和+manacher or ekmp)

    After an uphill battle, General Li won a great victory. Now the head of state decide to reward him w ...

随机推荐

  1. linux命令-sudo普通用户拥有root权限

    普通用户权限不够 [root@wangshaojun ~]# su - dennywang[dennywang@wangshaojun ~]$ ls /root/ls: 无法打开目录/root/: 权 ...

  2. hibernate的子查询

    hibernate原话 HQL supports subqueries in the where clause. We can't think of many good uses for subque ...

  3. 在PHP中PDO解决中文乱码问题的一些补充

    我的环境是appsver包, 在网上最常出现的解决中文乱码显示的代码是: 第一种:PDO::__construct($dsn, $user, $pass, array (PDO::MYSQL_ATTR ...

  4. ArcEngine中多边形内外环的处理(转)

    ArcEngine中多边形内外环的处理 原创 2012年09月06日 22:49:11 标签: object / null / 数据库 3462 Polylgon对象是由一个或多个Ring对象的有序集 ...

  5. 深入理解asp.net中的 __doPostBack函数

    前段时间做一个.net网站的时候,用到了模拟前端按钮刷新updatePanel进行局部刷新的时候,遇见了这个问题,当时没顾上记下来,查看网上资料,记下来留着以后查看. 很早以前,当我刚接触asp.NE ...

  6. linq to sql 不能更新的问题

    今天在项目中用linq更新一个表的时候,结果怎么都更新不了,最蛋疼的是什么异常也不报,发现db.table1.isReadOnly为True 知道问题所在,百度后得到解决办法: 原来是我的表没有增加主 ...

  7. ENCODE:DNA 分子元件的百科全书

    ENCODE(DNA分子元件的百科全书)是由国家人类基因研究所(NHGRI)资助的一个国际研究联盟, 该联盟的目标是:建立一份综合的人类基因组功能元件的清单,这些基本元件包括那些直接作用蛋白质和RNA ...

  8. javascript 准确的判断类型方法

    在 JavaScript 里使用 typeof 来判断数据类型,只能区分基本类型,即 “number”,”string”,”undefined”,”boolean”,”object” 五种. 对于数组 ...

  9. 导出Excel解决方案之一NOPI

    一.概要 导出Excel这个功能相信很多人都做过,但是实现这个功能解决方案有好几种,今天我未大家介绍一种比较新的,其实也不新了- -!它叫NPOI,可以完美操作EXCEl的导入和导出操作,让我们一起看 ...

  10. C++二进制文件读写

    简单二进制文件读写,多文件 /*Demo9.1.cpp*/ #include <iostream> #include <fstream> #include <string ...