The Genographic Project is a research partnership between IBM and The National Geographic Society that is analyzing DNA from hundreds of thousands of contributors to map how the Earth was populated.

As an IBM researcher, you have been tasked with writing a
program that will find commonalities amongst given snippets of DNA that
can be correlated with individual survey information to identify new
genetic markers.

A DNA base sequence is noted by listing the nitrogen bases in
the order in which they are found in the molecule. There are four
bases: adenine (A), thymine (T), guanine (G), and cytosine (C). A 6-base
DNA sequence could be represented as TAGACC.

Given a set of DNA base sequences, determine the longest series of bases that occurs in all of the sequences.

Input

Input to this problem will begin with a line containing a single
integer n indicating the number of datasets. Each dataset consists of
the following components:

  • A single positive integer m (2 <= m <= 10) indicating the number of base sequences in this dataset.
  • m lines each containing a single base sequence consisting of 60 bases.

Output

For each dataset in the input, output the longest base
subsequence common to all of the given base sequences. If the longest
common subsequence is less than three bases in length, display the
string "no significant commonalities" instead. If multiple subsequences
of the same longest length exist, output only the subsequence that comes
first in alphabetical order.

Sample Input

3
2
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
3
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
GATACTAGATACTAGATACTAGATACTAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
GATACCAGATACCAGATACCAGATACCAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
3
CATCATCATCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
ACATCATCATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AACATCATCATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

Sample Output

no significant commonalities
AGATAC
CATCATCAT 感觉暴力可以,但是没有去写。想用kmp,但是又无从下手,就学习了一波操作。 首先暴力第一串的所有子串,然后再其他字符串里面找是否存在。技巧之一就是从长到短枚举。 暴力:
 #include<iostream>
#include<stdio.h>
#include<string>
#include<set>
#include<vector>
using namespace std;
vector<string> t;
set<string> ss;
string s;
int _,n; string fun() {
ss.clear();
string str=t[];
bool flag;
for(int len=;len>=;len--) {
for(int ix=;ix<=-len;ix++) {
string temp=str.substr(ix,len);
flag=true;
for(int k=;k<t.size();k++) {
if(t[k].find(temp)==-) {
flag=false;
break;
}
}
if(flag) ss.insert(temp);
}
if(ss.size()) return *ss.begin();
}
return "no significant commonalities";
} int main() {
// freopen("in","r",stdin);
for(scanf("%d",&_);_;_--) {
scanf("%d",&n);
for(int i=;i<n;i++) {
cin>>s;
t.push_back(s);
}
cout<<fun()<<endl;
t.clear();
} }

kmp思想:不需要找第一个串的所有子串,只需枚举每一个后缀,去和其他字符串匹配就行了。其实这个匹配过程就好比所有子串进行匹配了。

 #include<stdio.h>
#include<iostream>
#include<string>
#include<algorithm>
#include<vector>
using namespace std;
int _,n,Next[];
string s,strans;
vector<string> t; void prekmp(string s) {
int len=s.size();
int i,j;
j=Next[]=-;
i=;
while(i<len) {
while(j!=-&&s[i]!=s[j]) j=Next[j];
if(s[++i]==s[++j]) Next[i]=Next[j];
else Next[i]=j;
}
} int kmp(string p,string t) {
int len=t.size();
int i=,j=,res=-;
while(i<len) {
while(j!=-&&t[i]!=p[j]) j=Next[j];
++i;++j;
res=max(res,j);
}
return res;
} int main() {
// freopen("in","r",stdin);
for(scanf("%d",&_);_;_--) {
scanf("%d",&n);
for(int i=;i<n;i++) {
cin>>s;
t.push_back(s);
}
int ans=-;
string str=t[];
for(int i=;i<;i++) {
string temp=str.substr(i,-i);
prekmp(temp);
int maxx=;
for(int j=;j<t.size();j++) {
maxx=min(maxx,kmp(temp,t[j]));
}
if(maxx>ans) {
strans=temp.substr(,maxx);
ans=maxx;
} else if(maxx==ans) {
string anstemp=temp.substr(,maxx);
if(anstemp<strans) strans=anstemp;
}
}
if(strans.size()<) cout<<"no significant commonalities"<<'\n';
else cout<<strans<<'\n';
t.clear();
}
}

kuangbin专题十六 KMP&&扩展KMP POJ3080 Blue Jeans的更多相关文章

  1. kuangbin专题十六 KMP&&扩展KMP HDU2609 How many (最小字符串表示法)

    Give you n ( n < 10000) necklaces ,the length of necklace will not large than 100,tell me How man ...

  2. kuangbin专题十六 KMP&&扩展KMP HDU2328 Corporate Identity

    Beside other services, ACM helps companies to clearly state their “corporate identity”, which includ ...

  3. kuangbin专题十六 KMP&&扩展KMP HDU1238 Substrings

    You are given a number of case-sensitive strings of alphabetic characters, find the largest string X ...

  4. kuangbin专题十六 KMP&&扩展KMP HDU3336 Count the string

    It is well known that AekdyCoin is good at string problems as well as number theory problems. When g ...

  5. kuangbin专题十六 KMP&&扩展KMP HDU3746 Cyclic Nacklace

    CC always becomes very depressed at the end of this month, he has checked his credit card yesterday, ...

  6. kuangbin专题十六 KMP&&扩展KMP HDU2087 剪花布条

    一块花布条,里面有些图案,另有一块直接可用的小饰条,里面也有一些图案.对于给定的花布条和小饰条,计算一下能从花布条中尽可能剪出几块小饰条来呢? Input输入中含有一些数据,分别是成对出现的花布条和小 ...

  7. kuangbin专题十六 KMP&&扩展KMP HDU1686 Oulipo

    The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e ...

  8. kuangbin专题十六 KMP&&扩展KMP HDU1711 Number Sequence

    Given two sequences of numbers : a[1], a[2], ...... , a[N], and b[1], b[2], ...... , b[M] (1 <= M ...

  9. kuangbin专题十六 KMP&&扩展KMP HDU3613 Best Reward(前缀和+manacher or ekmp)

    After an uphill battle, General Li won a great victory. Now the head of state decide to reward him w ...

随机推荐

  1. 2014.10.1 Word技巧

    设置每页都出现的表头 wordDoc.Tables[tab].Rows[1].HeadingFormat = (int)Word.WdConstants.wdToggle; //合并单元格 wordD ...

  2. python读取配置文件 ConfigParser

    Python 标准库的 ConfigParser 模块提供一套 API 来读取和操作配置文件. 配置文件的格式 a) 配置文件中包含一个或多个 section, 每个 section 有自己的 opt ...

  3. java 多线程系列---JUC原子类(二)之AtomicLong原子类

    概要 AtomicInteger, AtomicLong和AtomicBoolean这3个基本类型的原子类的原理和用法相似.本章以AtomicLong对基本类型的原子类进行介绍. AtomicLong ...

  4. 【Android 多媒体应用】使用 VideoView 播放视频

    1.MainActivity.java import android.os.Bundle; import android.support.v7.app.AppCompatActivity; impor ...

  5. matlab学习笔记(3)

    数据分析: 多项式: 多项式表示:p = [1 2 3 0]; //表示 1*x^3+2*x^2+3*x^1+0 ,系数从高次向低次项,0系数不能省略. roots函数:求解多项式的根.roots(p ...

  6. 去除Activity上面的标题边框

    实现方法:1.在代码中实现:在此方法setContentView(R.layout.main)之前加入:requestWindowFeature(Window.FEATURE_NO_TITLE);标题 ...

  7. Codeforces 1093E Intersection of Permutations (CDQ分治+树状数组)

    题意:给你两个数组a和b,a,b都是一个n的全排列:有两种操作:一种是询问区间在数组a的区间[l1,r1]和数组b的区间[l2,r2]出现了多少相同的数字,另一种是交换数组b中x位置和y位置的数字. ...

  8. 【摘自张宴的"实战:Nginx"】使用nginx的proxy_cache模块替代squid,缓存静态文件

    #user nobody;worker_processes 1; error_log logs/static_source.error.log;#error_log logs/error.log no ...

  9. Evil Book -- CodeChef

    传送门 分析 对于这道题,我们首先思考一个贪心策略,即对于所有我们要打败的厨师我们肯定可以先打败需使用帮助次数少的厨师再打败需使用帮助次数多的厨师 ,因为这样可以使得能支付得起帮助费用的可能性尽可能的 ...

  10. Luogu 1514 [NOIP2010] 引水入城

    我就是过来开心一下……这道题从开坑以来已经堆积了大半年了……今天才发现广搜一直写挂…… 丢个线段覆盖的模板,设$f_{i}$表示覆盖区间[1, i]的最小代价,$g_{i, j}$表示覆盖区间[i, ...