Human Gene Functions
Human Gene Functions
Time Limit: 1000MS Memory Limit: 10000K
Total Submissions: 18053 Accepted: 10046
Description
It is well known that a human gene can be considered as a sequence, consisting of four nucleotides, which are simply denoted by four letters, A, C, G, and T. Biologists have been interested in identifying human genes and determining their functions, because these can be used to diagnose human diseases and to design new drugs for them.
A human gene can be identified through a series of time-consuming biological experiments, often with the help of computer programs. Once a sequence of a gene is obtained, the next job is to determine its function.
One of the methods for biologists to use in determining the function of a new gene sequence that they have just identified is to search a database with the new gene as a query. The database to be searched stores many gene sequences and their functions – many researchers have been submitting their genes and functions to the database and the database is freely accessible through the Internet.
A database search will return a list of gene sequences from the database that are similar to the query gene.
Biologists assume that sequence similarity often implies functional similarity. So, the function of the new gene might be one of the functions that the genes from the list have. To exactly determine which one is the right one another series of biological experiments will be needed.
Your job is to make a program that compares two genes and determines their similarity as explained below. Your program may be used as a part of the database search if you can provide an efficient one.
Given two genes AGTGATG and GTTAG, how similar are they? One of the methods to measure the similarity
of two genes is called alignment. In an alignment, spaces are inserted, if necessary, in appropriate positions of
the genes to make them equally long and score the resulting genes according to a scoring matrix.
For example, one space is inserted into AGTGATG to result in AGTGAT-G, and three spaces are inserted into GTTAG to result in –GT–TAG. A space is denoted by a minus sign (-). The two genes are now of equal
length. These two strings are aligned:
AGTGAT-G
-GT–TAG
In this alignment, there are four matches, namely, G in the second position, T in the third, T in the sixth, and G in the eighth. Each pair of aligned characters is assigned a score according to the following scoring matrix.
denotes that a space-space match is not allowed. The score of the alignment above is (-3)+5+5+(-2)+(-3)+5+(-3)+5=9.
Of course, many other alignments are possible. One is shown below (a different number of spaces are inserted into different positions):
AGTGATG
-GTTA-G
This alignment gives a score of (-3)+5+5+(-2)+5+(-1) +5=14. So, this one is better than the previous one. As a matter of fact, this one is optimal since no other alignment can have a higher score. So, it is said that the
similarity of the two genes is 14.
Input
The input consists of T test cases. The number of test cases ) (T is given in the first line of the input file. Each test case consists of two lines: each line contains an integer, the length of a gene, followed by a gene sequence. The length of each gene sequence is at least one and does not exceed 100.
Output
The output should print the similarity of each test case, one per line.
Sample Input
2
7 AGTGATG
5 GTTAG
7 AGCTATT
9 AGCTTTAAA
Sample Output
14
21
Source
Taejon 2001
求DNA匹配度,类似最长公共子序列
#include <iostream>
#include <cstdio>
#include <cstring>
#include <cmath>
#include <queue>
#include <map>
#include <algorithm>
using namespace std;
typedef long long LL;
typedef pair<int,int>p;
const int INF = 0x3f3f3f3f;
int value[][5]={{5,-1,-2,-1,-3},
{-1,5,-3,-2,-4},
{-2,-3,5,-2,-2},
{-1,-2,-2,5,-1},
{-3,-4,-2,-1,0}};
map<char ,int >Dir;
int lens,lenc;
int Dp[110][110];
int main()
{
int n;
char s[110];
char c[110];
Dir['A']=0;
Dir['C']=1;
Dir['G']=2;
Dir['T']=3;
Dir['-']=4;
while(~scanf("%d",&n))
{
while(n--)
{
scanf("%d %s",&lens,s+1);
scanf("%d %s",&lenc,c+1);
Dp[0][0]=0;
for(int i=1;i<=lens;i++)
{
Dp[i][0]=Dp[i-1][0]+value[Dir[s[i]]][Dir['-']];//如果都不匹配的情况
}
for(int i=1;i<=lenc;i++)
{
Dp[0][i]=Dp[0][i-1]+value[Dir[c[i]]][Dir['-']];
}
for(int i=1;i<=lens;i++)
{
for(int j=1;j<=lenc;j++)
{
Dp[i][j]=Dp[i-1][j-1]+value[Dir[s[i]]][Dir[c[j]]];//如果两个字符要匹配,则Dp[i][j]由Dp[i-1][j-1]推出.
Dp[i][j]=max(Dp[i][j],Dp[i-1][j]+value[Dir[s[i]]][Dir['-']]);//前面的字符与这个已经匹配(不管用什么方式匹配的),他只能与'-'匹配
Dp[i][j]=max(Dp[i][j],Dp[i][j-1]+value[Dir['-']][Dir[c[j]]]);//如果这个字符已经与前面匹配,则c[j]与'-'匹配
}
}
printf("%d\n",Dp[lens][lenc]);
}
}
return 0;
}
Human Gene Functions的更多相关文章
- hdu1080 Human Gene Functions() 2016-05-24 14:43 65人阅读 评论(0) 收藏
Human Gene Functions Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Oth ...
- poj 1080 ——Human Gene Functions——————【最长公共子序列变型题】
Human Gene Functions Time Limit: 1000MS Memory Limit: 10000K Total Submissions: 17805 Accepted: ...
- 【POJ 1080】 Human Gene Functions
[POJ 1080] Human Gene Functions 相似于最长公共子序列的做法 dp[i][j]表示 str1[i]相应str2[j]时的最大得分 转移方程为 dp[i][j]=max(d ...
- poj 1080 Human Gene Functions(lcs,较难)
Human Gene Functions Time Limit: 1000MS Memory Limit: 10000K Total Submissions: 19573 Accepted: ...
- POJ 1080:Human Gene Functions LCS经典DP
Human Gene Functions Time Limit: 1000MS Memory Limit: 10000K Total Submissions: 18007 Accepted: ...
- POJ 1080 Human Gene Functions -- 动态规划(最长公共子序列)
题目地址:http://poj.org/problem?id=1080 Description It is well known that a human gene can be considered ...
- 杭电20题 Human Gene Functions
Problem Description It is well known that a human gene can be considered as a sequence, consisting o ...
- 刷题总结——Human Gene Functions(hdu1080)
题目: Problem Description It is well known that a human gene can be considered as a sequence, consisti ...
- Human Gene Functions POJ 1080 最长公共子序列变形
Description It is well known that a human gene can be considered as a sequence, consisting of four n ...
随机推荐
- 一个想了好几天的问题——关于8086cpu自己编写9号中断不能单步的问题
在<汇编语言>第十五章中我们可能遇到这样的问题:程序运行正确,但是debug单步调试,却无法运行,修改int 9h中断例程入口地址的指令,虚拟模式下,debug提示指令无效, ...
- UISearchController的使用
- (void)addSearchController { _searchController = [[UISearchController alloc] initWithSearchResultsC ...
- Python和Ruby开发中源文件中文注释乱码的解决方法(Eclipse和Aptana Studio3均适用)
Eclipse的设置(Aptana Studio3与Eclipse基本完全相同,此处略) window->preferences->general->editors->text ...
- CCF真题之窗口
201403-2 问题描述 在某图形操作系统中,有 N 个窗口,每个窗口都是一个两边与坐标轴分别平行的矩形区域.窗口的边界上的点也属于该窗口.窗口之间有层次的区别,在多于一个窗口重叠的区域里,只会显示 ...
- mysql linux 备份脚本
#!/bin/sh # mysql data backup script # # use mysqldump --help,get more detail. # BakDir=/root/back/m ...
- java 枚举类小结 Enum
好久没有接触枚举类了,差不多都忘了,今天抽出个时间总结一下吧.说实话,枚举类确实能够给我们带来很大的方便. 说明:枚举类它约定了一个范围,可以理解成只可以生成固定的几个对象让外界去调用,故枚举类中的构 ...
- java装饰者模式理解
java 装饰者模式其实就是扩展子类的功能,和继承是一个性质. 但继承是在编译时就固定扩展了父类的一些功能,而装饰者模式是在运行过程中动态绑定对象,实现一个子类可以随时扩展功能. 将方法排列组合,也可 ...
- [JAVA]在linux中设置JDK环境,ZendStudio,Eclipse
1.准备JDK安装包 下载地址: http://www.oracle.com/technetwork/java/javase/downloads/index.html 下载对应平台的tar.gz格式压 ...
- thinkphp 一个页面使用2次分页的方法
thinkphp内置ORG.Util.Page方法分页,使分页变得非常简单快捷. 但是如果一个页面里需要使用2次分页,就会产生冲突,这里先记录下百度来的解决办法 可以说是毫无技术含量的办法: 将Pag ...
- Sqlserver常用的时间函数---GETDATE、GETUTCDATE、DATENAME
GETDATE 按 datetime 值的 Microsoft® SQL Server™ 标准内部格式返回当前系统日期和时间.语法GETDATE ( )返回类型datetime注释日期函数可用在 SE ...