poj 1795 DNA Laboratory

DNA Laboratory

Time Limit: 5000MS		Memory Limit: 30000K
Total Submissions: 2892		Accepted: 516

Description

Background
Having started to build his own DNA lab just recently, the evil doctor Frankenstein is not quite up to date yet. He wants to extract his DNA, enhance it somewhat and clone himself. He has already figured out how to extract DNA from some of his blood cells, but unfortunately reading off the DNA sequence means breaking the DNA into a number of short pieces and analyzing those first. Frankenstein has not quite understood how to put the pieces together to recover the original sequence.
His pragmatic approach to the problem is to sneak into university and to kidnap a number of smart looking students. Not surprisingly, you are one of them, so you would better come up with a solution pretty fast.
Problem
You are given a list of strings over the alphabet A (for adenine), C (cytosine), G (guanine), and T (thymine),and your task is to find the shortest string (which is typically not listed) that contains all given strings as substrings.
If there are several such strings of shortest length, find the smallest in alphabetical/lexicographical order.

Input

The first line contains the number of scenarios.
For each scenario, the first line contains the number n of strings with 1 <= n <= 15. Then these strings with 1 <= length <= 100 follow, one on each line, and they consist of the letters "A", "C", "G", and "T" only.

Output

The output for every scenario begins with a line containing "Scenario #i:", where i is the number of the scenario starting at 1. Then print a single line containing the shortest (and smallest) string as described above. Terminate the output for the scenario with a blank line.

Sample Input

1

2

TGCACA

CAT

Sample Output

Scenario #1:

TGCACAT

Source

TUD Programming Contest 2004, Darmstadt, Germany

题意：给定n个由A,G,C,T构成的字符，现在要找到一个字符串使得该字符串能匹配到给定的n个字符，并且使得这个字符串字典序最小。

思路：首先n个字符中可能有一些字符包含于另一些字符当中，那么这些被包含的字符当然不需要考虑了。先预处理出一个字符的头部加上另一个字符后整体增加的长度大小,存于dist[i][j](在字符j的头部添加上字符i后整体增加的长度)中。显然可以用状压dp解决，dp[state][i]:由state状态中的字符构成的最小字符串，且这个字符串是以第i个字符开头的，dp[state][i]记录这个字符串的长度。

状态转移：dp[state|1<<j][j]=min(dp[state|1<<j][j],dp[state][i]+dist[j][i]);

利用dp，先得到字符串最小长度是多少，并且也知道这个字符串的头部应该是由哪个字符构成的。再从头部至尾部递归的寻找最小的字符串的组成。

AC代码：

#define _CRT_SECURE_NO_DEPRECATE

#include <iostream>

#include<vector>

#include<algorithm>

#include<cstring>

#include<bitset>

#include<set>

#include<map>

#include<cmath>

using namespace std;

#define N_MAX 16

#define MOD 100000000

#define INF 0x3f3f3f3f

typedef long long ll;

string s[N_MAX];

int dp[<<N_MAX][N_MAX];//状态是i,当前字符串的头部是字符串j时总字符串最小长度

int dist[N_MAX][N_MAX];//dist[i][j]:在j的前面加上字符串i，整体字符串所需要增加的长度

vector<string>vec;

int t,n;

void init() {

    memset(dp, INF, sizeof(dp));

    memset(dist, , sizeof(dist));

    for (int i = ; i < n; i++) {

        for (int j = ; j < n;j++) {

            if (i == j)continue;

            int sz = min(vec[i].size(), vec[j].size());

            for (int k = sz; k >= ;k--) {

                if (vec[i].substr(vec[i].size() - k) == vec[j].substr(, k)) {//首尾重复的部分不算

                    dist[i][j] = vec[i].size() - k;

                    break;

                }

            }

        }

    }

}

string res = "";

void dfs(int head,int state) {//state状态表示当前还有哪些字符串没有被使用

    if (state == )return;

    string min_s = "Z";int  min_head;

    for (int i = ; i < n;i++) {

        if ((state >> i & )&&dp[state|<<head][head]==dp[state][i]+dist[head][i]) {

            int Len = vec[head].size() - dist[head][i];

            string s = vec[i].substr(Len);

            if (min_s > s) { min_s = s; min_head = i; }

        }

    }

    res += min_s;

    dfs(min_head, state ^ ( << min_head));

}

int main() {

    int t; scanf("%d",&t);

    for (int cs = ; cs <= t;cs++) {

        scanf("%d",&n);

        printf("Scenario #%d:\n",cs);

        for (int i = ; i < n; i++) cin >> s[i];

        vec.clear();

        for (int i=; i < n;i++) {//检查是否有重复的字符串

            bool flag = ;

            for (int j = ; j < n;j++) {

                if (i == j || s[i].size() > s[j].size())continue;

                if (s[j].find(s[i]) != string::npos) {//找到重复

                    flag = ; break;

                }

            }

            if (flag)vec.push_back(s[i]);

        }

        if (vec.size() == ) { cout << s[] << endl << endl; continue; }

        sort(vec.begin(), vec.end());

        n = vec.size();

        init();

        int allstates =  << n;

        for (int i = ; i < n;i++) {

            dp[ << i][i] = vec[i].size();

        }

        for (int state = ; state < allstates; state++) {

             for (int i = ; i < n;i++) {

                if (dp[state][i] == INF)continue;

                for (int j = ; j < n; j++) {

                    if (!(state >> j & )) {

                        dp[state |  << j][j] = min(dp[state |  << j][j], dp[state][i] + dist[j][i]);

                    }

                }

            }

        }

        int head=;

        for(int i=;i<n;i++)

            if (dp[allstates - ][i] < dp[allstates-][head]) {

                head = i;

            }

        res = vec[head];

        dfs(head, (allstates -)^ ( << head));//!!!!

        cout << res << endl<<endl;

    }

    return ;

}

poj 1795 DNA Laboratory的更多相关文章

POJ 1795 DNA Laboratory（状压DP）
[题目链接] http://poj.org/problem?id=1795 [题目大意] 给出n个字符串,求一个最小长度的串,该串包含给出的所有字符串. 要求长度最小且字典序最小. [题解] dp[i ...
POJ 1795 DNA Laboratory (贪心+状压DP)
题意:给定 n 个字符串,让你构造出一个最短,字典序最小的字符串,包括这 n 个字符串. 析:首先使用状压DP,是很容易看出来的,dp[s][i] 表示已经满足 s 集合的字符串以第 i 个字符串 ...
POJ 1795
DNA Laboratory Time Limit: 5000MS Memory Limit: 30000K Total Submissions: 1425 Accepted: 280 Des ...
poj 1007 DNA Sorting 解题报告
题目链接:http://poj.org/problem?id=1007 本题属于字符串排序问题.思路很简单,把每行的字符串和该行字符串统计出的字母逆序的总和看成一个结构体.最后把全部行按照这个总和从小 ...
POJ 2778 DNA Sequence（AC自动机+矩阵加速）
DNA Sequence Time Limit: 1000MS Memory Limit: 65536K Total Submissions: 9899 Accepted: 3717 Desc ...
poj 2778 DNA Sequence ac自动机+矩阵快速幂
链接:http://poj.org/problem?id=2778 题意:给定不超过10串,每串长度不超过10的灾难基因:问在之后给定的长度不超过2e9的基因长度中不包含灾难基因的基因有多少中? DN ...
POJ 2778 DNA Sequence（AC自动机+矩阵快速幂）
题目链接:http://poj.org/problem?id=2778 题意:有m种DNA序列是有疾病的,问有多少种长度为n的DNA序列不包含任何一种有疾病的DNA序列.(仅含A,T,C,G四个字符) ...
POJ 3691 DNA Sequence （AC自动机 + 矩阵有bug，待修改）
DNA Sequence Time Limit: 1000MS Memory Limit: 65536K Total Submissions: 9889 Accepted: 3712 Desc ...
[POJ 1007] DNA Sorting C++解题
DNA Sorting Time Limit: 1000MS Memory Limit: 10000K Total Submissions: 77786 Accepted: 31201 ...

随机推荐

java连接MySQL数据库操作步骤
package com.swift; //这里导入的包是java.sql.Connection而不是com.mysql.jdbc.Connection import java.sql.Connecti ...
ES6 Proxy拦截器详解
Proxy 拦截器如有错误,麻烦指正,共同学习 Proxy的原意是"拦截",可以理解为对目标对象的访问和操作之前进行一次拦截.提供了这种机制,所以可以对目标对象进行修改和过滤的操 ...
html +css 登陆框中加用户图片，并设置登陆名不盖住图标
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...
为啥国内互联网公司都用centos而不是ubuntu?
一直以来都很好奇ubuntu和centos有啥区别,上学时接触的都是ubuntu,自己每次装virtual box的时候都会下个ubuntu,但是公司的服务器上装的都是centos,今天查了下知乎网友 ...
redis redis-cli 操作指令
默认选择 db库是 0 redis-cli -p 6379 查看当前所在“db库”所有的缓存key redis 127.0.0.1:6379> keys * 选择 db库 redis 1 ...
Linux 系统中 sudo 命令的 10 个技巧
概览 sudo 表示 "superuser do". 它允许已验证的用户以其他用户的身份来运行命令.其他用户可以是普通用户或者超级用户.然而,大部分时候我们用它来以提升的权限来运行 ...
04构建之法读书笔记——IT行业的创新
IT行业的创新: 1.创新的迷思: 灵光一闪现,伟大的创新就紧随其后:大家都喜欢创新:好的想法会赢:创新者都是一马当先:要成为领域的专家,才能创新:技术的创新是关键:成功的团队更能创新 2.创新的时机 ...
(JAVA指针),对象引用问题
引出指针从表面上看JAVA是没有指针的,或者是说,弱化了指针.但是指针在JAVA中还是真真切切存在的.在Java中我们称之为引用. String a;//引用为空 String a = new S ...
信号量和互斥量C语言示例理解线程同步
Table of Contents 1. 线程同步 1.1. 用信号量进行同步 1.2. 用互斥量进行同步 2. 参考资料线程同步了解线程信号量的基础知识,对深入理解python的线程会大有帮助. ...
sql server 不可见字符处理总结
前言问题描述:在表列里有肉眼不可见字符,导致一些更新或插入失败. 几年前第一次碰见这种问题是在读取考勤机人员信息时碰见的,折腾了一点时间,现在又碰到了还有点新发现就顺便一起记录下. 如下图所示 go ...

poj 1795 DNA Laboratory

poj 1795 DNA Laboratory的更多相关文章

随机推荐

热门专题