poj2778 DNA Sequence(AC自动机+矩阵快速幂）

Description

It's well known that DNA Sequence is a sequence only contains A, C, T and G, and it's very useful to analyze a segment of DNA Sequence，For example, if a animal's DNA sequence contains segment ATC then it may mean that the animal may have a genetic disease.
Until now scientists have found several those segments, the problem is how many kinds of DNA sequences of a species don't contain those segments.

Suppose that DNA sequences of a species is a sequence that consist of A, C, T and G，and the length of sequences is a given integer n.

Input

First line contains two integer m (0 <= m <= 10), n (1 <= n <=2000000000). Here, m is the number of genetic disease segment, and n is the length of sequences.

Next m lines each line contain a DNA genetic disease segment, and length of these segments is not larger than 10.

Output

An integer, the number of DNA sequences, mod 100000.

Sample Input

4 3

AT

AC

AG

AA

Sample Output

题意：给你m个长度不超过10的字符串，每个字符串只有'A','T','C','G'这四种，现在让你用这四种字符拼成n个字符，问有多少种拼凑的方案，使得新的字符串不包含前面m个字符串。

思路：可以先构造m个字符串的trie图，然后把树上含有一个字符串尾节点的节点价值val标为1，其他都为0，那么对于每一个节点出边都有4条，如果我们把边看做走的下一步，那么题目就转变成在图上走n步，不能走到危险节点（即某个字符串的尾节点，也是节点val值为0的点），然后我们就想到了邻接矩阵A，用a[i][j]表示节点i和j间的边的条数，那么A的n次就是从一个点到另一个点走n步的方案数。

写代码的时候有一点要注意，如果AT中的T是危险节点，那么trie树中的CCATC的T也是危险节点，也要标记val=1,这一步在bfs的时候实现，加上这一句：" if(val[fail[x]]) val[x]=1;"

#include<iostream>

#include<stdio.h>

#include<stdlib.h>

#include<string.h>

#include<math.h>

#include<vector>

#include<map>

#include<set>

#include<queue>

#include<stack>

#include<string>

#include<algorithm>

using namespace std;

typedef long long ll;

#define inf 99999999

#define pi acos(-1.0)

#define maxnode 510000

#define MOD 100000

char s[100];

int num[1006];

struct trie{

    ll sz,root,val[maxnode],next[maxnode][4],fail[maxnode];

    int q[1111111];

    void init(){

        int i;

        sz=root=0;

        val[0]=0;

        for(i=0;i<4;i++){

            next[root][i]=-1;

        }

    }

    int idx(char c){

        if(c=='A')return 0;

        if(c=='C')return 1;

        if(c=='T')return 2;

        if(c=='G')return 3;

    }

    void charu(char *s){

        ll i,j,u=0;

        ll len=strlen(s);

        for(i=0;i<len;i++){

            int c=idx(s[i]);

            if(next[u][c]==-1){

                sz++;

                val[sz]=0;

                next[u][c]=sz;

                u=next[u][c];

                for(j=0;j<4;j++){

                    next[u][j]=-1;

                }

            }

            else{

                u=next[u][c];

            }

        }

        val[u]=1;

    }

    void build(){

        int i,j;

        int front,rear;

        front=1;rear=0;

        for(i=0;i<4;i++){

            if(next[root][i]==-1 ){

                next[root][i]=root;

            }

            else{

                fail[next[root][i] ]=root;

                rear++;

                q[rear]=next[root][i];

            }

        }

        while(front<=rear){

            int x=q[front];

            if(val[fail[x]])        //!!!!!这里非常重要，如果一个节点的fail节点的val值存在(即以当前节点为尾节点的前缀的后缀是某一个字符串，那么该节点和fail指针指的节点一样也是危险节点)

                val[x]=1;

            front++;

            for(i=0;i<4;i++){

                if(next[x][i]==-1){

                    next[x][i]=next[fail[x] ][i];

                }

                else{

                    fail[next[x][i] ]=next[fail[x] ][i];

                    rear++;

                    q[rear]=next[x][i];

                }

            }

        }

    }

}ac;

struct matrix{

    ll n,m,i;

    ll data[105][105];

    void init_danwei(){

        for(i=0;i<n;i++){

            data[i][i]=1;

        }

    }

};

matrix multi(matrix &a,matrix &b){

    ll i,j,k;

    matrix temp;

    temp.n=a.n;

    temp.m=b.m;

    for(i=0;i<temp.n;i++){

        for(j=0;j<temp.m;j++){

            temp.data[i][j]=0;

        }

    }

    for(i=0;i<a.n;i++){

        for(k=0;k<a.m;k++){

            if(a.data[i][k]>0){

                for(j=0;j<b.m;j++){

                    temp.data[i][j]=(temp.data[i][j]+(a.data[i][k]*b.data[k][j])%MOD )%MOD;

                }

            }

        }

    }

    return temp;

}

matrix fast_mod(matrix &a,ll n){

    matrix ans;

    ans.n=a.n;

    ans.m=a.m;

    memset(ans.data,0,sizeof(ans.data));

    ans.init_danwei();

    while(n>0){

        if(n&1)ans=multi(ans,a);

        a=multi(a,a);

        n>>=1;

    }

    return ans;

}

int main()

{

    ll n,m,i,j;

    while(scanf("%lld%lld",&m,&n)!=EOF)

    {

        ac.init();

        for(i=1;i<=m;i++){

            scanf("%s",s);

            ac.charu(s);

        }

        ac.build();

        matrix a;

        a.n=a.m=ac.sz+1;

        memset(a.data,0,sizeof(a.data));

        for(i=0;i<=ac.sz;i++){

            for(j=0;j<4;j++){

                if(ac.val[ac.next[i][j] ]==0 ){

                    a.data[i][ac.next[i][j] ]++;

                }

            }

        }

        matrix cnt;

        cnt=fast_mod(a,n);

        ll sum=0;

        for(i=0;i<=cnt.n;i++){

            sum=(sum+cnt.data[0][i])%MOD;

        }

        printf("%lld\n",sum);

    }

    return 0;

}

poj2778 DNA Sequence(AC自动机+矩阵快速幂）的更多相关文章

[poj2778]DNA Sequence(AC自动机+矩阵快速幂)
题意:有m种DNA序列是有疾病的,问有多少种长度为n的DNA序列不包含任何一种有疾病的DNA序列.(仅含A,T,C,G四个字符) 解题关键:AC自动机,实际上就是一个状态转移图,注意能少取模就少取模, ...
poj 2778 DNA Sequence ac自动机+矩阵快速幂
链接:http://poj.org/problem?id=2778 题意:给定不超过10串,每串长度不超过10的灾难基因:问在之后给定的长度不超过2e9的基因长度中不包含灾难基因的基因有多少中? DN ...
poj2778DNA Sequence (AC自动机+矩阵快速幂)
转载请注明出处: http://www.cnblogs.com/fraud/ ——by fraud DNA Sequence Time Limit: 1000MS Memory ...
POJ2778 DNA Sequence(AC自动机矩阵)
先使用AC自动机求得状态转移关系,再建立矩阵,mat[i][j]表示一步可从i到j且i,j节点均非终止字符的方案数,则此矩阵的n次方表示n步从i,到j的方法数. #include<cstdio& ...
POJ2778 DNA Sequence（AC自动机+矩阵快速幂）
题目给m个病毒串,问不包含病毒串的长度n的DNA片段有几个. 感觉这题好神,看了好久的题解. 所有病毒串构造一个AC自动机,这个AC自动机可以看作一张有向图,图上的每个顶点就是Trie树上的结点,每个 ...
POJ 2778 DNA Sequence （ac自动机+矩阵快速幂）
DNA Sequence Description It's well known that DNA Sequence is a sequence only contains A, C, T and G ...
DNA Sequence POJ - 2778 AC自动机 && 矩阵快速幂
It's well known that DNA Sequence is a sequence only contains A, C, T and G, and it's very useful to ...
POJ 2778 DNA Sequence（AC自动机 + 矩阵快速幂）题解
题意:给出m个模式串,要求你构造长度为n(n <= 2000000000)的主串,主串不包含模式串,问这样的主串有几个思路:因为要不包含模式串,显然又是ac自动机.因为n很大,所以用dp不太好 ...
POJ2778(SummerTrainingDay10-B AC自动机+矩阵快速幂)
DNA Sequence Time Limit: 1000MS Memory Limit: 65536K Total Submissions: 17160 Accepted: 6616 Des ...

随机推荐

Laya 踩坑日记-BitmapFont 不显示空格
项目中有用到艺术字,美术通过 bmfont64 将字体导给我了,结果发现在应用上空格不显示如图: 今天去深究了一下这个问题,发现是底层没封装好,然后自己改了一下下面是改过的 BitmapFont ...
vue 侦听器watch 之深度监听 deep
<template> <div> <p>FullName: {{person.fullname}}</p> <p>FirstName: &l ...
jenkins 构建历史显示版本号
0 jenkins 安装此插件: 此插件名为 " groovy postbuild " 1 效果图: 2 安装插件: 系统管理 --> 插件管理 --> 可选 ...
Js中函数式编程的理解
函数式编程的理解函数式编程是一种编程范式,可以理解为是利用函数把运算过程封装起来,通过组合各种函数来计算结果.函数式编程与命令式编程最大的不同其实在于,函数式编程关心数据的映射,命令式编程关心解决问 ...
【Linux】常用的Linux可插拔认证模块（PAM）应用举例：pam_limits.so、pam_rootok.so和pam_userdb.so模块
常用的Linux可插拔认证模块(PAM)应用举例:pam_limits.so.pam_rootok.so和pam_userdb.so模块 pam_limits.so模块: pam_limits.so模 ...
探索微软开源Python自动化神器Playwright
相信玩过爬虫的朋友都知道selenium,一个自动化测试的神器工具.写个Python自动化脚本解放双手基本上是常规的操作了,爬虫爬不了的,就用自动化测试凑一凑. 虽然selenium有完备的文档,但也 ...
Redis持久化之RDB和AOF
Redis是一个键值对数据库服务器,由于Redis是内存数据库,那么有很多内存的特点,例如掉电易失,或者进程退出,服务器中的数据也将消失不见,所以需要一种方法将数据从内存中写到磁盘,这一过程称之为数据 ...
C# 8.0 可空(Nullable)给ASP.NET Core带来的坑
Nullable reference types(可为空引用类型) 可为空引用类型不讲武德 C#8.0 引入了"可为空引用类型"和"不可为空引用类型",使我们能 ...
loj10009钓鱼___vector的调试
题目描述在一条水平路边,有 n 个钓鱼湖,从左到右编号为1,2,...,n .佳佳有 h 个小时的空余时间,他希望利用这个时间钓到更多的鱼.他从1 出发,向右走,有选择的在一些湖边停留一定的时间( ...
React---路由跳转
最近在开发react的项目中,很多地方都是使用组件式的跳转方式,但是怎么样使用js去控制页面的跳转呢? withRouter withRouter 是一个高阶组件,把 match,location,h ...

poj2778 DNA Sequence(AC自动机+矩阵快速幂）

poj2778 DNA Sequence(AC自动机+矩阵快速幂）的更多相关文章

随机推荐

热门专题