bzoj 3473 后缀自动机多字符串的子串处理方法

后缀自动机处理多字符串字串相关问题。

首先，和后缀数组一样，用分割符连接各字符串，然后建一个后缀自动机。

我们定义一个节点代表的字符串为它原本代表的所有串去除包含分割符后的串。每个节点代表的字符串的数量可以用DP来计算（不能用right集合来算了）。

对于原来n个串中的一个串，其所有前缀可以通过将该串放到自动机上跑来获得，对于某个前缀，其所有后缀包括在该前缀本身的节点以及parent树的祖先节点中。这样我们就获得访问某个串所有子串的技能了。

对于这道题，我们可以先建出后缀自动机，然后对于n个串中的每个串，找到包含其子串的所有节点（可以保证所有子串一定且唯一出现在某个节点中）。然后将它们的计数器+1。弄完后，对于每个节点，我们就可以知道其代表的串是n个串中多少个串的子串。

最后再对于每个串，找出所有字串（不同位置要区分），统计答案。

如果不同位置不区分，那么我们得到的节点不能重复。如果要区分，对于每个前缀，其parent树上的节点都要计算，即使以前被计算过（因为他们的结束位置不同，所以肯定要计算）。

 /**************************************************************

     Problem: 3473

     User: idy002

     Language: C++

     Result: Accepted

     Time:728 ms

     Memory:120928 kb

 ****************************************************************/

 #include <cstdio>

 #include <cstring>

 #include <cassert>

 #include <algorithm>

 #define N 200010

 #define S 500010

 #define P 18

 using namespace std;

 typedef long long dnt;

 int n, k;

 char buf[N], *shead[N];

 int son[S][], pnt[S], val[S], ntot, last;

 int head[S], dest[S], next[S], etot;

 int dfn[S], dep[S], anc[S][P+], idgr[S], stk[S], qu[S], top, bg, ed, idc;

 dnt dp[S], eff[S];

 int log[S];

 void init() {

     ntot = last = ;

     pnt[] = -;

 }

 void append( int c ) {

     int p = last;

     int np = ++ntot;

     val[np] = val[p]+;

     while( p!=- && !son[p][c] )

         son[p][c]=np, p=pnt[p];

     if( p==- ) {

         pnt[np] = ;

     } else {

         int q=son[p][c];

         if( val[q]==val[p]+ ) {

             pnt[np] = q;

         } else {

             int nq = ++ntot;

             memcpy( son[nq], son[q], sizeof(son[nq]) );

             val[nq] = val[p]+;

             pnt[nq] = pnt[q];

             pnt[q] = pnt[np] = nq;

             while( p!=- && son[p][c]==q )

                 son[p][c]=nq, p=pnt[p];

         }

     }

     last = np;

 }

 void make_topo() {

     for( int u=; u<=ntot; u++ ) {

         for( int c=; c<=; c++ ) {

             int v=son[u][c];

             if( !v ) continue;

             idgr[v]++;

         }

     }

     qu[bg=ed=] = ;

     while( bg<=ed ) {

         int u=qu[bg++];

         for( int c=; c<=; c++ ) {

             int v=son[u][c];

             if( !v ) continue;

             idgr[v]--;

             if( idgr[v]== )

                 qu[++ed] = v;

         }

     }

 }

 void dodp() {

     make_topo();

     dp[] = ;

     for( int i=; i<=ed; i++ ) {

         int u=qu[i];

         for( int c=; c<=; c++ ) {

             int v=son[u][c];

             if( !v ) continue;

             dp[v] += dp[u];

         }

     }

 }

 void adde( int u, int v ) {

     etot++;

     dest[etot] = v;

     next[etot] = head[u];

     head[u] = etot;

 }

 void build() {

     for( int u=; u<=ntot; u++ )

         adde( pnt[u], u );

 }

 void dfs( int u ) {

     dfn[u] = ++idc;

     for( int p=; p<=P && anc[u][p-]; p++ )

         anc[u][p] = anc[anc[u][p-]][p-];

     for( int t=head[u]; t; t=next[t] ) {

         int v=dest[t];

         dep[v] = dep[u]+;

         anc[v][] = u;

         dfs(v);

     }

 }

 int lca( int u, int v ) {

     if( dep[u]<dep[v] ) swap(u,v);

     int t=dep[u]-dep[v];

     for( int p=; t; p++,t>>= )

         if( t& ) u=anc[u][p];

     if( u==v ) return u;

     for( int p=log[dep[u]]; anc[u][]!=anc[v][]; p-- )

         if( anc[u][p]!=anc[v][p] ) u=anc[u][p],v=anc[v][p];

     return anc[u][];

 }

 void fetch( char *s ) {

     top = ;

     int u = ;

     for( int i=; s[i]; i++ ) {

         int c=s[i]-'a'+;

         u = son[u][c];

         assert(u!=);

         stk[++top] = u;

     }

 }

 bool cmp( int u, int v ) {

     return dfn[u]<dfn[v];

 }

 void effort( char *s ) {

     fetch(s);

     sort( stk+, stk++top, cmp );

     eff[stk[]] += ;

     for( int i=; i<=top; i++ ) {

         int u=stk[i];

         int ca=lca(u,stk[i-]);

         eff[u] += ;

         eff[ca] -= ;

     }

 }

 void bfs() {

     qu[bg=ed=] = ;

     while( bg<=ed ) {

         int u=qu[bg++];

         for( int t=head[u]; t; t=next[t] ) {

             int v=dest[t];

             qu[++ed] = v;

         }

     }

     for( int i=ed; i>=; i-- ) {

         int u=qu[i];

         for( int t=head[u]; t; t=next[t] ) {

             int v=dest[t];

             eff[u] += eff[v];

         }

     }

     dp[] = ;

     for( int i=; i<=ed; i++ ) {

         int u=qu[i];

         if( eff[u]>=k ) {

             dp[u] = dp[pnt[u]]+dp[u];

         } else {

             dp[u] = dp[pnt[u]];

         }

     }

 }

 void query( char *s ) {

     fetch(s);

     sort( stk+, stk++top, cmp );

     dnt rt = ;

     for( int i=; i<=top; i++ ) {

         int u=stk[i];

         rt += dp[u];

     }

     printf( "%lld ", rt );

 }

 int main() {

     //  input and build sam

     scanf( "%d%d", &n, &k );

     init();

     char *buf_cur = buf;

     for( int i=; i<=n; i++ ) {

         shead[i] = buf_cur;

         scanf( "%s", shead[i] );

         for( int j=; shead[i][j]; j++ )

             append( shead[i][j]-'a'+ );

         append(  );

         buf_cur += strlen(shead[i]) + ;

     }

     log[] = -;

     for( int i=; i<=ntot; i++ ) log[i] = log[i>>]+;

     //  dodp to calc the number of real substring

     dodp();

     //  build parent tree and its dfs order

     build();

     dfs();

     //  calc echo string's effort

     for( int i=; i<=n; i++ )

         effort( shead[i] );

     //  bfs to calc the sum of subans

     bfs();

     //  query the answer of eacho string

     for( int i=; i<=n; i++ )

         query( shead[i] );

     printf( "\n" );

 }

bzoj 3473 后缀自动机多字符串的子串处理方法的更多相关文章

POJ 1509 Glass Beads 后缀自动机模板字符串的最小表示
http://poj.org/problem?id=1509 后缀自动机其实就是一个压缩储存空间时间(对节点重复利用)的储存所有一个字符串所有子串的trie树,如果想不起来长什么样子可以百度一下找个图 ...
Lexicographical Substring Search (spoj7259) (sam(后缀自动机)+第k小子串)
Little Daniel loves to play with strings! He always finds different ways to have fun with strings! K ...
SPOJ LCS 后缀自动机找最大公共子串
这里用第一个字符串构建完成后缀自动机以后不断用第二个字符串从左往右沿着后缀自动机往前走,如能找到,那么当前匹配配数加1 如果找不到,那么就不断沿着后缀树不断往前找到所能匹配到当前字符的最大长度,然后 ...
[SPOJ1811]Longest Common Substring 后缀自动机最长公共子串
题目链接:http://www.spoj.com/problems/LCS/ 题意如题目,求两个串的最大公共子串LCS. 首先对其中一个字符串A建立SAM,然后用另一个字符串B在上面跑. 用一个变量L ...
SPOJ - SUBST1 New Distinct Substrings —— 后缀数组单个字符串的子串个数
题目链接:https://vjudge.net/problem/SPOJ-SUBST1 SUBST1 - New Distinct Substrings #suffix-array-8 Given a ...
bzoj 3676 后缀自动机+马拉车+树上倍增
思路:用马拉车把一个串中的回文串个数降到O(n)级别,然后每个串在后缀自动机上倍增找个数. #include<bits/stdc++.h> #define LL long long #de ...
不在B中的A的子串数量 HDU - 4416 (后缀自动机模板题目)
题目: 给定一个字符串a,又给定一系列b字符串,求字符串a的子串不在b中出现的个数. 题解: 先将所有的查询串放入后缀自动机(每次将sam.last=1)(算出所有子串个数) 然后将母串放入后缀自动机 ...
BZOJ 4327 JSOI2012 玄武密码（后缀自动机）
[题目链接] http://www.lydsy.com/JudgeOnline/problem.php?id=4327 [题目大意] 求每个子串在母串中的最长匹配 [题解] 对母串建立后缀自动机,用每 ...
【算法专题】后缀自动机SAM
后缀自动机是用于识别子串的自动机. 学习推荐:陈立杰讲稿,本文记录重点部分和感性理解(论文语言比较严格). 刷题推荐:[后缀自动机初探],题目都来自BZOJ. [Right集合] 后缀自动机真正优于后 ...

随机推荐

MVVM模式的命令绑定
命令绑定要达到的效果命令绑定要关注的核心就是两个方面的问题,命令能否执行和命令怎么执行.也就是说当View中的一个Button绑定了ViewModel中一个命令后,什么时候这个Button是可用的, ...
jQuery-对标签的样式操作
一.操作样式类 // 1.给标签添加样式类 $("选择器").addClass("类名") // 2.移除标签的样式类 $("选择器").r ...
ntp/系统时钟/硬件时钟/双系统下计算机时间读取的问题
http://blog.chinaunix.net/uid-182041-id-3464524.html //linux系统时间和硬件时钟问题(date和hwclock) http://j ...
openstack发展历程及其架构简介
1.0 Openstack介绍 OpenStack既是一个社区,也是一个项目和一个开源软件,它提供了一个部署云的操作平台或工具集.其宗旨在于,帮助组织运行为虚拟计算或存储服务的云,为公有云.私有云,也 ...
C++中关于位域的概念
原文来自于http://topic.csdn.net/t/20060801/11/4918904.html中的回复位域有些信息在存储时,并不需要占用一个完整的字节, 而只需占几个或一个二进制位 ...
js实现数据视图双向绑定原理
这个方法了不起啊..vue.js和avalon.js 都是通过它实现双向绑定的..而且Object.observe也被草案发起人撤回了..所以defineProperty更有必要了解一下了几行代码看他 ...
【前端vue开发】vue单页应用添加百度统计
前言申请百度统计后,会得到一段JS代码,需要插入到每个网页中去,在Vue.js项目首先想到的可能就是,把统计代码插入到index.html入口文件中,这样就全局插入,每个页面就都有了;这样做就涉及到 ...
Python的set集合详解
Python 还包含了一个数据类型 -- set (集合).集合是一个无序不重复元素的集.基本功能包括关系测试和消除重复元素.集合对象还支持 union(联合),intersection(交),dif ...
OA项目CRUD和单元测试（一）
使用ModeFirst方法生成数据库,EntityFramework5.0. 一:Model层的模型:(根据模型生成数据库) 二:Dal层的UserInfo代码: namespace SunOA.EF ...
一个带bash，带glibc，中国时区，非root用户可运行crond命令的基于alpine镜像的Dockerfile
这个镜像现在说起来简单, 带bash(增加执行脚本的兼容性,带GLIBC,中国时区,非root用户可运行crond命令-安全) 但让我开始陷入时,真的让我有段时间有点爆了. 比如,将filebeat文 ...

bzoj 3473 后缀自动机多字符串的子串处理方法

bzoj 3473 后缀自动机多字符串的子串处理方法的更多相关文章

随机推荐

热门专题