Description

String analysis often arises in applications from biology and chemistry, such as the study of DNA and protein molecules. One interesting problem is to find how many substrings are repeated (at least twice) in a long string. In this problem, you will write a program to find the total number of repeated substrings in a string of at most 100 000 alphabetic characters. Any unique substring that occurs more than once is counted. As an example, if the string is “aabaab”, there are 5 repeated substrings: “a”, “aa”, “aab”, “ab”, “b”. If the string is “aaaaa”, the repeated substrings are “a”, “aa”, “aaa”, “aaaa”. Note that repeated occurrences of a substring may overlap (e.g. “aaaa” in the second case).

Input

The input consists of at most 10 cases. The first line contains a positive integer, specifying the number of
cases to follow. Each of the following line contains a nonempty string of up to 100 000 alphabetic characters.

Output

For each line of input, output one line containing the number of unique substrings that are repeated. You
may assume that the correct answer fits in a signed 32-bit integer.

Sample Input

3
aabaab
aaaaa
AaAaA

Sample Output

5
4
5 题目大意:统计字符串中重复出现的子串数目。
题目分析:sum(max(height(i)-height(i-1),0))即为答案。 代码如下:
//# define AC

# ifndef AC

# include<iostream>
# include<cstdio>
# include<cstring>
# include<vector>
# include<queue>
# include<list>
# include<cmath>
# include<set>
# include<map>
# include<string>
# include<cstdlib>
# include<algorithm>
using namespace std;
# define mid (l+(r-l)/2) typedef long long LL;
typedef unsigned long long ULL; const int N=100000;
const int mod=1e9+7;
const int INF=0x7fffffff;
const LL oo=0x7fffffffffffffff; int SA[N+5];
int tSA[N+5];
int cnt[N+5];
int rk[N+5];
int *x,*y;
int height[N+5]; int idx(char c)
{
if('a'<=c&&c<='z') return c-'a';
return c-'A'+26;
} bool same(int i,int j,int k,int n)
{
if(y[i]-y[j]) return false;
if(i+k<n&&j+k>=n) return false;
if(i+k>=n&&j+k<n) return false;
return y[i+k]==y[j+k];
} void buildSA(char *s)
{
int n=strlen(s);
int m=52;
x=rk,y=tSA;
for(int i=0;i<m;++i) cnt[i]=0;
for(int i=0;i<n;++i) ++cnt[x[i]=idx(s[i])];
for(int i=1;i<m;++i) cnt[i]+=cnt[i-1];
for(int i=n-1;i>=0;--i) SA[--cnt[x[i]]]=i; for(int k=1;k<=n;k<<=1){
int p=0;
for(int i=n-k;i<n;++i) y[p++]=i;
for(int i=0;i<n;++i) if(SA[i]>=k) y[p++]=SA[i]-k; for(int i=0;i<m;++i) cnt[i]=0;
for(int i=0;i<n;++i) ++cnt[x[y[i]]];
for(int i=1;i<m;++i) cnt[i]+=cnt[i-1];
for(int i=n-1;i>=0;--i) SA[--cnt[x[y[i]]]]=y[i]; p=1;
swap(x,y);
x[SA[0]]=0;
for(int i=1;i<n;++i)
x[SA[i]]=same(SA[i],SA[i-1],k,n)?p-1:p++;
if(p>=n) break;
m=p;
}
} void getHeight(char *s)
{
int n=strlen(s);
for(int i=0;i<n;++i) rk[SA[i]]=i;
int k=0;
for(int i=0;i<n;++i){
if(rk[i]==0){
height[rk[i]]=k=0;
}else{
if(k) --k;
int j=SA[rk[i]-1];
while(i+k<n&&j+k<n&&s[i+k]==s[j+k])
++k;
height[rk[i]]=k;
}
}
} char str[N+5]; void solve()
{
int n=strlen(str);
int ans=0;
for(int i=0;i<n;++i){
if(height[i]>height[i-1])
ans+=height[i]-height[i-1];
}
printf("%d\n",ans);
} int main()
{
int T;
scanf("%d",&T);
while(T--)
{
scanf("%s",str);
buildSA(str);
getHeight(str);
solve();
}
return 0;
} # endif

  

CSU-1632 Repeated Substrings (后缀数组)的更多相关文章

  1. UVALive - 6869 Repeated Substrings 后缀数组

    题目链接: http://acm.hust.edu.cn/vjudge/problem/113725 Repeated Substrings Time Limit: 3000MS 样例 sample ...

  2. CSU-1632 Repeated Substrings[后缀数组求重复出现的子串数目]

    评测地址:https://cn.vjudge.net/problem/CSU-1632 Description 求字符串中所有出现至少2次的子串个数 Input 第一行为一整数T(T<=10)表 ...

  3. csu 1305 Substring (后缀数组)

    http://acm.csu.edu.cn/OnlineJudge/problem.php?id=1305 1305: Substring Time Limit: 2 Sec  Memory Limi ...

  4. POJ3415 Common Substrings —— 后缀数组 + 单调栈 公共子串个数

    题目链接:https://vjudge.net/problem/POJ-3415 Common Substrings Time Limit: 5000MS   Memory Limit: 65536K ...

  5. POJ1226 Substrings ——后缀数组 or 暴力+strstr()函数 最长公共子串

    题目链接:https://vjudge.net/problem/POJ-1226 Substrings Time Limit: 1000MS   Memory Limit: 10000K Total ...

  6. SPOJ - SUBST1 New Distinct Substrings —— 后缀数组 单个字符串的子串个数

    题目链接:https://vjudge.net/problem/SPOJ-SUBST1 SUBST1 - New Distinct Substrings #suffix-array-8 Given a ...

  7. SPOJ- Distinct Substrings(后缀数组&后缀自动机)

    Given a string, we need to find the total number of its distinct substrings. Input T- number of test ...

  8. SPOJ - DISUBSTR Distinct Substrings (后缀数组)

    Given a string, we need to find the total number of its distinct substrings. Input T- number of test ...

  9. POJ 3415 Common Substrings 后缀数组+并查集

    后缀数组,看到网上很多题解都是单调栈,这里提供一个不是单调栈的做法, 首先将两个串 连接起来求height   求完之后按height值从大往小合并.  height值代表的是  sa[i]和sa[i ...

  10. POJ1226:Substrings(后缀数组)

    Description You are given a number of case-sensitive strings of alphabetic characters, find the larg ...

随机推荐

  1. 跨域资源共享(CORS)问题解决方案

    CORS:Cross-Origin Resource Sharing(跨域资源共享) CORS被浏览器支持的版本情况如下:Chrome 3+.IE 8+.Firefox 3.5+.Opera 12+. ...

  2. Extjs tree 更改图标

    去掉 树的叶子图标 .x-tree-node-icon { display: none; //不显示图标 } 更改图标  在后台返回的json中 有  添加  iconCls 属性 如    icon ...

  3. linux tar命令的使用

    tar格式,会打包成一个文件,可以对多个目录,或者多个文件进行打包 tar命令只是打包,不会压缩,打包前后大小是一样的 tar命令 -c    //打包 -x    //解压 -f    //指定文件 ...

  4. 【转】 Linux shell的&&和||

    http://www.2cto.com/os/201302/189655.html Linux shell的&&和||   shell 在执行某个命令的时候,会返回一个返回值,该返回值 ...

  5. C#读写文件的方法汇总_C#教程_脚本之家

    C#读写文件的方法汇总_C#教程_脚本之家 http://www.jb51.net/article/34936.htm

  6. em 和 px相互转换

    <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8" ...

  7. string和char*的相互转换

    原文地址: 点击打开链接

  8. difference between forward and sendredirect

    Difference between SendRedirect and forward is one of classical interview questions asked during jav ...

  9. node.js实用小模块

    1.浮点数操作 npm install float 2.MD5加密类 npm install MD5 3.xml解析类 1 npm install elementtree 4.转换字符串大小写 1 n ...

  10. centos单用户模式修改ROOT密码

    首先启动的时候的时候,需要进入单用户模式(进入单用户模式的前提是系统引导器能正常工作),单用户模式是不需要输入密码,并且(进入单用户模式,没有开启网络服务,不支持远程连接 )网上说可以通过GRUB ( ...