Description

String analysis often arises in applications from biology and chemistry, such as the study of DNA and protein molecules. One interesting problem is to find how many substrings are repeated (at least twice) in a long string. In this problem, you will write a program to find the total number of repeated substrings in a string of at most 100 000 alphabetic characters. Any unique substring that occurs more than once is counted. As an example, if the string is “aabaab”, there are 5 repeated substrings: “a”, “aa”, “aab”, “ab”, “b”. If the string is “aaaaa”, the repeated substrings are “a”, “aa”, “aaa”, “aaaa”. Note that repeated occurrences of a substring may overlap (e.g. “aaaa” in the second case).

Input

The input consists of at most 10 cases. The first line contains a positive integer, specifying the number of
cases to follow. Each of the following line contains a nonempty string of up to 100 000 alphabetic characters.

Output

For each line of input, output one line containing the number of unique substrings that are repeated. You
may assume that the correct answer fits in a signed 32-bit integer.

Sample Input

3
aabaab
aaaaa
AaAaA

Sample Output

5
4
5 题目大意:统计字符串中重复出现的子串数目。
题目分析:sum(max(height(i)-height(i-1),0))即为答案。 代码如下:
//# define AC

# ifndef AC

# include<iostream>
# include<cstdio>
# include<cstring>
# include<vector>
# include<queue>
# include<list>
# include<cmath>
# include<set>
# include<map>
# include<string>
# include<cstdlib>
# include<algorithm>
using namespace std;
# define mid (l+(r-l)/2) typedef long long LL;
typedef unsigned long long ULL; const int N=100000;
const int mod=1e9+7;
const int INF=0x7fffffff;
const LL oo=0x7fffffffffffffff; int SA[N+5];
int tSA[N+5];
int cnt[N+5];
int rk[N+5];
int *x,*y;
int height[N+5]; int idx(char c)
{
if('a'<=c&&c<='z') return c-'a';
return c-'A'+26;
} bool same(int i,int j,int k,int n)
{
if(y[i]-y[j]) return false;
if(i+k<n&&j+k>=n) return false;
if(i+k>=n&&j+k<n) return false;
return y[i+k]==y[j+k];
} void buildSA(char *s)
{
int n=strlen(s);
int m=52;
x=rk,y=tSA;
for(int i=0;i<m;++i) cnt[i]=0;
for(int i=0;i<n;++i) ++cnt[x[i]=idx(s[i])];
for(int i=1;i<m;++i) cnt[i]+=cnt[i-1];
for(int i=n-1;i>=0;--i) SA[--cnt[x[i]]]=i; for(int k=1;k<=n;k<<=1){
int p=0;
for(int i=n-k;i<n;++i) y[p++]=i;
for(int i=0;i<n;++i) if(SA[i]>=k) y[p++]=SA[i]-k; for(int i=0;i<m;++i) cnt[i]=0;
for(int i=0;i<n;++i) ++cnt[x[y[i]]];
for(int i=1;i<m;++i) cnt[i]+=cnt[i-1];
for(int i=n-1;i>=0;--i) SA[--cnt[x[y[i]]]]=y[i]; p=1;
swap(x,y);
x[SA[0]]=0;
for(int i=1;i<n;++i)
x[SA[i]]=same(SA[i],SA[i-1],k,n)?p-1:p++;
if(p>=n) break;
m=p;
}
} void getHeight(char *s)
{
int n=strlen(s);
for(int i=0;i<n;++i) rk[SA[i]]=i;
int k=0;
for(int i=0;i<n;++i){
if(rk[i]==0){
height[rk[i]]=k=0;
}else{
if(k) --k;
int j=SA[rk[i]-1];
while(i+k<n&&j+k<n&&s[i+k]==s[j+k])
++k;
height[rk[i]]=k;
}
}
} char str[N+5]; void solve()
{
int n=strlen(str);
int ans=0;
for(int i=0;i<n;++i){
if(height[i]>height[i-1])
ans+=height[i]-height[i-1];
}
printf("%d\n",ans);
} int main()
{
int T;
scanf("%d",&T);
while(T--)
{
scanf("%s",str);
buildSA(str);
getHeight(str);
solve();
}
return 0;
} # endif

  

CSU-1632 Repeated Substrings (后缀数组)的更多相关文章

  1. UVALive - 6869 Repeated Substrings 后缀数组

    题目链接: http://acm.hust.edu.cn/vjudge/problem/113725 Repeated Substrings Time Limit: 3000MS 样例 sample ...

  2. CSU-1632 Repeated Substrings[后缀数组求重复出现的子串数目]

    评测地址:https://cn.vjudge.net/problem/CSU-1632 Description 求字符串中所有出现至少2次的子串个数 Input 第一行为一整数T(T<=10)表 ...

  3. csu 1305 Substring (后缀数组)

    http://acm.csu.edu.cn/OnlineJudge/problem.php?id=1305 1305: Substring Time Limit: 2 Sec  Memory Limi ...

  4. POJ3415 Common Substrings —— 后缀数组 + 单调栈 公共子串个数

    题目链接:https://vjudge.net/problem/POJ-3415 Common Substrings Time Limit: 5000MS   Memory Limit: 65536K ...

  5. POJ1226 Substrings ——后缀数组 or 暴力+strstr()函数 最长公共子串

    题目链接:https://vjudge.net/problem/POJ-1226 Substrings Time Limit: 1000MS   Memory Limit: 10000K Total ...

  6. SPOJ - SUBST1 New Distinct Substrings —— 后缀数组 单个字符串的子串个数

    题目链接:https://vjudge.net/problem/SPOJ-SUBST1 SUBST1 - New Distinct Substrings #suffix-array-8 Given a ...

  7. SPOJ- Distinct Substrings(后缀数组&后缀自动机)

    Given a string, we need to find the total number of its distinct substrings. Input T- number of test ...

  8. SPOJ - DISUBSTR Distinct Substrings (后缀数组)

    Given a string, we need to find the total number of its distinct substrings. Input T- number of test ...

  9. POJ 3415 Common Substrings 后缀数组+并查集

    后缀数组,看到网上很多题解都是单调栈,这里提供一个不是单调栈的做法, 首先将两个串 连接起来求height   求完之后按height值从大往小合并.  height值代表的是  sa[i]和sa[i ...

  10. POJ1226:Substrings(后缀数组)

    Description You are given a number of case-sensitive strings of alphabetic characters, find the larg ...

随机推荐

  1. nginx 高并发配置参数(转载)

    声明:原文章来自http://blog.csdn.net/oonets334/article/details/7528558.如需知道更详细内容请访问. 一.一般来说nginx 配置文件中对优化比较有 ...

  2. oracle分析函数与over()(转)

    文章参考:http://blog.csdn.net/haiross/article/details/15336313 -- Oracle分析函数入门-- 分析函数是什么? 分析函数是Oracle专门用 ...

  3. TextWatcher 编辑框监听器

    TextWatcher tw = new TextWatcher() { @Override public void beforeTextChanged(CharSequence s, int sta ...

  4. HTTP2试用小记

    原文:https://www.clarencep.com/2016/11/17/upgrade-nginx-to-support-http2/ 这两天把公司的网站升级到了全站https. 顺便瞄到了H ...

  5. 2016-12-21(1)Git常用命令总结

    友情链接:http://www.cnblogs.com/mengdd/p/4153773.html

  6. .net中使用ODP.net访问Oracle数据库(无客户端部署方法)

      ODP.net是Oracle提供的数据库访问类库,其功能和效率上都有所保证,它还有一个非常方便特性:在客户端上,可以不用安装Oracle客户端,直接拷贝即可使用. 以下内容转载自:http://b ...

  7. easyUI中datetimebox和combobox的取值方法

    easyUi页面布局中,查询条件放在JS中,如下 <script type="text/javascript"> var columnList = [ [   {    ...

  8. Struts2框架下表单数据的流向以及映射关系

    本例框架很简单:默认页面为用户登录界面login.jsp,提交后由action类LoginAction.java来判断成功或失败,登录结果分别由success.jsp和failure.jsp呈现. 一 ...

  9. Guid与id区别

    一.产生Guid的方法 1.SqlServer中使用系统自带的NEWID函数: select NEWID() 2.C#中,使用Guid类型的NewGuid方法: Guid gid;           ...

  10. ArcGIS图层介绍

    什么是图层 图层是用来在 ArcGIS 产品套件中显示地理数据集的机制.每个图层代表一种数据集(可以是地图服务.图形或是矢量数据),并指定该数据集是如何描绘使用一组属性的. 包含一个地图控件的每个应用 ...