SWERC13 Trending Topic
map暴力。
。。
Imagine you are in the hiring process for a company whose principal activity is the analysis
of information in the Web. One of the tests consists in writing a program for maintaining up to
date a set of trending topics. You will be hired depending on the efficiency of your solution.
They provide you with text from the most active blogs. The text is organised daily and you
have to provide the sorted list of the N most frequent words during the last 7 days, when asked.
INPUT
Each input file contains one test case. The text corresponding to a day is delimited by tag
<text>. Queries of top N words can appear between texts corresponding to two different days.
A top N query appears as a tag like <top 10 />. In order to facilitate you the process of reading
from input, the number always will be delimited by white spaces, as in the sample.
Notes:
• All words are composed only of lowercase letters of size at most 20.
• The maximum number of different words that can appear is 20000.
• The maximum number of words per day is 20000.
• Words of length less than four characters are considered of no interest.
• The number of days will be at most 1000.
• 1 ≤ N ≤ 20
OUTPUT
The list of N most frequent words during the last 7 days must be shown given a query. Words
must appear in decreasing order of frequency and in alphabetical order when equal frequency.
There must be shown all words whose counter of appearances is equal to the word
at position N. Even if the amount of words to be shown exceeds N.
SAMPLE INPUT
<text>
imagine you are in the hiring process of a company whose
main business is analyzing the information that appears
in the web
</text>
<text>
a simple test consists in writing a program for
maintaining up to date a set of trending topics
</text>
<text>
you will be hired depending on the efficiency of your solution
</text>
<top 5 />
<text>
they provide you with a file containing the text
corresponding to a highly active blog
</text>
<text>
the text is organized daily and you have to provide the
sorted list of the n most frequent words during last week
when asked
</text>
<text>
each input file contains one test case the text corresponding
to a day is delimited by tag text
</text>
<text>
the query of top n words can appear between texts corresponding
to two different days
</text>
<top 3 />
<text>
blah blah blah blah blah blah blah blah blah
please please please
</text>
<top 3 />
2
Problem IProblem I
Trending Topic
SAMPLE OUTPUT
<top 5>
analyzing 1
appears 1
business 1
company 1
consists 1
date 1
depending 1
efficiency 1
hired 1
hiring 1
imagine 1
information 1
main 1
maintaining 1
process 1
program 1
simple 1
solution 1
test 1
that 1
topics 1
trending 1
whose 1
will 1
writing 1
your 1
</top>
<top 3>
text 4
corresponding 3
file 2
provide 2
test 2
words 2
</top>
<top 3>
blah 9
text 4
corresponding 3
please 3
</top>
#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <string>
#include <map>
#include <vector> using namespace std; typedef pair<int,int> pII; map<string,int> Hash;
vector<int> dy[11];
string rHash[20200];
int day_sum[11][20200];
char cache[30];
int now=9,pre=0,id=1;
int arr[20020],na;
string rss[20020];
bool vis[20020]; void DEBUG(int x)
{
int sz=dy[x].size();
for(int i=0;i<sz;i++)
{
cout<<"ID: "<<dy[x][i]<<" : "<<rHash[dy[x][i]]<<endl;
cout<<"sum: "<<day_sum[x][dy[x][i]]<<endl;
}
} struct RSP
{
int times;
string word;
}rsp[20020]; bool cmpRSP(RSP a,RSP b)
{
if(a.times!=b.times)
return a.times>b.times;
else
return a.word<b.word;
} void get_top(int now,int k)
{
int sz=dy[now].size();
na=0;
int _7dayago=(now+3)%10;
memset(vis,false,sizeof(vis));
for(int i=0;i<sz;i++)
{
if(vis[dy[now][i]]==false)
{
arr[na++]=day_sum[now][dy[now][i]]-day_sum[_7dayago][dy[now][i]];
vis[dy[now][i]]=true;
}
}
sort(arr,arr+na);
int sig=arr[max(0,na-k)];
int rn=0;
memset(vis,false,sizeof(vis));
for(int i=0;i<sz;i++)
{
int times=day_sum[now][dy[now][i]]-day_sum[_7dayago][dy[now][i]];
if(times >= sig &&vis[dy[now][i]]==false)
{
rsp[rn++]=(RSP){times,rHash[dy[now][i]]};
vis[dy[now][i]]=true;
}
}
sort(rsp,rsp+rn,cmpRSP);
printf("<top %d>\n",k);
for(int i=0;i<rn;i++)
{
cout<<rsp[i].word<<" "<<rsp[i].times<<endl;
}
printf("</top>\n");
} int main()
{
while(scanf("%s",cache)!=EOF)
{
if(strcmp(cache,"<text>")==0)
{
///read cache
pre=now;
now=(now+1)%10;
dy[now]=dy[pre];
memcpy(day_sum[now],day_sum[pre],sizeof(day_sum[0]));
///7 day ago ....
while(scanf("%s",cache))
{
if(cache[0]=='<') break;
if(strlen(cache)<4) continue;
string word=cache;
if(Hash[word]==0)
{
rHash[id]=word;
Hash[word]=id++;
}
int ID=Hash[word];
if(day_sum[pre][ID]==0)
dy[now].push_back(ID);
day_sum[now][ID]++;
}
}
else if(strcmp(cache,"<top")==0)
{
int top;
scanf("%d",&top); scanf("%s",cache);
get_top(now,top);
}
}
return 0;
}
SWERC13 Trending Topic的更多相关文章
- UVA 12686 Trending Topic
Trending Topic Time limit: 1.000 seconds Imagine you are in the hiring process for a company whose p ...
- USER STORIES AND USE CASES - DON’T USE BOTH
We’re in Orlando for a working session as part of the Core Team building BABOK V3 and over dinner th ...
- [转载]Three Trending Computer Vision Research Areas, 从CVPR看接下来几年的CV的发展趋势
As I walked through the large poster-filled hall at CVPR 2013, I asked myself, “Quo vadis Computer V ...
- Kafka 如何读取offset topic内容 (__consumer_offsets)
众所周知,由于Zookeeper并不适合大批量的频繁写入操作,新版Kafka已推荐将consumer的位移信息保存在Kafka内部的topic中,即__consumer_offsets topic,并 ...
- Kafka如何创建topic?
Kafka创建topic命令很简单,一条命令足矣:bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-facto ...
- Kafka0.8.2.1删除topic逻辑
前提条件: 在启动broker时候开启删除topic的开关,即在server.properties中添加: delete.topic.enable=true 命令: bin/kafka-topics ...
- [bigdata] kafka基本命令 -- 迁移topic partition到指定的broker
版本 0.9.2 创建topic bin/kafka-topics.sh --create --topic topic_name --partition 6 --replication-factor ...
- Kafka vs RocketMQ——多Topic对性能稳定性的影响-转自阿里中间件
引言 上期我们对比了RocketMQ和Kafka在多Topic场景下,收发消息的对比测试,RocketMQ表现稳定,而Kafka的TPS在64个Topic时可以保持13万,到了128个Topic就跌至 ...
- Kafka vs RocketMQ—— Topic数量对单机性能的影响-转自阿里中间件
引言 上一期我们对比了三类消息产品(Kafka.RabbitMQ.RocketMQ)单纯发送小消息的性能,受到了程序猿们的广泛关注,其中大家对这种单纯的发送场景感到并不过瘾,因为没有任何一个网站的业务 ...
随机推荐
- 题解报告:hdu 4764 Stone(巴什博弈)
题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=4764 Problem Description Tang and Jiang are good frie ...
- NPOI导出Excel自动计算公式问题
以前用过sheet.ForceFormulaRecalculation = true;当时能够自动计算出来. 今天把模板改了一下(没动公式,但是模板是老板改的,我也不知道他操作了什么),结果就不能自动 ...
- Appium Android 获取包名和 Activity 的几种方法 (转)
本文档主要记录“获取包名和 Activity 的方法”,用于自动化测试时启动APP.以下方法主要来源于网络和社区同学的贡献,特此感谢! 1. 方法一: pm list package查看包名 adb ...
- vue中怎样实现 路由拦截器
vue中怎样实现 路由拦截器(当用户没有登录的时候,跳转到登录页面,已经登录的时候,不能跳转到登录页,除非后台token失效) 在 我们需要实现这样 一个功能,登录拦截 其实就是 路由拦截,首先在定义 ...
- web开发——在网页中引用字体包(.ttf),即嵌入特殊字体
在写html时,有点时候需要显示一些特殊字体,不过这些特殊字体是系统一般不自带的,这时就需要我们自行加载要用的字体.方法如下: 1.首先在style里添加: @font-face { font-fam ...
- node mysql es6/es7改造
本文js代码采取了ES6/ES7的写法,而不是commonJs的写法.支持一波JS的新语法.node版本的mysql驱动,通过npm i mysql安装.官网地址:https://github.com ...
- 11.7 【Linq】在查询表达式和点标记之间作出选择
11.7.1 需要使用点标记的操作 最明显的必须使用点标记的情形是调用 Reverse . ToDictionary 这类没有相应的查询表达式语法的方法.然而即使查询表达式支持你要使用的查询操作符,也 ...
- HDU 5729 Rigid Frameworks (联通块计数问题)
题目传送门 通过看题解画图可以发现: 不论怎么转,一列里的横边/一行里的竖边始终平行 当我们加固一个格子时,会让它所在的这一行的竖边和这一列的横边保证垂直 而我们的目标是求所有竖边和横边都保证垂直的方 ...
- 使用VS Code断点调试PHP
vs code 使用一款杰出的轻量级代码编辑器,其中的插件工具不胜枚举而且还在不断增加.使用 vs code 调试 php 代码更是方便简洁,下面我们来一起看一下. 1. 安装 XDebug 扩展 调 ...
- SBC37x交叉编译平台QT+OPENCV【1】
在win7下安装Vbox虚拟机,然后安装Ubuntu10.04版本.上一篇说了根据厂商提供的编译器进行安装. 接下来要说的的环境准备.因为在Linux下对u盘的识别以及目录的共享,还有代码的编译传送运 ...