USACO 5.4 Character Recognition
Character Recognition
This problem requires you to write a program that performs character recognition.
Each ideal character image has 20 lines of 20 digits. Each digit is a `0' or a `1'. See Figure 1a (way below) for the layout of character images in the file.
The file font.in contains representations of 27 ideal character images in this order:
_abcdefghijklmnopqrstuvwxyz
where _ represents the space character. Each ideal character is 20 lines long.
The input file contains one or more potentially corrupted character images. A character image might be corrupted in these ways:
- at most one line might be duplicated (and the duplicate immediately follows)
- at most one line might be missing
- some 0's might be changed to 1's
- some 1's might be changed to 0's.
No character image will have both a duplicated line and a missing line. No more than 30% of the 0's and 1's will be changed in any character image in the evaluation datasets.
In the case of a duplicated line, one or both of the resulting lines may have corruptions, and the corruptions may be different.
Write a program to recognize the sequence of one or more characters in the image provided in the input file using the font provided in file font.in.
Recognize a character image by choosing the font character images that require the smallest number of overall changed 1's and 0's to be corrupted to the given font image, given the most favourable assumptions about duplicated or omitted lines. Count corruptions in only the least corrupted line in the case of a duplicated line. You must determine the sequence of characters that most closely matches the input sequence (the one that requires the least number of corruptions). There is a unique best solution for each evaluation dataset.
A correct solution will use precisely all of the data supplied in the input file.
PROGRAM NAME: charrec
INPUT FORMAT (both input files)
Both input files begin with an integer N (19 <= N < 1200) that specifies the number of lines that follow:
N
(digit1)(digit2)(digit3) ... (digit20)
(digit1)(digit2)(digit3) ... (digit20)
...
Each line of data is 20 digits wide. There are no spaces separating the zeros and ones.
The file font.in describes the font. It will always contain 541 lines. It may differ for each evaluation dataset.
SAMPLE INPUT (file charrec.in)
|
Incomplete sample showing the |
Sample |
|
font.in |
charrec.in |
| 540 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000011100000000000 00000111111011000000 00001111111001100000 00001110001100100000 00001100001100010000 00001100000100010000 00000100000100010000 00000010000000110000 00000001000001110000 00001111111111110000 00001111111111110000 00001111111111000000 00001000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 00000000000000000000 |
19 00000000000000000000 00000000000000000000 00000000000000000000 00000011100000000000 00100111011011000000 00001111111001100000 00001110001100100000 00001100001100010000 00001100000100010000 00000100000100010000 00000010000000110000 00001111011111110000 00001111111111110000 00001111111111000000 00001000010000000000 00000000000000000000 00000000000001000000 00000000000000000000 00000000000000000000 |
| Figure 1a | Figure 1b |
OUTPUT FORMAT
Your program must produce an output file that contains a single string of the characters recognized. Its format is a single line of ASCII text. The output should not contain any separator characters. If your program does not recognize a particular character, it must output a ? in the appropriate position.
SAMPLE OUTPUT (file charrec.out)
a
Note that the 'sample output' formerly displayed a blank followed by an 'a', but that seems wrong now.
————————————————————————————————————————题解
任何不会做的题一定不能虚的坦然自若地说完题解
这道题写模拟一定会超时,然后我们发现有最优子结构
也就是dp[m]表示匹配到了第m行最小的不相符字符数,我们再开一个数组c[j][k]表示从第j行下匹配了k行的最小代价,同时记录这个字符是什么
20行直接匹配,19行枚举pass掉该字符标准表示的哪一行,21行枚举pass掉匹配的这21行的哪一行然后匹配
然后记搜转移,因为并不是所有状态都要用到,最后一个点是0.9s才过
【把font.in点开后粘到编辑器然后看缩略图会发现极为良心的……当然近视眼只需要摘下眼镜了……】
/*
ID: ivorysi
LANG: C++
PROG: charrec
*/
#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <queue>
#include <set>
#include <vector>
#include <string.h>
#include <cmath>
#include <stack>
#include <map>
#define siji(i,x,y) for(int i=(x);i<=(y);++i)
#define gongzi(j,x,y) for(int j=(x);j>=(y);--j)
#define xiaosiji(i,x,y) for(int i=(x);i<(y);++i)
#define sigongzi(j,x,y) for(int j=(x);j>(y);--j)
#define inf 0x3f3f3f3f
//#define ivorysi
#define mo 97797977
#define hash 974711
#define base 47
#define pss pair<string,string>
#define MAXN 5000
#define fi first
#define se second
#define pii pair<int,int>
#define esp 1e-8
typedef long long ll;
using namespace std;
char *os=" abcdefghijklmnopqrstuvwxyz?";
char ch[][],word[][];
int n1,n2;
int c[][],b[],pr[][];
void init() {
FILE *fin=fopen("font.in","r");
fscanf(fin,"%d",&n1);
siji(i,,n1) {
fscanf(fin,"%s",ch[i]+);
}
scanf("%d",&n2);
siji(i,,n2) {
scanf("%s",word[i]+);
}
siji(i,,n2) {
b[i]=-;
siji(j,,) {
c[i][j]=-;
}
}
b[]=;
}
int check(int d,int l) {
if(c[d][l]!=-) return c[d][l];
c[d][l]=inf;
int t=,t1;
int id1,id2;
int ti=max(,l);
siji(i,,) {
t=;
if((i+)*>n1) break;
siji(j,,l) {
siji(z,,) {
if(ch[i*+j][z]!=word[d+j][z])
++t;
}
}
if(l==) id1=-,id2=;
if(l==) id1=,id2=-;
siji(k,,) {
if(l==) break;
t1=;
siji(j,,k-) {
siji(z,,) {
if(ch[i*+j][z]!=word[d+j][z]) ++t1;
}
}
siji(j,k+,ti) {
siji(z,,) {
if(ch[i*+j+id1][z]!=word[d+j+id2][z])
++t1;
}
}
t=min(t,t1);
}
if(c[d][l]>t) {
c[d][l]=t;
pr[d][l]=i;
}
}
if(c[d][l]>) {pr[d][l]=;}//假如损坏率超过30%
return c[d][l];
} int dfs(int m) {
if(b[m]!=-) return b[m];
b[m]=inf;
siji(i,,) {
if(m-i<) break;
if(dfs(m-i)>=inf) continue;
b[m]=min(b[m],dfs(m-i)+check(m-i,i));
}
return b[m];
}
void pra(int m) {
if(m==) return;
char cw;
siji(i,,) {
if(b[m]==b[m-i]+c[m-i][i]) {
pra(m-i);
cw=pr[m-i][i];
break;
}
}
printf("%c",os[cw]);
}
void solve() {
init();
dfs(n2);
pra(n2);
puts("");
}
int main(int argc, char const *argv[])
{
#ifdef ivorysi
freopen("charrec.in","r",stdin);
freopen("charrec.out","w",stdout);
#else
freopen("f1.in","r",stdin);
#endif
solve();
return ;
}
USACO 5.4 Character Recognition的更多相关文章
- USACO 5.4 Character Recognition(DP)
非常恶心的一题,卡了三个月,没什么动力做了,代码直接抄的别人的... 这题主要思路就是预处理出几个数组,再预处理出几个数组,最后DP,输出一下路径... 写起来挺非常麻烦,代码不贴了,丢人... 把U ...
- OCR (Optical Character Recognition,光学字符识别)
OCR (Optical Character Recognition,光学字符识别)是指电子设备(例如扫描仪或数码相机)检查纸上打印的字符,通过检测暗.亮的模式确定其形状,然后用字符识别方法将形状翻译 ...
- csharp:Optical Character Recognition
using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.D ...
- 图片OCR(Optical Character Recognition)
目录 Photo OCR问题描述 滑动窗口(Sliding Windows) 获得大量数据和人工数据(Getting Logs of Data and Artificial Data) 瓶颈分析:需要 ...
- OCR(Optical Character Recognition)算法总结
https://zhuanlan.zhihu.com/p/84815144 最全OCR资料汇总,awesome-OCR
- 第 38 章 OCR - Optical Character Recognition
38.1. Tesseract 查找Tesseract安装包 $ apt-cache search Tesseract ocrodjvu - tool to perform OCR on DjVu d ...
- USACO 5.4 章节
Canada Tour 题目大意 双向连通图,点从左向右排列, 你需要先从最左的点到最右的点,(过程中只能从左向右走) 然后再从最右的点返回最左的点,(过程中只能从右向左走) 过程中除了最左的点,其它 ...
- USACO 完结的一些感想
其实日期没有那么近啦……只是我偶尔还点进去造成的,导致我没有每一章刷完的纪念日了 但是全刷完是今天啦 讲真,题很锻炼思维能力,USACO保持着一贯猎奇的题目描述,以及尽量不用高级算法就完成的题解……例 ...
- Online handwriting recognition using multi convolution neural networks
w可以考虑从计算机的“机械性.重复性”特征去设计“低效的”算法. https://www.codeproject.com/articles/523074/webcontrols/ Online han ...
随机推荐
- Spring Boot 使用properties如何多环境配置
当我们使用properties文件作为Spring Boot的配置文件而不是yaml文件时,怎样实现多环境使用不同的配置信息呢? 在Spring Boot中,多环境配置的文件名需要满足 ...
- ubuntu 使用小技巧
1. 查看网速 ethstatus ubuntu下用ethstatus可以监控实时的网卡带宽占用.这个软件能显示当前网卡的 RX 和 TX 速率,单位是Byte 安装 ethstatus 软件 sud ...
- LocalDateTime与字符串互转/Date互转/LocalDate互转/指定日期/时间比较
Java 8中表示日期和时间的类有多个,主要的有: Instant:表示时刻,不直接对应年月日信息,需要通过时区转换 LocalDateTime: 表示与时区无关的日期和时间信息,不直接对应时刻,需要 ...
- jquery如何获取input(file)控件上传的图片名称,即"11111.jpg"
html代码:<input name=file" type="file" id="file"/> Jquery代码:var file;$( ...
- ubuntu 开机自动挂在windows下的分区
最近装了Ubuntu14.04 + windows7 的双系统,启动Ubuntu的时候,不会自动挂载win7的分区,只有我点击相应的硬盘符号时才会挂载/media下面.本着折腾到底的原则,在网上搜了搜 ...
- 【译】第八篇 Integration Services:高级工作流管理
本篇文章是Integration Services系列的第八篇,详细内容请参考原文. 简介在前面两篇文章,我们创建了一个新的SSIS包,学习了SSIS中的脚本任务和优先约束,并检查包的MaxConcu ...
- D. GCD Counting(树上dp)
题目链接:http://codeforces.com/contest/1101/problem/D 题目大意:给你n个点,每个点都有权值,然后给你边的关系,问你树上的最大距离.(这里的最大距离指的是这 ...
- Java Dom对XML的解析和修改操作
与Dom4J和JDom对XML的操作类似,JDK提供的JavaDom解析器用起来一样方便,在解析XML方面Java DOM甚至更甚前两者一筹!其不足之处在于对XML的增删改比较繁琐,特开篇介绍... ...
- Remove K Digits
Given string A representative a positive integer which has N digits, remove any k digits of the numb ...
- sql 内联,左联,右联,全联
联合查询效率较高,以下例子来说明联合查询(内联.左联.右联.全联)的好处: T1表结构(用户名,密码) userid (int) username varchar(20) password varc ...