Character Recognition

This problem requires you to write a program that performs character recognition.

Each ideal character image has 20 lines of 20 digits. Each digit is a `0' or a `1'. See Figure 1a (way below) for the layout of character images in the file.

The file font.in contains representations of 27 ideal character images in this order:

_abcdefghijklmnopqrstuvwxyz

where _ represents the space character. Each ideal character is 20 lines long.

The input file contains one or more potentially corrupted character images. A character image might be corrupted in these ways:

  • at most one line might be duplicated (and the duplicate immediately follows)
  • at most one line might be missing
  • some 0's might be changed to 1's
  • some 1's might be changed to 0's.

No character image will have both a duplicated line and a missing line. No more than 30% of the 0's and 1's will be changed in any character image in the evaluation datasets.

In the case of a duplicated line, one or both of the resulting lines may have corruptions, and the corruptions may be different.

Write a program to recognize the sequence of one or more characters in the image provided in the input file using the font provided in file font.in.

Recognize a character image by choosing the font character images that require the smallest number of overall changed 1's and 0's to be corrupted to the given font image, given the most favourable assumptions about duplicated or omitted lines. Count corruptions in only the least corrupted line in the case of a duplicated line. You must determine the sequence of characters that most closely matches the input sequence (the one that requires the least number of corruptions). There is a unique best solution for each evaluation dataset.

A correct solution will use precisely all of the data supplied in the input file.

PROGRAM NAME: charrec

INPUT FORMAT (both input files)

Both input files begin with an integer N (19 <= N < 1200) that specifies the number of lines that follow:

N
(digit1)(digit2)(digit3) ... (digit20)
(digit1)(digit2)(digit3) ... (digit20)
...

Each line of data is 20 digits wide. There are no spaces separating the zeros and ones.

The file font.in describes the font. It will always contain 541 lines. It may differ for each evaluation dataset.

SAMPLE INPUT (file charrec.in)

Incomplete sample showing the
beginning of font.in
(space and a).

Sample charrec.in, showing
an a corrupted

font.in

charrec.in

540
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000011100000000000
00000111111011000000
00001111111001100000
00001110001100100000
00001100001100010000
00001100000100010000
00000100000100010000
00000010000000110000
00000001000001110000
00001111111111110000
00001111111111110000
00001111111111000000
00001000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
00000000000000000000
19
00000000000000000000
00000000000000000000
00000000000000000000
00000011100000000000
00100111011011000000
00001111111001100000
00001110001100100000
00001100001100010000
00001100000100010000
00000100000100010000
00000010000000110000
00001111011111110000
00001111111111110000
00001111111111000000
00001000010000000000
00000000000000000000
00000000000001000000
00000000000000000000
00000000000000000000
Figure 1a Figure 1b

OUTPUT FORMAT

Your program must produce an output file that contains a single string of the characters recognized. Its format is a single line of ASCII text. The output should not contain any separator characters. If your program does not recognize a particular character, it must output a ? in the appropriate position.

SAMPLE OUTPUT (file charrec.out)

a

Note that the 'sample output' formerly displayed a blank followed by an 'a', but that seems wrong now.

————————————————————————————————————————题解

任何不会做的题一定不能虚的坦然自若地说完题解

这道题写模拟一定会超时,然后我们发现有最优子结构

也就是dp[m]表示匹配到了第m行最小的不相符字符数,我们再开一个数组c[j][k]表示从第j行下匹配了k行的最小代价,同时记录这个字符是什么

20行直接匹配,19行枚举pass掉该字符标准表示的哪一行,21行枚举pass掉匹配的这21行的哪一行然后匹配

然后记搜转移,因为并不是所有状态都要用到,最后一个点是0.9s才过

【把font.in点开后粘到编辑器然后看缩略图会发现极为良心的……当然近视眼只需要摘下眼镜了……】

 /*
ID: ivorysi
LANG: C++
PROG: charrec
*/
#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <queue>
#include <set>
#include <vector>
#include <string.h>
#include <cmath>
#include <stack>
#include <map>
#define siji(i,x,y) for(int i=(x);i<=(y);++i)
#define gongzi(j,x,y) for(int j=(x);j>=(y);--j)
#define xiaosiji(i,x,y) for(int i=(x);i<(y);++i)
#define sigongzi(j,x,y) for(int j=(x);j>(y);--j)
#define inf 0x3f3f3f3f
//#define ivorysi
#define mo 97797977
#define hash 974711
#define base 47
#define pss pair<string,string>
#define MAXN 5000
#define fi first
#define se second
#define pii pair<int,int>
#define esp 1e-8
typedef long long ll;
using namespace std;
char *os=" abcdefghijklmnopqrstuvwxyz?";
char ch[][],word[][];
int n1,n2;
int c[][],b[],pr[][];
void init() {
FILE *fin=fopen("font.in","r");
fscanf(fin,"%d",&n1);
siji(i,,n1) {
fscanf(fin,"%s",ch[i]+);
}
scanf("%d",&n2);
siji(i,,n2) {
scanf("%s",word[i]+);
}
siji(i,,n2) {
b[i]=-;
siji(j,,) {
c[i][j]=-;
}
}
b[]=;
}
int check(int d,int l) {
if(c[d][l]!=-) return c[d][l];
c[d][l]=inf;
int t=,t1;
int id1,id2;
int ti=max(,l);
siji(i,,) {
t=;
if((i+)*>n1) break;
siji(j,,l) {
siji(z,,) {
if(ch[i*+j][z]!=word[d+j][z])
++t;
}
}
if(l==) id1=-,id2=;
if(l==) id1=,id2=-;
siji(k,,) {
if(l==) break;
t1=;
siji(j,,k-) {
siji(z,,) {
if(ch[i*+j][z]!=word[d+j][z]) ++t1;
}
}
siji(j,k+,ti) {
siji(z,,) {
if(ch[i*+j+id1][z]!=word[d+j+id2][z])
++t1;
}
}
t=min(t,t1);
}
if(c[d][l]>t) {
c[d][l]=t;
pr[d][l]=i;
}
}
if(c[d][l]>) {pr[d][l]=;}//假如损坏率超过30%
return c[d][l];
} int dfs(int m) {
if(b[m]!=-) return b[m];
b[m]=inf;
siji(i,,) {
if(m-i<) break;
if(dfs(m-i)>=inf) continue;
b[m]=min(b[m],dfs(m-i)+check(m-i,i));
}
return b[m];
}
void pra(int m) {
if(m==) return;
char cw;
siji(i,,) {
if(b[m]==b[m-i]+c[m-i][i]) {
pra(m-i);
cw=pr[m-i][i];
break;
}
}
printf("%c",os[cw]);
}
void solve() {
init();
dfs(n2);
pra(n2);
puts("");
}
int main(int argc, char const *argv[])
{
#ifdef ivorysi
freopen("charrec.in","r",stdin);
freopen("charrec.out","w",stdout);
#else
freopen("f1.in","r",stdin);
#endif
solve();
return ;
}

USACO 5.4 Character Recognition的更多相关文章

  1. USACO 5.4 Character Recognition(DP)

    非常恶心的一题,卡了三个月,没什么动力做了,代码直接抄的别人的... 这题主要思路就是预处理出几个数组,再预处理出几个数组,最后DP,输出一下路径... 写起来挺非常麻烦,代码不贴了,丢人... 把U ...

  2. OCR (Optical Character Recognition,光学字符识别)

    OCR (Optical Character Recognition,光学字符识别)是指电子设备(例如扫描仪或数码相机)检查纸上打印的字符,通过检测暗.亮的模式确定其形状,然后用字符识别方法将形状翻译 ...

  3. csharp:Optical Character Recognition

    using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.D ...

  4. 图片OCR(Optical Character Recognition)

    目录 Photo OCR问题描述 滑动窗口(Sliding Windows) 获得大量数据和人工数据(Getting Logs of Data and Artificial Data) 瓶颈分析:需要 ...

  5. OCR(Optical Character Recognition)算法总结

    https://zhuanlan.zhihu.com/p/84815144 最全OCR资料汇总,awesome-OCR

  6. 第 38 章 OCR - Optical Character Recognition

    38.1. Tesseract 查找Tesseract安装包 $ apt-cache search Tesseract ocrodjvu - tool to perform OCR on DjVu d ...

  7. USACO 5.4 章节

    Canada Tour 题目大意 双向连通图,点从左向右排列, 你需要先从最左的点到最右的点,(过程中只能从左向右走) 然后再从最右的点返回最左的点,(过程中只能从右向左走) 过程中除了最左的点,其它 ...

  8. USACO 完结的一些感想

    其实日期没有那么近啦……只是我偶尔还点进去造成的,导致我没有每一章刷完的纪念日了 但是全刷完是今天啦 讲真,题很锻炼思维能力,USACO保持着一贯猎奇的题目描述,以及尽量不用高级算法就完成的题解……例 ...

  9. Online handwriting recognition using multi convolution neural networks

    w可以考虑从计算机的“机械性.重复性”特征去设计“低效的”算法. https://www.codeproject.com/articles/523074/webcontrols/ Online han ...

随机推荐

  1. ZeroMQ API(五) 传输模式

    1.使用TCP的单播传输:zmq_tcp(7) 1.1 名称 zmq_tcp - 使用TCP的ZMQ单播传输 1.2 概要 TCP是一种无处不在,可靠的单播传输.当通过具有ZMQ的网络连接分布式应用程 ...

  2. Android 动态添加线性布局(.java文件内) 实现控件按比例分割空间

    这里实现 两个 编辑框同一水平上 按1:1分割空间 这里的1:1 比例可以通过 lp1.weight :  1p2.weight  =m:n 实现 { LinearLayout l=new Linea ...

  3. 【总结】前端框架:react还是vue?

    之前写了一篇前端框架的大汇总,主要介绍了当下主流的框架和其特性.最近除了bootstrap,就属react和vue最为热门,这篇就主要拿这两个框架来做一下详细对比. 究竟如何正确使用?作为小白的我们从 ...

  4. oracle主键约束、唯一键约束和唯一索引的区别

    (1)主键约束和唯一键约束均会隐式创建同名的唯一索引,当主键约束或者唯一键约束失效时,隐式创建的唯一索引会被删除: (2)主键约束要求列值非空,而唯一键约束和唯一索引不要求列值非空: (3)相同字段序 ...

  5. 【codeforces】【比赛题解】#960 CF Round #474 (Div. 1 + Div. 2, combined)

    终于打了一场CF,不知道为什么我会去打00:05的CF比赛…… 不管怎么样,这次打的很好!拿到了Div. 2选手中的第一名,成功上紫! 以后还要再接再厉! [A]Check the string 题意 ...

  6. 使用RegSetValueEx修改注册表时遇到的问题(转)

    原文转自 http://blog.csdn.net/tracyzhongcf/article/details/4076870 1.今天在使用RegSetValueEx时发现一个问题: RegSetVa ...

  7. 64_t7

    texlive-ulqda-bin-svn13663.0-33.20160520.fc26.2..> 24-May-2017 15:57 33102 texlive-ulqda-doc-svn2 ...

  8. Linux(Centos )的网络内核参数优化来提高服务器并发处理能力【转】

    简介 提高服务器性能有很多方法,比如划分图片服务器,主从数据库服务器,和网站服务器在服务器.但是硬件资源额定有限的情况下,最大的压榨服务器的性能,提高服务器的并发处理能力,是很多运维技术人员思考的问题 ...

  9. python并发爬虫利器tomorrow(一)

    tomorrow是我最近在用的一个爬虫利器,该模块属于第三方的一个模块,使用起来非常的方便,只需要用其中的threads方法作为装饰器去修饰一个普通的函数,既可以达到并发的效果,本篇将用实例来展示to ...

  10. [转]在C#程序设计中使用Win32类库

    http://blog.163.com/j_yd168/blog/static/496797282008611326218/     C# 用户经常提出两个问题:“我为什么要另外编写代码来使用内置于 ...