Catalog

About

Git repository https://github.com/PaddlePaddle/PaddleOCR
Online demo https://www.paddlepaddle.org.cn/hub/scene/ocr
Installation Docs https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/quickstart.md

Install

python -m pip install paddlepaddle==2.3.2 -i https://pypi.tuna.tsinghua.edu.cn/simple

python -m pip install "paddleocr>=2.0.1"

If error ocurs during installation

 error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/

Download buildTool Installer from https://visualstudio.microsoft.com/visual-cpp-build-tools/ and install the desktop c++ package, this will download several GiB files, taking a long time.

Usage

paddleocr --image_dir ./fp05b.jpg --use_angle_cls true --use_gpu false

If it is running for the first time, it will download the model files

Invoke in Python

from paddleocr import PaddleOCR, draw_ocr

# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换

# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`

ocr = PaddleOCR(use_angle_cls=True, lang="ch")  # need to run only once to download and load model into memory

img_path = './photos/fp04b.jpg'

result = ocr.ocr(img_path, cls=True)

for idx in range(len(result)):

    res = result[idx]

    for line in res:

        print(line)

# 显示结果

from PIL import Image

result = result[0]

image = Image.open(img_path).convert('RGB')

boxes = [line[0] for line in result]

txts = [line[1][0] for line in result]

scores = [line[1][1] for line in result]

im_show = draw_ocr(image, boxes, txts, scores, font_path='./msyh.ttc')

im_show = Image.fromarray(im_show)

im_show.save('fp04b_result.jpg')

Performance

Run on CPU, it will take around 10 seconds to parsing a picture
Correct rate is much better than Tesseract and EasyOCR, even when handling the cellphone photos.

OCR 03: PaddleOCR的更多相关文章

由于OCR文件损坏造成Oracle RAC不能启动的现象和处理方法
v$cluster_interconnects 集群节点间通信使用的IP地址错误信息使用了公网进行连接 SQL> select * from v$cluster_interconnects; ...
机器学习&数据挖掘笔记_19（PGM练习三：马尔科夫网络在OCR上的简单应用）
前言: 接着coursera课程:Probabilistic Graphical Models上的实验3,本次实验是利用马尔科夫网络(CRF模型)来完成单词的OCR识别,每个单词由多个字母组合,每个字 ...
[转]Theano下用CNN(卷积神经网络)做车牌中文字符OCR
Theano下用CNN(卷积神经网络)做车牌中文字符OCR 原文地址:http://m.blog.csdn.net/article/details?id=50989742 之前时间一直在看 Micha ...
Tesseract——OCR图像识别入门篇
Tesseract——OCR图像识别入门篇最近给了我一个任务,让我研究图像识别,从我们项目的screenshot中识别文字信息,so我开始了学习,与大家分享下. 我看到目前OCR技术有很多,最主要 ...
11g r2 模拟OCR和voting disk不可用，完整恢复过程，以及一些注意事项
环境:RHEL5.8 RAC 11.2.0.3.0 1:查看ORC和voting disk信息: In 11g Release 2 your voting disk data is automatic ...
Tesseract Ocr引擎
Tesseract Ocr引擎 1.Tesseract介绍 tesseract 是一个google支持的开源ocr项目,其项目地址:https://github.com/tesseract-ocr/t ...
Python爬虫-尝试使用人工和OCR处理验证码模拟登入
刚开始在网上看别人一直在说知乎登入首页有有倒立的汉字验证码,我打开自己的知乎登入页面,发现只有账号和密码,他们说的倒立的验证码去哪了,后面仔细一想我之前登入过知乎,应该在本地存在cookies,然后我 ...
Ocr答题辅助神器 OcrAnswerer4.x，通过百度OCR识别手机文字，支持屏幕窗口截图和ADB安卓截图，支持四十个直播App,可保存题库
http://www.cnblogs.com/Charltsing/p/OcrAnswerer.html 联系qq:564955427 最新版为v4.1版,开放一定概率的八窗口体验功能,请截图体验(多 ...
Python下Tesseract Ocr引擎及安装介绍
1.Tesseract介绍 tesseract 是一个google支持的开源ocr项目,其项目地址:https://github.com/tesseract-ocr/tesseract,目前最新的源码 ...
管理 Oracle Cluster Registry(OCR)
oracle的clusterware包含两个重要组件:OCR(包含本地组件OLR)和voting disks --OCR管理oracle clusterware和oracle rac数据库的配置信息 ...

随机推荐

SoC scan implementation
scan chain产生之前需要进行scan drc的过程,判断cell是不是能够串到scan chain上去 mux-d scan cell(是最常用的scan cell),还有其他的scan ce ...
SpringMVC01——回顾MVC
1.1什么是MVC MVC是模型(Model).视图(View).控制器(Controller)的简写,是一种软件设计规范. 是将业务逻辑.数据.显示分离的方法来组织代码. MVC主要作用是降低了视图 ...
[转帖]shell编程之条件语句
目录一.条件测试 test命令文件测试与整数测试文件测试整数值比较字符串测试与逻辑测试字符串比较逻辑测试二.if语句 if单分支语句单分支结构 if双分支语句双分支结构 if多分支 ...
[转帖]Shell编程之免交互
目录交互的概念与Linux中的运用 Here Document 免交互 tee命令重定向输出加标准输出支持变量替换多行注释 Expect 实例操作免交互预设值修改用户密码创建用户并设置密码 ...
【转帖】浅析经典JVM垃圾收集器-Serial/ParNew/Parallel Scavenge/Serial Old/Parallel Old/CMS/G1
https://zhuanlan.zhihu.com/p/481256418 在讲述垃圾收集器之前,我们得先知道JVM中常见的垃圾收集算法有什么,具体请参考我的这篇博文.如果说收集算法是内存回收的方法 ...
[转帖]谈 JVM 参数 GC 线程数 ParallelGCThreads 合理性设置
https://my.oschina.net/u/4090830/blog/7926038 1. ParallelGCThreads 参数含义在讲这个参数之前,先谈谈 JVM 垃圾回收 (GC) 算 ...
IPV6的简单学习与整理
背景大概2018年时曾经突击学习过一段时间IPV6 当时没太有写文档的习惯,导致这边没有成型的记录了. 今天又有项目要求使用IPV6, 想了想就将之前学习的部分还有想继续学习提高的部分进行一下总结 ...
【小测试】rust中的数组越界——好吧，这下证明rust不是零成本抽象了吧
作者:张富春(ahfuzhang),转载时请注明作者和引用链接,谢谢! cnblogs博客 zhihu Github 公众号:一本正经的瞎扯 1.编译期发现的数组越界在数组下标是常量的情况下,编译期 ...
Flask闪现
目录九.闪现 9.1 什么是闪现? 九.闪现 9.1 什么是闪现? -设置:flash('aaa') -取值:get_flashed_message() - -假设在a页面操作出错,跳转到b页面,在 ...
Gin 框架之jwt 介绍与基本使用
目录一.JWT 介绍二.JWT认证与session认证的区别 2.1 基于session认证流程图 2.2 基于jwt认证流程图三. JWT 的构成 3.1 header : 头部 3.2 pa ...

OCR 03: PaddleOCR