机器视觉 - tesseract ( 验证码 )

安装

Ubuntu

sudo apt-get install tesseract-ocr

Windows

下载安装包

添加环境变量(Path) ：搜索环境变量

测试

终端：tesseract xx.jpg 文件名

pytesseract

识别成功率取决你的 tessdata 的质量

自带的质量就很炸, 所以基本上没什么用

安装

sudo pip3 install pytesseract

简单使用

import pytesseract

# python 标准化图片处理组件

from PIL import Image

# 创建图片对象

img = Image.open('yzm1.jpg')

# 图片转字符串

r = pytesseract.image_to_string(img)

print(r)

在线打码平台

tesseract-ocr 识别率有点辛苦, 因此不太实用

使用在线打码会性价比高一点

在线打码, 云打码

网址这里

官方文档

import http.client, mimetypes, urllib, json, time, requests

######################################################################

class YDMHttp:

    apiurl = 'http://api.yundama.com/api.php'

    username = ''

    password = ''

    appid = ''

    appkey = ''

    def __init__(self, username, password, appid, appkey):

        self.username = username

        self.password = password

        self.appid = str(appid)

        self.appkey = appkey

    def request(self, fields, files=[]):

        response = self.post_url(self.apiurl, fields, files)

        response = json.loads(response)

        return response

    def balance(self):

        data = {'method': 'balance', 'username': self.username, 'password': self.password, 'appid': self.appid,

                'appkey': self.appkey}

        response = self.request(data)

        if (response):

            if (response['ret'] and response['ret'] < 0):

                return response['ret']

            else:

                return response['balance']

        else:

            return -9001

    def login(self):

        data = {'method': 'login', 'username': self.username, 'password': self.password, 'appid': self.appid,

                'appkey': self.appkey}

        response = self.request(data)

        if (response):

            if (response['ret'] and response['ret'] < 0):

                return response['ret']

            else:

                return response['uid']

        else:

            return -9001

    def upload(self, filename, codetype, timeout):

        data = {'method': 'upload', 'username': self.username, 'password': self.password, 'appid': self.appid,

                'appkey': self.appkey, 'codetype': str(codetype), 'timeout': str(timeout)}

        file = {'file': filename}

        response = self.request(data, file)

        if (response):

            if (response['ret'] and response['ret'] < 0):

                return response['ret']

            else:

                return response['cid']

        else:

            return -9001

    def result(self, cid):

        data = {'method': 'result', 'username': self.username, 'password': self.password, 'appid': self.appid,

                'appkey': self.appkey, 'cid': str(cid)}

        response = self.request(data)

        return response and response['text'] or ''

    def decode(self, filename, codetype, timeout):

        cid = self.upload(filename, codetype, timeout)

        if (cid > 0):

            for i in range(0, timeout):

                result = self.result(cid)

                if (result != ''):

                    return cid, result

                else:

                    time.sleep(1)

            return -3003, ''

        else:

            return cid, ''

    def report(self, cid):

        data = {'method': 'report', 'username': self.username, 'password': self.password, 'appid': self.appid,

                'appkey': self.appkey, 'cid': str(cid), 'flag': ''}

        response = self.request(data)

        if (response):

            return response['ret']

        else:

            return -9001

    def post_url(self, url, fields, files=[]):

        for key in files:

            files[key] = open(files[key], 'rb');

        res = requests.post(url, files=files, data=fields)

        return res.text

######################################################################

# 用户名

username = 'username'

# 密码

password = 'password'

# 软件ＩＤ，开发者分成必要参数。登录开发者后台【我的软件】获得！

appid = 1

# 软件密钥，开发者分成必要参数。登录开发者后台【我的软件】获得！

appkey = '22cc5376925e9387a23cf797cb9ba745'

# 图片文件

filename = 'getimage.jpg'

# 验证码类型，# 例：1004表示4位字母数字，不同类型收费不同。请准确填写，否则影响识别率。在此查询所有类型 http://www.yundama.com/price.html

codetype = 1004

# 超时时间，秒

timeout = 60

# 检查

if (username == 'username'):

    print('请设置好相关参数再测试')

else:

    # 初始化

    yundama = YDMHttp(username, password, appid, appkey)

    # 登陆云打码

    uid = yundama.login();

    print('uid: %s' % uid)

    # 查询余额

    balance = yundama.balance();

    print('balance: %s' % balance)

    # 开始识别，图片路径，验证码类型ID，超时时间（秒），识别结果

    cid, result = yundama.decode(filename, codetype, timeout);

    print('cid: %s, result: %s' % (cid, result))

######################################################################

Python - pytesseract 机器视觉的更多相关文章

python pytesseract——3步识别验证码的识别入门
验证码识别是个大工程,但入门开始只要3步.需要用到的库PIL.pytesserac,没有的话pip安装.还有一个是tesseract-ocr 下载地址:https://sourceforge.net/ ...
python pytesseract使用
正确使用方法 1.tesseract-orc安装 tesseract-ocr-setup-3.05.00dev.exe下载 2.pytesseract pip install pytesseract ...
python下调用pytesseract识别某网站验证码
一.pytesseract介绍 1.pytesseract说明 pytesseract最新版本0.1.6,网址:https://pypi.python.org/pypi/pytesseract Pyt ...
Tesseract-ocr视觉学习-验证码识别及python import pytesseract使用
Tesseract-OCR的简单使用与训练最近看到某个网站提交数据要提交验证码,用tesseract自带的识别, 识别出来是什么鬼,0-9识别成了什么玩意! so决定自己训练下tesseract.. ...
Python机器视觉编程常用数据结构与示例
本文总结了使用Python进行机器视觉(图像处理)编程时常用的数据结构,主要包括以下内容: 数据结构通用序列操作:索引(indexing).分片(slicing).加(adding).乘(multi ...
text recognizer (OCR) Engine 光学字符识别
https://github.com/tesseract-ocr/tesseract/wiki https://github.com/UB-Mannheim/tesseract/wiki C:\Use ...
tesseract 安装及使用
安装软件 tesseract下载地址:https://digi.bib.uni-mannheim.de/tesseract/ 安装即可! 安装完成tesseract-ocr后,需要做一下配置 . 在P ...
python--通过ocr对数据可视化视频还原为csv，进行简单的分析
见github https://github.com/TouwaErioH/Machine-Learning/tree/master/video/video 题目描述: source https:// ...
python识别验证码——PIL,pytesser,pytesseract的安装
1.使用Python识别验证码需要安装Python的图像处理模块(PIL.pytesser.pytesseract) (安装过程需要pip,在我的Python中已经安装pip了,pip的安装就不在赘述 ...

随机推荐

Win7升级Win10系统提示错误0x80070057的解决方法
Win7系统用户在通过Windows Update来升级Win10系统时,有时会出现0x80070057的错误代码从而导致无法继续升级.下面好系统重装助手就来告诉大家Win7升级Win10系统出现0x ...
null 和{}的那点事
直接上代码 console.log(typeof null); //object console.log(typeof {}); //object 可以看到两者的类型都是object ,写在前面的事: ...
socket 编程的一些应用例子
1.#传输文件的例子 import socketfrom socket import *import osimport requests import time,socketserver,struct ...
ggplot2绘制Excel所有图
出处:https://brucezhaor.github.io/blog/2016/06/13/excel2ggplot/#%E5%89%8D%E8%A8%80 目录前言 1.用到的包 2.数据准备 ...
【转】go里面字符串转成字节slice, 字节slice转成字符串
原文: https://yourbasic.org/golang/convert-string-to-byte-slice/#convert-string-to-bytes ------------- ...
Property or method "openPageOffice" is not defined on the instance but referenced during render. Make sure that this property is reactive, either in the data option, or for class-based components, by
Property or method "openPageOffice" is not defined on the instance but referenced during r ...
大二小学期C#资产管理大作业小记
说明这个程序是我大二夏季学期(俗称小学期)用Visual Studio + C#写的<资产管理>大作业.这个项目非常简单,就是用C#写出一个UI界面,并连接数据库进行增删改查.这是我第一 ...
c语言学习、工作相关必备的常用网站
1.https://zh.cppreference.com/,c.c++参考手册, 2.http://www.cplusplus.com/,在线查看c.c++函数的定义及用法 3.http://c-f ...
C# 安全性
一.标识和Principal static void Main(string[] args) { AppDomain.CurrentDomain.SetPrincipalPolicy(System.S ...
Time travel HDU - 4418 (概率DP)
对于每个点两个方向(两头只有一个方向)建一个点,然后预处理出每个点走k(1≤k≤n)k(1\le k\le n)k(1≤k≤n)到哪个点,列出方程式高斯消元就行了.记得前面bfsbfsbfs出那些点不 ...

Python - pytesseract 机器视觉

机器视觉 - tesseract ( 验证码 )

安装

Ubuntu

Windows

测试

pytesseract

安装

简单使用

在线打码平台

在线打码, 云打码

官方文档

Python - pytesseract 机器视觉的更多相关文章

随机推荐

热门专题