Python+Yolov8+ONNX实时缺陷目标检测

相比于上一篇Windows10+Python+Yolov8+ONNX图片缺陷识别，并在原图中标记缺陷，有onnx模型则无需配置，无需训练。

优化了程序逻辑，降低了程序运行时间，增加了实时检测功能

1、模型转换

2、查看模型结构

1、模型转换

通过训练得到的模型是pt文件，我们需要转换为onnx文件

from ultralytics import YOLO

# 加载模型

model = YOLO("models\\best.pt")

# 转换模型

model.export(format="onnx")

2、查看模型结构

通过以下网站来查看onnx模型结构

best.onnx (netron.app)

可以得到，输入图片的尺寸要求为3*640*640，输出结果为float32的n*8400二维数组，n为数据集缺陷种类的数量

3、修改输入图片的尺寸

为防止图片畸变，所以需要将图片修改为如下形状

import onnxruntime

import numpy as np

import tkinter

from tkinter import filedialog

import random

import cv2

# 弹出文件选择框，让用户选择要打开的图片

filepath = tkinter.filedialog.askopenfilename()

# 如果用户选择了一个文件，则加载该文件并显示

if filepath != '':

    # 读取图片

    image = cv2.imread(filepath)

    # 获取图像尺寸

    h, w = image.shape[:2]

    # 将BGR图像转换为RGB图像

    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    # 尺寸变换

    if h > w:

        img = cv2.resize(image, (int(w / h * 640) , 640))

    else:

        img = cv2.resize(image, (640 , int(h / w * 640)))

    # 创建单色背景图像

    background = np.zeros((640, 640, 3), np.uint8)

    background[:] = (255, 0, 0)

    # 将图像居中放置

    x_offset = (640 - img.shape[1]) // 2

    y_offset = (640 - img.shape[0]) // 2

    background[y_offset:y_offset+img.shape[0], x_offset:x_offset+img.shape[1]] = img

    # 显示图片

    cv2.imshow('Result', background)

    cv2.waitKey(0)

    cv2.destroyAllWindows()

4、图像数据归一化

为了方便深度学习模型对图片数据进行推理，需要对读入图片进行归一化处理

# 将像素值转换为浮点数，并将其归一化到0~1之间

img = image.astype(np.float32) / 255.0   

# 将图像从HWC格式转换为CHW格式

img = np.transpose(img, (2, 0, 1))

# 将图像从CHW格式转换为NCHW格式，批次大小为1

img = np.expand_dims(img, axis=0)

5、模型推理

将修改好的图像数据，用onnx模型推理工具进行推理，得到n*8400二维数组的推理结果，n为数据集缺陷种类的数量

# onnx测试

session = onnxruntime.InferenceSession(onnx_model_path)

inputs = {session.get_inputs()[0].name: image}

logits = session.run(None, inputs)[0]

# 将输出转换为二维数组

# 将(1, 9, 8400)的形状转换为(9, 8400)的形状

output = logits.reshape((9, -1))

# 将二维数组转置为(8400, 9)的形状

output = output.transpose((1, 0))

6、推理结果筛选

9*8400二维数组转成8400*9方便处理，9列数据分别表示了检测框的中心x坐标、y坐标、宽度、高度、每个缺陷的置信系数

需要筛选出缺陷置信系数大于阈值的检测框

# 缺陷位置和缺陷置信系数

selected = np.zeros((0, 9))

# 缺陷置信系数

Thresh = np.zeros((0, 1))

# 缺陷类型

typ = np.zeros((0, 1), dtype=int)

i = 0

# 循环遍历每一行,筛选大于阈值的缺陷

for n in range(num.shape[0]):

    # 如果第4~8列中有大于阈值的元素

    if np.any(num[n, 4:] > threshold):

        # 将这一行添加到selected数组中

        selected = np.vstack((selected, num[n]))

        # 如果第4列大于阈值

        if selected[i, 4] == max(selected[i, 4:]):

            # 将type数组第i个元素赋值为缺陷类型0

            typ = np.vstack((typ, 0))

            # 将Thresh数组第i个元素赋值为缺陷类型0的阈值

            Thresh = np.vstack((Thresh, selected[i, 4]))

        elif selected[i, 5] == max(selected[i, 4:]):

            typ = np.vstack((typ, 1))

            Thresh = np.vstack((Thresh, selected[i, 5]))

        elif selected[i, 6] == max(selected[i, 4:]):

            typ = np.vstack((typ, 2))

            Thresh = np.vstack((Thresh, selected[i, 6]))

        elif selected[i, 7] == max(selected[i, 4:]):

            typ = np.vstack((typ, 3))

            Thresh = np.vstack((Thresh, selected[i, 7]))

        elif selected[i, 8] == max(selected[i, 4:]):

            typ = np.vstack((typ, 4))

            Thresh = np.vstack((Thresh, selected[i, 8]))

        i = i + 1

7、像素还原

将筛选结果还原成原图像素点坐标

# 获取selected数组的第0、1、2和3列，分别对应缺陷中心x，y坐标，宽度，高度

x_center = select[:, 0]

y_center = select[:, 1]

width = select[:, 2]

height = select[:, 3]

# 计算左上角坐标

x_min = x_center - width / 2

y_min = y_center - height / 2

# 创建bbox数组，将左上角坐标和宽度、高度存储进去

bbox = np.zeros((select.shape[0], 6))

bbox[:, 0] = x_min

bbox[:, 1] = y_min

bbox[:, 2] = width

bbox[:, 3] = height

# 将type数组和Thresh数组分别添加到bbox数组的第4列和第5列

bbox[:, 4] = typ

bbox[:, 5] = thresh

# 图像比例恢复

if h > w:

    bbox[:, :4] *= (h/640)

    bbox[:, 0] -= (h/2-w/2)

else:

    bbox[:, :4] *= (w/640)

    bbox[:, 1] -= (w/2-h/2)

# 将二维数组转换为二维列表

my_list = [list(row) for row in bbox]

# 将 0~4 列转换为 int 型，5 列转换为 float 型

for i in range(len(my_list)):

    for j in range(len(my_list[i])):

        if j < 5:

            my_list[i][j] = int(my_list[i][j])

        else:

            my_list[i][j] = float(my_list[i][j])

8、筛选重叠面积

根据阈值去除同一缺陷种类的重复检测框

i = 0

bbox = sorted(bbox, key=lambda x: x[3])

while i < (len(bbox) - 1):

    if bbox[i][4] == bbox[i + 1][4]:

        # 计算两个框之间的重叠面积

        x1 = max(bbox[i][0], bbox[i + 1][0])

        y1 = max(bbox[i][1], bbox[i + 1][1])

        x2 = min(bbox[i][0] + bbox[i][2], bbox[i + 1][0] + bbox[i + 1][2])

        y2 = min(bbox[i][1] + bbox[i][3], bbox[i + 1][1] + bbox[i + 1][3])

        intersection = (x2 - x1) * (y2 - y1)

        area1 = bbox[i][2] * bbox[i][3]

        area2 = bbox[i + 1][2] * bbox[i + 1][3]

        nms = 1 - intersection / (area1 + area2 - intersection)

        # print(nms) 

        # 去除多余框

        if nms < threshold and bbox[i][5] >= bbox[i + 1][5]:

            del bbox[i + 1]

        elif nms < threshold and bbox[i][5] < bbox[i + 1][5]:

            del bbox[i]

        elif nms > threshold:

            i = i + 1

    else:

        i = i + 1

9、标记缺陷

根据处理完的缺陷位置信息，使用方框将缺陷标记出来

global colors

global labels

# 循环遍历 bbox 列表中的每一行

for bbox in bbox_list:

    # 获取方框的左上角坐标和宽度、高度

    x, y, w, h = bbox[:4]

    # 在方框左上角上加上缺陷类型和置信系数

    defect_type = bbox[4]

    confidence = bbox[5]

    # 绘制方框

    cv2.rectangle(img, (x, y), (x + w, y + h), colors[defect_type], 2)

    str_confidence = "{:.3f}".format(confidence)

    cv2.putText(img, labels[defect_type] + ' ' + str_confidence, (x, y - 5),

                cv2.FONT_HERSHEY_SIMPLEX, 2, colors[defect_type], 3)

    cv2.imshow("result", img)

10、全局变量设置

# 初始化全局变量

colors = []

with open('type.names', 'r') as f:

    labels = f.read().splitlines()

# 生成缺陷种类数量的随机颜色值

for _ in range(len(labels)):

    color = (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255))

    colors.append(color)

11、读取摄像头图像

# 读取摄像头cv.VideoCapture(设备号)

cap = cv2.VideoCapture(0)

while True:

    # 得到每帧图像, cap.read(是否有图像True或者false, 图像)

    ret_flag, Vshow = cap.read()  

    。。。。

    # 连续读取的时候需要把参数设置为1或更高

    if cv2.waitKey(1) == 27:

        break

12、完整代码

import onnxruntime

import numpy as np

import tkinter

from tkinter import filedialog

import random

import cv2

def resize_image(image, h, w):

    # 将BGR图像转换为RGB图像

    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    # 尺寸变换

    if h > w:

        img = cv2.resize(image, (int(w / h * 640) , 640))

    else:

        img = cv2.resize(image, (640 , int(h / w * 640)))

    # 创建单色背景图像

    background = np.zeros((640, 640, 3), np.uint8)

    background[:] = (255, 0, 0)

    # 将图像居中放置

    x_offset = (640 - img.shape[1]) // 2

    y_offset = (640 - img.shape[0]) // 2

    background[y_offset:y_offset+img.shape[0], x_offset:x_offset+img.shape[1]] = img

    return background

def nchw_image(image):

    # 将像素值转换为浮点数，并将其归一化到0~1之间

    img = image.astype(np.float32) / 255.0   

    # 将图像从HWC格式转换为CHW格式

    img = np.transpose(img, (2, 0, 1))

    # 将图像从CHW格式转换为NCHW格式，批次大小为1

    img = np.expand_dims(img, axis=0)

    return img

def onnx(image, onnx_model_path):

    # onnx测试

    session = onnxruntime.InferenceSession(onnx_model_path)

    inputs = {session.get_inputs()[0].name: image}

    logits = session.run(None, inputs)[0]

    # 将输出转换为二维数组

    # 将(1, 9, 8400)的形状转换为(9, 8400)的形状

    output = logits.reshape((9, -1))

    # 将二维数组转置为(8400, 9)的形状

    output = output.transpose((1, 0))

    return output

def select(num, threshold):

    # 缺陷位置和缺陷置信系数

    selected = np.zeros((0, 9))

    # 缺陷置信系数

    Thresh = np.zeros((0, 1))

    # 缺陷类型

    typ = np.zeros((0, 1), dtype=int)

    i = 0

    # 循环遍历每一行,筛选大于阈值的缺陷

    for n in range(num.shape[0]):

        # 如果第4~8列中有大于阈值的元素

        if np.any(num[n, 4:] >= threshold):

            # 将这一行添加到selected数组中

            selected = np.vstack((selected, num[n]))

            # 如果第4列大于阈值

            if selected[i, 4] == max(selected[i, 4:]):

                # 将type数组第i个元素赋值为缺陷类型0

                typ = np.vstack((typ, 0))

                # 将Thresh数组第i个元素赋值为缺陷类型0的阈值

                Thresh = np.vstack((Thresh, selected[i, 4]))

            elif selected[i, 5] == max(selected[i, 4:]):

                typ = np.vstack((typ, 1))

                Thresh = np.vstack((Thresh, selected[i, 5]))

            elif selected[i, 6] == max(selected[i, 4:]):

                typ = np.vstack((typ, 2))

                Thresh = np.vstack((Thresh, selected[i, 6]))

            elif selected[i, 7] == max(selected[i, 4:]):

                typ = np.vstack((typ, 3))

                Thresh = np.vstack((Thresh, selected[i, 7]))

            elif selected[i, 8] == max(selected[i, 4:]):

                typ = np.vstack((typ, 4))

                Thresh = np.vstack((Thresh, selected[i, 8]))

            i = i + 1

    typ = typ.flatten()

    Thresh = Thresh.flatten()

    return selected , typ , Thresh

def back(select, typ, thresh, h , w):

    # 获取selected数组的第0、1、2和3列，分别对应缺陷中心x，y坐标，宽度，高度

    x_center = select[:, 0]

    y_center = select[:, 1]

    width = select[:, 2]

    height = select[:, 3]

    # 计算左上角坐标

    x_min = x_center - width / 2

    y_min = y_center - height / 2

    # 创建bbox数组，将左上角坐标和宽度、高度存储进去

    bbox = np.zeros((select.shape[0], 6))

    bbox[:, 0] = x_min

    bbox[:, 1] = y_min

    bbox[:, 2] = width

    bbox[:, 3] = height

    # 将type数组和Thresh数组分别添加到bbox数组的第4列和第5列

    bbox[:, 4] = typ

    bbox[:, 5] = thresh

    # 图像比例恢复

    if h > w:

        bbox[:, :4] *= (h/640)

        bbox[:, 0] -= (h/2-w/2)

    else:

        bbox[:, :4] *= (w/640)

        bbox[:, 1] -= (w/2-h/2)

    # 将二维数组转换为二维列表

    my_list = [list(row) for row in bbox]

    # 将 0~4 列转换为 int 型，5 列转换为 float 型

    for i in range(len(my_list)):

        for j in range(len(my_list[i])):

            if j < 5:

                my_list[i][j] = int(my_list[i][j])

            else:

                my_list[i][j] = float(my_list[i][j])

    return my_list

def nms_box(bbox, threshold):

    i = 0

    bbox = sorted(bbox, key=lambda x: x[3])

    while i < (len(bbox) - 1):

        if bbox[i][4] == bbox[i + 1][4]:

            # 计算两个框之间的重叠面积

            x1 = max(bbox[i][0], bbox[i + 1][0])

            y1 = max(bbox[i][1], bbox[i + 1][1])

            x2 = min(bbox[i][0] + bbox[i][2], bbox[i + 1][0] + bbox[i + 1][2])

            y2 = min(bbox[i][1] + bbox[i][3], bbox[i + 1][1] + bbox[i + 1][3])

            intersection = (x2 - x1) * (y2 - y1)

            area1 = bbox[i][2] * bbox[i][3]

            area2 = bbox[i + 1][2] * bbox[i + 1][3]

            nms = 1 - intersection / (area1 + area2 - intersection)

            # print(nms) 

            # 去除多余框

            if nms < threshold and bbox[i][5] >= bbox[i + 1][5]:

                del bbox[i + 1]

            elif nms < threshold and bbox[i][5] < bbox[i + 1][5]:

                del bbox[i]

            elif nms > threshold:

                i = i + 1

        else:

            i = i + 1

    return bbox

def draw_bbox(img, bbox_list):

    global colors

    global labels

    # 循环遍历 bbox 列表中的每一行

    for bbox in bbox_list:

        # 获取方框的左上角坐标和宽度、高度

        x, y, w, h = bbox[:4]

        # 在方框左上角上加上缺陷类型和置信系数

        defect_type = bbox[4]

        confidence = bbox[5]

        # 绘制方框

        cv2.rectangle(img, (x, y), (x + w, y + h), colors[defect_type], 2)

        str_confidence = "{:.3f}".format(confidence)

        cv2.putText(img, labels[defect_type] + ' ' + str_confidence, (x, y - 5),

                    cv2.FONT_HERSHEY_SIMPLEX, 2, colors[defect_type], 3)

        cv2.imshow("result", img)

# 初始化全局变量

colors = []

with open('type.names', 'r') as f:

    labels = f.read().splitlines()

# 生成缺陷种类数量的随机颜色值

for _ in range(len(labels)):

    color = (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255))

    colors.append(color)

if __name__ == "__main__":

    # 读取摄像头cv.VideoCapture(设备号)

    cap = cv2.VideoCapture(0)

    while True:

        # 得到每帧图像, cap.read(是否有图像True或者false, 图像)

        ret_flag, Vshow = cap.read()  

        # # 读取图片

        # im = cv2.imread(Vshow)

        # 获取图像尺寸

        y, x = Vshow.shape[:2]

        # 图像尺寸等比例变换

        image0 = resize_image(Vshow, y, x)

        # 图像归一化

        image1 = nchw_image(image0)

        # 模型推理

        onnx_model_path = "models\\best.onnx"

        result0 = onnx(image1, onnx_model_path)

        # 缺陷阈值

        threshold = 0.4

        # 筛选推理结果缺陷位置和缺陷置信系数、缺陷类型、缺陷置信系数一一对应

        select1, typ, thresh = select(result0, threshold)   

        # 缺陷位置还原

        result1 = back(select1, typ, thresh, y, x)

        # 去除重叠缺陷

        nms_threshold = 0.4

        result2 = nms_box(result1, nms_threshold)

        # print(result2)

        # 绘制缺陷方框

        draw_bbox(Vshow, result2)

        # 连续读取的时候需要把参数设置为1或更高

        if cv2.waitKey(1) == 27:

            break

    # 释放摄像头

    cv2.destroyAllWindows()

    # 释放摄像头

    cap.release()

Python+Yolov8+ONNX实时缺陷目标检测的更多相关文章

[OpenCV]基于特征匹配的实时平面目标检测算法
一直想基于传统图像匹配方式做一个融合Demo,也算是对上个阶段学习的一个总结. 由此,便采购了一个摄像头,在此基础上做了实时检测平面目标的特征匹配算法. 代码如下: # coding: utf-8 ' ...
利用ImageAI库只需几行python代码超简实现目标检测
目录什么是目标检测目标检测算法 Two Stages One Stage python实现依赖安装使用附录什么是目标检测目标检测关注图像中特定的物体目标,需要同时解决解决定位(loca ...
转载:点云上实时三维目标检测的欧拉区域方案 ----Complex-YOLO
感觉是机器翻译,好多地方不通顺,凑合看看原文名称:Complex-YOLO: An Euler-Region-Proposal for Real-time 3D Object Detection ...
You Only Look Once Unified, Real-Time Object Detection（你只需要看一次统一的，实时的目标检测）
我们提出了一种新的目标检测方法YOLO.先前的目标检测工作重新利用分类器来执行检测.相反,我们将目标检测作为一个回归问题来处理空间分离的边界框和相关的类概率.单个神经网络在一次评估中直接从完整图像预测 ...
经典论文系列 | 目标检测--CornerNet & 又名 anchor boxes的缺陷
前言: 目标检测的预测框经过了滑动窗口.selective search.RPN.anchor based等一系列生成方法的发展,到18年开始,开始流行anchor free系列,CornerNe ...
使用Caffe完成图像目标检测和 caffe 全卷积网络
一.[用Python学习Caffe]2. 使用Caffe完成图像目标检测标签: pythoncaffe深度学习目标检测ssd 2017-06-22 22:08 207人阅读评论(0) 收藏举报 ...
第三十五节，目标检测之YOLO算法详解
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object de ...
【目标检测】SSD：
slides 讲得是相当清楚了: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf 配合中文翻译来看: https://www.cnb ...
深度学习 + OpenCV，Python实现实时视频目标检测
使用 OpenCV 和 Python 对实时视频流进行深度学习目标检测是非常简单的,我们只需要组合一些合适的代码,接入实时视频,随后加入原有的目标检测功能. 在本文中我们将学习如何扩展原有的目标检测项 ...
Python实现YOLO目标检测
作者:R语言和Python学堂链接:https://www.jianshu.com/p/35cfc959b37c 1. 什么是目标检测? YOLO目标检测的一个示例啥是目标检测? 拿上图 (用YO ...

随机推荐

django.core.exceptions.ImproperlyConfigured: Specifying a namespace in include() without providing an app_name is not supported.
django.core.exceptions.ImproperlyConfigured: Specifying a namespace in include() without providing a ...
【笔记】- 【美团1万台 Hadoop 集群 YARN 的调优之路】
原文:美团1万台 Hadoop 集群 YARN 的调优之路背景架构 YARN架构资源抽象 YARN调度架构资源抽象调度流程作业的组织方式核心调度流程指标业务指标:有效调度系统性能指 ...
[python]为指定目录下的文件名批量加前缀
前言功能描述:批量重命名指定目录下的文件,文件名加前缀,默认格式为"目录名_原文件名". 示例代码 import argparse import os import sys im ...
使用 KubeBlocks 为 K8s 提供稳如老狗的数据库服务
原文链接:https://forum.laf.run/d/994 大家好!今天这篇文章主要向大家介绍 Sealos 的数据库服务.在 Sealos 上数据库后端服务由 KubeBlocks 提供,为用 ...
标题：在Godot中使用Node2D创建自定义的Label
在Godot游戏引擎中,我们经常需要在游戏中显示文本信息.通常,我们可以使用Label节点来实现这一点.但是,在某些情况下,你可能希望更灵活地控制文本的显示和样式.在本篇博客中,我们将学习如何通过使用 ...
《Python魔法大冒险》009 魔法之语：字符串的奥秘
随着小鱼和魔法师的深入,他们来到了一个被薄雾笼罩的湖泊.湖中央有一个小岛,岛上有一棵巨大的古树,树上挂满了闪闪发光的果实,每一个果实上都刻着一个字母或符号. 小鱼好奇地问:"这些是什么果实? ...
关于MySQL获取自增ID的几种方法
1. Select Max(id) From Table; 通过取表字段最大值的方式来获取最近一次自增id 缺点: 这种方法在多人操作数据库的软件上不可靠, 举个例子, 你刚插入一条记录. 当你在查询 ...
使用vscodep快速编写markdown
写在前面这是一篇基于 vscode 配置,用于书写 markdown 的文章为了方便快速书写 markdown 真想使用一些便捷的快捷键去生成一些自己常用的格式或者是模版,于是自己基于自己的个人习 ...
一次性全讲透GaussDB（DWS）锁的问题
本文分享自华为云社区<GaussDB(DWS)锁问题全解>,作者: yd_211043076. 一.gaussdb有哪些锁 1.常规锁:常规锁主要用于业务访问数据库对象的加锁,保护并发操作 ...
Dubbo3应用开发—XML形式的Dubbo应用开发和SpringBoot整合Dubbo开发
Dubbo3程序的初步开发 Dubbo3升级的核心内容易⽤性开箱即⽤,易⽤性⾼,如 Java 版本的⾯向接⼝代理特性能实现本地透明调⽤功能丰富,基于原⽣库或轻量扩展即可实现绝⼤多数的微服务治理能 ...

Python+Yolov8+ONNX实时缺陷目标检测

1、模型转换

2、查看模型结构

3、修改输入图片的尺寸

4、 图像数据归一化

5、模型推理

6、推理结果筛选

7、像素还原

8、筛选重叠面积

9、标记缺陷

10、全局变量设置

11、读取摄像头图像

12、完整代码

Python+Yolov8+ONNX实时缺陷目标检测的更多相关文章

随机推荐

热门专题

4、图像数据归一化