1、 the YOLO model (YOLO ,you only look once)

(1)We will use 5 anchor boxes. So you can think of the YOLO architecture as the following: IMAGE (m, 608, 608, 3) -> DEEP CNN -> ENCODING (m, 19, 19, 5, 85).

(2)Training a YOLO model takes a very long time and requires a fairly large dataset of labelled bounding boxes for a large range of target classes. We are going to load an existing pretrained Keras YOLO model stored in "yolo.h5". (These weights come from the official YOLO website, and were converted using a function written by Allan Zelener.)

(3)model summry (模型信息)

Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer) (None, 608, 608, 3) 0
____________________________________________________________________________________________________
conv2d_1 (Conv2D) (None, 608, 608, 32) 864 input_1[0][0]
____________________________________________________________________________________________________
batch_normalization_1 (BatchNorm (None, 608, 608, 32) 128 conv2d_1[0][0]
____________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU) (None, 608, 608, 32) 0 batch_normalization_1[0][0]
____________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 304, 304, 32) 0 leaky_re_lu_1[0][0]
____________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 304, 304, 64) 18432 max_pooling2d_1[0][0]
____________________________________________________________________________________________________
batch_normalization_2 (BatchNorm (None, 304, 304, 64) 256 conv2d_2[0][0]
____________________________________________________________________________________________________
leaky_re_lu_2 (LeakyReLU) (None, 304, 304, 64) 0 batch_normalization_2[0][0]
____________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D) (None, 152, 152, 64) 0 leaky_re_lu_2[0][0]
____________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 152, 152, 128) 73728 max_pooling2d_2[0][0]
____________________________________________________________________________________________________
batch_normalization_3 (BatchNorm (None, 152, 152, 128) 512 conv2d_3[0][0]
____________________________________________________________________________________________________
leaky_re_lu_3 (LeakyReLU) (None, 152, 152, 128) 0 batch_normalization_3[0][0]
____________________________________________________________________________________________________
conv2d_4 (Conv2D) (None, 152, 152, 64) 8192 leaky_re_lu_3[0][0]
____________________________________________________________________________________________________
batch_normalization_4 (BatchNorm (None, 152, 152, 64) 256 conv2d_4[0][0]
____________________________________________________________________________________________________
leaky_re_lu_4 (LeakyReLU) (None, 152, 152, 64) 0 batch_normalization_4[0][0]
____________________________________________________________________________________________________
conv2d_5 (Conv2D) (None, 152, 152, 128) 73728 leaky_re_lu_4[0][0]
____________________________________________________________________________________________________
batch_normalization_5 (BatchNorm (None, 152, 152, 128) 512 conv2d_5[0][0]
____________________________________________________________________________________________________
leaky_re_lu_5 (LeakyReLU) (None, 152, 152, 128) 0 batch_normalization_5[0][0]
____________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D) (None, 76, 76, 128) 0 leaky_re_lu_5[0][0]
____________________________________________________________________________________________________
conv2d_6 (Conv2D) (None, 76, 76, 256) 294912 max_pooling2d_3[0][0]
____________________________________________________________________________________________________
batch_normalization_6 (BatchNorm (None, 76, 76, 256) 1024 conv2d_6[0][0]
____________________________________________________________________________________________________
leaky_re_lu_6 (LeakyReLU) (None, 76, 76, 256) 0 batch_normalization_6[0][0]
____________________________________________________________________________________________________
conv2d_7 (Conv2D) (None, 76, 76, 128) 32768 leaky_re_lu_6[0][0]
____________________________________________________________________________________________________
batch_normalization_7 (BatchNorm (None, 76, 76, 128) 512 conv2d_7[0][0]
____________________________________________________________________________________________________
leaky_re_lu_7 (LeakyReLU) (None, 76, 76, 128) 0 batch_normalization_7[0][0]
____________________________________________________________________________________________________
conv2d_8 (Conv2D) (None, 76, 76, 256) 294912 leaky_re_lu_7[0][0]
____________________________________________________________________________________________________
batch_normalization_8 (BatchNorm (None, 76, 76, 256) 1024 conv2d_8[0][0]
____________________________________________________________________________________________________
leaky_re_lu_8 (LeakyReLU) (None, 76, 76, 256) 0 batch_normalization_8[0][0]
____________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D) (None, 38, 38, 256) 0 leaky_re_lu_8[0][0]
____________________________________________________________________________________________________
conv2d_9 (Conv2D) (None, 38, 38, 512) 1179648 max_pooling2d_4[0][0]
____________________________________________________________________________________________________
batch_normalization_9 (BatchNorm (None, 38, 38, 512) 2048 conv2d_9[0][0]
____________________________________________________________________________________________________
leaky_re_lu_9 (LeakyReLU) (None, 38, 38, 512) 0 batch_normalization_9[0][0]
____________________________________________________________________________________________________
conv2d_10 (Conv2D) (None, 38, 38, 256) 131072 leaky_re_lu_9[0][0]
____________________________________________________________________________________________________
batch_normalization_10 (BatchNor (None, 38, 38, 256) 1024 conv2d_10[0][0]
____________________________________________________________________________________________________
leaky_re_lu_10 (LeakyReLU) (None, 38, 38, 256) 0 batch_normalization_10[0][0]
____________________________________________________________________________________________________
conv2d_11 (Conv2D) (None, 38, 38, 512) 1179648 leaky_re_lu_10[0][0]
____________________________________________________________________________________________________
batch_normalization_11 (BatchNor (None, 38, 38, 512) 2048 conv2d_11[0][0]
____________________________________________________________________________________________________
leaky_re_lu_11 (LeakyReLU) (None, 38, 38, 512) 0 batch_normalization_11[0][0]
____________________________________________________________________________________________________
conv2d_12 (Conv2D) (None, 38, 38, 256) 131072 leaky_re_lu_11[0][0]
____________________________________________________________________________________________________
batch_normalization_12 (BatchNor (None, 38, 38, 256) 1024 conv2d_12[0][0]
____________________________________________________________________________________________________
leaky_re_lu_12 (LeakyReLU) (None, 38, 38, 256) 0 batch_normalization_12[0][0]
____________________________________________________________________________________________________
conv2d_13 (Conv2D) (None, 38, 38, 512) 1179648 leaky_re_lu_12[0][0]
____________________________________________________________________________________________________
batch_normalization_13 (BatchNor (None, 38, 38, 512) 2048 conv2d_13[0][0]
____________________________________________________________________________________________________
leaky_re_lu_13 (LeakyReLU) (None, 38, 38, 512) 0 batch_normalization_13[0][0]
____________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D) (None, 19, 19, 512) 0 leaky_re_lu_13[0][0]
____________________________________________________________________________________________________
conv2d_14 (Conv2D) (None, 19, 19, 1024) 4718592 max_pooling2d_5[0][0]
____________________________________________________________________________________________________
batch_normalization_14 (BatchNor (None, 19, 19, 1024) 4096 conv2d_14[0][0]
____________________________________________________________________________________________________
leaky_re_lu_14 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_14[0][0]
____________________________________________________________________________________________________
conv2d_15 (Conv2D) (None, 19, 19, 512) 524288 leaky_re_lu_14[0][0]
____________________________________________________________________________________________________
batch_normalization_15 (BatchNor (None, 19, 19, 512) 2048 conv2d_15[0][0]
____________________________________________________________________________________________________
leaky_re_lu_15 (LeakyReLU) (None, 19, 19, 512) 0 batch_normalization_15[0][0]
____________________________________________________________________________________________________
conv2d_16 (Conv2D) (None, 19, 19, 1024) 4718592 leaky_re_lu_15[0][0]
____________________________________________________________________________________________________
batch_normalization_16 (BatchNor (None, 19, 19, 1024) 4096 conv2d_16[0][0]
____________________________________________________________________________________________________
leaky_re_lu_16 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_16[0][0]
____________________________________________________________________________________________________
conv2d_17 (Conv2D) (None, 19, 19, 512) 524288 leaky_re_lu_16[0][0]
____________________________________________________________________________________________________
batch_normalization_17 (BatchNor (None, 19, 19, 512) 2048 conv2d_17[0][0]
____________________________________________________________________________________________________
leaky_re_lu_17 (LeakyReLU) (None, 19, 19, 512) 0 batch_normalization_17[0][0]
____________________________________________________________________________________________________
conv2d_18 (Conv2D) (None, 19, 19, 1024) 4718592 leaky_re_lu_17[0][0]
____________________________________________________________________________________________________
batch_normalization_18 (BatchNor (None, 19, 19, 1024) 4096 conv2d_18[0][0]
____________________________________________________________________________________________________
leaky_re_lu_18 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_18[0][0]
____________________________________________________________________________________________________
conv2d_19 (Conv2D) (None, 19, 19, 1024) 9437184 leaky_re_lu_18[0][0]
____________________________________________________________________________________________________
batch_normalization_19 (BatchNor (None, 19, 19, 1024) 4096 conv2d_19[0][0]
____________________________________________________________________________________________________
conv2d_21 (Conv2D) (None, 38, 38, 64) 32768 leaky_re_lu_13[0][0]
____________________________________________________________________________________________________
leaky_re_lu_19 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_19[0][0]
____________________________________________________________________________________________________
batch_normalization_21 (BatchNor (None, 38, 38, 64) 256 conv2d_21[0][0]
____________________________________________________________________________________________________
conv2d_20 (Conv2D) (None, 19, 19, 1024) 9437184 leaky_re_lu_19[0][0]
____________________________________________________________________________________________________
leaky_re_lu_21 (LeakyReLU) (None, 38, 38, 64) 0 batch_normalization_21[0][0]
____________________________________________________________________________________________________
batch_normalization_20 (BatchNor (None, 19, 19, 1024) 4096 conv2d_20[0][0]
____________________________________________________________________________________________________
space_to_depth_x2 (Lambda) (None, 19, 19, 256) 0 leaky_re_lu_21[0][0]
____________________________________________________________________________________________________
leaky_re_lu_20 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_20[0][0]
____________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 19, 19, 1280) 0 space_to_depth_x2[0][0]
leaky_re_lu_20[0][0]
____________________________________________________________________________________________________
conv2d_22 (Conv2D) (None, 19, 19, 1024) 11796480 concatenate_1[0][0]
____________________________________________________________________________________________________
batch_normalization_22 (BatchNor (None, 19, 19, 1024) 4096 conv2d_22[0][0]
____________________________________________________________________________________________________
leaky_re_lu_22 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_22[0][0]
____________________________________________________________________________________________________
conv2d_23 (Conv2D) (None, 19, 19, 425) 435625 leaky_re_lu_22[0][0]
====================================================================================================
Total params: 50,983,561
Trainable params: 50,962,889
Non-trainable params: 20,672

  

2、输入输出数据类型

(1)输入数据:(m, 608, 608, 3)

(2)输出数据:(m, 19, 19, 5, 85)

3、检测过程

(1)Score-thresholding:throw away boxes that have detected a class with a score less than the threshold(0.4)

(2)Non-max suppression: Compute the Intersection over Union and avoid selecting overlapping boxes

  • Select the box that has the highest score.
  • Compute its overlap with all other boxes, and remove boxes that overlap it more than iou_threshold.
  • Go back to step 1 and iterate until there's no more boxes with a lower score than the current selected box.

This will remove all boxes that have a large overlap with the selected boxes. Only the "best" boxes remain.

4、总结

    • YOLO is a state-of-the-art object detection model that is fast and accurate
    • It runs an input image through a CNN which outputs a 19x19x5x85 dimensional volume.
    • The encoding can be seen as a grid where each of the 19x19 cells contains information about 5 boxes.
    • You filter through all the boxes using non-max suppression. Specifically:
      • Score thresholding on the probability of detecting a class to keep only accurate (high probability) boxes
      • Intersection over Union (IoU) thresholding to eliminate overlapping boxes
    • Because training a YOLO model from randomly initialized weights is non-trivial and requires a large dataset as well as lot of computation, we used previously trained model parameters in this exercise. If you wish, you can also try fine-tuning the YOLO model with your own dataset, though this would be a fairly non-trivial exercise.

项目总结三:目标检测项目(Car detection with YOLOv2)的更多相关文章

  1. 明火烟雾目标检测项目部署(YoloV5+Flask)

    明火烟雾目标检测项目部署 目录 明火烟雾目标检测项目部署 1. 拉取Docker PyToch镜像 2. 配置项目环境 2.1 更换软件源 2.2 下载vim 2.3 解决vim中文乱码问题 3. 运 ...

  2. 多尺度目标检测 Multiscale Object Detection

    多尺度目标检测 Multiscale Object Detection 我们在输入图像的每个像素上生成多个锚框.这些定位框用于对输入图像的不同区域进行采样.但是,如果锚定框是以图像的每个像素为中心生成 ...

  3. 目标检测--Scalable Object Detection using Deep Neural Networks(CVPR 2014)

    Scalable Object Detection using Deep Neural Networks 作者: Dumitru Erhan, Christian Szegedy, Alexander ...

  4. 吴恩达《深度学习》第四门课(3)目标检测(Object detection)

    3.1目标定位 (1)案例1:在构建自动驾驶时,需要定位出照片中的行人.汽车.摩托车和背景,即四个类别.可以设置这样的输出,首先第一个元素pc=1表示有要定位的物体,那么用另外四个输出元素表示定位框的 ...

  5. 目标检测 - Tensorflow Object Detection API

    一. 找到最好的工具 "工欲善其事,必先利其器",如果你想找一个深度学习框架来解决深度学习问题,TensorFlow 就是你的不二之选,究其原因,也不必过多解释,看过其优雅的代码架 ...

  6. 基于深度学习的目标检测(object detection)—— rcnn、fast-rcnn、faster-rcnn

    模型和方法: 在深度学习求解目标检测问题之前的主流 detection 方法是,DPM(Deformable parts models), 度量与评价: mAP:mean Average Precis ...

  7. spring boot快速入门 1 :创建项目、 三种启动项目方式

    准备工作: (转载)IDEA新建项目时,没有Spring Initializr选项 最近开始使用IDEA作为开发工具,然后也是打算开始学习使用spring boot. 看着博客来进行操作上手sprin ...

  8. 目标检测(一)RCNN--Rich feature hierarchies for accurate object detection and semantic segmentation(v5)

    作者:Ross Girshick,Jeff Donahue,Trevor Darrell,Jitendra Malik 该论文提出了一种简单且可扩展的检测算法,在VOC2012数据集上取得的mAP比当 ...

  9. 关于目标检测 Object detection

    NO1.目标检测 (分类+定位) 目标检测(Object Detection)是图像分类的延伸,除了分类任务,还要给定多个检测目标的坐标位置.      NO2.目标检测的发展 R-CNN是最早基于C ...

随机推荐

  1. LoadRunner录制脚本时没有响应——无法启动浏览器问题总结

    1.ie浏览器去掉启用第三方浏览器扩展 2.loadrunner11 键盘F4,在browser Emulation点击change,在弹出的提示框中Browser version 选择8.0,pla ...

  2. call和apply(学习笔记)

    call() call() 方法调用一个函数, 其具有一个指定的this值和分别地提供的参数(参数的列表). 语法: function.call(thisArg, arg1, arg2, ...) 参 ...

  3. Vue下载页面显示内容

    摘抄自 https://www.cnblogs.com/zhangtianqi520/p/9323873.html 先下载所需依赖 npm install --save html2canvas npm ...

  4. java中的接口与继承,接口的例子讲解

    extends 继承类:implements 实现接口. 简单说: 1.extends是继承父类,只要那个类不是声明为final或者那个类定义为abstract的就能继承, 2.JAVA中不支持多重继 ...

  5. 记录做一个类似于探探的卡片式布局的Recycleview有数据一直不显示

    使用了别人的项目 https://github.com/JerryChan123/ReSwipeCard/blob/master/README_zh.md 之前找recycleview有数据不显示的原 ...

  6. Python Flask学习笔记之Hello World

    Python Flask学习笔记之Hello World 安装virtualenv,配置Flask开发环境 virtualenv 虚拟环境是Python解释器的一个私有副本,在这个环境中可以安装私有包 ...

  7. Prometheus Alert Rules with Some Metrics

    Using Prometheus as a monitor system, it is quite efficent. The most important one is that alert tem ...

  8. post推送消息时的乱码

    URL realUrl = new URL(url); // 打开和URL之间的连接 URLConnection conn = realUrl.openConnection(); // 设置通用的请求 ...

  9. scrapy 中 xpath 用string方法提取带有空格符解决方法

    注释掉的是刚开始的代码,匹配的全是带空格的,replace替换不了空格 后面加上了normalize-space()  匹配到的文本内容变成了可replace 问题解决

  10. Git .gitignore文件说明

    参见:https://book.git-scm.com/book/zh/v2/Git-%E5%9F%BA%E7%A1%80-%E8%AE%B0%E5%BD%95%E6%AF%8F%E6%AC%A1%E ...