1、 the YOLO model (YOLO ,you only look once)

(1)We will use 5 anchor boxes. So you can think of the YOLO architecture as the following: IMAGE (m, 608, 608, 3) -> DEEP CNN -> ENCODING (m, 19, 19, 5, 85).

(2)Training a YOLO model takes a very long time and requires a fairly large dataset of labelled bounding boxes for a large range of target classes. We are going to load an existing pretrained Keras YOLO model stored in "yolo.h5". (These weights come from the official YOLO website, and were converted using a function written by Allan Zelener.)

(3)model summry (模型信息)

Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer) (None, 608, 608, 3) 0
____________________________________________________________________________________________________
conv2d_1 (Conv2D) (None, 608, 608, 32) 864 input_1[0][0]
____________________________________________________________________________________________________
batch_normalization_1 (BatchNorm (None, 608, 608, 32) 128 conv2d_1[0][0]
____________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU) (None, 608, 608, 32) 0 batch_normalization_1[0][0]
____________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 304, 304, 32) 0 leaky_re_lu_1[0][0]
____________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 304, 304, 64) 18432 max_pooling2d_1[0][0]
____________________________________________________________________________________________________
batch_normalization_2 (BatchNorm (None, 304, 304, 64) 256 conv2d_2[0][0]
____________________________________________________________________________________________________
leaky_re_lu_2 (LeakyReLU) (None, 304, 304, 64) 0 batch_normalization_2[0][0]
____________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D) (None, 152, 152, 64) 0 leaky_re_lu_2[0][0]
____________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 152, 152, 128) 73728 max_pooling2d_2[0][0]
____________________________________________________________________________________________________
batch_normalization_3 (BatchNorm (None, 152, 152, 128) 512 conv2d_3[0][0]
____________________________________________________________________________________________________
leaky_re_lu_3 (LeakyReLU) (None, 152, 152, 128) 0 batch_normalization_3[0][0]
____________________________________________________________________________________________________
conv2d_4 (Conv2D) (None, 152, 152, 64) 8192 leaky_re_lu_3[0][0]
____________________________________________________________________________________________________
batch_normalization_4 (BatchNorm (None, 152, 152, 64) 256 conv2d_4[0][0]
____________________________________________________________________________________________________
leaky_re_lu_4 (LeakyReLU) (None, 152, 152, 64) 0 batch_normalization_4[0][0]
____________________________________________________________________________________________________
conv2d_5 (Conv2D) (None, 152, 152, 128) 73728 leaky_re_lu_4[0][0]
____________________________________________________________________________________________________
batch_normalization_5 (BatchNorm (None, 152, 152, 128) 512 conv2d_5[0][0]
____________________________________________________________________________________________________
leaky_re_lu_5 (LeakyReLU) (None, 152, 152, 128) 0 batch_normalization_5[0][0]
____________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D) (None, 76, 76, 128) 0 leaky_re_lu_5[0][0]
____________________________________________________________________________________________________
conv2d_6 (Conv2D) (None, 76, 76, 256) 294912 max_pooling2d_3[0][0]
____________________________________________________________________________________________________
batch_normalization_6 (BatchNorm (None, 76, 76, 256) 1024 conv2d_6[0][0]
____________________________________________________________________________________________________
leaky_re_lu_6 (LeakyReLU) (None, 76, 76, 256) 0 batch_normalization_6[0][0]
____________________________________________________________________________________________________
conv2d_7 (Conv2D) (None, 76, 76, 128) 32768 leaky_re_lu_6[0][0]
____________________________________________________________________________________________________
batch_normalization_7 (BatchNorm (None, 76, 76, 128) 512 conv2d_7[0][0]
____________________________________________________________________________________________________
leaky_re_lu_7 (LeakyReLU) (None, 76, 76, 128) 0 batch_normalization_7[0][0]
____________________________________________________________________________________________________
conv2d_8 (Conv2D) (None, 76, 76, 256) 294912 leaky_re_lu_7[0][0]
____________________________________________________________________________________________________
batch_normalization_8 (BatchNorm (None, 76, 76, 256) 1024 conv2d_8[0][0]
____________________________________________________________________________________________________
leaky_re_lu_8 (LeakyReLU) (None, 76, 76, 256) 0 batch_normalization_8[0][0]
____________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D) (None, 38, 38, 256) 0 leaky_re_lu_8[0][0]
____________________________________________________________________________________________________
conv2d_9 (Conv2D) (None, 38, 38, 512) 1179648 max_pooling2d_4[0][0]
____________________________________________________________________________________________________
batch_normalization_9 (BatchNorm (None, 38, 38, 512) 2048 conv2d_9[0][0]
____________________________________________________________________________________________________
leaky_re_lu_9 (LeakyReLU) (None, 38, 38, 512) 0 batch_normalization_9[0][0]
____________________________________________________________________________________________________
conv2d_10 (Conv2D) (None, 38, 38, 256) 131072 leaky_re_lu_9[0][0]
____________________________________________________________________________________________________
batch_normalization_10 (BatchNor (None, 38, 38, 256) 1024 conv2d_10[0][0]
____________________________________________________________________________________________________
leaky_re_lu_10 (LeakyReLU) (None, 38, 38, 256) 0 batch_normalization_10[0][0]
____________________________________________________________________________________________________
conv2d_11 (Conv2D) (None, 38, 38, 512) 1179648 leaky_re_lu_10[0][0]
____________________________________________________________________________________________________
batch_normalization_11 (BatchNor (None, 38, 38, 512) 2048 conv2d_11[0][0]
____________________________________________________________________________________________________
leaky_re_lu_11 (LeakyReLU) (None, 38, 38, 512) 0 batch_normalization_11[0][0]
____________________________________________________________________________________________________
conv2d_12 (Conv2D) (None, 38, 38, 256) 131072 leaky_re_lu_11[0][0]
____________________________________________________________________________________________________
batch_normalization_12 (BatchNor (None, 38, 38, 256) 1024 conv2d_12[0][0]
____________________________________________________________________________________________________
leaky_re_lu_12 (LeakyReLU) (None, 38, 38, 256) 0 batch_normalization_12[0][0]
____________________________________________________________________________________________________
conv2d_13 (Conv2D) (None, 38, 38, 512) 1179648 leaky_re_lu_12[0][0]
____________________________________________________________________________________________________
batch_normalization_13 (BatchNor (None, 38, 38, 512) 2048 conv2d_13[0][0]
____________________________________________________________________________________________________
leaky_re_lu_13 (LeakyReLU) (None, 38, 38, 512) 0 batch_normalization_13[0][0]
____________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D) (None, 19, 19, 512) 0 leaky_re_lu_13[0][0]
____________________________________________________________________________________________________
conv2d_14 (Conv2D) (None, 19, 19, 1024) 4718592 max_pooling2d_5[0][0]
____________________________________________________________________________________________________
batch_normalization_14 (BatchNor (None, 19, 19, 1024) 4096 conv2d_14[0][0]
____________________________________________________________________________________________________
leaky_re_lu_14 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_14[0][0]
____________________________________________________________________________________________________
conv2d_15 (Conv2D) (None, 19, 19, 512) 524288 leaky_re_lu_14[0][0]
____________________________________________________________________________________________________
batch_normalization_15 (BatchNor (None, 19, 19, 512) 2048 conv2d_15[0][0]
____________________________________________________________________________________________________
leaky_re_lu_15 (LeakyReLU) (None, 19, 19, 512) 0 batch_normalization_15[0][0]
____________________________________________________________________________________________________
conv2d_16 (Conv2D) (None, 19, 19, 1024) 4718592 leaky_re_lu_15[0][0]
____________________________________________________________________________________________________
batch_normalization_16 (BatchNor (None, 19, 19, 1024) 4096 conv2d_16[0][0]
____________________________________________________________________________________________________
leaky_re_lu_16 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_16[0][0]
____________________________________________________________________________________________________
conv2d_17 (Conv2D) (None, 19, 19, 512) 524288 leaky_re_lu_16[0][0]
____________________________________________________________________________________________________
batch_normalization_17 (BatchNor (None, 19, 19, 512) 2048 conv2d_17[0][0]
____________________________________________________________________________________________________
leaky_re_lu_17 (LeakyReLU) (None, 19, 19, 512) 0 batch_normalization_17[0][0]
____________________________________________________________________________________________________
conv2d_18 (Conv2D) (None, 19, 19, 1024) 4718592 leaky_re_lu_17[0][0]
____________________________________________________________________________________________________
batch_normalization_18 (BatchNor (None, 19, 19, 1024) 4096 conv2d_18[0][0]
____________________________________________________________________________________________________
leaky_re_lu_18 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_18[0][0]
____________________________________________________________________________________________________
conv2d_19 (Conv2D) (None, 19, 19, 1024) 9437184 leaky_re_lu_18[0][0]
____________________________________________________________________________________________________
batch_normalization_19 (BatchNor (None, 19, 19, 1024) 4096 conv2d_19[0][0]
____________________________________________________________________________________________________
conv2d_21 (Conv2D) (None, 38, 38, 64) 32768 leaky_re_lu_13[0][0]
____________________________________________________________________________________________________
leaky_re_lu_19 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_19[0][0]
____________________________________________________________________________________________________
batch_normalization_21 (BatchNor (None, 38, 38, 64) 256 conv2d_21[0][0]
____________________________________________________________________________________________________
conv2d_20 (Conv2D) (None, 19, 19, 1024) 9437184 leaky_re_lu_19[0][0]
____________________________________________________________________________________________________
leaky_re_lu_21 (LeakyReLU) (None, 38, 38, 64) 0 batch_normalization_21[0][0]
____________________________________________________________________________________________________
batch_normalization_20 (BatchNor (None, 19, 19, 1024) 4096 conv2d_20[0][0]
____________________________________________________________________________________________________
space_to_depth_x2 (Lambda) (None, 19, 19, 256) 0 leaky_re_lu_21[0][0]
____________________________________________________________________________________________________
leaky_re_lu_20 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_20[0][0]
____________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 19, 19, 1280) 0 space_to_depth_x2[0][0]
leaky_re_lu_20[0][0]
____________________________________________________________________________________________________
conv2d_22 (Conv2D) (None, 19, 19, 1024) 11796480 concatenate_1[0][0]
____________________________________________________________________________________________________
batch_normalization_22 (BatchNor (None, 19, 19, 1024) 4096 conv2d_22[0][0]
____________________________________________________________________________________________________
leaky_re_lu_22 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_22[0][0]
____________________________________________________________________________________________________
conv2d_23 (Conv2D) (None, 19, 19, 425) 435625 leaky_re_lu_22[0][0]
====================================================================================================
Total params: 50,983,561
Trainable params: 50,962,889
Non-trainable params: 20,672

  

2、输入输出数据类型

(1)输入数据:(m, 608, 608, 3)

(2)输出数据:(m, 19, 19, 5, 85)

3、检测过程

(1)Score-thresholding:throw away boxes that have detected a class with a score less than the threshold(0.4)

(2)Non-max suppression: Compute the Intersection over Union and avoid selecting overlapping boxes

  • Select the box that has the highest score.
  • Compute its overlap with all other boxes, and remove boxes that overlap it more than iou_threshold.
  • Go back to step 1 and iterate until there's no more boxes with a lower score than the current selected box.

This will remove all boxes that have a large overlap with the selected boxes. Only the "best" boxes remain.

4、总结

    • YOLO is a state-of-the-art object detection model that is fast and accurate
    • It runs an input image through a CNN which outputs a 19x19x5x85 dimensional volume.
    • The encoding can be seen as a grid where each of the 19x19 cells contains information about 5 boxes.
    • You filter through all the boxes using non-max suppression. Specifically:
      • Score thresholding on the probability of detecting a class to keep only accurate (high probability) boxes
      • Intersection over Union (IoU) thresholding to eliminate overlapping boxes
    • Because training a YOLO model from randomly initialized weights is non-trivial and requires a large dataset as well as lot of computation, we used previously trained model parameters in this exercise. If you wish, you can also try fine-tuning the YOLO model with your own dataset, though this would be a fairly non-trivial exercise.

项目总结三:目标检测项目(Car detection with YOLOv2)的更多相关文章

  1. 明火烟雾目标检测项目部署(YoloV5+Flask)

    明火烟雾目标检测项目部署 目录 明火烟雾目标检测项目部署 1. 拉取Docker PyToch镜像 2. 配置项目环境 2.1 更换软件源 2.2 下载vim 2.3 解决vim中文乱码问题 3. 运 ...

  2. 多尺度目标检测 Multiscale Object Detection

    多尺度目标检测 Multiscale Object Detection 我们在输入图像的每个像素上生成多个锚框.这些定位框用于对输入图像的不同区域进行采样.但是,如果锚定框是以图像的每个像素为中心生成 ...

  3. 目标检测--Scalable Object Detection using Deep Neural Networks(CVPR 2014)

    Scalable Object Detection using Deep Neural Networks 作者: Dumitru Erhan, Christian Szegedy, Alexander ...

  4. 吴恩达《深度学习》第四门课(3)目标检测(Object detection)

    3.1目标定位 (1)案例1:在构建自动驾驶时,需要定位出照片中的行人.汽车.摩托车和背景,即四个类别.可以设置这样的输出,首先第一个元素pc=1表示有要定位的物体,那么用另外四个输出元素表示定位框的 ...

  5. 目标检测 - Tensorflow Object Detection API

    一. 找到最好的工具 "工欲善其事,必先利其器",如果你想找一个深度学习框架来解决深度学习问题,TensorFlow 就是你的不二之选,究其原因,也不必过多解释,看过其优雅的代码架 ...

  6. 基于深度学习的目标检测(object detection)—— rcnn、fast-rcnn、faster-rcnn

    模型和方法: 在深度学习求解目标检测问题之前的主流 detection 方法是,DPM(Deformable parts models), 度量与评价: mAP:mean Average Precis ...

  7. spring boot快速入门 1 :创建项目、 三种启动项目方式

    准备工作: (转载)IDEA新建项目时,没有Spring Initializr选项 最近开始使用IDEA作为开发工具,然后也是打算开始学习使用spring boot. 看着博客来进行操作上手sprin ...

  8. 目标检测(一)RCNN--Rich feature hierarchies for accurate object detection and semantic segmentation(v5)

    作者:Ross Girshick,Jeff Donahue,Trevor Darrell,Jitendra Malik 该论文提出了一种简单且可扩展的检测算法,在VOC2012数据集上取得的mAP比当 ...

  9. 关于目标检测 Object detection

    NO1.目标检测 (分类+定位) 目标检测(Object Detection)是图像分类的延伸,除了分类任务,还要给定多个检测目标的坐标位置.      NO2.目标检测的发展 R-CNN是最早基于C ...

随机推荐

  1. C#下载Url文件到本地

    protected void Page_Load(object sender, EventArgs e) { string filePath = Request.Params["FilePa ...

  2. IIS Express(7.0) HTTP 错误 500.22 - Internal Server Error(vs2013)

    1.错误如下: HTTP 错误 500.22 - Internal Server Error 检测到在集成的托管管道模式下不适用的 ASP.NET 设置. 解决的方法: 首先,找到本地appcmd.x ...

  3. bind()方法

    当点击鼠标时,隐藏或显示 p 元素: $("button").bind("click",function(){ $("p").slideTo ...

  4. PostgreSQL时间段查询

    1.今日 select * from "表名" where to_date("时间字段"::text,'yyyy-mm-dd')=current_date 2. ...

  5. 模板引擎,中间件,spring AOP原理

    1. 主流模板引擎有哪些 https://blog.csdn.net/wangmx1993328/article/details/81054474 2. 解释模板引擎是个什么东西 https://ww ...

  6. Vue 获取登录用户名

    本来是打算登录的时候把用户名传过去,试了几次都没成功,然后改成用cookie保存用户名,然后在读取就行了, 登录时候设置cookie setCookie(c_name,c_pwd,exdays) { ...

  7. Acoustic modelling from the signal domain using CNNs

    3. Neural network architecture 此处描述了在本文当中所使用的网络结构,和所提取的关键特征(key features).首先,描述了两个新型的网络结构:the networ ...

  8. Spring资源加载器抽象和缺省实现 -- ResourceLoader + DefaultResourceLoader(摘)

    概述 对于每一个底层资源,比如文件系统中的一个文件,classpath上的一个文件,或者一个以URL形式表示的网络资源,Spring 统一使用 Resource 接口进行了建模抽象,相应地,对于这些资 ...

  9. mvc 路由伪静态实现

    很多网站都采用伪静态,例如以html.shtml等结尾的url,mvc的路由可以轻松实现. 配置路由 默认路由配置 添加伪静态路由 mvc的路由原理是从上往下匹配的,所以只需要在后面添加自己配置的路由 ...

  10. JavaScript实现页面刷新滚动条位置不变(利用cookie)

    实验环境:vs2015 asp.net(C#) 主要原理: 1.在页面滚动时或点击按钮时将当前滚动条位置记录到cookie[pos], 2.页面刷新或重载时查询cookie[pos]中的值是否存在,若 ...