项目总结三：目标检测项目（Car detection with YOLOv2）

1、 the YOLO model （YOLO ，you only look once）

（1）We will use 5 anchor boxes. So you can think of the YOLO architecture as the following: IMAGE (m, 608, 608, 3) -> DEEP CNN -> ENCODING (m, 19, 19, 5, 85).

（2）Training a YOLO model takes a very long time and requires a fairly large dataset of labelled bounding boxes for a large range of target classes. We are going to load an existing pretrained Keras YOLO model stored in "yolo.h5". (These weights come from the official YOLO website, and were converted using a function written by Allan Zelener.)

（3）model summry (模型信息)

Layer (type)                     Output Shape          Param #     Connected to

====================================================================================================

input_1 (InputLayer)             (None, 608, 608, 3)   0

____________________________________________________________________________________________________

conv2d_1 (Conv2D)                (None, 608, 608, 32)  864         input_1[0][0]

____________________________________________________________________________________________________

batch_normalization_1 (BatchNorm (None, 608, 608, 32)  128         conv2d_1[0][0]

____________________________________________________________________________________________________

leaky_re_lu_1 (LeakyReLU)        (None, 608, 608, 32)  0           batch_normalization_1[0][0]

____________________________________________________________________________________________________

max_pooling2d_1 (MaxPooling2D)   (None, 304, 304, 32)  0           leaky_re_lu_1[0][0]

____________________________________________________________________________________________________

conv2d_2 (Conv2D)                (None, 304, 304, 64)  18432       max_pooling2d_1[0][0]

____________________________________________________________________________________________________

batch_normalization_2 (BatchNorm (None, 304, 304, 64)  256         conv2d_2[0][0]

____________________________________________________________________________________________________

leaky_re_lu_2 (LeakyReLU)        (None, 304, 304, 64)  0           batch_normalization_2[0][0]

____________________________________________________________________________________________________

max_pooling2d_2 (MaxPooling2D)   (None, 152, 152, 64)  0           leaky_re_lu_2[0][0]

____________________________________________________________________________________________________

conv2d_3 (Conv2D)                (None, 152, 152, 128) 73728       max_pooling2d_2[0][0]

____________________________________________________________________________________________________

batch_normalization_3 (BatchNorm (None, 152, 152, 128) 512         conv2d_3[0][0]

____________________________________________________________________________________________________

leaky_re_lu_3 (LeakyReLU)        (None, 152, 152, 128) 0           batch_normalization_3[0][0]

____________________________________________________________________________________________________

conv2d_4 (Conv2D)                (None, 152, 152, 64)  8192        leaky_re_lu_3[0][0]

____________________________________________________________________________________________________

batch_normalization_4 (BatchNorm (None, 152, 152, 64)  256         conv2d_4[0][0]

____________________________________________________________________________________________________

leaky_re_lu_4 (LeakyReLU)        (None, 152, 152, 64)  0           batch_normalization_4[0][0]

____________________________________________________________________________________________________

conv2d_5 (Conv2D)                (None, 152, 152, 128) 73728       leaky_re_lu_4[0][0]

____________________________________________________________________________________________________

batch_normalization_5 (BatchNorm (None, 152, 152, 128) 512         conv2d_5[0][0]

____________________________________________________________________________________________________

leaky_re_lu_5 (LeakyReLU)        (None, 152, 152, 128) 0           batch_normalization_5[0][0]

____________________________________________________________________________________________________

max_pooling2d_3 (MaxPooling2D)   (None, 76, 76, 128)   0           leaky_re_lu_5[0][0]

____________________________________________________________________________________________________

conv2d_6 (Conv2D)                (None, 76, 76, 256)   294912      max_pooling2d_3[0][0]

____________________________________________________________________________________________________

batch_normalization_6 (BatchNorm (None, 76, 76, 256)   1024        conv2d_6[0][0]

____________________________________________________________________________________________________

leaky_re_lu_6 (LeakyReLU)        (None, 76, 76, 256)   0           batch_normalization_6[0][0]

____________________________________________________________________________________________________

conv2d_7 (Conv2D)                (None, 76, 76, 128)   32768       leaky_re_lu_6[0][0]

____________________________________________________________________________________________________

batch_normalization_7 (BatchNorm (None, 76, 76, 128)   512         conv2d_7[0][0]

____________________________________________________________________________________________________

leaky_re_lu_7 (LeakyReLU)        (None, 76, 76, 128)   0           batch_normalization_7[0][0]

____________________________________________________________________________________________________

conv2d_8 (Conv2D)                (None, 76, 76, 256)   294912      leaky_re_lu_7[0][0]

____________________________________________________________________________________________________

batch_normalization_8 (BatchNorm (None, 76, 76, 256)   1024        conv2d_8[0][0]

____________________________________________________________________________________________________

leaky_re_lu_8 (LeakyReLU)        (None, 76, 76, 256)   0           batch_normalization_8[0][0]

____________________________________________________________________________________________________

max_pooling2d_4 (MaxPooling2D)   (None, 38, 38, 256)   0           leaky_re_lu_8[0][0]

____________________________________________________________________________________________________

conv2d_9 (Conv2D)                (None, 38, 38, 512)   1179648     max_pooling2d_4[0][0]

____________________________________________________________________________________________________

batch_normalization_9 (BatchNorm (None, 38, 38, 512)   2048        conv2d_9[0][0]

____________________________________________________________________________________________________

leaky_re_lu_9 (LeakyReLU)        (None, 38, 38, 512)   0           batch_normalization_9[0][0]

____________________________________________________________________________________________________

conv2d_10 (Conv2D)               (None, 38, 38, 256)   131072      leaky_re_lu_9[0][0]

____________________________________________________________________________________________________

batch_normalization_10 (BatchNor (None, 38, 38, 256)   1024        conv2d_10[0][0]

____________________________________________________________________________________________________

leaky_re_lu_10 (LeakyReLU)       (None, 38, 38, 256)   0           batch_normalization_10[0][0]

____________________________________________________________________________________________________

conv2d_11 (Conv2D)               (None, 38, 38, 512)   1179648     leaky_re_lu_10[0][0]

____________________________________________________________________________________________________

batch_normalization_11 (BatchNor (None, 38, 38, 512)   2048        conv2d_11[0][0]

____________________________________________________________________________________________________

leaky_re_lu_11 (LeakyReLU)       (None, 38, 38, 512)   0           batch_normalization_11[0][0]

____________________________________________________________________________________________________

conv2d_12 (Conv2D)               (None, 38, 38, 256)   131072      leaky_re_lu_11[0][0]

____________________________________________________________________________________________________

batch_normalization_12 (BatchNor (None, 38, 38, 256)   1024        conv2d_12[0][0]

____________________________________________________________________________________________________

leaky_re_lu_12 (LeakyReLU)       (None, 38, 38, 256)   0           batch_normalization_12[0][0]

____________________________________________________________________________________________________

conv2d_13 (Conv2D)               (None, 38, 38, 512)   1179648     leaky_re_lu_12[0][0]

____________________________________________________________________________________________________

batch_normalization_13 (BatchNor (None, 38, 38, 512)   2048        conv2d_13[0][0]

____________________________________________________________________________________________________

leaky_re_lu_13 (LeakyReLU)       (None, 38, 38, 512)   0           batch_normalization_13[0][0]

____________________________________________________________________________________________________

max_pooling2d_5 (MaxPooling2D)   (None, 19, 19, 512)   0           leaky_re_lu_13[0][0]

____________________________________________________________________________________________________

conv2d_14 (Conv2D)               (None, 19, 19, 1024)  4718592     max_pooling2d_5[0][0]

____________________________________________________________________________________________________

batch_normalization_14 (BatchNor (None, 19, 19, 1024)  4096        conv2d_14[0][0]

____________________________________________________________________________________________________

leaky_re_lu_14 (LeakyReLU)       (None, 19, 19, 1024)  0           batch_normalization_14[0][0]

____________________________________________________________________________________________________

conv2d_15 (Conv2D)               (None, 19, 19, 512)   524288      leaky_re_lu_14[0][0]

____________________________________________________________________________________________________

batch_normalization_15 (BatchNor (None, 19, 19, 512)   2048        conv2d_15[0][0]

____________________________________________________________________________________________________

leaky_re_lu_15 (LeakyReLU)       (None, 19, 19, 512)   0           batch_normalization_15[0][0]

____________________________________________________________________________________________________

conv2d_16 (Conv2D)               (None, 19, 19, 1024)  4718592     leaky_re_lu_15[0][0]

____________________________________________________________________________________________________

batch_normalization_16 (BatchNor (None, 19, 19, 1024)  4096        conv2d_16[0][0]

____________________________________________________________________________________________________

leaky_re_lu_16 (LeakyReLU)       (None, 19, 19, 1024)  0           batch_normalization_16[0][0]

____________________________________________________________________________________________________

conv2d_17 (Conv2D)               (None, 19, 19, 512)   524288      leaky_re_lu_16[0][0]

____________________________________________________________________________________________________

batch_normalization_17 (BatchNor (None, 19, 19, 512)   2048        conv2d_17[0][0]

____________________________________________________________________________________________________

leaky_re_lu_17 (LeakyReLU)       (None, 19, 19, 512)   0           batch_normalization_17[0][0]

____________________________________________________________________________________________________

conv2d_18 (Conv2D)               (None, 19, 19, 1024)  4718592     leaky_re_lu_17[0][0]

____________________________________________________________________________________________________

batch_normalization_18 (BatchNor (None, 19, 19, 1024)  4096        conv2d_18[0][0]

____________________________________________________________________________________________________

leaky_re_lu_18 (LeakyReLU)       (None, 19, 19, 1024)  0           batch_normalization_18[0][0]

____________________________________________________________________________________________________

conv2d_19 (Conv2D)               (None, 19, 19, 1024)  9437184     leaky_re_lu_18[0][0]

____________________________________________________________________________________________________

batch_normalization_19 (BatchNor (None, 19, 19, 1024)  4096        conv2d_19[0][0]

____________________________________________________________________________________________________

conv2d_21 (Conv2D)               (None, 38, 38, 64)    32768       leaky_re_lu_13[0][0]

____________________________________________________________________________________________________

leaky_re_lu_19 (LeakyReLU)       (None, 19, 19, 1024)  0           batch_normalization_19[0][0]

____________________________________________________________________________________________________

batch_normalization_21 (BatchNor (None, 38, 38, 64)    256         conv2d_21[0][0]

____________________________________________________________________________________________________

conv2d_20 (Conv2D)               (None, 19, 19, 1024)  9437184     leaky_re_lu_19[0][0]

____________________________________________________________________________________________________

leaky_re_lu_21 (LeakyReLU)       (None, 38, 38, 64)    0           batch_normalization_21[0][0]

____________________________________________________________________________________________________

batch_normalization_20 (BatchNor (None, 19, 19, 1024)  4096        conv2d_20[0][0]

____________________________________________________________________________________________________

space_to_depth_x2 (Lambda)       (None, 19, 19, 256)   0           leaky_re_lu_21[0][0]

____________________________________________________________________________________________________

leaky_re_lu_20 (LeakyReLU)       (None, 19, 19, 1024)  0           batch_normalization_20[0][0]

____________________________________________________________________________________________________

concatenate_1 (Concatenate)      (None, 19, 19, 1280)  0           space_to_depth_x2[0][0]

                                                                   leaky_re_lu_20[0][0]

____________________________________________________________________________________________________

conv2d_22 (Conv2D)               (None, 19, 19, 1024)  11796480    concatenate_1[0][0]

____________________________________________________________________________________________________

batch_normalization_22 (BatchNor (None, 19, 19, 1024)  4096        conv2d_22[0][0]

____________________________________________________________________________________________________

leaky_re_lu_22 (LeakyReLU)       (None, 19, 19, 1024)  0           batch_normalization_22[0][0]

____________________________________________________________________________________________________

conv2d_23 (Conv2D)               (None, 19, 19, 425)   435625      leaky_re_lu_22[0][0]

====================================================================================================

Total params: 50,983,561

Trainable params: 50,962,889

Non-trainable params: 20,672

2、输入输出数据类型

（1）输入数据：(m, 608, 608, 3)

（2）输出数据：(m, 19, 19, 5, 85)

3、检测过程

（1）Score-thresholding：throw away boxes that have detected a class with a score less than the threshold（0.4）

（2）Non-max suppression: Compute the Intersection over Union and avoid selecting overlapping boxes

Select the box that has the highest score.
Compute its overlap with all other boxes, and remove boxes that overlap it more than iou_threshold.
Go back to step 1 and iterate until there's no more boxes with a lower score than the current selected box.

This will remove all boxes that have a large overlap with the selected boxes. Only the "best" boxes remain.

4、总结

YOLO is a state-of-the-art object detection model that is fast and accurate
It runs an input image through a CNN which outputs a 19x19x5x85 dimensional volume.
The encoding can be seen as a grid where each of the 19x19 cells contains information about 5 boxes.
You filter through all the boxes using non-max suppression. Specifically:
- Score thresholding on the probability of detecting a class to keep only accurate (high probability) boxes
- Intersection over Union (IoU) thresholding to eliminate overlapping boxes
Because training a YOLO model from randomly initialized weights is non-trivial and requires a large dataset as well as lot of computation, we used previously trained model parameters in this exercise. If you wish, you can also try fine-tuning the YOLO model with your own dataset, though this would be a fairly non-trivial exercise.

项目总结三：目标检测项目（Car detection with YOLOv2）的更多相关文章

明火烟雾目标检测项目部署（YoloV5+Flask）
明火烟雾目标检测项目部署目录明火烟雾目标检测项目部署 1. 拉取Docker PyToch镜像 2. 配置项目环境 2.1 更换软件源 2.2 下载vim 2.3 解决vim中文乱码问题 3. 运 ...
多尺度目标检测 Multiscale Object Detection
多尺度目标检测 Multiscale Object Detection 我们在输入图像的每个像素上生成多个锚框.这些定位框用于对输入图像的不同区域进行采样.但是,如果锚定框是以图像的每个像素为中心生成 ...
目标检测--Scalable Object Detection using Deep Neural Networks(CVPR 2014)
Scalable Object Detection using Deep Neural Networks 作者: Dumitru Erhan, Christian Szegedy, Alexander ...
吴恩达《深度学习》第四门课（3）目标检测（Object detection）
3.1目标定位 (1)案例1:在构建自动驾驶时,需要定位出照片中的行人.汽车.摩托车和背景,即四个类别.可以设置这样的输出,首先第一个元素pc=1表示有要定位的物体,那么用另外四个输出元素表示定位框的 ...
目标检测 - Tensorflow Object Detection API
一. 找到最好的工具 "工欲善其事,必先利其器",如果你想找一个深度学习框架来解决深度学习问题,TensorFlow 就是你的不二之选,究其原因,也不必过多解释,看过其优雅的代码架 ...
基于深度学习的目标检测（object detection）—— rcnn、fast-rcnn、faster-rcnn
模型和方法: 在深度学习求解目标检测问题之前的主流 detection 方法是,DPM(Deformable parts models), 度量与评价: mAP:mean Average Precis ...
spring boot快速入门 1 ：创建项目、三种启动项目方式
准备工作: (转载)IDEA新建项目时,没有Spring Initializr选项最近开始使用IDEA作为开发工具,然后也是打算开始学习使用spring boot. 看着博客来进行操作上手sprin ...
目标检测（一）RCNN--Rich feature hierarchies for accurate object detection and semantic segmentation(v5)
作者:Ross Girshick,Jeff Donahue,Trevor Darrell,Jitendra Malik 该论文提出了一种简单且可扩展的检测算法,在VOC2012数据集上取得的mAP比当 ...
关于目标检测 Object detection
NO1.目标检测 (分类+定位) 目标检测(Object Detection)是图像分类的延伸,除了分类任务,还要给定多个检测目标的坐标位置. NO2.目标检测的发展 R-CNN是最早基于C ...

随机推荐

java之路数据类型-常量
class Demo1{ public static void main(String[] args){ //数据类型类名 = 初始值 int age = 10; int age1 = 20; Sy ...
Windows上Kafka运行环境安装
1. 安装JDK 1.1 安装文件:http://www.oracle.com/technetwork/java/javase/downloads/index.html 下载JDK1.2 安装完成后需 ...
ContentType与SpiringMvc
转载https://blog.csdn.net/mingtianhaiyouwo/article/details/51459764
【Selenium】【BugList11】启动selenium server报错：Unsupported major.minor version 52.0
[环境信息] python:3.6.5 平台:win7 selenium:3.11.0 selenium server:selenium-server-standalone-3.11.0.jar jd ...
hive 优化方法
https://blog.csdn.net/jiangsanfeng1111/article/details/52847044 -- 高级优化使用各种函数hive>show functions ...
JPA-学习02
一.主键生成策略主键:确定一张表的唯一性东西(非空且唯一) 分为:自然主键和代理主键. 生成策略: identity:自增策略(1.值必须是数字,2.数据库支持) sequence:序列策略(同上, ...
redis学习-散列表常用命令（hash）
redis学习-散列表常用命令(hash) hset,hmset:给指定散列表插入一个或者多个键值对 hget,hmget:获取指定散列表一个或者多个键值对的值 hgetall:获取所欲哦键值以及 ...
c++变量的存储方式
1.名字的作用域作用域是从空间的角度来分析的,c++的作用域以花括号分隔,定于于所有{ }以外的名字具有全局作用域,定义于{ }以内的名字具有块作用域 2.变量的生命周期生命周期是从变量存在的时间 ...
openmp入门总结
Ref: https://wdxtub.com/2016/03/20/openmp-guide/ 简介这门课作为 ECE 中少有的跟计算机科学相关的课,自然是必上不可.不过无论是 OpenMP 还是 ...
python insert所用插入到自定的位置
a = list(range(50)) b = list(range(50)) c = [] for x in a: c.insert(x, [a[x], b[x]]) print(c)

项目总结三：目标检测项目（Car detection with YOLOv2）

项目总结三：目标检测项目（Car detection with YOLOv2）的更多相关文章

随机推荐

热门专题