proposal_layer层是利用训练好的rpn网络来生成region proposal供fast rcnn使用。

proposal_layer整个处理过程:1.生成所有的anchor,对anchor进行4个坐标变换生成新的坐标变成proposals(按照老方法先在最后一层feature map的每个像素点上滑动生成所有的anchor,然后将所有的anchor坐标乘以16,即映射到原图就得到所有的region proposal,接着再用boundingbox regression对每个region proposal进行坐标变换生成更优的region proposal坐标,也是最终的region proposal坐标)  2.处理掉所有坐标超过了图像边界的proposal  3.处理掉所有长度宽度小于min_size的proposal  4.把所有的proposal按score高低进行排序  5.选择得分前pre_nms_topN的proposal,这是在进行nms前进行一次选择  6.进行nms处理  7.选择得分前post_nms_topN的proposal,这是在进行nms后进行的一次选择  最终就得到了需要传入fast rcnn网络的region proposal。

# --------------------------------------------------------
# Faster R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick and Sean Bell
# -------------------------------------------------------- import caffe
import numpy as np
import yaml
from fast_rcnn.config import cfg
from generate_anchors import generate_anchors
from fast_rcnn.bbox_transform import bbox_transform_inv, clip_boxes
from fast_rcnn.nms_wrapper import nms DEBUG = False class ProposalLayer(caffe.Layer):
"""
Outputs object detection proposals by applying estimated bounding-box
transformations to a set of regular boxes (called "anchors").
""" def setup(self, bottom, top):
# parse the layer parameter string, which must be valid YAML
layer_params = yaml.load(self.param_str_) self._feat_stride = layer_params['feat_stride']
anchor_scales = layer_params.get('scales', (8, 16, 32))
self._anchors = generate_anchors(scales=np.array(anchor_scales))
self._num_anchors = self._anchors.shape[0] if DEBUG:
print 'feat_stride: {}'.format(self._feat_stride)
print 'anchors:'
print self._anchors # rois blob: holds R regions of interest, each is a 5-tuple
# (n, x1, y1, x2, y2) specifying an image batch index n and a
# rectangle (x1, y1, x2, y2)
top[0].reshape(1, 5) # scores blob: holds scores for R regions of interest
if len(top) > 1:
top[1].reshape(1, 1, 1, 1) def forward(self, bottom, top):
# Algorithm:
#
# for each (H, W) location i
# generate A anchor boxes centered on cell i
# apply predicted bbox deltas at cell i to each of the A anchors
# clip predicted boxes to image
# remove predicted boxes with either height or width < threshold
# sort all (proposal, score) pairs by score from highest to lowest
# take top pre_nms_topN proposals before NMS
# apply NMS with threshold 0.7 to remaining proposals
# take after_nms_topN proposals after NMS
# return the top proposals (-> RoIs top, scores top) assert bottom[0].data.shape[0] == 1, \
'Only single item batches are supported' cfg_key = str(self.phase) # either 'TRAIN' or 'TEST'
pre_nms_topN = cfg[cfg_key].RPN_PRE_NMS_TOP_N         #这是在进行nms处理前,从anchor中筛选出前topn个
post_nms_topN = cfg[cfg_key].RPN_POST_NMS_TOP_N        #这是经过nms处理后,从anchor中筛选出钱topn个
nms_thresh = cfg[cfg_key].RPN_NMS_THRESH
min_size = cfg[cfg_key].RPN_MIN_SIZE # the first set of _num_anchors channels are bg probs
# the second set are the fg probs, which we want
scores = bottom[0].data[:, self._num_anchors:, :, :]
bbox_deltas = bottom[1].data                    #和anchor_target_layer层一样,获得训练得到4个变化值
im_info = bottom[2].data[0, :] if DEBUG:
print 'im_size: ({}, {})'.format(im_info[0], im_info[1])
print 'scale: {}'.format(im_info[2]) # 1. Generate proposals from bbox deltas and shifted anchors
height, width = scores.shape[-2:]                  #这里和anchor_target_layer层一样,都是通过rpn_cls_score得到最后一层特征提取层的长度和宽度if DEBUG:
print 'score map size: {}'.format(scores.shape) # Enumerate all shifts
shift_x = np.arange(0, width) * self._feat_stride
shift_y = np.arange(0, height) * self._feat_stride
shift_x, shift_y = np.meshgrid(shift_x, shift_y)
shifts = np.vstack((shift_x.ravel(), shift_y.ravel(),
shift_x.ravel(), shift_y.ravel())).transpose() # Enumerate all shifted anchors:
#
# add A anchors (1, A, 4) to
# cell K shifts (K, 1, 4) to get
# shift anchors (K, A, 4)
# reshape to (K*A, 4) shifted anchors
A = self._num_anchors
K = shifts.shape[0]
anchors = self._anchors.reshape((1, A, 4)) + \
shifts.reshape((1, K, 4)).transpose((1, 0, 2))
anchors = anchors.reshape((K * A, 4))                        #和anchor_target_layer层一样,得到所有的anchor坐标值,并且形状是4列多行  # Transpose and reshape predicted bbox transformations to get them
# into the same order as the anchors:
#
# bbox deltas will be (1, 4 * A, H, W) format
# transpose to (1, H, W, 4 * A)
# reshape to (1 * H * W * A, 4) where rows are ordered by (h, w, a)
# in slowest to fastest order
bbox_deltas = bbox_deltas.transpose((0, 2, 3, 1)).reshape((-1, 4))     #将bbox_deltas的shape改成和anchors一样,方便下面运算 # Same story for the scores:
#
# scores are (1, A, H, W) format
# transpose to (1, H, W, A)
# reshape to (1 * H * W * A, 1) where rows are ordered by (h, w, a)
scores = scores.transpose((0, 2, 3, 1)).reshape((-1, 1))           #将scores的shape也变成4列多行 # Convert anchors into proposals via bbox transformations
proposals = bbox_transform_inv(anchors, bbox_deltas)              #通过bbox_deltas将anchors转成proposals, # 2. clip predicted boxes to image
proposals = clip_boxes(proposals, im_info[:2]) # 3. remove predicted boxes with either height or width < threshold
# (NOTE: convert min_size to input image scale stored in im_info[2])
keep = _filter_boxes(proposals, min_size * im_info[2])
proposals = proposals[keep, :]
scores = scores[keep] # 4. sort all (proposal, score) pairs by score from highest to lowest
# 5. take top pre_nms_topN (e.g. 6000)
order = scores.ravel().argsort()[::-1]
if pre_nms_topN > 0:
order = order[:pre_nms_topN]
proposals = proposals[order, :]
scores = scores[order] # 6. apply nms (e.g. threshold = 0.7)
# 7. take after_nms_topN (e.g. 300)
# 8. return the top proposals (-> RoIs top)
keep = nms(np.hstack((proposals, scores)), nms_thresh)
if post_nms_topN > 0:
keep = keep[:post_nms_topN]
proposals = proposals[keep, :]
scores = scores[keep] # Output rois blob
# Our RPN implementation only supports a single input image, so all
# batch inds are 0
batch_inds = np.zeros((proposals.shape[0], 1), dtype=np.float32)
blob = np.hstack((batch_inds, proposals.astype(np.float32, copy=False)))
top[0].reshape(*(blob.shape))
top[0].data[...] = blob # [Optional] output scores blob
if len(top) > 1:
top[1].reshape(*(scores.shape))
top[1].data[...] = scores def backward(self, top, propagate_down, bottom):
"""This layer does not propagate gradients."""
pass def reshape(self, bottom, top):
"""Reshaping happens during the call to forward."""
pass def _filter_boxes(boxes, min_size):
"""Remove all boxes with any side smaller than min_size."""
ws = boxes[:, 2] - boxes[:, 0] + 1
hs = boxes[:, 3] - boxes[:, 1] + 1
keep = np.where((ws >= min_size) & (hs >= min_size))[0]
return keep

这是这一层的prototxt

layer {
name: 'proposal'
type: 'Python'
bottom: 'rpn_cls_prob_reshape'
bottom: 'rpn_bbox_pred'
bottom: 'im_info'
top: 'rois'
top: 'scores'
python_param {
module: 'rpn.proposal_layer'
layer: 'ProposalLayer'
param_str: "'feat_stride': 16"
}
}

可以看到,bottom[1]就是rpn_bbox_pred

所以上面代码中的bbox_deltas = bottom[1].data就是训练得到的坐标的4个变化值。因为训练rpn网络,本身训练的就是这4个变化值,而不是直接的4个坐标值。

 # bbox deltas will be (1, 4 * A, H, W) format
# transpose to (1, H, W, 4 * A)
# reshape to (1 * H * W * A, 4) where rows are ordered by (h, w, a)
# in slowest to fastest order
bbox_deltas = bbox_deltas.transpose((0, 2, 3, 1)).reshape((-1, 4))

代码中的这一部分必须理解一下。实际上,bbox deltas,也就是要学习的那4个变换值。首先必须知道的是,这4个变换值是训练学习来的,是由卷积训练来的,来自于rpn_bbox_pred这一层,他是一个feature map, shape是(4×anchor个数,h,w)。如何将这个feature map和生成的anchor进行变换,首先必须shape一样才能加或者其他运算。所以,这里所做的就是将bbox deltas的shape变成了和anchors一样的4列多行,4列就代表着x,y,w,h。

注意:无论是anchors还是bbox deltas,还是scores,他们的shape都是多行4列,排列的顺序都是(h,w,a),即第一行是h,w,a,第二行是h+1,w,a,当h排完了,再排w的变换,最后才是a

proposal_layer.py层解读的更多相关文章

  1. 【Android】Sensor框架Framework层解读

    Sensor整体架构 整体架构说明 黄色部分表示硬件,它要挂在I2C总线上 红色部分表示驱动,驱动注册到Kernel的Input Subsystem上,然后通过Event Device把Sensor数 ...

  2. 【Android】Sensor框架HAL层解读

    Android sensor构建 Android4.1 系统内置对传感器的支持达13种,他们分别是:加速度传感器(accelerometer).磁力传感器(magnetic field).方向传感器( ...

  3. anchor_target_layer层解读

    总结下来,用generate_anchors产生多种坐标变换,这种坐标变换由scale和ratio来,相当于提前计算好.anchor_target_layer先计算的是从feature map映射到原 ...

  4. django setting.py配置文件解读-02

    定义项目目录常量 import os # Build paths inside the project like this: os.path.join(BASE_DIR, ...) BASE_DIR ...

  5. caffe层解读系列-softmax_loss

    转自:http://blog.csdn.net/shuzfan/article/details/51460895 Loss Function softmax_loss的计算包含2步: (1)计算sof ...

  6. slover层解读

    void Solver<Dtype>::UpdateSmoothedLoss(Dtype loss, int start_iter, int average_loss) { if (los ...

  7. caffe层解读-softmax_loss

    转自https://blog.csdn.net/shuzfan/article/details/51460895. Loss Function softmax_loss的计算包含2步: (1)计算so ...

  8. 如何在Windows下用cpu模式跑通py-faster-rcnn 的demo.py

    关键字:Windows.cpu模式.Python.faster-rcnn.demo.py 声明:本篇blog暂时未经二次实践验证,主要以本人第一次配置过程的经验写成.计划在7月底回家去电脑城借台机子试 ...

  9. caffe︱ImageData层、DummyData层作为原始数据导入的应用

    Part1:caffe的ImageData层 ImageData是一个图像输入层,该层的好处是,直接输入原始图像信息就可以导入分析. 在案例中利用ImageData层进行数据转化,得到了一批数据. 但 ...

随机推荐

  1. I.MX6 各模块 clock 查询

    /********************************************************************* * I.MX6 各模块 clock 查询 * 说明: * ...

  2. pssh 批量管理执行

    pssh 是一个python写的批量执行工具,非常适合30台服务器以内的一些重复性的操作 安装很简单,只要python版本2.4 以上的都行 用这个工作最好把机器做做好ssh信任关系,不然很麻烦 每次 ...

  3. bzoj4720

    期望dp n久以前做过,再做一遍 你只能决定决策,不能决定结果,这是这道题的关键,因为我们换了教室不一定成功,所以我们应该这样设dp状态,dp[i][j][k],第i天,换j次,换没换,转移: dp[ ...

  4. 如何替换某文件中的所有的特定字符?---linux sed命令(文本编辑命令) (转载)

    转自:http://blog.csdn.net/year_9/article/details/20318407 sed是一个很好的文件处理工具,主要是以行为单位进行处理,可以将数据行进行替换.删除.新 ...

  5. Swift4 函数, 元组, 运算符

    创建: 2018/02/19 完成: 2018/02/19 更新: 2018/02/25 修改标题 [Swift4 函数] -> [Swift4 函数, 元组, 运算符] 更新 :2018/03 ...

  6. idea 取消代码下波浪线

    如图取消下面的波浪线

  7. bzoj 2257: [Jsoi2009]瓶子和燃料【裴蜀定理+gcd】

    裴蜀定理:若a,b是整数,且gcd(a,b)=d,那么对于任意的整数x,y,ax+by都一定是d的倍数,特别地,一定存在整数x,y,使ax+by=d成立. 所以最后能得到的最小燃料书就是gcd,所以直 ...

  8. Pycharm的安装教学

    Python环境搭建—安利Python小白的Python和Pycharm安装详细教程 人生苦短,我用Python.众所周知,Python目前越来越火,学习Python的小伙伴也越来越多.最近看到群里的 ...

  9. Docker+Jenkins+Git发布SpringBoot应用

    Doccker Docker 是一个开源的应用容器引擎,让开发者可以打包他们的应用以及依赖包到一个可移植的容器中,然后发布到任何流行的Linux机器上,也可以实现虚拟化,容器是完全使用沙箱机制,相互之 ...

  10. javascript 冒泡与捕获的原理及操作实例

    所谓的javascript冒泡与捕获不是数据结构中的冒泡算法,而是javascript针对dom事件处理的先后顺序,所谓的先后顺序是指针对父标签与其嵌套子标签,如果父标签与嵌套子标签均有相同的事件时, ...