object detection[NMS]

非极大抑制，是在对象检测中用的较为频繁的方法，当在一个对象区域，框出了很多框，那么如下图：

上图来自这里

目的就是为了在这些框中找到最适合的那个框.有以下几种方式：

1 nms

2 soft-nms

3 softer-nms

1. nms

主要就是通过迭代的形式，不断的以最大得分的框去与其他框做iou操作，并过滤那些iou较大（即交集较大）的框

IOU也是一种Tanimoto测量方法[见模式识别，希腊，书609页]

按照github上R-CNN的matlab代码，改成py的，具体如下：



def iou(xminNp,yminNp,xmaxNp,ymaxNp,areas,lastInd,beforeInd,threshold):

    # 将lastInd指向的box，与之前的所有存活的box做比较，得到交集区域的坐标。

    # np.maximum([3,1,4,2],3) 等于 array([3,3,4,3])

    xminNpTmp = np.maximum(xminNp[lastInd], xminNp[beforeInd])

    yminNpTmp = np.maximum(yminNp[lastInd], yminNp[beforeInd])

    xmaxNpTmp = np.maximum(xmaxNp[lastInd], xmaxNp[beforeInd])

    ymaxNpTmp = np.maximum(ymaxNp[lastInd], ymaxNp[beforeInd])

    #计算lastInd指向的box，与存活box交集的，所有width，height

    w = np.maximum(0.0,xmaxNpTmp-xminNpTmp)

    h = np.maximum(0.0,ymaxNpTmp-yminNpTmp)

    #计算存活box与last指向box的交集面积

    # array([1,2,3,4]) * array([1,2,3,4]) 等于 array([1,4,9,16])

    inter = w*h

    iouValue = inter/(areas[beforeInd]+areas[lastInd]-inter)

    indexOutput = [item[0] for item in zip(beforeInd,iouValue) if item[1] <= threshold ]

    return indexOutput

def nms(boxes,threshold):

    '''

    boxes:n by 5的矩阵，n表示box个数，每一行分别为[xmin,ymin,xmax,ymax,score]

    '''

    assert isinstance(boxes,numpy.ndarray),'boxes must numpy object'

    assert boxes.shape[1] == 5,'the column Dimension should be 5'

    xminNp = boxes[:,0]

    yminNp = boxes[:,1]

    xmaxNp = boxes[:,2]

    ymaxNp = boxes[:,3]

    scores = boxes[:,4]

    #计算每个box的面积

    areas = (xmaxNp-xminNp)*(ymaxNp-yminNp)

    #对每个box的得分按升序排序

    scoresSorted = sorted(list(enumerate(scores)),key = lambda item:item[1])

    #提取排序后数据的原索引

    index = [ item[0] for item in scoresSorted ]

    pick = []

    while index:

        #将当前index中最后一个加入pick

        lastInd = index[-1]

        pick.append(lastInd)

        #计算最后一个box与之前所有box的iou

        index = iou(xminNp,yminNp,xmaxNp,ymaxNp,areas,lastInd,index[:-1],threshold)

    return pick

if __name__ == '__main__':

    nms(boxes,threshold)

2. soft-nms

import copy

def iou(xminNp,yminNp,xmaxNp,ymaxNp,scores,areas,remainInds,maxGlobalInd,Nt,sigma,threshold, method):

    remainInds = np.array(remainInds)

    # 将maxGlobalInd指向的box，与所有剩下的box做比较，得到交集区域的坐标。

    # np.maximum([3,1,4,2],3) 等于 array([3,3,4,3])

    xminNpTmp = np.maximum(xminNp[maxGlobalInd], xminNp[remainInds])

    yminNpTmp = np.maximum(yminNp[maxGlobalInd], yminNp[remainInds])

    xmaxNpTmp = np.maximum(xmaxNp[maxGlobalInd], xmaxNp[remainInds])

    ymaxNpTmp = np.maximum(ymaxNp[maxGlobalInd], ymaxNp[remainInds])

    # 计算box交集所有width，height

    w = np.maximum(0.0,xmaxNpTmp-xminNpTmp)

    h = np.maximum(0.0,ymaxNpTmp-yminNpTmp)

    #计算IOU

    # array([1,2,3,4]) * array([1,2,3,4]) 等于 array([1,4,9,16])

    inter = w*h

    iouValue = inter/(areas[remainInds]+areas[maxGlobalInd]-inter)

    # 依据不同的方法进行权值更新

    weight = np.ones_like(iouValue)

    if method == 'linear': # linear

        # 实现1 - iou

        weight = weight - iouValue

        weight[iouValue <= Nt] = 1

    elif method == 'gaussian':

        weight = np.exp(-(iouValue*iouValue)/sigma)

    else: # original NMS

        weight[iouValue > Nt] = 0

    # 更新scores

    scores[remainInds] = weight*scores[remainInds]

    # 删除低于阈值的框

    remainInds = remainInds[scores[remainInds] > threshold]

    return remainInds.tolist(),scores

def soft_nms(boxes, threshold, sigma, Nt, method):

    '''

    boxes:n by 5的矩阵，n表示box个数，每一行分别为[xmin,ymin,xmax,ymax,score]

    # 1 - 先找到最大得分的box，放到结果集中；

    # 2 - 然后将最大得分的box与剩下的做对比，去更新剩下的得分权值

    # 3 - 删除低于最小值的框；

    # 4 - 再找到剩下中最大的，循环

    # 5 - 返回结果集

    '''

    assert isinstance(boxes,numpy.ndarray),'boxes must numpy object'

    assert boxes.shape[1] == 5,'the column Dimension should be 5'

    pick = []

    copyBoxes = copy.deepcopy(boxes)

    xminNp = boxes[:,0]

    yminNp = boxes[:,1]

    xmaxNp = boxes[:,2]

    ymaxNp = boxes[:,3]

    scores = copy.deepcopy(boxes[:,4]) # 会不断的更新其中的得分数值

    remainInds = list(range(len(scores))) # 会不断的被分割成结果集，丢弃

    #计算每个box的面积

    areas = (xmaxNp-xminNp)*(ymaxNp-yminNp)    

    while remainInds:

        # 1 - 先找到最大得分的box，放到结果集中；

        maxLocalInd = np.argmax(scores[remainInds])

        maxGlobalInd = remainInds[maxLocalInd]

        pick.append(maxGlobalInd)

        # 2 - 丢弃最大值在索引中的位置

        remainInds.pop(maxLocalInd)

        if not remainInds: break

        # 3 - 更新scores,remainInds

        remainInds,scores = iou(xminNp,yminNp,xmaxNp,ymaxNp,scores,areas,remainInds,maxGlobalInd,Nt,sigma,threshold, method)

    return pick

if __name__ == '__main__':

    soft_nms(boxes, 0.001, 0.5, 0.3, 'linear')

3. softer-nms

参考资料：

非极大抑制
[首次提出nms] Rosenfeld A, Thurston M. Edge and curve detection for visual scene analysis[J]. IEEE Transactions on computers, 1971 (5): 562-569.
Theodoridis.S.,.Koutroumbas.K..Pattern.Recognition,.4ed,.AP,.2009
[soft-nms] Bodla N, Singh B, Chellappa R, et al. Soft-nms—improving object detection with one line of code[C]//Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, 2017: 5562-5570. 【code】
[fitness nms] Tychsen-Smith L, Petersson L. Improving Object Localization with Fitness NMS and Bounded IoU Loss[J]. arXiv preprint arXiv:1711.00164, 2017.
[learning NMS] J. H. Hosang, R. Benenson, and B. Schiele. Learning nonmaximum suppression. In CVPR, pages 6469–6477, 2017
[softer-nms] He Y, Zhang X, Savvides M, et al. Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection[J]. arXiv preprint arXiv:1809.08545, 2018.)

object detection[NMS]的更多相关文章

Object Detection · RCNN论文解读
转载请注明作者:梦里茶 Object Detection,顾名思义就是从图像中检测出目标对象,具体而言是找到对象的位置,常见的数据集是PASCAL VOC系列.2010年-2012年,Object D ...
[Arxiv1706] Few-Example Object Detection with Model Communication 论文笔记
p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #042eee } p. ...
论文阅读笔记五十五：DenseBox: Unifying Landmark Localization with End to End Object Detection（CVPR2015）
论文原址:https://arxiv.org/abs/1509.04874 github:https://github.com/CaptainEven/DenseBox 摘要本文先提出了一个问题:如 ...
论文阅读笔记五十二：CornerNet-Lite: Efficient Keypoint Based Object Detection（CVPR2019）
论文原址:https://arxiv.org/pdf/1904.08900.pdf github:https://github.com/princeton-vl/CornerNet-Lite 摘要基 ...
论文阅读笔记四十八：Bounding Box Regression with Uncertainty for Accurate Object Detection(CVPR2019)
论文原址:https://arxiv.org/pdf/1809.08545.pdf github:https://github.com/yihui-he/KL-Loss 摘要大规模的目标检测数据集在 ...
论文阅读笔记四十六：Feature Selective Anchor-Free Module for Single-Shot Object Detection（CVPR2019）
论文原址:https://arxiv.org/abs/1903.00621 摘要本文提出了基于无anchor机制的特征选择模块,是一个简单高效的单阶段组件,其可以结合特征金字塔嵌入到单阶段检测器中. ...
论文阅读笔记四十四：RetinaNet:Focal Loss for Dense Object Detection(ICCV2017）
论文原址:https://arxiv.org/abs/1708.02002 github代码:https://github.com/fizyr/keras-retinanet 摘要目前,具有较高准确 ...
Adversarial Examples for Semantic Segmentation and Object Detection 阅读笔记
Adversarial Examples for Semantic Segmentation and Object Detection (语义分割和目标检测中的对抗样本) 作者:Cihang Xie, ...
论文阅读笔记三十五：R-FCN:Object Detection via Region-based Fully Convolutional Networks（CVPR2016）
论文源址:https://arxiv.org/abs/1605.06409 开源代码:https://github.com/PureDiors/pytorch_RFCN 摘要提出了基于区域的全卷积网 ...

随机推荐

spring boot mybatis 打成可执行jar包后启动UnsatisfiedDependencyException异常
我的spring boot + mybatis项目在idea里面执行正常,但发布测试环境打成可执行jar包后就启动失败,提示错误如下: [ ERROR] [2018-08-30 17:23:48] o ...
使用SQL查看表字段和字段说明
MySql: show full columns from tableName; Sql server: SELECT A.name AS table_name, B.name AS column_n ...
Android为TV端助力自定义view中findViewById为空的解决办法
网上说的都是在super(context, attrs);构造函数这里少加了一个字段, 其实根本不只这一个原因,属于view生命周期的应该知道,如果你在自定义view的构造函数里面调用findVie ...
react-router-dom v^4路由、带参路由的配置
首先安装路由 npm install --save react-router-dom 新建一个router.js文件然后我们的router.js代码如下↓ import React from 're ...
操作系统-进程通信（信号量、匿名管道、命名管道、Socket）
进程通信(信号量.匿名管道.命名管道.Socket) 具体的概念就没必要说了,参考以下链接. 信号量匿名管道命名管道 Socket Source Code: 1. 信号量(生产者消费者问题) #i ...
MySQL mysqlbinlog解析出的SQL语句被注释是怎么回事
MySQL mysqlbinlog解析出的SQL语句被注释是怎么回事一网友反馈使用mysqlbinlog解析出的二进制日志中的内容中,有些SQL语句有#注释的情况,这个是怎么回事呢?我们通过实验 ...
ApplicationContext 配置里dataSource mysql连接数据源，设置ssl和utf-8
?useUnicode&useSSL=false
C#-事件（十八）
概述事件(Event) 基本上说是一个用户操作,如按键.点击.鼠标移动使用事件,可以很方便地确定程序执行顺序事件在类中声明且生成,且通过使用同一个类或其他类中的委托与事件处理程序关联包含事件的 ...
四、Tableau如何设置数据格式
一.要求 ‘销售额’:K为单位 ‘利润’: M为单位,负值用括号括起来,但是正值 ‘利润率’:带百分号,负值用括号括起来仍然时负值二.解决方案 1.‘销售额’:m为单位 2.‘利润’: ...
用emacs 阅读 c/c++ 代码
在emacs编程中有以下需求从调用一个函数的地方跳转到函数的定义的地方或是反过来从函数定义的地方列出所有调用这个函数的地方实现办法需要安装以下软件 gnu global(阅读源代码的工具)官网 ...