[原创]Faster R-CNN论文翻译
Faster R-CNN论文翻译
Faster R-CNN是互怼完了的好基友一起合作出来的巅峰之作,本文翻译的比例比较小,主要因为本paper是前述paper的一个简单改进,方法清晰,想法自然。什么想法?就是把那个一直明明应该换掉却一直被几位大神挤牙膏般地拖着不换的选择性搜索算法,即区域推荐算法。在Fast R-CNN的基础上将区域推荐换成了神经网络,而且这个神经网络和Fast R-CNN的卷积网络一起复用,大大缩短了计算时间。同时mAP又上了一个台阶,我早就说过了,他们一定是在挤牙膏。
Faster R-CNN: Towards Real-Time Object
Detection with Region Proposal Networks
摘要
1. 介绍

2 相关工作
3 FASTER R-CNN

3.1 区域推荐网络

3.1.1 锚点
平移不变性锚点
多尺度锚点作为回归参照物
3.1.2 损失函数




3.1.3 训练RPNs
3.2 RPN and Fast R-CNN之间共享特征
3.3 实现细节
4 EXPERIMENTS
5 CONCLUSION
参考文献
[2] R. Girshick, “Fast R-CNN,” in IEEE International Conference onComputer Vision (ICCV), 2015.
[3] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in InternationalConference on Learning Representations (ICLR), 2015.
[4] J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W. Smeulders, “Selective search for object recognition,” InternationalJournal of Computer Vision (IJCV), 2013.
[5] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich featurehierarchies for accurate object detection and semantic segmentation,” in IEEE Conference on Computer Vision and PatternRecognition (CVPR), 2014.
[6] C. L. Zitnick and P. Dollar, “Edge boxes: Locating object ´proposals from edges,” in European Conference on ComputerVision (ECCV), 2014.
[7] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutionalnetworks for semantic segmentation,” in IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2015.
[8] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, “Object detection with discriminatively trained partbased models,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2010.
[9] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus,and Y. LeCun, “Overfeat: Integrated recognition, localizationand detection using convolutional networks,” in InternationalConference on Learning Representations (ICLR), 2014.
[10] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” inNeural Information Processing Systems (NIPS), 2015.
[11] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, andA. Zisserman, “The PASCAL Visual Object Classes Challenge2007 (VOC2007) Results,” 2007.
[12] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, “Microsoft COCO: Com- ´mon Objects in Context,” in European Conference on ComputerVision (ECCV), 2014.
[13] S. Song and J. Xiao, “Deep sliding shapes for amodal 3d objectdetection in rgb-d images,” arXiv:1511.02300, 2015.
[14] J. Zhu, X. Chen, and A. L. Yuille, “DeePM: A deep part-basedmodel for object detection and semantic part localization,”arXiv:1511.07131, 2015.
[15] J. Dai, K. He, and J. Sun, “Instance-aware semantic segmentation via multi-task network cascades,” arXiv:1512.04412, 2015.
[16] J. Johnson, A. Karpathy, and L. Fei-Fei, “Densecap: Fullyconvolutional localization networks for dense captioning,”arXiv:1511.07571, 2015.
[17] D. Kislyuk, Y. Liu, D. Liu, E. Tzeng, and Y. Jing, “Human curation and convnets: Powering item-to-item recommendationson pinterest,” arXiv:1511.04003, 2015.
[18] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learningfor image recognition,” arXiv:1512.03385, 2015.
[19] J. Hosang, R. Benenson, and B. Schiele, “How good are detection proposals, really?” in British Machine Vision Conference(BMVC), 2014.
[20] J. Hosang, R. Benenson, P. Dollar, and B. Schiele, “What makes ´for effective detection proposals?” IEEE Transactions on PatternAnalysis and Machine Intelligence (TPAMI), 2015.
[21] N. Chavali, H. Agrawal, A. Mahendru, and D. Batra,“Object-Proposal Evaluation Protocol is ’Gameable’,” arXiv:1505.05836, 2015.
[22] J. Carreira and C. Sminchisescu, “CPMC: Automatic object segmentation using constrained parametric min-cuts,”IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI), 2012.
[23] P. Arbelaez, J. Pont-Tuset, J. T. Barron, F. Marques, and J. Malik, ´“Multiscale combinatorial grouping,” in IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2014.
[24] B. Alexe, T. Deselaers, and V. Ferrari, “Measuring the objectness of image windows,” IEEE Transactions on Pattern Analysisand Machine Intelligence (TPAMI), 2012.
[25] C. Szegedy, A. Toshev, and D. Erhan, “Deep neural networksfor object detection,” in Neural Information Processing Systems(NIPS), 2013.
[26] D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov, “Scalableobject detection using deep neural networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[27] C. Szegedy, S. Reed, D. Erhan, and D. Anguelov, “Scalable,high-quality object detection,” arXiv:1412.1441 (v1), 2015.
[28] P. O. Pinheiro, R. Collobert, and P. Dollar, “Learning tosegment object candidates,” in Neural Information ProcessingSystems (NIPS), 2015.
[29] J. Dai, K. He, and J. Sun, “Convolutional feature maskingfor joint object and stuff segmentation,” in IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2015.
[30] S. Ren, K. He, R. Girshick, X. Zhang, and J. Sun, “Object detection networks on convolutional feature maps,”arXiv:1504.06066, 2015.
[31] J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, andY. Bengio, “Attention-based models for speech recognition,”in Neural Information Processing Systems (NIPS), 2015.
[32] M. D. Zeiler and R. Fergus, “Visualizing and understandingconvolutional neural networks,” in European Conference onComputer Vision (ECCV), 2014.
[33] V. Nair and G. E. Hinton, “Rectified linear units improverestricted boltzmann machines,” in International Conference onMachine Learning (ICML), 2010.
[34] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov,D. Erhan, and A. Rabinovich, “Going deeper with convolutions,” in IEEE Conference on Computer Vision and PatternRecognition (CVPR), 2015.
[35] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard,W. Hubbard, and L. D. Jackel, “Backpropagation applied tohandwritten zip code recognition,” Neural computation, 1989.
[36] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma,Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg,and L. Fei-Fei, “ImageNet Large Scale Visual RecognitionChallenge,” in International Journal of Computer Vision (IJCV),2015.
[37] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” in NeuralInformation Processing Systems (NIPS), 2012.
[38] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutionalarchitecture for fast feature embedding,” arXiv:1408.5093, 2014.
[39] K. Lenc and A. Vedaldi, “R-CNN minus R,” in British MachineVision Conference (BMVC), 2015.
[原创]Faster R-CNN论文翻译的更多相关文章
- k[原创]Faster R-CNN论文翻译
物体检测论文翻译系列: 建议从前往后看,这些论文之间具有明显的延续性和递进性. R-CNN SPP-net Fast R-CNN Faster R-CNN Faster R-CNN论文翻译 原文地 ...
- 深度学习论文翻译解析(四):Faster R-CNN: Down the rabbit hole of modern object detection
论文标题:Faster R-CNN: Down the rabbit hole of modern object detection 论文作者:Zhi Tian , Weilin Huang, Ton ...
- 深度学习论文翻译解析(十三):Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
论文标题:Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 标题翻译:基于区域提议(Regi ...
- 深度学习论文翻译解析(三):Detecting Text in Natural Image with Connectionist Text Proposal Network
论文标题:Detecting Text in Natural Image with Connectionist Text Proposal Network 论文作者:Zhi Tian , Weilin ...
- 深度学习论文翻译解析(十六):Squeeze-and-Excitation Networks
论文标题:Squeeze-and-Excitation Networks 论文作者:Jie Hu Li Shen Gang Sun 论文地址:https://openaccess.thecvf.co ...
- R-CNN论文翻译
R-CNN论文翻译 Rich feature hierarchies for accurate object detection and semantic segmentation 用于精确物体定位和 ...
- SSD: Single Shot MultiBoxDetector英文论文翻译
SSD英文论文翻译 SSD: Single Shot MultiBoxDetector 2017.12.08 摘要:我们提出了一种使用单个深层神经网络检测图像中对象的方法.我们的方法,名为SSD ...
- 深度学习论文翻译解析(二):An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
论文标题:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application ...
- 论文翻译——R-CNN(目标检测开山之作)
R-CNN论文翻译 <Rich feature hierarchies for accurate object detection and semantic segmentation> 用 ...
随机推荐
- SSH端口转发详解及实例
一.SSH端口转发简介 SSH会自动加密和解密所有SSH客户端与服务端之间的网络数据.但是,SSH还能够将其他TCP端口的网络数据通SSH链接来转发,并且自动提供了相应的加密及解密服务.这一过程也被叫 ...
- RAID及热备盘详解
RAID,为Redundant Arrays of Independent Disks的简称,中文为廉价冗余磁盘阵列. 一.出现的原因(RAID的优点): 它的用途主要是面向服务器,但现在的个人电脑由 ...
- hdu1166 敌兵布阵
敌兵布阵 C国的死对头A国这段时间正在进行军事演习,所以C国间谍头子Derek和他手下Tidy又开始忙乎了.A国在海岸线沿直线布置了N个工兵营地,Derek和Tidy的任务就是要监视这些工兵营地的活动 ...
- Reliability diagrams
Reliability diagrams (Hartmann et al. 2002) are simply graphs of the Observed frequency of an event ...
- 记录一下从懵懂到理解RESTful的过程
前言 Spring+SpringMVC+MyBatis+easyUI整合进阶篇(一)设计一套好的RESTful API Spring+SpringMVC+MyBatis+easyUI整合进阶篇(二)R ...
- C-Flex 与 box布局教程
http://www.ruanyifeng.com/blog/2015/07/flex-grammar.html -阮一峰老师 http://www.w3cplus.com/css3/flexbox- ...
- ServletListener对象学习笔记
JavaWeb学习笔记--监听器详解 知识概要: 1.监听器下例子举例 2.Servlet规范中的监听器 3. 4. 1. 监听器下例子举例说明: /* Frame:事件源.发生事件的对象 Windo ...
- Java web AJAX入门
一:AJAX简介 AJAX :Asynchronous JavaScript And XML 指异步 JavaScript 及 XML 一种日渐流行的Web编程方式 Better Faster Use ...
- 浅谈Java多态
什么是Java中的多态?又是一个纸老虎的概念,老套路,把它具体化,细分化,先想三个问题(注意,这里不是简单的化整为零,而是要建立在学习一个新概念时的思考框架): 1.这个东西有什么用?用来干什么的?它 ...
- 基于FFMPEG的跨平台播放器实现(二)
基于FFMPEG的跨平台播放器实现(二) 上一节讲到了在Android平台下采用FFmpeg+surface组合打造播放器的方法,这一节讲一下Windows平台FFmpeg + D3D.Linux平台 ...