XiangBai_CVPR2018_Rotation-Sensitive Regression for Oriented Scene Text Detection

作者和代码

关键词

文字检测、多方向、SSD、$$xywh\theta$$、one-stage，开源

方法亮点

核心思想认为，分类问题对于旋转不敏感，但回归问题对于旋转是敏感的，因此两个任务不应该用同样的特征。所以作者提出来基于旋转CNN的思路，先对特征做不同角度的旋转，该特征用于做框的回归，而对分类问题，采用沿oriented response pooling，所以对旋转不敏感。

Text coordinates are sensitive to text orientation. Therefore, the regression of coordinate offsets should be performed on rotation-sensitive features.

In contrast to regression, the classification of text presence should be rotation-invariant, i.e., text regions of arbitrary orientations should be classified as positive.

Figure 1: Visualization of feature maps and results of baseline and RRD. Red numbers are the classification scores. (b): the shared feature map for both regression and classification; (c): the result of shared feature; (d) and (e): the regression feature map and classification feature map of RRD; (f): the result of RRD.

首次使用Oriented Response Convolution来做文字检测

方法概述

本文方法是SSD进行修改，除了修改输出预测4个点坐标偏移量来检测倾斜文本外，还利用了ORN来提取旋转敏感的文字特征，然后在分类分支增加最大池化来提取针对分类不敏感的特征。

方法细节

网络结构

该网络结构由SSD改造，不同的是原来的多层融合侧边连接是普通的卷积，但这里换成了RSR。每一个RSR分为两个部分，第一部分是把卷积改成多种不同方向的oriented convolution。第二部分是用来做predicition，包括regression和classification两个分支。classification分支的不同地方在于多了一个oriented response pooling。

Figure 2: Architecture of RRD. (a) The rotation-sensitive backbone follows the main architecture of SSD while changing its convolution into oriented response convolution. (b) The outputs of rotation-sensitive backbone are rotation-sensitive feature maps, followed by two branches: one for regression and another for classification based on oriented response pooling. Note that the inception block is optional.

ORN（Oriented response net-works）

目的：通过使用旋转滤波器（active rotating filters，ARF）来提起对旋转敏感（rotation-sensitive）的卷积特征

方法来源：Y. Zhou, Q. Ye, Q. Qiu, and J. Jiao. Oriented response networks. In CVPR, 2017.

github链接：https://github.com/ZhouYanzhao/ORN

主要思想：

Rotation-Invariant Classification

简单说，就是把所有方向的结果逐像素取个最大值。如果文字是某个方向的，那么对应的方向的response应该比较大，这样就能把该方向的特征抽取出来（因为原来的feature有多个方向的，但只给定feature是不知道具体是哪个方向的，不能把它单独拿出来，用Max就可以不管是哪个方向都能提取出来）。

The rotationsensitive feature maps are pooled along their depth axis.

Default Boxes

使用四个顶点的四边形来表示。最后prediction的是四个点坐标的offset。

训练

作者argue第一个点的选择很重要，文中采用了textbox++提供的方法来确定第一个点
计算IOU的时候为了简化直接用了最外接矩形bb的IOU
损失函数 = 分类-2类softmax损失 + 回归-smooth_L1损失

实验结果

Ablation 实验

Baseline: architecture without inception block, using shared conventional feature maps for both regression and classification;

Baseline+inc: baseline architecture using inception blocks;

Baseline+inc+rs: architecture with inception block, using rotation-sensitive features for both regression and classification;

Baseline+inc+rs+rotInvar: the proposed RRD. Note that for word-based datasets, inception block is not applied and we also name it RRD.

在RCTW-17、ICDAR2015、MSRA-TD500上的实验结果

不同IOU实验结果

ICDAR2013实验结果

在其他数据集（ship，HRSC2016）检测上结果

当前文字检测结果中常见的歧义性

总结与收获

这篇文章的key idea和R-FCN有点像。检测对于平移、旋转具有敏感性，但分类不具有。所以这篇文章的方法是通过一个最大池化来去掉分类特征对旋转的敏感性。另外，这是第一篇把oriented response net-works引入ocr检测的文章。

【论文速读】XiangBai_CVPR2018_Rotation-Sensitive Regression for Oriented Scene Text Detection的更多相关文章

【论文速读】XiangBai_TIP2018_TextBoxes++_A Single-Shot Oriented Scene Text Detector
XiangBai_TIP2018_TextBoxes++_A Single-Shot Oriented Scene Text Detector 作者和代码 Minghui Liao, Baoguang ...
论文阅读（Xiang Bai——【arXiv2016】Scene Text Detection via Holistic, Multi-Channel Prediction）
Xiang Bai--[arXiv2016]Scene Text Detection via Holistic, Multi-Channel Prediction 目录作者和相关链接方法概括创新 ...
【论文速读】ChengLin_Liu_ICCV2017_Deep_Direct_Regression_for_Multi-Oriented_Scene_Text_Detection
ChengLin Liu_ICCV2017_Deep Direct Regression for Multi-Oriented Scene Text Detection 作者关键词文字检测.多方向 ...
论文速读（Chuhui Xue——【arxiv2019】MSR_Multi-Scale Shape Regression for Scene Text Detection）
Chuhui Xue--[arxiv2019]MSR_Multi-Scale Shape Regression for Scene Text Detection 论文 Chuhui Xue--[arx ...
论文速读（Yongchao Xu——【2018】TextField_Learning A Deep Direction Field for Irregular Scene Text）
Yongchao Xu--[2018]TextField_Learning A Deep Direction Field for Irregular Scene Text Detection 论文 Y ...
【论文速读】Chuhui Xue_ECCV2018_Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping
Chuhui Xue_ECCV2018_Accurate Scene Text Detection through Border Semantics Awareness and Bootstrappi ...
论文阅读（Weilin Huang——【TIP2016】Text-Attentional Convolutional Neural Network for Scene Text Detection）
Weilin Huang--[TIP2015]Text-Attentional Convolutional Neural Network for Scene Text Detection) 目录作者 ...
【论文速读】Shitala Prasad_ECCV2018】Using Object Information for Spotting Text
Shitala Prasad_ECCV2018]Using Object Information for Spotting Text 作者和代码关键词文字检测.水平文本.FasterRCNN.xy ...
【论文速读】Sheng Zhang_AAAI2018_Feature Enhancement Network_A Refined Scene Text Detector
Sheng Zhang_AAAI2018_Feature Enhancement Network_A Refined Scene Text Detector 作者关键词文字检测.水平文字.Fast ...

随机推荐

Vs2017 typescript 开发小问题
最近想写点ts的东西,以前用vs2015很方便,直接创建一个ts app项目就折腾了. Vs2017打开,居然发现这个项目模板不见了. 于是研究了一下,由于原来的ts app项目就是一个asp.n ...
11 安装已集成HA的树莓派镜像Hassbian
2017-09-04 10:40:47 下载Hassbian镜像文件,浏览https://github.com/home-assistant/pi-gen/releases/tag/v1.23,查看最 ...
css 元素溢出
css元素溢出: 当子元素的尺寸超过父元素的尺寸时,需要设置父元素显示溢出的子元素的方式,设置的方法是通过overflow属性来设置. overflow的设置项: (1)visible 默认值内容不 ...
java单链表反转（花了半个多小时的作品）
欢迎光临............... 首先我们要搞清楚链表是啥玩意儿?先看看定义: 讲链表之前我们先说说Java内存的分配情况:我们new对象的时候,会在java堆中为对象分配内存,当我们调用方法的 ...
常见问题：bootstrap datepicker日期插件汉化
引入简体中文js(bootstrap-datepicker.zh-CN.js),并在datepicker属性配置language为‘zh-CN’即可,示例如下: $(".form_datet ...
ValidateCode源码
ValidataCode.java: package com.itcast; /** * @author 大汉 */ import java.awt.Color; import java.awt.Fo ...
Windows中使用ssh利用公钥登入远程服务器
方式:使用 Winscp 密钥登录我们平时开发多会使用 ftp 来上传下载文件,尤其是很多 Linux 环境下. 其实 Linux 默认是不提供 ftp 的,需要你额外安装 FTP 服务 ...
swust oj 972
统计利用先序遍历创建的二叉树的宽度 1000(ms) 10000(kb) 2938 / 6810 利用先序递归遍历算法创建二叉树并计算该二叉树的宽度.先序递归遍历建立二叉树的方法为:按照先序递归遍历的 ...
Visual Studio 2012编译的程序无法在XP下运行的解决办法【转】
最近看到一篇<Windows编程革命简史>,想到以前刚开始用VS2012的时候,编译的程序在其他人那无法运行,一查才知道是VS2012本身不支持XP.当然现在微软早已在VS2012 Upd ...
ubuntu16.04安装mrpt
源码地址 https://github.com/MRPT/mrpt 安装教程 https://github.com/MRPT/mrpt/blob/master/README.md#32-build-f ...

【论文速读】XiangBai_CVPR2018_Rotation-Sensitive Regression for Oriented Scene Text Detection