Pan He_ICCV2017_Single Shot Text Detector With Regional Attention

作者和代码

caffe代码

关键词

文字检测、多方向、SSD、$$xywh\theta$$ 、one-stage、开源

方法亮点

Attention机制强化文字特征： Text Attentional Module
引入Inception来增强detector对文字大小的鲁棒性：Hierarchical Inception Module（HIM）

方法概述

本文方法是对SSD进行改进，通过增加一个角度信息，用于多方向文字检测。只要通过Attention机制和引入Inception来提高对文字特征的鲁棒性。

方法细节

网络结构

SSD的feature fusion层进行改进。增加了Text Attentional Module， Hierarchical Inception Module，以及AIF进行特征融合。

Aggregated Inception Features (AIFs)

Text Attentional Module

Attention的思想是原来的特征可能是全局整张图的，但是通过强化文字部分的特征（增加监督信息来对text部分的特征进行加权强化），来让文字特征更明显，更利于分类和回归任务。简单说，原来可能要看完整张图来做判断，现在只要多看看文字部分。

从效果来看，attention的好处：噪声的鲁棒性更强，文字的黏连问题解决的更好。

Figure 3: Text attention module. It computes a text attention map from Aggregated Inception Features (AIFs). The attention map indicates rough text regions and is further encoded into the AIFs. The attention module is trained by using a pixel-wise binary mask of text.

Figure 4: We compare detection results of the baseline model and the model with our text attention module (TAM), which enables the detector with stronger capability for identifying extremely challenging text with a higher word-level accuracy.

Hierarchical Inception Module

Inception有多种不同感受野的特征融合，对文字的大小鲁棒性更强。

Figure 5: Inception module. The convolutional maps are processed through four different convolutional operations, with Dilated convolutions [34] applied.

Figure 6: Comparisons of baseline model and Hierarchical Inception Module (HIM) model. The HIM allows the detector to handle extremely challenging text, and also improves word-level detection accuracy.

其他细节点

default box的aspect ratio从1,2,3,5,7 换成1,2,3,5,$\frac{1}{2}$,$\frac{1}{3}$,$\frac{1}{5}$

实验结果

ICDAR13数据集上验证TAM（+3）、HIM（+2）、TAM+HIM（+5）的效果

ICDAR2013和ICDAR2015

COCO-text
速度
- TITAN X， caffe，0.13s/image

总结与收获

这篇文章的方法主要是修改网络模型，通过增加attention和inception来提升特征鲁棒性。这个思想可以用于任何其他目标检测框架的特征融合层。

【论文速读】Pan He_ICCV2017_Single Shot Text Detector With Regional Attention的更多相关文章

【论文速读】Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation[2018-CPVR]
方法概述该方法用一个端到端网络完成文字检测整个过程——除了基础卷积网络(backbone)外,包括两个并行分支和一个后处理.第一个分支是通过一个DSSD网络进行角点检测来提取候选文字区域,第二个分支 ...
【论文速读】XiangBai_TIP2018_TextBoxes++_A Single-Shot Oriented Scene Text Detector
XiangBai_TIP2018_TextBoxes++_A Single-Shot Oriented Scene Text Detector 作者和代码 Minghui Liao, Baoguang ...
【论文速读】Sheng Zhang_AAAI2018_Feature Enhancement Network_A Refined Scene Text Detector
Sheng Zhang_AAAI2018_Feature Enhancement Network_A Refined Scene Text Detector 作者关键词文字检测.水平文字.Fast ...
论文速读（Jiaming Liu——【2019】Detecting Text in the Wild with Deep Character Embedding Network ）
Jiaming Liu--[2019]Detecting Text in the Wild with Deep Character Embedding Network 论文 Jiaming Liu-- ...
论文速读（Chuhui Xue——【arxiv2019】MSR_Multi-Scale Shape Regression for Scene Text Detection）
Chuhui Xue--[arxiv2019]MSR_Multi-Scale Shape Regression for Scene Text Detection 论文 Chuhui Xue--[arx ...
论文速读（Yongchao Xu——【2018】TextField_Learning A Deep Direction Field for Irregular Scene Text）
Yongchao Xu--[2018]TextField_Learning A Deep Direction Field for Irregular Scene Text Detection 论文 Y ...
【论文速读】Fangfang Wang_CVPR2018_Geometry-Aware Scene Text Detection With Instance Transformation Network
Han Hu--[ICCV2017]WordSup_Exploiting Word Annotations for Character based Text Detection 作者和代码 caffe ...
【论文速读】Yuliang Liu_2017_Detecting Curve Text in the Wild_New Dataset and New Solution
Yuliang Liu_2017_Detecting Curve Text in the Wild_New Dataset and New Solution 作者和代码 caffe版代码关键词文字 ...
【论文速读】XiangBai_CVPR2018_Rotation-Sensitive Regression for Oriented Scene Text Detection
XiangBai_CVPR2018_Rotation-Sensitive Regression for Oriented Scene Text Detection 作者和代码 caffe代码关键词 ...

随机推荐

2019-2-14sql server数据库模糊查询语句
sql server数据库模糊查询语句确切匹配: select * from hs_user where ID=123 模糊查询 select * from hs_user where ID l ...
ubuntu下vim使用方法
按下's'可对文本进行编辑按下'ESC'再输入':',之后输入wq是保存再退出,输入q是直接退出.如果是只读read only模式则需要输入'wq!'保存退出.
03-JavaScript
上一次内容进行复习: CSS: 层叠样式表主要作用: 美化页面, 将美化和HTML进行分离,提高代码复用性选择器: 元素选择器: 元素的名称{} 类选择器: . 开头 ...
对迭代器操作的python 模块
import itertools import more_itertools 目前用到的more_itertools.ilen(range(10)) --->返回可迭代的数量.这回消耗迭代,小心 ...
[LeetCode] Valid Tic-Tac-Toe State 验证井字棋状态
A Tic-Tac-Toe board is given as a string array board. Return True if and only if it is possible to r ...
jenkins-参数化构建(二)插件：Extended Choice Parameter
一.Extended Choice Parameter插件这个插件相对丰富,安装过程就不过多介绍了,在点击项目设置后会出现下载的插件名字. 写在文件中构建时效果如下:
javascript的数组之splice()
splice()方法通过删除现有元素和/或添加新元素来更改一个数组的内容.修改数组自身 var months = ['Jan', 'March', 'April', 'June']; months.s ...
python全栈开发 * 继承性层叠性盒模型标准文档流 * 180809
---恢复内容开始--- 一继承性 1.继承: 给父级设置一些属性,子级继承了父级的该属性,这就是我们的css中的继承. 2. 可继承: color . font-*(size). text-*(de ...
What's the meaning of unqualified-id?
catch( const std::runtime_error & e) { .... } When compile, met an error: error: expected unqual ...
python中list添加元素的方法append()、extend()和insert()
append()函数:将新元素追加到列表末尾 In [1]: a = [1, 2, 3, 4, 5] In [2]: a.append(6) In [3]: a Out[3]: [1, 2, 3, 4 ...

【论文速读】Pan He_ICCV2017_Single Shot Text Detector With Regional Attention