An intriguing failing of convolutional neural networks and the CoordConv solution

NeurIPS 2018

2019-10-10 15:01:48

Paperhttps://arxiv.org/pdf/1807.03247.pdf

Official TensorFlow Codehttps://github.com/uber-research/coordconv

Unofficial PyTorch Codehttps://github.com/walsvid/CoordConv

 

机器之心:卷积神经网络「失陷」,CoordConv 来填坑https://zhuanlan.zhihu.com/p/39665894

Uber提出CoordConv:解决普通CNN坐标变换问题: https://zhuanlan.zhihu.com/p/39919038

要拯救CNN的CoordConv受嘲讽,翻译个坐标还用训练? https://zhuanlan.zhihu.com/p/39841356

1. 给定 feature map and 坐标(x, y)如何生成对应的 relative CoordinateMap?

The following code is from: [ICCV19] AdaptIS: Adaptive Instance Selection Network [Github]

    def get_instances_maps(self, F, points, adaptive_input, controller_input):
if isinstance(points, mx.nd.NDArray):
self.num_points = points.shape[1] if getattr(self.controller_net, 'return_map', False):
w = self.eqf(controller_input, points)
else:
w = self.eqf(controller_input, points)
w = self.controller_net(w) points = F.reshape(points, shape=(-1, 2))
x = F.repeat(adaptive_input, self.num_points, axis=0)
x = self.add_coord_features(x, points) x = self.block0(x)
x = self.adain(x, w)
x = self.block1(x) return x
class AppendCoordFeatures(gluon.HybridBlock):
def __init__(self, norm_radius, append_dist=True, spatial_scale=1.0):
super(AppendCoordFeatures, self).__init__()
self.xs = None
self.spatial_scale = spatial_scale
self.norm_radius = norm_radius
self.append_dist = append_dist def _ctx_kwarg(self, x):
if isinstance(x, mx.nd.NDArray):
return {"ctx": x.context}
return {} def get_coord_features(self, F, points, rows, cols, batch_size, **ctx_kwarg):
row_array = F.arange(start=0, stop=rows, step=1, **ctx_kwarg)
col_array = F.arange(start=0, stop=cols, step=1, **ctx_kwarg)
coord_rows = F.repeat(F.reshape(row_array, (1, 1, rows, 1)), repeats=cols, axis=3)
coord_cols = F.repeat(F.reshape(col_array, (1, 1, 1, cols)), repeats=rows, axis=2) coord_rows = F.repeat(coord_rows, repeats=batch_size, axis=0)
coord_cols = F.repeat(coord_cols, repeats=batch_size, axis=0) coords = F.concat(coord_rows, coord_cols, dim=1) add_xy = F.reshape(points * self.spatial_scale, shape=(0, 0, 1))
add_xy = F.reshape(F.repeat(add_xy, rows * cols, axis=2),
shape=(0, 0, rows, cols)) coords = (coords - add_xy) / (self.norm_radius * self.spatial_scale)
if self.append_dist:
dist = F.sqrt(F.sum(F.square(coords), axis=1, keepdims=1))
coord_features = F.concat(coords, dist, dim=1)
else:
coord_features = coords coord_features = F.clip(coord_features, a_min=-1, a_max=1)
return coord_features def hybrid_forward(self, F, x, coords):
if isinstance(x, mx.nd.NDArray):
self.xs = x.shape batch_size, rows, cols = self.xs[0], self.xs[2], self.xs[3]
coord_features = self.get_coord_features(F, coords, rows, cols, batch_size, **self._ctx_kwarg(x)) return F.concat(coord_features, x, dim=1)
    def get_coord_features(self, F, points, rows, cols, batch_size, **ctx_kwarg):

        # (Pdb) points, rows, cols, batch_size
# ([[61. 71.]] <NDArray 1x2 @gpu(0)>, 96, 96, 1) # row_array and col_array:
# [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
# 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
# 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53.
# 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71.
# 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89.
# 90. 91. 92. 93. 94. 95.]
# <NDArray 96 @gpu(0)> # (Pdb) coord_rows
# [[[[ 0. 0. 0. ... 0. 0. 0.]
# [ 1. 1. 1. ... 1. 1. 1.]
# [ 2. 2. 2. ... 2. 2. 2.]
# ...
# [93. 93. 93. ... 93. 93. 93.]
# [94. 94. 94. ... 94. 94. 94.]
# [95. 95. 95. ... 95. 95. 95.]]]]
# <NDArray 1x1x96x96 @gpu(0)> # (Pdb) coord_cols
# [[[[ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]
# ...
# [ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]]]]
# <NDArray 1x1x96x96 @gpu(0)> # (Pdb) add_xy
# [[[[61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]
# ...
# [61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]] # [[71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]
# ...
# [71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]]]]
# <NDArray 1x2x96x96 @gpu(0)> # (Pdb) if self.append_dist, then coord_features is:
# [[[[-1. -1. -1. ... -1. -1.
# -1. ]
# [-1. -1. -1. ... -1. -1.
# -1. ]
# [-1. -1. -1. ... -1. -1.
# -1. ]
# ...
# [ 0.7619048 0.7619048 0.7619048 ... 0.7619048 0.7619048
# 0.7619048 ]
# [ 0.78571427 0.78571427 0.78571427 ... 0.78571427 0.78571427
# 0.78571427]
# [ 0.8095238 0.8095238 0.8095238 ... 0.8095238 0.8095238
# 0.8095238 ]] # [[-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# ...
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]] # [[ 1. 1. 1. ... 1. 1.
# 1. ]
# [ 1. 1. 1. ... 1. 1.
# 1. ]
# [ 1. 1. 1. ... 1. 1.
# 1. ]
# ...
# [ 1. 1. 1. ... 0.9245947 0.9382886
# 0.95238096]
# [ 1. 1. 1. ... 0.944311 0.9577231
# 0.9715336 ]
# [ 1. 1. 1. ... 0.96421224 0.97735125
# 0.99088824]]]]
# <NDArray 1x3x96x96 @gpu(0)> pdb.set_trace()
row_array = F.arange(start=0, stop=rows, step=1, **ctx_kwarg) ## (96,)
col_array = F.arange(start=0, stop=cols, step=1, **ctx_kwarg) ## (96,)
coord_rows = F.repeat(F.reshape(row_array, (1, 1, rows, 1)), repeats=cols, axis=3)
coord_cols = F.repeat(F.reshape(col_array, (1, 1, 1, cols)), repeats=rows, axis=2) coord_rows = F.repeat(coord_rows, repeats=batch_size, axis=0)
coord_cols = F.repeat(coord_cols, repeats=batch_size, axis=0) coords = F.concat(coord_rows, coord_cols, dim=1) ## (1, 2, 96, 96) add_xy = F.reshape(points * self.spatial_scale, shape=(0, 0, 1)) ## [[[61.] [71.]]] <NDArray 1x2x1 @gpu(0)>
add_xy = F.reshape(F.repeat(add_xy, rows * cols, axis=2), shape=(0, 0, rows, cols)) ## self.norm_radius: 42
coords = (coords - add_xy) / (self.norm_radius * self.spatial_scale) ## <NDArray 1x2x96x96 @gpu(0)>
if self.append_dist:
dist = F.sqrt(F.sum(F.square(coords), axis=1, keepdims=1)) ## <NDArray 1x1x96x96 @gpu(0)>
coord_features = F.concat(coords, dist, dim=1)
else:
coord_features = coords coord_features = F.clip(coord_features, a_min=-1, a_max=1) return coord_features

I also write one PyTorch version according to the MXNet version:

class AddCoords(nn.Module):

    def __init__(self, ):
super().__init__() def forward(self, input_tensor, points):
_, x_dim, y_dim = input_tensor.size()
batch_size = 1 xx_channel = torch.arange(x_dim).repeat(1, y_dim, 1) ## torch.Size([1, 9, 9])
yy_channel = torch.arange(y_dim).repeat(1, x_dim, 1).transpose(1, 2) ## torch.Size([1, 9, 9]) xx_channel = xx_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3)
yy_channel = yy_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3) coords = torch.cat((xx_channel, yy_channel), dim=1) ## torch.Size([20, 2, 9, 9])
coords = coords.type(torch.FloatTensor) add_xy = torch.reshape(points, (1, 2, 1)) ## torch.Size([1, 2, 1])
add_xy_ = add_xy.repeat(1, 1, x_dim * y_dim) ## torch.Size([1, 2, 81])
add_xy_ = torch.reshape(add_xy_, (1, 2, x_dim, y_dim)) ## torch.Size([1, 2, 9, 9])
add_xy_ = add_xy_.type(torch.FloatTensor) coords = (coords - add_xy_) ## torch.Size([1, 2, 9, 9])
coord_features = np.clip(np.array(coords), -1, 1) ## (1, 2, 9, 9)
coord_features = torch.from_numpy(coord_features).cuda() return coord_features

 

An intriguing failing of convolutional neural networks and the CoordConv solution的更多相关文章

  1. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks

    Understanding the Effective Receptive Field in Deep Convolutional Neural Networks 理解深度卷积神经网络中的有效感受野 ...

  2. Deep learning_CNN_Review:A Survey of the Recent Architectures of Deep Convolutional Neural Networks——2019

    CNN综述文章 的翻译 [2019 CVPR] A Survey of the Recent Architectures of Deep Convolutional Neural Networks 翻 ...

  3. tensorfolw配置过程中遇到的一些问题及其解决过程的记录(配置SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving)

    今天看到一篇关于检测的论文<SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real- ...

  4. Notes on Convolutional Neural Networks

    这是Jake Bouvrie在2006年写的关于CNN的训练原理,虽然文献老了点,不过对理解经典CNN的训练过程还是很有帮助的.该作者是剑桥的研究认知科学的.翻译如有不对之处,还望告知,我好及时改正, ...

  5. 《ImageNet Classification with Deep Convolutional Neural Networks》 剖析

    <ImageNet Classification with Deep Convolutional Neural Networks> 剖析 CNN 领域的经典之作, 作者训练了一个面向数量为 ...

  6. 卷积神经网络CNN(Convolutional Neural Networks)没有原理只有实现

    零.说明: 本文的所有代码均可在 DML 找到,欢迎点星星. 注.CNN的这份代码非常慢,基本上没有实际使用的可能,所以我只是发出来,代表我还是实践过而已 一.引入: CNN这个模型实在是有些年份了, ...

  7. A Beginner's Guide To Understanding Convolutional Neural Networks(转)

    A Beginner's Guide To Understanding Convolutional Neural Networks Introduction Convolutional neural ...

  8. 阅读笔记 The Impact of Imbalanced Training Data for Convolutional Neural Networks [DegreeProject2015] 数据分析型

    The Impact of Imbalanced Training Data for Convolutional Neural Networks Paulina Hensman and David M ...

  9. 读convolutional Neural Networks Applied to House Numbers Digit Classification 的收获。

    本文以下内容来自读论文以后认为有价值的地方,论文来自:convolutional Neural Networks Applied to House Numbers Digit Classificati ...

随机推荐

  1. CTFd平台搭建以及一些相关问题解决

    CTFd平台搭建以及一些相关问题解决 一.序言 因为想给学校工作室提高一下学习氛围,随便带学弟学妹入门,所以做了一个ctf平台,开源的平台有CTFd和FBCTF,因为学生租不起高端云主机所以只能选择占 ...

  2. springBoot 发布war/jar包到tomcat(idea)

    参考链接:https://blog.csdn.net/qq1076472549/article/details/81318729 1.启动类目录新增打包类:  2.pom.xml新增依赖:<pa ...

  3. html, js,css应用文件路径规则

    web前端一般常用文件 .html .css .js.但是当用css文件和html引入资源(比如图片)时,路径可能不相同.下面总结了几条. 使用相对路径引入规则: html或者js引入图片,按照htm ...

  4. Solr基础知识三(整合SSM)

    前两篇讲了solr安装和导入数据,这篇讲如何整合到SSM中. 一.整合SSM 1.1 引入依赖 1.2 初始化solr 1.3 写service 1.4 写控制层 1.5 查询 二.IK分词器 2.1 ...

  5. Bootstrap。

    bootstrap: 1.概念:前端开发框架. 2.快速入门:下载bootstrap.导入文件. 3.响应式布局: * 同一套页面可以兼容不同分辨率的设备. * 实现:依赖于栅格系统:将一行平均分成1 ...

  6. docker命令小全 this is my note.

    服务器类型:linux =>centos 7.X以上版本 常用命令使用紫色加粗标明 1.安装yum-util(为配置docker安装时使用阿里镜像做准备):yum install -y yum- ...

  7. Python 并发部分的面试题

    进程 进程间内存是否共享?如何实现通讯? 进程间内存不共享,可以通过 Manage模块加锁 通过队列或 通过管道加锁 socket实现通讯 请聊聊进程队列的特点和实现原理? 先进先出 Queue 后进 ...

  8. git merge origin master git merge origin/master区别

    git merge origin master //将origin merge 到 master 上 git merge origin/master //将origin上的master分支 merge ...

  9. java.lang.IllegalArgumentException: An invalid character [34] was present in the Cookie value

    java.lang.IllegalArgumentException: An invalid character [34] was present in the Cookie value at org ...

  10. 2.创建NHibernateHelper帮助类,生成sessionFactory

    接上一篇文章 使用FluentNHibemate 操作数据库,添加映射到数据库 http://www.cnblogs.com/fzxiaoyi/p/8443586.html 在Model文件下再创建个 ...