An intriguing failing of convolutional neural networks and the CoordConv solution

NeurIPS 2018

2019-10-10 15:01:48

Paperhttps://arxiv.org/pdf/1807.03247.pdf

Official TensorFlow Codehttps://github.com/uber-research/coordconv

Unofficial PyTorch Codehttps://github.com/walsvid/CoordConv

 

机器之心:卷积神经网络「失陷」,CoordConv 来填坑https://zhuanlan.zhihu.com/p/39665894

Uber提出CoordConv:解决普通CNN坐标变换问题: https://zhuanlan.zhihu.com/p/39919038

要拯救CNN的CoordConv受嘲讽,翻译个坐标还用训练? https://zhuanlan.zhihu.com/p/39841356

1. 给定 feature map and 坐标(x, y)如何生成对应的 relative CoordinateMap?

The following code is from: [ICCV19] AdaptIS: Adaptive Instance Selection Network [Github]

    def get_instances_maps(self, F, points, adaptive_input, controller_input):
if isinstance(points, mx.nd.NDArray):
self.num_points = points.shape[1] if getattr(self.controller_net, 'return_map', False):
w = self.eqf(controller_input, points)
else:
w = self.eqf(controller_input, points)
w = self.controller_net(w) points = F.reshape(points, shape=(-1, 2))
x = F.repeat(adaptive_input, self.num_points, axis=0)
x = self.add_coord_features(x, points) x = self.block0(x)
x = self.adain(x, w)
x = self.block1(x) return x
class AppendCoordFeatures(gluon.HybridBlock):
def __init__(self, norm_radius, append_dist=True, spatial_scale=1.0):
super(AppendCoordFeatures, self).__init__()
self.xs = None
self.spatial_scale = spatial_scale
self.norm_radius = norm_radius
self.append_dist = append_dist def _ctx_kwarg(self, x):
if isinstance(x, mx.nd.NDArray):
return {"ctx": x.context}
return {} def get_coord_features(self, F, points, rows, cols, batch_size, **ctx_kwarg):
row_array = F.arange(start=0, stop=rows, step=1, **ctx_kwarg)
col_array = F.arange(start=0, stop=cols, step=1, **ctx_kwarg)
coord_rows = F.repeat(F.reshape(row_array, (1, 1, rows, 1)), repeats=cols, axis=3)
coord_cols = F.repeat(F.reshape(col_array, (1, 1, 1, cols)), repeats=rows, axis=2) coord_rows = F.repeat(coord_rows, repeats=batch_size, axis=0)
coord_cols = F.repeat(coord_cols, repeats=batch_size, axis=0) coords = F.concat(coord_rows, coord_cols, dim=1) add_xy = F.reshape(points * self.spatial_scale, shape=(0, 0, 1))
add_xy = F.reshape(F.repeat(add_xy, rows * cols, axis=2),
shape=(0, 0, rows, cols)) coords = (coords - add_xy) / (self.norm_radius * self.spatial_scale)
if self.append_dist:
dist = F.sqrt(F.sum(F.square(coords), axis=1, keepdims=1))
coord_features = F.concat(coords, dist, dim=1)
else:
coord_features = coords coord_features = F.clip(coord_features, a_min=-1, a_max=1)
return coord_features def hybrid_forward(self, F, x, coords):
if isinstance(x, mx.nd.NDArray):
self.xs = x.shape batch_size, rows, cols = self.xs[0], self.xs[2], self.xs[3]
coord_features = self.get_coord_features(F, coords, rows, cols, batch_size, **self._ctx_kwarg(x)) return F.concat(coord_features, x, dim=1)
    def get_coord_features(self, F, points, rows, cols, batch_size, **ctx_kwarg):

        # (Pdb) points, rows, cols, batch_size
# ([[61. 71.]] <NDArray 1x2 @gpu(0)>, 96, 96, 1) # row_array and col_array:
# [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
# 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
# 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53.
# 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71.
# 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89.
# 90. 91. 92. 93. 94. 95.]
# <NDArray 96 @gpu(0)> # (Pdb) coord_rows
# [[[[ 0. 0. 0. ... 0. 0. 0.]
# [ 1. 1. 1. ... 1. 1. 1.]
# [ 2. 2. 2. ... 2. 2. 2.]
# ...
# [93. 93. 93. ... 93. 93. 93.]
# [94. 94. 94. ... 94. 94. 94.]
# [95. 95. 95. ... 95. 95. 95.]]]]
# <NDArray 1x1x96x96 @gpu(0)> # (Pdb) coord_cols
# [[[[ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]
# ...
# [ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]]]]
# <NDArray 1x1x96x96 @gpu(0)> # (Pdb) add_xy
# [[[[61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]
# ...
# [61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]] # [[71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]
# ...
# [71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]]]]
# <NDArray 1x2x96x96 @gpu(0)> # (Pdb) if self.append_dist, then coord_features is:
# [[[[-1. -1. -1. ... -1. -1.
# -1. ]
# [-1. -1. -1. ... -1. -1.
# -1. ]
# [-1. -1. -1. ... -1. -1.
# -1. ]
# ...
# [ 0.7619048 0.7619048 0.7619048 ... 0.7619048 0.7619048
# 0.7619048 ]
# [ 0.78571427 0.78571427 0.78571427 ... 0.78571427 0.78571427
# 0.78571427]
# [ 0.8095238 0.8095238 0.8095238 ... 0.8095238 0.8095238
# 0.8095238 ]] # [[-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# ...
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]] # [[ 1. 1. 1. ... 1. 1.
# 1. ]
# [ 1. 1. 1. ... 1. 1.
# 1. ]
# [ 1. 1. 1. ... 1. 1.
# 1. ]
# ...
# [ 1. 1. 1. ... 0.9245947 0.9382886
# 0.95238096]
# [ 1. 1. 1. ... 0.944311 0.9577231
# 0.9715336 ]
# [ 1. 1. 1. ... 0.96421224 0.97735125
# 0.99088824]]]]
# <NDArray 1x3x96x96 @gpu(0)> pdb.set_trace()
row_array = F.arange(start=0, stop=rows, step=1, **ctx_kwarg) ## (96,)
col_array = F.arange(start=0, stop=cols, step=1, **ctx_kwarg) ## (96,)
coord_rows = F.repeat(F.reshape(row_array, (1, 1, rows, 1)), repeats=cols, axis=3)
coord_cols = F.repeat(F.reshape(col_array, (1, 1, 1, cols)), repeats=rows, axis=2) coord_rows = F.repeat(coord_rows, repeats=batch_size, axis=0)
coord_cols = F.repeat(coord_cols, repeats=batch_size, axis=0) coords = F.concat(coord_rows, coord_cols, dim=1) ## (1, 2, 96, 96) add_xy = F.reshape(points * self.spatial_scale, shape=(0, 0, 1)) ## [[[61.] [71.]]] <NDArray 1x2x1 @gpu(0)>
add_xy = F.reshape(F.repeat(add_xy, rows * cols, axis=2), shape=(0, 0, rows, cols)) ## self.norm_radius: 42
coords = (coords - add_xy) / (self.norm_radius * self.spatial_scale) ## <NDArray 1x2x96x96 @gpu(0)>
if self.append_dist:
dist = F.sqrt(F.sum(F.square(coords), axis=1, keepdims=1)) ## <NDArray 1x1x96x96 @gpu(0)>
coord_features = F.concat(coords, dist, dim=1)
else:
coord_features = coords coord_features = F.clip(coord_features, a_min=-1, a_max=1) return coord_features

I also write one PyTorch version according to the MXNet version:

class AddCoords(nn.Module):

    def __init__(self, ):
super().__init__() def forward(self, input_tensor, points):
_, x_dim, y_dim = input_tensor.size()
batch_size = 1 xx_channel = torch.arange(x_dim).repeat(1, y_dim, 1) ## torch.Size([1, 9, 9])
yy_channel = torch.arange(y_dim).repeat(1, x_dim, 1).transpose(1, 2) ## torch.Size([1, 9, 9]) xx_channel = xx_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3)
yy_channel = yy_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3) coords = torch.cat((xx_channel, yy_channel), dim=1) ## torch.Size([20, 2, 9, 9])
coords = coords.type(torch.FloatTensor) add_xy = torch.reshape(points, (1, 2, 1)) ## torch.Size([1, 2, 1])
add_xy_ = add_xy.repeat(1, 1, x_dim * y_dim) ## torch.Size([1, 2, 81])
add_xy_ = torch.reshape(add_xy_, (1, 2, x_dim, y_dim)) ## torch.Size([1, 2, 9, 9])
add_xy_ = add_xy_.type(torch.FloatTensor) coords = (coords - add_xy_) ## torch.Size([1, 2, 9, 9])
coord_features = np.clip(np.array(coords), -1, 1) ## (1, 2, 9, 9)
coord_features = torch.from_numpy(coord_features).cuda() return coord_features

 

An intriguing failing of convolutional neural networks and the CoordConv solution的更多相关文章

  1. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks

    Understanding the Effective Receptive Field in Deep Convolutional Neural Networks 理解深度卷积神经网络中的有效感受野 ...

  2. Deep learning_CNN_Review:A Survey of the Recent Architectures of Deep Convolutional Neural Networks——2019

    CNN综述文章 的翻译 [2019 CVPR] A Survey of the Recent Architectures of Deep Convolutional Neural Networks 翻 ...

  3. tensorfolw配置过程中遇到的一些问题及其解决过程的记录(配置SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving)

    今天看到一篇关于检测的论文<SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real- ...

  4. Notes on Convolutional Neural Networks

    这是Jake Bouvrie在2006年写的关于CNN的训练原理,虽然文献老了点,不过对理解经典CNN的训练过程还是很有帮助的.该作者是剑桥的研究认知科学的.翻译如有不对之处,还望告知,我好及时改正, ...

  5. 《ImageNet Classification with Deep Convolutional Neural Networks》 剖析

    <ImageNet Classification with Deep Convolutional Neural Networks> 剖析 CNN 领域的经典之作, 作者训练了一个面向数量为 ...

  6. 卷积神经网络CNN(Convolutional Neural Networks)没有原理只有实现

    零.说明: 本文的所有代码均可在 DML 找到,欢迎点星星. 注.CNN的这份代码非常慢,基本上没有实际使用的可能,所以我只是发出来,代表我还是实践过而已 一.引入: CNN这个模型实在是有些年份了, ...

  7. A Beginner's Guide To Understanding Convolutional Neural Networks(转)

    A Beginner's Guide To Understanding Convolutional Neural Networks Introduction Convolutional neural ...

  8. 阅读笔记 The Impact of Imbalanced Training Data for Convolutional Neural Networks [DegreeProject2015] 数据分析型

    The Impact of Imbalanced Training Data for Convolutional Neural Networks Paulina Hensman and David M ...

  9. 读convolutional Neural Networks Applied to House Numbers Digit Classification 的收获。

    本文以下内容来自读论文以后认为有价值的地方,论文来自:convolutional Neural Networks Applied to House Numbers Digit Classificati ...

随机推荐

  1. java 之 集合概述

    一.集合概述 不管是哪一种数据结构,其实本质上都是容器来着,就是用来装对象的.因此,我们就要搞清楚两点:(1)如何存储(2)存储特点 1.集合 集合是 Java 中提供的一种容器,可以用来存储多个数据 ...

  2. Java 之 List 接口

    一.List 接口介绍 java.util.List 接口继承自 Collection 接口,是单列集合的一个重要分支,习惯性地会将实现了 List 接口的对象称为 List 集合. 在 List 集 ...

  3. React的基本知识和优缺点

    阮一峰 React入门实例教程 知识点 1.html模板3个预加载的js文件,script的type属性 2.ReactDOM.render() 3.JSX语言:允许js和html的混写 4.comp ...

  4. Spring 创建Bean的6种方式

    前言 本文讲解了在Spring 应用中创建Bean的多种方式,包括自动创建,以及手动创建注入方式,实际开发中可以根据业务场景选择合适的方案. 方式1: 使用Spring XML方式配置,该方式用于在纯 ...

  5. 机器学习笔记6:K-Means

    目录 目标函数 目标函数的表现函数 针对u和r求解: 最优解的表达式的意义: K-means聚类的形象化展示 聚类前 第一轮循环 第二轮循环 第三轮循环 最终结果 演示代码: 关于K-means的几个 ...

  6. python写入csv文件时的乱码问题

    今天在使用python的csv库将数据写入csv文件时候,出现了中文乱码问题,解决方法是在写入文件前,先指定utf-8编码,如下: import csv import codecs if __name ...

  7. python-gitlab 之更改 merge_method

    参考: https://docs.gitlab.com/ee/api/projects.html https://python-gitlab.readthedocs.io/en/stable/gl_o ...

  8. P1092 虫食算[搜索]

    这个式子是是由\(A\sim A+N\)组成的,那么\(A\sim A+N\)就只能等于\(0\sim N-1\),因此我们每次对\(A\sim A+N\)的取值做一个新的排列,然后judge一下当前 ...

  9. sql中多条件进行排序的问题

    order by后边的字段并不是唯一的,支持多个,按照你排序的先后顺序写就可以了.另外按照每个字段的升序和降序同样支持.默认是升序的.如下order by column1(asc or desc),c ...

  10. 项目Beta冲刺(6/7)(追光的人)(2019.5.28)

    所属课程 软件工程1916 作业要求 Beta冲刺博客汇总 团队名称 追光的人 作业目标 描述Beta冲刺每日的scrum和PM报告两部分 队员学号 队员博客 221600219 小墨 https:/ ...