An intriguing failing of convolutional neural networks and the CoordConv solution

NeurIPS 2018

2019-10-10 15:01:48

Paper: https://arxiv.org/pdf/1807.03247.pdf

Official TensorFlow Code: https://github.com/uber-research/coordconv

Unofficial PyTorch Code: https://github.com/walsvid/CoordConv

机器之心：卷积神经网络「失陷」，CoordConv 来填坑: https://zhuanlan.zhihu.com/p/39665894

39919038

要拯救CNN的CoordConv受嘲讽，翻译个坐标还用训练? https://zhuanlan.zhihu.com/p/39841356

1. 给定 feature map and 坐标（x, y）如何生成对应的 relative CoordinateMap？

The following code is from: [ICCV19] AdaptIS: Adaptive Instance Selection Network [Github]

    def get_instances_maps(self, F, points, adaptive_input, controller_input):

        if isinstance(points, mx.nd.NDArray):

            self.num_points = points.shape[1]

        if getattr(self.controller_net, 'return_map', False):

            w = self.eqf(controller_input, points)

        else:

            w = self.eqf(controller_input, points)

            w = self.controller_net(w)

        points = F.reshape(points, shape=(-1, 2))

        x = F.repeat(adaptive_input, self.num_points, axis=0)

        x = self.add_coord_features(x, points)

        x = self.block0(x)

        x = self.adain(x, w)

        x = self.block1(x)

        return x

class AppendCoordFeatures(gluon.HybridBlock):

    def __init__(self, norm_radius, append_dist=True, spatial_scale=1.0):

        super(AppendCoordFeatures, self).__init__()

        self.xs = None

        self.spatial_scale = spatial_scale

        self.norm_radius = norm_radius

        self.append_dist = append_dist

    def _ctx_kwarg(self, x):

        if isinstance(x, mx.nd.NDArray):

            return {"ctx": x.context}

        return {}

    def get_coord_features(self, F, points, rows, cols, batch_size, **ctx_kwarg):

        row_array = F.arange(start=0, stop=rows, step=1, **ctx_kwarg)

        col_array = F.arange(start=0, stop=cols, step=1, **ctx_kwarg)

        coord_rows = F.repeat(F.reshape(row_array, (1, 1, rows, 1)), repeats=cols, axis=3)

        coord_cols = F.repeat(F.reshape(col_array, (1, 1, 1, cols)), repeats=rows, axis=2)

        coord_rows = F.repeat(coord_rows, repeats=batch_size, axis=0)

        coord_cols = F.repeat(coord_cols, repeats=batch_size, axis=0)

        coords = F.concat(coord_rows, coord_cols, dim=1)

        add_xy = F.reshape(points * self.spatial_scale, shape=(0, 0, 1))

        add_xy = F.reshape(F.repeat(add_xy, rows * cols, axis=2),

                           shape=(0, 0, rows, cols))

        coords = (coords - add_xy) / (self.norm_radius * self.spatial_scale)

        if self.append_dist:

            dist = F.sqrt(F.sum(F.square(coords), axis=1, keepdims=1))

            coord_features = F.concat(coords, dist, dim=1)

        else:

            coord_features = coords

        coord_features = F.clip(coord_features, a_min=-1, a_max=1)

        return coord_features

    def hybrid_forward(self, F, x, coords):

        if isinstance(x, mx.nd.NDArray):

            self.xs = x.shape

        batch_size, rows, cols = self.xs[0], self.xs[2], self.xs[3]

        coord_features = self.get_coord_features(F, coords, rows, cols, batch_size, **self._ctx_kwarg(x))

        return F.concat(coord_features, x, dim=1)

    def get_coord_features(self, F, points, rows, cols, batch_size, **ctx_kwarg):

        # (Pdb) points, rows, cols, batch_size

        # ([[61. 71.]] <NDArray 1x2 @gpu(0)>, 96, 96, 1)        

        # row_array and col_array:

        # [ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17.

        #  18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.

        #  36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53.

        #  54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71.

        #  72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89.

        #  90. 91. 92. 93. 94. 95.]

        # <NDArray 96 @gpu(0)>

        # (Pdb) coord_rows

        # [[[[ 0.  0.  0. ...  0.  0.  0.]

        #    [ 1.  1.  1. ...  1.  1.  1.]

        #    [ 2.  2.  2. ...  2.  2.  2.]

        #    ...

        #    [93. 93. 93. ... 93. 93. 93.]

        #    [94. 94. 94. ... 94. 94. 94.]

        #    [95. 95. 95. ... 95. 95. 95.]]]]

        # <NDArray 1x1x96x96 @gpu(0)>

        # (Pdb) coord_cols

        # [[[[ 0.  1.  2. ... 93. 94. 95.]

        #    [ 0.  1.  2. ... 93. 94. 95.]

        #    [ 0.  1.  2. ... 93. 94. 95.]

        #    ...

        #    [ 0.  1.  2. ... 93. 94. 95.]

        #    [ 0.  1.  2. ... 93. 94. 95.]

        #    [ 0.  1.  2. ... 93. 94. 95.]]]]

        # <NDArray 1x1x96x96 @gpu(0)>        

        # (Pdb) add_xy

        # [[[[61. 61. 61. ... 61. 61. 61.]

        #    [61. 61. 61. ... 61. 61. 61.]

        #    [61. 61. 61. ... 61. 61. 61.]

        #    ...

        #    [61. 61. 61. ... 61. 61. 61.]

        #    [61. 61. 61. ... 61. 61. 61.]

        #    [61. 61. 61. ... 61. 61. 61.]]

        #   [[71. 71. 71. ... 71. 71. 71.]

        #    [71. 71. 71. ... 71. 71. 71.]

        #    [71. 71. 71. ... 71. 71. 71.]

        #    ...

        #    [71. 71. 71. ... 71. 71. 71.]

        #    [71. 71. 71. ... 71. 71. 71.]

        #    [71. 71. 71. ... 71. 71. 71.]]]]

        # <NDArray 1x2x96x96 @gpu(0)>    

        # (Pdb) if self.append_dist, then coord_features is:

        # [[[[-1.         -1.         -1.         ... -1.         -1.

        #     -1.        ]

        #    [-1.         -1.         -1.         ... -1.         -1.

        #     -1.        ]

        #    [-1.         -1.         -1.         ... -1.         -1.

        #     -1.        ]

        #    ...

        #    [ 0.7619048   0.7619048   0.7619048  ...  0.7619048   0.7619048

        #      0.7619048 ]

        #    [ 0.78571427  0.78571427  0.78571427 ...  0.78571427  0.78571427

        #      0.78571427]

        #    [ 0.8095238   0.8095238   0.8095238  ...  0.8095238   0.8095238

        #      0.8095238 ]]

        #   [[-1.         -1.         -1.         ...  0.52380955  0.54761904

        #      0.5714286 ]

        #    [-1.         -1.         -1.         ...  0.52380955  0.54761904

        #      0.5714286 ]

        #    [-1.         -1.         -1.         ...  0.52380955  0.54761904

        #      0.5714286 ]

        #    ...

        #    [-1.         -1.         -1.         ...  0.52380955  0.54761904

        #      0.5714286 ]

        #    [-1.         -1.         -1.         ...  0.52380955  0.54761904

        #      0.5714286 ]

        #    [-1.         -1.         -1.         ...  0.52380955  0.54761904

        #      0.5714286 ]]

        #   [[ 1.          1.          1.         ...  1.          1.

        #      1.        ]

        #    [ 1.          1.          1.         ...  1.          1.

        #      1.        ]

        #    [ 1.          1.          1.         ...  1.          1.

        #      1.        ]

        #    ...

        #    [ 1.          1.          1.         ...  0.9245947   0.9382886

        #      0.95238096]

        #    [ 1.          1.          1.         ...  0.944311    0.9577231

        #      0.9715336 ]

        #    [ 1.          1.          1.         ...  0.96421224  0.97735125

        #      0.99088824]]]]

        # <NDArray 1x3x96x96 @gpu(0)>

        pdb.set_trace()

        row_array = F.arange(start=0, stop=rows, step=1, **ctx_kwarg)   ## (96,)

        col_array = F.arange(start=0, stop=cols, step=1, **ctx_kwarg)   ## (96,)

        coord_rows = F.repeat(F.reshape(row_array, (1, 1, rows, 1)), repeats=cols, axis=3)

        coord_cols = F.repeat(F.reshape(col_array, (1, 1, 1, cols)), repeats=rows, axis=2)

        coord_rows = F.repeat(coord_rows, repeats=batch_size, axis=0)

        coord_cols = F.repeat(coord_cols, repeats=batch_size, axis=0)

        coords = F.concat(coord_rows, coord_cols, dim=1)    ## (1, 2, 96, 96) 

        add_xy = F.reshape(points * self.spatial_scale, shape=(0, 0, 1))    ## [[[61.] [71.]]] <NDArray 1x2x1 @gpu(0)>

        add_xy = F.reshape(F.repeat(add_xy, rows * cols, axis=2), shape=(0, 0, rows, cols))

        ## self.norm_radius: 42

        coords = (coords - add_xy) / (self.norm_radius * self.spatial_scale)    ## <NDArray 1x2x96x96 @gpu(0)>

        if self.append_dist:

            dist = F.sqrt(F.sum(F.square(coords), axis=1, keepdims=1))  ## <NDArray 1x1x96x96 @gpu(0)>

            coord_features = F.concat(coords, dist, dim=1)

        else:

            coord_features = coords

        coord_features = F.clip(coord_features, a_min=-1, a_max=1)

        return coord_features

I also write one PyTorch version according to the MXNet version:

class AddCoords(nn.Module):

    def __init__(self, ):

        super().__init__() 

    def forward(self, input_tensor, points):

        _, x_dim, y_dim = input_tensor.size()

        batch_size = 1 

        xx_channel = torch.arange(x_dim).repeat(1, y_dim, 1)    ## torch.Size([1, 9, 9])

        yy_channel = torch.arange(y_dim).repeat(1, x_dim, 1).transpose(1, 2)    ## torch.Size([1, 9, 9]) 

        xx_channel = xx_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3)

        yy_channel = yy_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3)

        coords = torch.cat((xx_channel, yy_channel), dim=1)     ## torch.Size([20, 2, 9, 9])

        coords = coords.type(torch.FloatTensor)

        add_xy = torch.reshape(points, (1, 2, 1))   ## torch.Size([1, 2, 1])

        add_xy_ = add_xy.repeat(1, 1, x_dim * y_dim)  ## torch.Size([1, 2, 81])

        add_xy_ = torch.reshape(add_xy_, (1, 2, x_dim, y_dim))  ## torch.Size([1, 2, 9, 9])

        add_xy_ = add_xy_.type(torch.FloatTensor)

        coords = (coords - add_xy_)     ## torch.Size([1, 2, 9, 9])

        coord_features = np.clip(np.array(coords), -1, 1)   ## (1, 2, 9, 9)

        coord_features = torch.from_numpy(coord_features).cuda() 

        return coord_features

An intriguing failing of convolutional neural networks and the CoordConv solution的更多相关文章

Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
Understanding the Effective Receptive Field in Deep Convolutional Neural Networks 理解深度卷积神经网络中的有效感受野 ...
Deep learning_CNN_Review：A Survey of the Recent Architectures of Deep Convolutional Neural Networks——2019
CNN综述文章的翻译 [2019 CVPR] A Survey of the Recent Architectures of Deep Convolutional Neural Networks 翻 ...
tensorfolw配置过程中遇到的一些问题及其解决过程的记录（配置SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving）
今天看到一篇关于检测的论文<SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real- ...
Notes on Convolutional Neural Networks
这是Jake Bouvrie在2006年写的关于CNN的训练原理,虽然文献老了点,不过对理解经典CNN的训练过程还是很有帮助的.该作者是剑桥的研究认知科学的.翻译如有不对之处,还望告知,我好及时改正, ...
《ImageNet Classification with Deep Convolutional Neural Networks》剖析
<ImageNet Classification with Deep Convolutional Neural Networks> 剖析 CNN 领域的经典之作, 作者训练了一个面向数量为 ...
卷积神经网络CNN(Convolutional Neural Networks)没有原理只有实现
零.说明: 本文的所有代码均可在 DML 找到,欢迎点星星. 注.CNN的这份代码非常慢,基本上没有实际使用的可能,所以我只是发出来,代表我还是实践过而已一.引入: CNN这个模型实在是有些年份了, ...
A Beginner's Guide To Understanding Convolutional Neural Networks(转)
A Beginner's Guide To Understanding Convolutional Neural Networks Introduction Convolutional neural ...
阅读笔记 The Impact of Imbalanced Training Data for Convolutional Neural Networks [DegreeProject2015] 数据分析型
The Impact of Imbalanced Training Data for Convolutional Neural Networks Paulina Hensman and David M ...
读convolutional Neural Networks Applied to House Numbers Digit Classification 的收获。
本文以下内容来自读论文以后认为有价值的地方,论文来自:convolutional Neural Networks Applied to House Numbers Digit Classificati ...

随机推荐

CTFd平台搭建以及一些相关问题解决
CTFd平台搭建以及一些相关问题解决一.序言因为想给学校工作室提高一下学习氛围,随便带学弟学妹入门,所以做了一个ctf平台,开源的平台有CTFd和FBCTF,因为学生租不起高端云主机所以只能选择占 ...
springBoot 发布war/jar包到tomcat（idea）
参考链接:https://blog.csdn.net/qq1076472549/article/details/81318729 1.启动类目录新增打包类: 2.pom.xml新增依赖:<pa ...
html, js,css应用文件路径规则
web前端一般常用文件 .html .css .js.但是当用css文件和html引入资源(比如图片)时,路径可能不相同.下面总结了几条. 使用相对路径引入规则: html或者js引入图片,按照htm ...
Solr基础知识三（整合SSM）
前两篇讲了solr安装和导入数据,这篇讲如何整合到SSM中. 一.整合SSM 1.1 引入依赖 1.2 初始化solr 1.3 写service 1.4 写控制层 1.5 查询二.IK分词器 2.1 ...
Bootstrap。
bootstrap: 1.概念:前端开发框架. 2.快速入门:下载bootstrap.导入文件. 3.响应式布局: * 同一套页面可以兼容不同分辨率的设备. * 实现:依赖于栅格系统:将一行平均分成1 ...
docker命令小全 this is my note.
服务器类型:linux =>centos 7.X以上版本常用命令使用紫色加粗标明 1.安装yum-util(为配置docker安装时使用阿里镜像做准备):yum install -y yum- ...
Python 并发部分的面试题
进程进程间内存是否共享?如何实现通讯? 进程间内存不共享,可以通过 Manage模块加锁通过队列或通过管道加锁 socket实现通讯请聊聊进程队列的特点和实现原理? 先进先出 Queue 后进 ...
git merge origin master git merge origin/master区别
git merge origin master //将origin merge 到 master 上 git merge origin/master //将origin上的master分支 merge ...
java.lang.IllegalArgumentException: An invalid character [34] was present in the Cookie value
java.lang.IllegalArgumentException: An invalid character [34] was present in the Cookie value at org ...
2.创建NHibernateHelper帮助类，生成sessionFactory
接上一篇文章使用FluentNHibemate 操作数据库,添加映射到数据库 http://www.cnblogs.com/fzxiaoyi/p/8443586.html 在Model文件下再创建个 ...

An intriguing failing of convolutional neural networks and the CoordConv solution

Uber提出CoordConv：解决普通CNN坐标变换问题: https://zhuanlan.zhihu.com/p/39919038

要拯救CNN的CoordConv受嘲讽，翻译个坐标还用训练? https://zhuanlan.zhihu.com/p/39841356

An intriguing failing of convolutional neural networks and the CoordConv solution的更多相关文章

随机推荐

热门专题