大规模人脸分类—allgather操作(1)

pytorch中 all_gather 操作是不进行梯度回传的。在计算图构建中如果需要经过all_gather操作后，仍需要将梯度回传给各个进程中的allgather前的对应变量，则需要重新继承torch.autograd.Function

https://pytorch.org/docs/stable/autograd.html 中对torch.autograd.Function进行了介绍

https://pytorch.org/docs/stable/notes/extending.html#extending-torch-autograd 中举例介绍如何重新实现其子类

下面代码是为了说明all_gather相关特性及如何实现梯度回传.

$x,y,z$都是2x2矩阵，其之间关系为$y=x+2, z=y*y$

接下来就需要MPI进行进程间数据传递，将z进行汇总到每个进程即all_gather操作。然后将汇总的矩阵进行相乘，然后求均值。

r对y的导数如下:

$r=0.25({}_{g_0}y_{11}^2*{}_{g_1}y_{11}^2+{}_{g_0}y_{12}^2*{}_{g_1}y_{12}^2+
{}_{g_0}y_{21}^2*{}_{g_1}y_{21}^2+
{}_{g_0}y_{22}^2*{}_{g_1}y_{22}^2)$

$\frac{dr}{d{}_{g_0}y}=
\begin{Bmatrix}
0.5{}_{g_0}y_{11}*{}_{g_1}y_{11}^2 & 0.5{}_{g_0}y_{12}*{}_{g_1}y_{12}^2 \\
0.5{}_{g_0}y_{21}*{}_{g_1}y_{21}^2 & 0.5{}_{g_0}y_{22}*{}_{g_1}y_{22}^2)
\end{Bmatrix}$

gpu0上x值为$\begin{Bmatrix} 1 & 1 \\1 & 1 \end{Bmatrix}$，gpu1上x值为$\begin{Bmatrix} 0 & 0 \\0 & 0 \end{Bmatrix}$.通过公式可以计算出，r关于gpu0上的y的导数为$\begin{Bmatrix}6 & 6 \\6 & 6\end{Bmatrix}$,r关于gpu1上的y的导数为$\begin{Bmatrix}9 & 9 \\9 & 9\end{Bmatrix}$

import os

import torch

from torch import nn

import sys

sys.path.append('./')

import torch.distributed as dist

from torch.autograd import Variable

from utils import GatherLayer

def test():

    #torch.manual_seed(0)

    torch.backends.cudnn.deterministic=True

    torch.backends.cudnn.benchmark=True

    dist.init_process_group(backend="nccl", init_method="env://")

    rank = dist.get_rank()

    local_rank = int(os.environ.get('LOCAL_RANK', 0))

    world_size = dist.get_world_size()

    torch.cuda.set_device(local_rank)

    print('world_size: {}, rank: {}, local_rank: {}'.format(world_size, rank, local_rank))

    if local_rank == 0:

        x = Variable(torch.ones(2, 2), requires_grad=True).cuda()

    else:

        x = Variable(torch.zeros(2, 2), requires_grad=True).cuda()

    y = x + 2

    y.retain_grad()

    z = y * y

    z_gather = [torch.zeros_like(z) for _ in range(world_size)]

    dist.all_gather(z_gather, z)

    #z_gather = GatherLayer.apply(z)

    r = z_gather[0] * z_gather[1]

    out = r.mean()

    out.backward()

    if local_rank == 0:

        print('rank:0', y.grad)

    else:

        print('rank:1', y.grad)

（1）上述述代码中，先使用pytorch中提供的all_gather操作，运行代码会提示错误。错误信息如下：

Traceback (most recent call last):

  File "test/test_all_gather.py", line 46, in <module>

Traceback (most recent call last):

  File "test/test_all_gather.py", line 46, in <module>

    test()

  File "test/test_all_gather.py", line 36, in test

    out.backward()

  File "/usr/local/lib/python3.6/dist-packages/torch/tensor.py", line 185, in backward

    torch.autograd.backward(self, gradient, retain_graph, create_graph)

  File "/usr/local/lib/python3.6/dist-packages/torch/autograd/__init__.py", line 127, in backward

    allow_unreachable=True)  # allow_unreachable flag

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

    test()

（2）参考https://github.com/Spijkervet/SimCLR/blob/master/simclr/modules/gather.py, 该函数就是继承torch.autograd.Function，实现了all_gather后，梯度也能回传。

上述代码，启用z_gather = GatherLayer.apply(z),就实现了梯度回传功能，打印对变量y的梯度

world_size: 2, rank: 0, local_rank: 0

world_size: 2, rank: 1, local_rank: 1

rank:0 tensor([[6., 6.],

        [6., 6.]], device='cuda:0')

rank:1 tensor([[9., 9.],

        [9., 9.]], device='cuda:1')

GatherLayer类实现如下：

class GatherLayer(torch.autograd.Function):

    """Gather tensors from all process, supporting backward propagation."""

    @staticmethod

    def forward(ctx, input):

        ctx.save_for_backward(input)

        output = [torch.zeros_like(input) for _ in range(dist.get_world_size())]

        dist.all_gather(output, input)

        return tuple(output)

    @staticmethod

    def backward(ctx, *grads):

        (input,) = ctx.saved_tensors

        grad_out = torch.zeros_like(input)

        grad_out[:] = grads[dist.get_rank()]

        return grad_out

下面网址有关all gather梯度传播的讨论

https://discuss.pytorch.org/t/will-dist-all-gather-break-the-auto-gradient-graph/47350

大规模人脸分类—allgather操作(1)的更多相关文章

用深度学习（CNN RNN Attention）解决大规模文本分类问题 - 综述和实践
https://zhuanlan.zhihu.com/p/25928551 近来在同时做一个应用深度学习解决淘宝商品的类目预测问题的项目,恰好硕士毕业时论文题目便是文本分类问题,趁此机会总结下文本分类 ...
[转] 用深度学习（CNN RNN Attention）解决大规模文本分类问题 - 综述和实践
转自知乎上看到的一篇很棒的文章:用深度学习(CNN RNN Attention)解决大规模文本分类问题 - 综述和实践近来在同时做一个应用深度学习解决淘宝商品的类目预测问题的项目,恰好硕士毕业时论文 ...
用keras的cnn做人脸分类
keras介绍 Keras是一个简约,高度模块化的神经网络库.采用Python / Theano开发. 使用Keras如果你需要一个深度学习库: 可以很容易和快速实现原型(通过总模块化,极简主义,和可 ...
wordpress搜索结果排除某个分类如何操作
我们知道wordpress的搜索结果页search.php和分类页category.php是一样的,但是客户的网站是功能比较多的系统,有新闻又有产品,如果搜索结果只想展示产品要如何操作呢?随ytkah ...
SQL分类-DDL_操作数据库_创建&查询
SQL分类 1.DDL(Data Definition Language)数据定义语言用来定义数据库对象:数据库,表,列等.关键字:create , drop, alter 等 2.DML(Data ...
python集合的分类与操作
如图: 集合的炒作分类: 确定大小测试项的成员关系遍历集合获取一个字符串表示测试相等性连接两个集合转换为另一种类型的集合插入一项删除一项替换一项访问或获取一项
Python函数分类及操作
为什么使用函数? 答:函数的返回值可以确切知道整个函数执行的结果函数的定义:1.数学意义的函数:两个变量:自变量x和因变量y,二者的关系 2.Pytho ...
.NET做人脸识别并分类
.NET做人脸识别并分类在游乐场.玻璃天桥.滑雪场等娱乐场所,经常能看到有摄影师在拍照片,令这些经营者发愁的一件事就是照片太多了,客户在成千上万张照片中找到自己可不是件容易的事.在一次游玩等活动或家 ...
face recognition[翻译][深度学习理解人脸]
本文译自<Deep learning for understanding faces: Machines may be just as good, or better, than humans& ...
face recognition[翻译][深度人脸识别:综述]
这里翻译下<Deep face recognition: a survey v4>. 1 引言由于它的非侵入性和自然特征,人脸识别已经成为身份识别中重要的生物认证技术,也已经应用到许多领 ...

随机推荐

Jest - Using test function to test the function
Note: Please check the prev blog to see the jest configuration. calculator.js const plus = (a, b) =& ...
K8S详细教程
Kubernetes详细教程 1. Kubernetes介绍 1.1 应用部署方式演变在部署应用程序的方式上,主要经历了三个时代: 传统部署:互联网早期,会直接将应用程序部署在物理机上优点:简单, ...
神奇的Object.assign()
Object.assign() 方法用于将所有可枚举的属性的值从一个或多个源对象复制到目标对象.它将返回目标对象. 1.Object.assign()可以在对象为一层的时候,实现简单的"深拷 ...
Log4net使用探究
第一步: 通过Nuget package 搜索Apache Log4net安装第二步: 在项目Global.asax文件中添加读取配置文件第三步: 编写Loghelper 文件 1 public ...
[Swift]创建桥接文件，Swift使用MJRefresh刷新插件
刚开始玩Swift,想做个下拉刷新的功能,发现在OC中用得比较多的第三方插件是MJRefresh.查了一下,在Swift中使用OC的插件要通过桥接文件,然后又百度一下怎么创建桥接文件,发现很多都是老司 ...
macOS Big Sur 设置JAVA_HOME
默认JAVA_HOME指向的是: /Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home 这个不是我们自己安装的jdk,另外本 ...
Fortran笔记派生类型-整理版
以下为整理后的笔记,英文原文 Introduction to Modern Fortran for the Earth System Sciences, 英文翻译 https://www.cnblog ...
【广告】UEOI 招聘减章
欢迎加入UEOI! 招聘目标:>=25人各位++rp.
[原创] CSS自定义IOS苹果，Android安卓的CheckBox 效果，可以根据文字大小变化而变化，内框显示文字，另外可自定大小，自定颜色
在经过对网上一些自定CheckBox的一番研究之后,现在综合讲一下该样式实现的技巧. 先上图: 图中已展示了多种样式,实现的原理很简单,一个外Box,一个内Box,外Box显示背景色,内Box显示白色 ...
Sup, inf convolution for convex functions
Let $\Omega$ be a bounded convex domain in $\mathbb{R}^n$. $f:\Omega\rightarrow\mathbb{R}^n$. If $f$ ...

大规模人脸分类—allgather操作(1)

大规模人脸分类—allgather操作(1)的更多相关文章

随机推荐

热门专题