论文笔记之：Semi-supervised Classification with Graph Convolutional Networks

Semi-supervised Classification with Graph Convolutional Networks

2018-01-16 22:33:36

1. 文章主要思想：

2. 代码实现（Pytorch）：https://github.com/tkipf/pygcn

【Introduction】：

本文尝试用 GCN 进行半监督的分类，通过引入一个 graph Laplacian regularization term 到损失函数中：

其中，L0 代表损失函数，即：graph 的标注部分，f(*) 可以是类似神经网络的可微分函数，X 是节点特征向量组成的矩阵，代表无向图 g 的 unnormalized graph Laplacian，及其邻接矩阵 A，degree matrix $D_{ii} = \sum_{j} A_{ij}$. 公式（1）是依赖于假设：connected nodes in the graph are likely to share the same label. 但是这个假设，可能限制了模型的适应性（the modeling capacity），因为 graph edges 不需要编码 node 的相似性，但可以包含额外的信息。

在这个工作中，我们直接用神经网络模型 f(X, A) 来编码 graph 结构，然后在有label 的节点上进行训练，所以，避免了显示的在损失函数中，基于 graph 的正则化项。基于 f(*) 在 graph 上的近邻矩阵将会允许模型从监督loss L0 来分布梯度信息，也确保其可以学习 nodes 的表示。

本文的创新点主要由两个部分：

1. we introduce a localized and well-behaved propagation rule for graph convolutional neural networks, and show it can be motived from a first-order approximation of spectral convolutions on graphs.

2. we show how this form of a graph convolutional neural network can be used for fast and scalable semi-supervised classification of nodes in a graph.

【Fast Approximate Convolutions on Graphs】:

我们利用下面的传递规则来构建多层 Graph Convolutional Network（GCN）：

其中，是无向图 g 的邻接矩阵加上自我连接。$I_N$ 是单位矩阵，和 $W^l$ 是特定层的可训练权重矩阵。$\delta(*)$ 代表激活函数，例如 ReLU(*)。$H^l$ 是第 l 层的激活的矩阵。

接下来，我们表明这种形式的传递规则可以由 first-order approximation of localized spectral filters on graphs 启发而来。我们将 graph 上的 spectral convolutions 定义为一个信号 x 和 filter $g_{\theta} = diag(\theta)$ 在傅里叶领域的乘积，参数化为 $\theta$，即：

其中，U 是归一化的 graph Laplacian 的特征向量的矩阵（the matrix of eigenvectors of the normalized graph Laplacian），，with a diagonal matrix of its eigenvalues ^ and $U^T x$ being the graph Fourier transform of x. 我们可以将 $g_{\theta}$ 看做是 L的奇异值的函数，即：。评估上述公式，计算量比较大，因为奇异值矩阵乘积的复杂度是 $O(N^2)$。此外，计算 L 的特征值分解可能对于大型的 graph 来说代价也比较昂贵。为了解决这个问题，Hammond et al. 在 2011年提出，可以用一个 truncated expansion 来很好的估计：

其中，。$\lambda_{max}$ 代表 L 的最大奇异值。$\theta'$ 现在是 Chebyshev coefficients 的向量。这里引出了一个新的概念【Chebyshev polynomials】，其定义为：$T_k(x) = 2xT_{k-1}(x) - T_{k-2}(x)$ with $T_0(x) = 1$ and $T_1(x) = x$。读者可以继续研究下这两篇 paper，来更好的理解这个近似：【1】【2】。

【1】Hammond, David K, Vandergheynst, Pierre, and Gribonval, Remi. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2):129–150, 2011

【2】Defferrard, Michael, Bresson, Xavier, and Vandergheynst, Pierre. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, 2016

重新回到我们关于 a signal x and a filter $g_{\theta'}$ 的定义，我们现在有：

其中，；可以很简单的验证：。注意到这个表达式具有下面的性质：。注意到，this experssion is now K-localized sinece it is a K-th localized since it is a K-th order polynomial in the Laplacian, i.e. it depends only on nodes that are at maximum K steps away from the central node (K-th order neighborhood)。评估上述公式的复杂度为 $O(E)$，即：与边的个数有关。Defferrard et al. 【2】利用这个 K-localized convolution 来定义 graphs 上的卷积神经网络。

在这个工作中，我们建议 keeping only terms up to order k=1 来估计上述公式。原因如下：as we intend to stack multiple layers of parameterized graph convolutions followed by non-linearities, we expect that a per-layer convolution operation that is linear with respect to the adjacency matrix increases modeling capacity while keeping the comptational complexity comparable to a single graph convolution with k > 1. We further approximate $\lambda_{max} 约等于 2$，as we can expect that neural network parameters will adapt to this change in scale during training.

有了这些近似，我们有：

有两个 free parameters $\theta_0^'$ and $\theta_1^'$. 公式（6）可以理解为利用一个参数化的 filter 仅仅在一个节点的直接近邻上进行局部卷积操作。这些 filter 的参数可以在整个 graph 上进行参数共享。随后的这种 filters 可以有效的卷积一个节点的 k-th order 的近邻，其中 k is the number of successive filtering operations or convolutional layers in the neural network model.

实际上，进一步的限制参数的数量，可以降低每一层的许多操作（如 matrix multiplication）。我们可以写作：

这里就仅仅有一个参数了 $\theta = \theta_0^' = -\theta_1^'$。注意到，现在奇异值的范围[0, 2]。重复的利用这个操作符，可能会引起不稳定或者梯度消失、爆炸等情况，当在一个深度神经网络模型中进行应用的时候。为了消除这种问题，我们引入如下的 renormalization trick：

我们将这种形式拓展到 signal X with C input channels （i.e. a C-dimensional feature vector for every node）and F filters or feature maps as follows:

其中，现在是 filter 参数的矩阵，Y 是卷积的信号矩阵。这个 filter operation 的复杂度是 $O(|E|FC)$，因为可以有效的执行，as a product of a sparse matrix with a dense matrix.

【Semi-supervised Node Classification 】

　　有了上述灵活的模型 f(X, A) 在 graph 上进行有效的信息传递，我们可以重新回到半监督节点分类的问题。像 introduction 中列出来的那样，我们可以 relax 在基于 graph 的半监督学习中的常规假设，通过 conditioning our model f(X, A) both on the data X and on the adjacency matrix A of the underlying graph structure. 我们希望这种设定可以在特定的场景下特别有效：the adjacency matrix contains information not present in the data X. 总体的模型，例如：一个多层的 GCN 进行半监督学习，如图1所示的那样。

　　3.1 Example :

　　我们考虑一个两层的 GCN 进行半监督节点分类（a two-layer GCN for semi-supervised node classification on a graph with a symmetric adjacency matrix A (binary or weighted)）。我们首先在预处理的步骤中计算。我们的前向传播模型可以采用下面简单的形式：

其中，$W^0$ is a input-to-hidden weight matrix for a hidden layer with H feature maps. $W^1$ is a hidden-to-output weight matrix. 对于半监督的多类别分类，我们采用 the cross-entropy error over all labeled examples:

　　其中，$y_L$ 是带有标签的节点集合（the set of node indices that have labels）。

　　神经网络的权重 $W^0$ and $W^1$ 是用 gradient descent 进行训练的。在这个工作中，我们利用全部的数据集，进行批梯度下降，进行每一次的训练迭代。

Pytorch 代码实现：

1. train.py :

数据的加载

2. Layer 的定义：

论文笔记之：Semi-supervised Classification with Graph Convolutional Networks的更多相关文章

Semi-Supervised Classification with Graph Convolutional Networks
Kipf, Thomas N., and Max Welling. "Semi-supervised classification with graph convolutional netw ...
论文笔记之：Visual Tracking with Fully Convolutional Networks
论文笔记之:Visual Tracking with Fully Convolutional Networks ICCV 2015 CUHK 本文利用 FCN 来做跟踪问题,但开篇就提到并非将其看做 ...
论文笔记：（2019CVPR）PointConv: Deep Convolutional Networks on 3D Point Clouds
目录摘要一.前言 1.1直接获取3D数据的传感器 1.2为什么用3D数据 1.3目前遇到的困难 1.4现有的解决方法及存在的问题二.本文idea 2.1 idea来源 2.2 初始思路 2.3 ...
《SEMI-SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS》论文阅读
背景简介 GCN的提出是为了处理非结构化数据(相对于image像素点而言).CNN处理规则矩形的网格像素点已经十分成熟,其最大的特点就是利用卷积进行①参数共享②局部连接,如下图: 那么类比到非结构数据 ...
论文解读 - Composition Based Multi Relational Graph Convolutional Networks
1 简介随着图卷积神经网络在近年来的不断发展,其对于图结构数据的建模能力愈发强大.然而现阶段的工作大多针对简单无向图或者异质图的表示学习,对图中边存在方向和类型的特殊图----多关系图(Multi- ...
论文解读（DropEdge）《DropEdge: Towards Deep Graph Convolutional Networks on Node Classification》
论文信息论文标题:DropEdge: Towards Deep Graph Convolutional Networks on Node Classification论文作者:Yu Rong, We ...
【论文笔记】Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition 2018-01-28 15:4 ...
论文解读（SelfGNN）《Self-supervised Graph Neural Networks without explicit negative sampling》
论文信息论文标题:Self-supervised Graph Neural Networks without explicit negative sampling论文作者:Zekarias T. K ...
论文解读（Geom-GCN）《Geom-GCN: Geometric Graph Convolutional Networks》
Paper Information Title:Geom-GCN: Geometric Graph Convolutional NetworksAuthors:Hongbin Pei, Bingzhe ...

随机推荐

html5-css背景
div{ width: 300px; height: 300px; border:50px groove rgba(200,60,30,0.5); /*background-c ...
XMLHttpRequest对象（Ajax）的状态码（readystate） HTTP状态代码（status）
2018-11-28 14:19:00 来自 :XMLHttpRequest对象(Ajax)的状态码(readystate) HTTP状态代码(status) XMLHttpRequest对象(Aj ...
【Hadoop学习之一】Hadoop介绍
一.概念 Hadoop是一个能够对大量数据进行分布式处理的软件框架,充分利用集群的威力进行高速运算和存储. 二.主要模块Hadoop Common:支持其他Hadoop模块的常用实用程序.Hadoop ...
FastDFS：Java客户都实现文件的上传、下载、修改、删除
客户端版本:fastdfs_client_v1.24.jar 配置文件 connect_timeout = 200 network_timeout = 3000 charset = UTF-8 htt ...
C++11 Function 使用场景
[1]场景分析在一个函数内部,可能会多次用到某一段代码,一般情况是把这段用到次数较多的代码封装成一个函数. 但是,如果这段代码仅仅只在这个函数中有使用,这时封装成函数显得既麻烦又冗赘. 那么,有没有 ...
mac下编译cpu only caffe并用xCode建caffe工程
mac编译caffe 好像又变容易了,直接git clone下载blvc源码,make.config里去掉了CPU_ONLY前面的注释,并没有安装任何依赖,也可能是自己mac上本来有, xCode里调 ...
java 泛型E T ?的区别
Java泛型中的标记符含义: E - Element (在集合中使用,因为集合中存放的是元素) T - Type(Java 类) K - Key(键) V - Value(值) N - Number ...
mybatis源码解析6---MappedStatement解析
MappedStatement类位于mybatis包的org.apache.ibatis.mapping目录下,是一个final类型也就是说实例化之后就不允许改变 MappedStatement对象对 ...
pyspider源码解读--调度器scheduler.py
pyspider源码解读--调度器scheduler.py scheduler.py首先从pyspider的根目录下找到/pyspider/scheduler/scheduler.py其中定义了四个类 ...
Python词云分析
import jieba from matplotlib import pyplot as plt from wordcloud import WordCloud from PIL import Im ...

论文笔记之：Semi-supervised Classification with Graph Convolutional Networks

论文笔记之：Semi-supervised Classification with Graph Convolutional Networks的更多相关文章

随机推荐

热门专题