论文信息

论文标题：DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks
论文作者：Lin Tian, Xiuzhen Zhang, Jey Han Lau
论文来源：2022，NAACL
论文地址：download
论文代码：download

1 Introduction

　　本文的模型研究了如何充分利用用户和评论信息，对比之前的方法，有以下不同：

　　(1) we model comments both as a:

　　　　(i) stream to capture the temporal nature of evolving comments;

　　　　(ii) network by following the conversational structure (see Figure 1 for an illustration);

　　(2) our comment network uses sequence model to encode a pair of comments before feeding them to a graph network, allowing our model to capture the nuanced charac- teristics (e.g. agreement or rebuttal) exhibited by a reply;

　　(3) when modelling the users who engage with a story via graph networks, we initialise the user nodes with encodings learned from their profiles and characteristics of their “friends” based on their social networks.

2 Problem Statement

3 Methodology

　　总体框架：

　　包括如下几个部分：

　　(1) comment tree: models the comment network by following the reply-to structure using a combination of BERT and graph attentional networks;
　　(2) comment chain: models the comments as a stream using transformer-based sequence models;
　　(3) user tree: incorporates social relations to model the user network using graph attentional networks;
　　(4) rumour classifier: combines the output from comment tree, comment chain and user tree to classify the source post.

　　请注意，user tree 的网络结构不同于 comment tree 的网络结构，因为前者同时捕获 comment 和 reposts/retweets，但后者只考虑 comment（Figure 1）。

3.1 Comment Tree

　　基于 GNN 的建模 comment 之间的关系的模型通常使用的是简单的文本特征（bag-of-words），忽略了 comment 之间的微妙关系（"stance" or "deny"）关系。

　　所以，本文采用预训练语言模型 BERT 和 GAT 去建模 comment tree ，具体参见 Figure 2：

　　首先，使用 BERT 去处理一对 parent-child posts ，然后使用 GAT 去建模整个 conversational strucure 。（ self-attention 在 parent-child 之间的词产生细粒度的分析）

　　以 Figure 2 中的 comment tree 为例，这意味着我们将首先使用 BERT 处理以下几对 comments {(0, 0),(0, 1),(0, 2),(2, 6),(2, 7),(6, 9)}：

　　　　$h_{p+q}=\mathrm{BERT}\left(\mathrm{emb}\left([C L S], c_{p},[S E P], c_{q}\right)\right)$

　　其中，$c$ 表示 text，$emb()$ 表示 embedding function，$h$ 表示由 BERT 产生的 [CLS] 标记的上下文表示。

　　为了模拟 conversational network structure ，本文使用图注意网络 GAT。为了计算 $h_{i}^{(l+1)}$，在迭代 $l+1$ 次时对节点 $i$ 的编码：

　　　　$\begin{array}{l}e_{i j}^{(l)} &=&\operatorname{LR}\left(a^{(l)^{T}}\left(W^{(l)} h_{i}^{(l)} \oplus W^{(l)} h_{j}^{(l)}\right)\right) \\h_{i}^{(l+1)} &=&\sigma\left(\sum\limits _{j \in \mathcal{N}(i)} \operatorname{softmax}\left(e_{i j}^{(l)}\right) z_{j}^{(l)}\right)\end{array}$

　　为了聚合节点编码以得到一个图表示（$\left(z_{c t}\right)$），探索了四种方法：

　　root：Uses the root encoding to represent the graph as the source post

　　　　$z_{c t}=h_{0}^{L}$

　　$\neg root$: Mean-pooling over all nodes except the root:

　　　　$z_{c t}=\frac{1}{m} \sum_{i=1}^{m} h_{i}^{L}$

　　　　where $m$ is the number of replies/comments.

　　$\Delta$ : Mean-pooling of the root node and its immediate neighbours:

　　　　$z_{c t}=\frac{1}{|\mathcal{N}(0)|} \sum_{i \in \mathcal{N}(0)} h_{i}^{L}$

　　all: Mean-pooling of all nodes:

　　　　$z_{c t}=\frac{1}{m+1} \sum_{i=0}^{m} h_{i}^{L}$

3.2 Comment Chain

　　本文按照它们发布的顺序将这些帖子建模为一个流结构，而不是一个树结构，处理 comment chain 考虑了三种模型：

　　(1) one-tier transformer
　　(2) longformer
　　(3) two-tier transformer

3.2.1 One-tier transformer

　　给定一个源帖子 $\left(c_{0}\right)$ 和 comment $\left(\left\{c_{1}, \ldots, c_{m}\right\}\right)$，我们可以简单地将它们连接成一个长字符串，并将其提供给 BERT：

　　　　$z_{c c}=\operatorname{BERT}\left(\mathrm{emb}\left([C L S], c_{0},[S E P], c_{1}, \ldots, c_{m^{\prime}}\right)\right)$

　　其中，$m^{\prime}(<m)$ 是我们可以合并的不超过 BERT 的最大序列长度的 comment（实验中是384个）。

3.2.2 Longformer

　　为规避序列长度的限制，实验使用了一个 Longformer，它可以处理多达4096个子词，允许使用大部分 comment，如果不是所有的评论。

　　Longformer 具有与 one-tier transformer 类似的架构，但使用更稀疏的注意模式来更有效地处理更长的序列。我们使用一个预先训练过的 Longformer，并遵循与之前相同的方法来建模 comment chain：

　　　　$z_{c c}=\mathrm{LF}\left(\operatorname{emb}\left([C L S], c_{0},[S E P], c_{1}, \ldots, c_{m^{\prime \prime}}\right)\right)$

　　其中，$m^{\prime \prime} \approx m$

3.2.3 Two-tier transformer

　　解决序列长度限制的另一种方法是使用 two tiers of transformers 对 comment chain 进行建模：一层用于独立处理帖子，另一种用于使用来自第一个 transformer 的表示来处理帖子序列。

　　　　$\begin{array}{l}h_{i} &=&\operatorname{BERT}\left(\mathrm{emb}_{1}\left([C L S], c_{i}\right)\right) \\z_{c c} &=&\operatorname{transformer}\left(\operatorname{emb}_{2}([C L S]), h_{0}, h_{1}, \ldots, h_{m}\right)\end{array}$

　　其中，BERT 和 transformer 分别表示 first-tier transformers 和 second-tier transformers。econd-tier transformers 具有与 BERT 类似的架构，但只有 2 层，其参数是随机初始化的。

3.3 User Tree

　　我们探索了三种都是基于 GAT 建模 user network 的方法，并通过 mean-pooling 所有节点来聚合节点编码，以生成图表示：

　　　　$z_{u t}=\frac{1}{m+1} \sum\limits_{i=0}^{m} h_{i}^{L}$

　　这三种方法之间的主要区别在于它们如何初始化用户节点 $\left(h_{i}^{(0)}\right)$：

　　第一种 $\mathbf{G A T_{\text {rnd }}}$ ：用随机向量初始化用户节点。

　　　　$h_{i}^{0}=\operatorname{random}\left[v_{1}, v_{2}, \ldots, v_{d}\right]$

　　第二种 $\mathbf{GAT _{\text {prf: }}}$ : 来自他们的 user profiles ：username, user screen name, user description, user account age 等。因此，static user node $h_{i}^{0}$ 由 $v_{i} \in \mathbb{R}^{k}$ 给出

　　　　$h_{i}^{0}=\left[v_{1}, v_{2}, \ldots, v_{k}\right]$

　　第三种 $\mathbf{GAT_{\text {prf }+\text { rel : }}}$：该方法基于用户特征（user profiles）及其社会关系（基于“follow”关系）通过变分图自动编码器 GAE 初始化用户节点的表示。

　　前者捕捉使用源帖子的用户，而后者是互相关注的用户网络。

　　给定基于训练数据构造的 social graph $G_{s}$，我们可以推导出一个邻接矩阵 $\mathrm{A} \in \mathbb{R}^{n \times n}$，其中 $\mathrm{n} $ 为用户数。设 $X=\left[x_{1}, x_{2}, \ldots, x_{n}\right], x_{i} \in \mathbb{R}^{k}$，$x_{i} \in \mathbb{R}^{k}$ 为输入节点特征。我们的目标是学习一个变换矩阵 $\mathrm{Z} \in \mathbb{R}^{n \times d}$，它将用户转换为一个维数为 $d$ 的潜在空间。我们使用一个两层的 GCN 作为编码器。它以邻接矩阵 $\mathrm{A}$ 和特征矩阵 $\mathrm{X}$ 作为输入，并生成潜在变量 $Z$ 作为输出。解码器由潜在变量 $\mathrm{Z}$ 之间的内积定义。我们的解码器的输出是一个重构的邻接矩阵 $ \hat{A}$。从形式上讲：

　　　　$\begin{array}{l}Z &=\operatorname{enc}(\mathbf{X}, \mathbf{A}) =\operatorname{GCN}\left(f\left(\operatorname{GCN}\left(\mathbf{A}, \mathbf{X} ; \theta_{1}\right)\right) ; \theta_{2}\right) \\\hat{A} &=\operatorname{dec}\left(Z, Z^{\top}\right)=\sigma\left(Z Z^{\top}\right)\end{array}$

　　$h_{i}^{(0)} \in \mathbb{R}^{d}$ 通过下述方法计算：

　　　　$h_{i}^{(0)}=\left\{\begin{array}{ll}\operatorname{ReLU}\left(W \cdot\left[v_{1}, \ldots, v_{k}\right]\right), & \text { if } \operatorname{user}_{i} \notin G_{s} \\Z_{i}, & \text { if } \operatorname{user}_{i} \in G_{s}\end{array}\right.$

　　其中，$W_{i}$ 是全连接参数，$v_{i} \in \mathbb{R}^{k}$ 是 user profiles。

3.4 Rumour Classifier

　　使用 comment tree、comment chain、user tree 分别生成的图表示 $z_{c t}$、$z_{c c}$、$z_{u t}$ 进行谣言分类：

　　　　$\begin{array}{l}z=z_{c t} \oplus z_{c c} \oplus z_{u t} \\\hat{y}=\operatorname{softmax}\left(W_{c} z+b_{c}\right) \\\mathcal{L}=-\sum\limits _{i=1}^{n} y_{i} \log \left(\hat{y_{i}}\right)\end{array}$

　　其中，$n$ 表示训练实例数。

4 Experiments and Results

4.1 Datasets

　　数据集统计如下：

　　we report the average performance based on 5-fold cross-validation.

　　we reserve 20% data as test and split the rest in a ratio of 4:1 for training and development partitions and report the average test performance over 5 runs (initialised with different random seeds).

4.2 Results

　　本文实验主要回答如下问题：

Q1 [Comment tree]: Does incorporating BERT to analyse the relation between parent and child posts help modelling the comment network, and what is the best way to aggregate comment-pair encodings to represent the comment graph?
Q2 [Comment chain]: Does incorporating more comments help rumour detection when modelling them as a stream of posts?
Q3 [User tree]: Can social relations help modelling the user network?
Q4 [Overall performance]: Do the three different components complement each other and how does a combined approach compared to existing rumour detection systems?

4.2.1 Comment Tree

　　为了理解使用BERT处理一对 parent-child posts 的影响，我们提出了另一种替代方法（“unpaired”），即使用 BERT 独立处理每个帖子，然后将其 [CLS] 表示提供给GAT。

　　　　$h_{p}=\operatorname{BERT}\left(\operatorname{emb}\left([C L S], c_{p}\right)\right)$

　　其中，$h$ 将用作 GAT 中的初始节点表示（$h^{(0)}$）。这里报告了这个替代模型（“unpaired”）及不同的聚合方法（“root”、“¬root”、“$\bigtriangleup $” 和 “all”）的性能。

　　Comparing the aggregation methods, "all" performs the best, followed by "$\boldsymbol{\Delta}$ " and "root" (0.88 vs . 0.87 vs. 0.86 in Twitter16; 0.87 vs. 0.86 vs. 0.85 in CoAID in terms of Macro-F1). We can see that the root and its immediate neighbours contain most of the information, and not including the root node impacts the performance severely (both Twitter16 and CoAID drops to 0.80 with $\neg$ root).

　　Does processing the parent-child posts together with BERT help? The answer is evidently yes, as we see a substantial drop in performance when we process the posts independently: "unpaired" produces a macro-F1 of only 0.83 in both Twitter16 and CoAID. Given these results, our full model (DUCK) will be using "all"' as the aggregation method for computing the comment graph representation.

4.2.2 Comment Chain

　　Fig. 3 绘制了我们改变所包含的评论数量来回答 Q2 的结果：

4.2.3 User Tree

4.2.4 Overall Rumour Detection Performance

谣言检测（DUCK）《DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks》的更多相关文章

谣言检测（PSIN）——《Divide-and-Conquer: Post-User Interaction Network for Fake News Detection on Social Media》
论文信息论文标题:Divide-and-Conquer: Post-User Interaction Network for Fake News Detection on Social Media论 ...
谣言检测（GACL）《Rumor Detection on Social Media with Graph Adversarial Contrastive Learning》
论文信息论文标题:Rumor Detection on Social Media with Graph AdversarialContrastive Learning论文作者:Tiening Sun ...
谣言检测（RDEA）《Rumor Detection on Social Media with Event Augmentations》
论文信息论文标题:Rumor Detection on Social Media with Event Augmentations论文作者:Zhenyu He, Ce Li, Fan Zhou, Y ...
谣言检测——(GCAN)《GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media》
论文信息论文标题:GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Medi ...
谣言检测(RDCL)——《Towards Robust False Information Detection on Social Networks with Contrastive Learning》
论文信息论文标题:Towards Robust False Information Detection on Social Networks with Contrastive Learning论文作 ...
谣言检测——《MFAN: Multi-modal Feature-enhanced Attention Networks for Rumor Detection》
论文信息论文标题:MFAN: Multi-modal Feature-enhanced Attention Networks for Rumor Detection论文作者:Jiaqi Zheng, ...
谣言检测（PLAN）——《Interpretable Rumor Detection in Microblogs by Attending to User Interactions》
论文信息论文标题:Interpretable Rumor Detection in Microblogs by Attending to User Interactions论文作者:Ling Min ...
谣言检测——（PSA）《Probing Spurious Correlations in Popular Event-Based Rumor Detection Benchmarks》
论文信息论文标题:Probing Spurious Correlations in Popular Event-Based Rumor Detection Benchmarks论文作者:Jiayin ...
谣言检测（ClaHi-GAT）《Rumor Detection on Twitter with Claim-Guided Hierarchical Graph Attention Networks》
论文信息论文标题:Rumor Detection on Twitter with Claim-Guided Hierarchical Graph Attention Networks论文作者:Erx ...

随机推荐

centos 8及以上安装mysql 8.0
本文适用于centos 8及以上安装mysql 8.0,整体耗时20分钟内,不需要FQ 1.环境先搞好 systemctl stop firewalld //关闭防火墙 systemctl disab ...
中高级Java程序员,挑战20k+,知识点汇总(一),Java修饰符
1 前言工作久了就会发现,基础知识忘得差不多了.为了复习下基础的知识,同时为以后找工作做准备,这里简单总结一些常见的可能会被问到的问题. 2 自我介绍自己根据实际情况发挥就行 3 Java SE ...
odoo 14 一些常见问题集
1 # 当你往tree或者form视图中增加action的时候 2 # 记住!千万别重名 3 # 一旦重名,Export.Delete.Archive.Unarchive都会消失不见 4 # tree ...
项目应用丨Modbus转Profinet网关连接ABB变频器的现场应用记录
本案例客户需求是将ABB变频器接入到Profinet网络中,使用设备为西门子1200PLC,ABB变频器以及小疆智控Modbus转profinet网关.1.首先打开西门子组态软件,新建一个项目. 2. ...
新一代大数据任务调度系统 - Apache DolphinScheduler 1.3.4 发布，推荐下载
| 本文编辑:朱桐新一代大数据任务调度 - Apache DolphinScheduler(incubator) 在经过社区 30 多位小伙伴的贡献与努力下于发布了 1.3.4 版本,1.3.4 作 ...
DS队列----银行单队列多窗口模拟
题目描述假设银行有K个窗口提供服务,窗口前设一条黄线,所有顾客按到达时间在黄线后排成一条长龙.当有窗口空闲时,下一位顾客即去该窗口处理事务.当有多个窗口可选择时,假设顾客总是选择编号最小的窗口. 本 ...
文件分享工具ShareLocalFile不需要云盘的实时上传下载文件的云盘工具可以搜索整个网络的文件
工具的下载地址:https://comm.zhaimaojun.cn/AllSources/ToolDetail/?tid=9693 这是一个未来的项目,可以分享我们的文件,目前由于个人的技术水平限制 ...
KingbaseES 中 JSON 介绍
KingbaseES支持JSON和JSONB.这两种类型在使用上几乎完全一致,主要区别是 JSON类型把输入的数据原封不动的存放到数据库中.JSONB类型在存放时把JSON解析成二进制格式. JSON ...
三分钟，带你了解PLM
PLM应用于单一地点或者多个地点的企业内部.以及在产品研发领域具有协作关系的企业之间的.支持产品全生命周期的信息的创建.管理.分发和应用的综合性的应用解决方案,能够集成与产品相关的流程.应用系统和信息 ...
批量修改DNS记录的TTL值
最近有个需求,需要修改Windows DNS服务器上区域下所有A记录的TTL值.原先默认的TTL是1小时.也就是说,其它DNS服务器会缓存查询到的记录1个小时.对于近期需要大量修改记录的情况来说这样生 ...

谣言检测（DUCK）《DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks》