DUCK 谣言检测《DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks》( 三 )


$h_{i}^{(0)} \in \mathbb{R}^{d}$ 通过下述方法计算:
$h_{i}^{(0)}=\left\{\begin{array}{ll}\operatorname{ReLU}\left(W \cdot\left[v_{1}, \ldots, v_{k}\right]\right), & \text { if } \operatorname{user}_{i} \notin G_{s} \\Z_{i}, & \text { if } \operatorname{user}_{i} \in G_{s}\end{array}\right.$
其中 , $W_{i}$ 是全连接参数 , $v_{i} \in \mathbb{R}^{k}$ 是 user profiles 。
3.4 Rumour Classifier使用 comment tree、comment chain、user tree 分别生成的图表示 $z_{c t}$、$z_{c c}$、$z_{u t}$ 进行谣言分类:
$\begin{array}{l}z=z_{c t} \oplus z_{c c} \oplus z_{u t} \\\hat{y}=\operatorname{softmax}\left(W_{c} z+b_{c}\right) \\\mathcal{L}=-\sum\limits _{i=1}^{n} y_{i} \log \left(\hat{y_{i}}\right)\end{array}$其中 , $n$ 表示训练实例数 。4 Experiments and Results4.1 Datasets数据集统计如下:

DUCK 谣言检测《DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks》

文章插图
we report the average performance based on 5-fold cross-validation.
we reserve 20% data as test and split the rest in a ratio of 4:1 for training and development partitions and report the average test performance over 5 runs (initialised with different random seeds).
4.2 Results本文实验主要回答如下问题:
  • Q1 [Comment tree]: Does incorporating BERT to analyse the relation between parent and child posts help modelling the comment network, and what is the best way to aggregate comment-pair encodings to represent the comment graph?
  • Q2 [Comment chain]: Does incorporating more comments help rumour detection when modelling them as a stream of posts?
  • Q3 [User tree]: Can social relations help modelling the user network?
  • Q4 [Overall performance]: Do the three different components complement each other and how does a combined approach compared to existing rumour detection systems?
4.2.1 Comment Tree为了理解使用BERT处理一对 parent-child posts 的影响 , 我们提出了另一种替代方法(“unpaired”) , 即使用 BERT 独立处理每个帖子 , 然后将其 [CLS] 表示提供给GAT 。
$h_{p}=\operatorname{BERT}\left(\operatorname{emb}\left([C L S], c_{p}\right)\right)$
其中 , $h$ 将用作 GAT 中的初始节点表示($h^{(0)}$) 。这里报告了这个替代模型(“unpaired”)及不同的聚合方法(“root”、“?root”、“$\bigtriangleup $” 和 “all”)的性能 。
DUCK 谣言检测《DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks》

文章插图
Comparing the aggregation methods, "all" performs the best, followed by "$\boldsymbol{\Delta}$ " and "root" (0.88  vs  . 0.87 vs. 0.86 in Twitter16; 0.87 vs. 0.86 vs. 0.85 in CoAID in terms of Macro-F1). We can see that the root and its immediate neighbours contain most of the information, and not including the root node impacts the performance severely (both Twitter16 and CoAID drops to 0.80 with $\neg$ root).
Does processing the parent-child posts together with BERT help? The answer is evidently yes, as we see a substantial drop in performance when we process the posts independently: "unpaired" produces a macro-F1 of only 0.83 in both Twitter16 and CoAID. Given these results, our full model (DUCK) will be using "all"' as the aggregation method for computing the comment graph representation.
4.2.2 Comment ChainFig. 3 绘制了我们改变所包含的评论数量来回答 Q2 的结果:
DUCK 谣言检测《DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks》

文章插图
4.2.3 User Tree
DUCK 谣言检测《DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks》

文章插图
4.2.4 Overall Rumour Detection Performance
DUCK 谣言检测《DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks》

文章插图

经验总结扩展阅读