RDEA 谣言检测《Rumor Detection on Social Media with Event Augmentations》( 二 )


在对 event graph 进行对比预训练后,我们得到了 input event graph $G_{i}$ 的预训练的向量 $H\left(G_{i}\right)$ 。然后,对于一个 event $C_{i}=\left[r_{i}, x_{1}^{i}, x_{2}^{i}, \cdots, x_{\left|\mathcal{V}_{i}\right|-1}^{i}, G_{i}\right]$,通过平均所有相关的回复帖子和源帖子的原始特征 $o_{i}=\frac{1}{n_{i}}\left(\sum_{j=1}^{\left|\mathcal{V}_{i}\right|-1} x_{j}^{i}+r_{i}\right)$,我们得到了文本图向量 $o_{i}$ 。为了强调 source post,将 contrastive vector、textual graph vector 和source post features 合并为:
$\mathbf{S}_{i}=\mathbf{C O N C A T}\left(H\left(G_{i}\right), o_{i}, r_{i}\right)$
2.3 Fine tuning预训练使用了文本特征,得到了预训练的 event representation,并包含了原始特征和 source post 信息,在 fine-tune 阶段,使用预训练的参数初始化参数,并使用标签训练模型:
将上述生成的 $s_{i}$ 通过全连接层进行分类:
$\hat{\mathbf{y}}_{i}=\operatorname{softmax}\left(F C\left(\mathbf{S}_{i}\right)\right)$
最后采用交叉熵损失:
$\mathcal{L}(Y, \hat{Y})=\sum_{i=1}^{|C|} \mathbf{y}_{i} \log \hat{\mathbf{y}}_{i}+\lambda\|\Theta\|_{2}^{2}$
其中,$\|\Theta\|_{2}^{2}$ 代表 $L_{2}$ 正则化,$\Theta$ 代表模型参数,$\lambda$ 是 trade-off 系数 。
3 Experiments3.1 Baselines

    • DTC [3]: A rumor detection approach applying decision tree that utilizes tweet features to obtain information credibility.
    • SVM-TS [10]: A linear SVM-based time-series model that leverages handcrafted features to make predictions.
    • RvNN [11]: A recursive tree-structured model with GRU units that learn rumor representations via the tree structure.
    • PPC_RNN+CNN [8]: A rumor detection model combining RNN and CNN for early-stage rumor detection, which learns the rumor representations by modeling user and source tweets.
    • Bi-GCN [2]: using directed GCN, which learns the rumor representations through Bi-directional propagation structure.
3.2 Performance Comparison
RDEA 谣言检测《Rumor Detection on Social Media with Event Augmentations》

文章插图
3.3 Ablation study
RDEA 谣言检测《Rumor Detection on Social Media with Event Augmentations》

文章插图
-R represent our model without root feature enhancement-T represent our model without textual graph-A represent our model without event augmentation-M represent our model without mutual information
3.4 Limited labeled dataFigure 3 显示了当标签分数变化时的性能:
RDEA 谣言检测《Rumor Detection on Social Media with Event Augmentations》

文章插图
我们观察到,RDEA 对这两个数据集都比 Bi-GCN 更具有标签敏感性 。此外,标签越少,改进幅度越大,说明RDEA的鲁棒性和数据有效性 。
3.5 Early Rumor Detection
RDEA 谣言检测《Rumor Detection on Social Media with Event Augmentations》

文章插图

经验总结扩展阅读