RDEA 谣言检测《Rumor Detection on Social Media with Event Augmentations》( 二 ) _生活百科

在对 event graph 进行对比预训练后，我们得到了 input event graph $G_{i}$ 的预训练的向量 $H\left(G_{i}\right)$ 。然后，对于一个 event $C_{i}=\left[r_{i}, x_{1}^{i}, x_{2}^{i}, \cdots, x_{\left|\mathcal{V}_{i}\right|-1}^{i}, G_{i}\right]$，通过平均所有相关的回复帖子和源帖子的原始特征 $o_{i}=\frac{1}{n_{i}}\left(\sum_{j=1}^{\left|\mathcal{V}_{i}\right|-1} x_{j}^{i}+r_{i}\right)$，我们得到了文本图向量 $o_{i}$ 。为了强调 source post，将 contrastive vector、textual graph vector 和source post features 合并为：
$\mathbf{S}_{i}=\mathbf{C O N C A T}\left(H\left(G_{i}\right), o_{i}, r_{i}\right)$
2.3 Fine tuning预训练使用了文本特征，得到了预训练的 event representation，并包含了原始特征和 source post 信息，在 fine-tune 阶段，使用预训练的参数初始化参数，并使用标签训练模型：
将上述生成的 $s_{i}$ 通过全连接层进行分类：
$\hat{\mathbf{y}}_{i}=\operatorname{softmax}\left(F C\left(\mathbf{S}_{i}\right)\right)$
最后采用交叉熵损失：
$\mathcal{L}(Y, \hat{Y})=\sum_{i=1}^{|C|} \mathbf{y}_{i} \log \hat{\mathbf{y}}_{i}+\lambda\|\Theta\|_{2}^{2}$
其中，$\|\Theta\|_{2}^{2}$ 代表 $L_{2}$ 正则化，$\Theta$ 代表模型参数，$\lambda$ 是 trade-off 系数。
3 Experiments3.1 Baselines

- DTC [3]: A rumor detection approach applying decision tree that utilizes tweet features to obtain information credibility.
- SVM-TS [10]: A linear SVM-based time-series model that leverages handcrafted features to make predictions.
- RvNN [11]: A recursive tree-structured model with GRU units that learn rumor representations via the tree structure.
- PPC_RNN+CNN [8]: A rumor detection model combining RNN and CNN for early-stage rumor detection, which learns the rumor representations by modeling user and source tweets.
- Bi-GCN [2]: using directed GCN, which learns the rumor representations through Bi-directional propagation structure.

3.2 Performance Comparison

RDEA 谣言检测《Rumor Detection on Social Media with Event Augmentations》

文章插图
3.3 Ablation study

文章插图
-R represent our model without root feature enhancement-T represent our model without textual graph-A represent our model without event augmentation-M represent our model without mutual information
3.4 Limited labeled dataFigure 3 显示了当标签分数变化时的性能：