PSIN 谣言检测——《Divide-and-Conquer: Post-User Interaction Network for Fake News Detection on Social Media》
论文信息
论文标题:Divide-and-Conquer: Post-User Interaction Network for Fake News Detection on Social Media论文作者:Erxue Min, Yu Rong, Yatao Bian, Tingyang Xu, Peilin Zhao, Junzhou Huang,Sophia Ananiadou论文来源:2022,WWW 论文地址:download 论文代码:downloadBackground挑战:
(1) 谣言检测涉及众多类型的实体和关系,需要一些方法来建模异质性;(2) 社交媒体中的话题出现了分布变化,显著降低了虚假新闻的性能;(3) 现有虚假新闻数据集通常缺乏较大规模、话题多样性和用户的社交关系;
基于文本的谣言检测方法存在如下两个问题:
(1) 首先,在新闻的社会背景下的信息是复杂的和异构的;
(2) 其次是分布偏移问题——训练分布不同于测试分布;
分布偏移例子:如虚假新闻分类器是在 包含政治、体育、娱乐等普通主题的标记数据进行训练的,但是在测试集上出现了出现了诸如“黑天鹅事件”的新主题 。
贡献:
- We construct and publicize a new fake news dataset with social context named MC-Fake2 , which contains 27,155 news events in 5 topics, and their social context composed of 5 million posts, 2 million users and induced social graph with 0.2 billion edges.
- We propose a novel Post-User Interaction Network (PSIN), which applies divide-and-conquer strategy to model the heterogeneous relations. Specifically, we integrate the post-post, user-user and post-user subgraphs with three variants of Graph Attention Networks based on their intrinsic characteristics. Additionally, we employ an additionally adversarial topic discriminator to learn topic-agnostic features for veracity classification.
- We evaluate our proposed model on the curated dataset in two settings: in-topic split and out-of-topic split. The superior results of our model in both settings reveal the effectiveness of the proposed method.
- BuzzFeedNews specializes in political news published on Facebook during the 2016 U.S. Presidential Election.
- LIAR collects 12.8K short statements with manual labels from the political fact-checking website.
- FA-KES consists of 804 articles around Syrian war.
- CREDBANK contains about 1000 news events and 60 million tweets, labeled by Amazon mechanical Turk.
- Twitter15 contains 778 reported events between March 2015 to December 2015, with 1 million posts from 500k users.
- FakeNewsNet is a data repository with news content and related posts, containing political news and entertainment news which are checked by politifact and gossiocop.
- FakeHealth is collected from healthcare information review website Health News Review, it contains over 2000 news articles, 500k posts and 27k user profiles, along with user networks.
- COAID collects 1,896 news, 183,654 related user engagements, 516 social platform posts about COVID-19, and ground truth labels.
- FakeCovid is a multilingual cross-domain dataset of 5,182 fact-checked news article for COVID-19 from 92 different fact-checking websites.
- MM-COVID is a multilingual and multidimensional COVID-19 fake news data repository, containing 3,981 pieces of fake news content and 7,192 trustworthy information from 6 different languages.
- Sequential Modeling [20, 24, 30, 52]
- Explicit responding path modeling [4, 19, 26, 47]
- Implicit attention modeling
经验总结扩展阅读
- PLAN 谣言检测——《Interpretable Rumor Detection in Microblogs by Attending to User Interactions》
- 谣言检测——《Debunking Rumors on Twitter with Tree Transformer》
- 如何检测手机
- 水质检测笔多少为正常
- 翅尖有毒是谣言吗
- 核酸检测阳性怎么办
- 自身 如何在linux下检测IP冲突
- 华为watch3pro支持血糖检测吗_华为watch3pro有测血糖功能吗
- 东莞机动车检测站周末上班吗
- Notebook交互式完成目标检测任务