<会議発表論文>
Domain Bias in Fake News Datasets Consisting of Fake and Real News Pairs

作成者
本文言語
出版者
発行日
収録物名
開始ページ
終了ページ
会議情報
出版タイプ
アクセス権
関連DOI
概要 News intentionally containing false information–known as "fake news"–is common on the Internet and often causes social disruption. In order to solve it, research on automatic detection of fake news us...ing supervised learning has been active. Although the accuracy is improving, a major challenge for practical application remains: models can not work well for news in unknown fields (domains) due to domain biases. The goal of this study is to mitigate these domain biases and improve the accuracy of cross-domain fake news detection, which tests news from unknown domains. We firstly try to mitigate the bias by masking noun phrases which are considered a major source of domain bias. However, masking has not improved accuracy. Therefore, we point out that the dataset in this study has the property that it always contains pairs of fake and real news on the exact same topic. In this paper, we focus on this property of dataset and examine how it may affect domain bias and accuracy. Comparative experiments show that accuracy is higher when trained on a dataset with the property shown in this study. We suggest that a fake news dataset consisting of paired news could be effective for cross-domain detection.続きを見る

本文ファイル

pdf 6779689 pdf 223 KB 29  

詳細

PISSN
レコードID
主題
登録日 2023.04.05
更新日 2024.07.01

この資料を見た人はこんな資料も見ています