Journal of Advanced Research in Data Structures Innovations and Computer Science
https://adrjournalshouse.com/index.php/datastructures
en-USJournal of Advanced Research in Data Structures Innovations and Computer ScienceA Comprehensive Review of Large-Scale Ground-Truth Datasets for Real and Fake Content Detection on Social Media
https://adrjournalshouse.com/index.php/datastructures/article/view/2760
<p>With the rapid growth of social net-goal platforms, there is an alarming spread of wrong information, fake news, and manipulated content, giving substantial problems to the trust of the public, political stability, and information integrity. Rooting out those problems requires having a data resource with five distinct qualities: it is verifiable by hand yet at a large scale, and it annotates data in ways that benefit robust detection model development and evaluation. This is what the TruthSeeker’s initiative intends to achieve: to build a large-scale, ground-truth social media dataset with the motive of strikingly scrutinising real and fake texts; therefore, a genuine resource for anybody in NLP, machine learning, and social computing. This paper presents a descriptive exploration of the structure, data collection methodology, labelling strategies, and implications of the TruthSeeker dataset with respect to misinformation research. The TruthSeeker dataset essentially combines content from multiple social media sources along with verified ground- truth labels distinguishing real information from false information. The scale and diversity of the data set facilitate the training of data- hungry deep-learning models and its use in interdisciplinary, cross- domain, and cross-platform exploration. In comparison with other datasets, the performance level of TruthSeeker drastically grows, for instance, in terms of size, consistency of class labelling, and contextual sophistication, providing improved scenarios compared to the real- world misinformation cases. This section of the review delves into the multiple research yokes to hundreds of methods such as fake news discovery, stance analysis, credibility evaluation, and explainable AI for misinformation classification. It also looks critically at factors hindering the method, such as annotative epochs, language coverages, and the changing face of misinformation. By contrasting TruthSeeker with benchmark datasets, we give a clear insight into its strengths and an indication of areas where future improvements might be required. TruthSeeker is an important contribution to the field of social media misinformation research. Such holistic coverage and ground-truth reliability represent TruthSeeker as a foundational dataset for advancing automated real or fake detection systems and promote reproducible, scalable, and impactful research into controlling misinformation.</p> <p><strong>How to cite this article:</strong><br>Kumar A, Rai A K. A Comprehensive Review of Large-Scale Ground-Truth Datasets for Real and Fake Content Detection on Social Media. J Adv Res Data Struct Innov Comput Sci 2026; 2(2): 1-9.</p>Anuj KumarArun Kumar Rai
Copyright (c) 2026 Journal of Advanced Research in Data Structures Innovations and Computer Science
2026-06-202026-06-202219