Deteksi Berita Palsu Dwibahasa dengan Pemrosesan Bahasa Alami
DOI:
https://doi.org/10.70309/ticom.v14i2.201Keywords:
Kecerdasan Buatan, Pemrosesan Bahasa Alami, Naive Bayes, Deteksi Dwibahasa Buatan dan Berita PalsuAbstract
Mendeteksi berita palsu merupakan tantangan besar di era digital saat ini, terlebih lagi dalam konteks bilingual seperti Bahasa Indonesia dan Bahasa Inggris. Tujuan dari proyek ini adalah untuk membangun algoritma deteksi berita palsu bilateral yang akurat berdasarkan pemrosesan bahasa alami. Teknik yang digunakan adalah Naive Bayes, metode yang tidak rumit dan efektif untuk mengategorikan teks. Pendekatan penelitian meliputi pengumpulan dataset berita dalam bahasa Indonesia dan bahasa Inggris dan pra-pemrosesan teks yang meliputi normalisasi, tokenisasi, dan vektorisasi dengan metode TF-IDF. Teknik validasi silang kemudian digunakan untuk mengeksekusi dan mengevaluasi model Naive Bayes untuk menilai akurasi, presisi, dan penarikan kembali klasifikasi. Hasil eksperimen menunjukkan bahwa model ini dapat secara efektif mendeteksi berita palsu dalam kedua bahasa dengan akurasi yang kompetitif sambil menawarkan keuntungan besar yang timbul dari aspek NLP yang meningkatkan kinerja. Dalam penelitian ini, kami menunjukkan bahwa Naive Bayes adalah metode yang ringan dan andal untuk aplikasi deteksi berita palsu multibahasa
References
N. Ahuja and S. Kumar, “Mul-FaD: attention based detection of multiLingual fake news,” J. Ambient Intell. Humaniz. Comput., vol. 14, no. 3, pp. 2481–2491, 2023, doi: 10.1007/s12652-022-04499-0.
K. Shu, S. Wang, and H. Liu, “Understanding User Profiles on Social Media for Fake News Detection,” Proc. - IEEE 1st Conf. Multimed. Inf. Process. Retrieval, MIPR 2018, pp. 430–435, 2018, doi: 10.1109/MIPR.2018.00092.
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186, 2019.
A. Mishra and H. Sadia, “A Comprehensive Analysis of Fake News Detection Models: A Systematic Literature Review and Current Challenges †,” Eng. Proc., vol. 59, no. 1, 2023, doi: 10.3390/engproc2023059028.
E. Papageorgiou, C. Chronis, I. Varlamis, and Y. Himeur, “A Survey on the Use of Large Language Models (LLMs) in Fake News,” Futur. Internet, vol. 16, no. 8, pp. 1–29, 2024, doi: 10.3390/fi16080298.
X. Zhou, J. Wu, and R. Zafarani, “SAFE : Similarity-Aware Multi-Modal Fake News Detection,” no. 1, pp. 1–12.
K. Kowsari, K. J. Meimandi, M. Heidarysafa, and S. Mendu, “Text Classification Algorithms : A Survey,” pp. 1–68, 2019, doi: 10.3390/info10040150.
J. Su, C. Cardie, and P. Nakov, “Adapting Fake News Detection to the Era of Large Language Models,” Find. Assoc. Comput. Linguist. NAACL 2024 - Find., pp. 1473–1490, 2024, doi: 10.18653/v1/2024.findings-naacl.95.
H. Allcott and M. Gentzkow, “Social media and fake news in the 2016 election,” J. Econ. Perspect., vol. 31, no. 2, pp. 211–236, 2017, doi: 10.1257/jep.31.2.211.
K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu, “Fake News Detection on Social Media: A Data Mining Perspective,” no. i, 2017, [Online]. Available: http://arxiv.org/abs/1708.01967
S. Vosoughi, D. Roy, and S. Aral, “The spread of true and false news online. Science, 359(6380), 1146–1151 | 10.1126/science.aap9559,” vol. 1151, no. March, pp. 1146–1151, 2018, [Online]. Available: 10.1126/science.aap9559
A. Conneau et al., “XNLI: Evaluating cross-lingual sentence representations,” Proc. 2018 Conf. Empir. Methods Nat. Lang. Process. EMNLP 2018, pp. 2475–2485, 2018, doi: 10.18653/v1/d18-1269.
S. Ruder, M. Peters, S. Swayamdipta, and T. Wolf, “Transfer learning in natural language processing tutorial,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Tutor. Abstr., no. 2010, pp. 15–18, 2019.
F. K. Alarfaj, “Deep Dive into Fake News Detection : Feature-Centric Classification with Ensemble and Deep Learning Methods,” 2023.
D. Mimno, “Pulling Out the Stops : Rethinking Stopword Removal for Topic Models,” vol. 2, pp. 432–436, 2017.
A. S. M, L. Thompson, and D. Mimno, “Understanding Text Pre-Processing for Latent Dirichlet Allocation,” 2017.
C. Padurariu and M. Elena, “ScienceDirect Dealing with Data Imbalance in Text Classification Dealing with Data Imbalance in Text Classification,” Procedia Comput. Sci., vol. 159, pp. 736–745, 2019, doi: 10.1016/j.procs.2019.09.229.
M. Zhang, J. M. Peña, and V. Robles, “Feature selection for multi-label naive Bayes classification,” Inf. Sci. (Ny)., vol. 179, no. 19, pp. 3218–3229, 2009, doi: 10.1016/j.ins.2009.06.010.
N. L. Models, “N-gram Language Models,” 2024.
C. Zhao, M. Wu, X. Yang, and W. Zhang, A Systematic Review of Cross-Lingual Sentiment Analysis : Tasks , Strategies , and Prospects, vol. 56, no. 7. 2024. doi: 10.1145/3645106.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Irfan Ramadhan, Eni Heni Hermaliani, Zico Pratama Putra

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
CC BY-SA 4.0
Creative Commons Attribution-ShareAlike 4.0 International
This license requires that reusers give credit to the creator. It allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, even for commercial purposes. If others remix, adapt, or build upon the material, they must license the modified material under identical terms.
BY: Credit must be given to you, the creator.
SA: Adaptations must be shared under the same terms.ng







