Hate Speech and Emotion Detection on Twitter Using LSTM Model
Deteksi Ujaran Kebencian dan Emosi di Twitter Menggunakan Model LSTM
Abstract
This research aims to develop a classification model to detect hate speech and emotions on the Twitter platform used the Long-Short Term Memory (LSTM) method. With the increasing volume of data on social media, especially Twitter, automatic identification of negative content is crucial for maintaining a healthy digital ecosystem. The dataset used in this study consists of tweets labeled for hate speech and various emotion categories. The preprocessing process is carried out to clean and prepare the data, including steps such as punctuation removal, tokenization, and text normalization. After preprocessing, the dataset is split into training and testing data with a ratio of 60:40 to ensure accurate model evaluation. The experimental results show that the LSTM model achieves an accuracy of 89% in hate speech classification and 71% in emotion classification. These results demonstrate the potential of the LSTM method in text analysis tasks and can serve as a basis for developing automatic detection systems on social media platforms.
Highlights:
- LSTM achieves 89% accuracy in detecting hate speech and 71% in emotion classification.
- The model processes Indonesian language tweets to identify hate speech and emotional tone.
- Preprocessing steps like tokenization and stopword removal are crucial for accurate classification.
Keywords: Hate Speech, LSTM, Twitter
References
Aldi, M. W. P., Jondri, & Aditsania, A. (2018a). Analisis dan implementasi long short term memory neural network untuk prediksi harga Bitcoin. e-Proceeding Engineering, 5(2), 3548–3555.
Aldi, M. W. P., Jondri, & Aditsania, A. (2018b). Analisis dan Implementasi Long Short Term Memory Neural Network untuk Prediksi Harga Bitcoin. E-Proceeding of Engineering Vol.5 No.2, 5(2), 3548–3555.
Hartono, J. (2017). Aplikasi dan Analisis Literatur Fasilkom UI. 4–25.
Henderi, H., & Wanda, R. L. (2017). Preprocessing Data Untuk Sistem Peramalan Tingkat Kedisiplinan Mahasiswa. ICIT Journal, 3(2), 296–308. https://doi.org/10.33050/icit.v3i2.70
Kadir, A. A. W. (2021). PERBANDINGAN KINERJA KLASIFIKASI CNN BERDASARKAN STRATEGI SPLIT DATA PADA BERAGAM DATASET CITRA.
Kholifatullah, B. A. H., & Prihanto, A. (2023). Penerapan Metode Long Short Term Memory Untuk Klasifikasi Pada Hate Speech. Journal of Informatics and Computer Science (JINACS), 04, 292–297. https://doi.org/10.26740/jinacs.v4n03.p292-297
Mardia, E., Aisha, D., & Dimala, C. P. (2023). Kematangan Emosi dengan Perilaku Ujaran Kebencian Pada Remaja Akhir. 11(2), 254–260.
Murni, M., Riadi, I., & Fadlil, A. (2023). Analisis Sentimen HateSpeech pada Pengguna Layanan Twitter dengan Metode Naïve Bayes Classifier (NBC). JURIKOM (Jurnal Riset Komputer), 10(2), 566. https://doi.org/10.30865/jurikom.v10i2.5984
Pradana, Y. A., Cholissodin, I., & ... (2023). Analisis Sentimen Pemindahan Ibu Kota Indonesia pada Media Sosial Twitter menggunakan Metode LSTM dan Word2Vec. … Teknologi Informasi Dan …, 7(5), 2389–2397.
Prasetyo, E. (2012). Data Mining: Konsep dan Aplikasi menggunakan MATLAB (Nikodemus, Ed.). ANDI.
Pratama, C. H., & Findawati, Y. (2020). Hate Speech and Emotions Classification in Indonesian Language Texts on Twitter Using Naïve Bayes Classifier [ Klasifikasi Hate Speech dan Emosi Dalam Teks Berbahasa Indonesia Pada Pengguna Twitter Menggunakan Metode Naïve Bayes Classifier ]. 1–6.
Radliya, N. R. (2015). Data mining. 321, 2005.
Rafael, R. Y., & Adikara, F. (2023). Pengimplmentasian Algoritma Long Short-Term Memory Untuk Mendeteksi Ujaran Kebencian Pada Aplikasi Twitter. JIPI (Jurnal Ilmiah Penelitian Dan Pembelajaran Informatika), 8(2), 551–560. https://doi.org/10.29100/jipi.v8i2.3490
Saputri, I. S. Y., Fadli, M., & Surya, I. (2017). Implementasi E-Commerce Menggunakan Metode UCD (User Centered Design) Berbasis Web. Jurnal Aksara Komputer Terapan, 6(2), 269–278.
Wati, N. P. S., & Pramartha, C. (2022). Penerapan Long Short Term Memory dalam Mengklasifikasi Jenis Ujaran Kebencian pada Tweet Bahasa Indonesia. Jurnal Nasional Teknologi Informasi Dan Aplikasinya (JNATIA), 1(1), 755–762.
Copyright (c) 2023 Nanda Yunania, Yulian Findawati
This work is licensed under a Creative Commons Attribution 4.0 International License.