Analysis of the Effect of Feature Extraction on Sentiment Analysis using BiLSTM: Monkeypox Case Study on X/Twitter
Downloads
The monkeypox outbreak has again become a global concern due to its
widespread spread in various countries. Information related to the disease is
widely shared through social media, especially Twitter which is a major source
of public opinion. However, the complexity of language and the diverse
viewpoints of users often pose challenges in accurately analyzing sentiment.
Therefore, sentiment analysis of tweets about monkeypox is important to
understand public perception and its impact on the dissemination of health
information. This research contributes to identifying the most effective word
embedding-based feature extraction method for sentiment analysis of health
issues on social media. The purpose of this study is to compare the
performance of word embedding methods namely Word2Vec, GloVe, and
FastText in sentiment analysis of tweets about monkeypox using the BiLSTM
model. Data totaling 1511 tweets were collected through a crawling process
using the Twitter API. After the data is collected, manual labeling is done into
three sentiment categories, namely positive, negative, and neutral.
Furthermore, the data is processed through a preprocessing stage which
includes data cleaning, case folding, tokenization, stopword removal, and
stemming. The evaluation results show that FastText with BiLSTM produces
the highest accuracy of 90%, followed by Word2Vec at 89%, and GloVe at
87%. FastText proved to be more effective in reducing classification errors,
especially in distinguishing between negative and positive sentiments due to
its ability to capture subword information and broader context. These findings
suggest that the use of FastText can improve the accuracy of sentiment
analysis, especially on health issues that develop on social media, so that it
can support data-driven decision making by relevant parties in handling
information dissemination.
Copyright (c) 2025 Noryasminda, Triando Hamonangan Saragih, Rudy Herteno, Mohammad Reza Faisal, Andi Farmadi (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).





