﻿<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD with MathML3 v1.2 20190208//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd">
<article
    xmlns:mml="http://www.w3.org/1998/Math/MathML"
    xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="article">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">JAIBD</journal-id>
      <journal-title-group>
        <journal-title>Journal of Artificial Intelligence and Big Data</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2771-2389</issn>
      <issn pub-type="ppub"></issn>
      <publisher>
        <publisher-name>Science Publications</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.31586/jaibd.2026.6162</article-id>
      <article-id pub-id-type="publisher-id">JAIBD-6162</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Article</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>
          Predictive Modeling of Public Sentiment Using Social Media Data and Natural Language Processing Techniques
        </article-title>
      </title-group>
      <contrib-group>
<contrib contrib-type="author">
<name>
<surname>*</surname>
<given-names>Lawrence A. Farinola</given-names>
</name>
<xref rid="af1" ref-type="aff">1</xref>
<xref rid="af2" ref-type="aff">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Assogba</surname>
<given-names>Jean-Eudes</given-names>
</name>
<xref rid="af1" ref-type="aff">1</xref>
<xref rid="af2" ref-type="aff">2</xref>
</contrib>
      </contrib-group>
<aff id="af1"><label>1</label> Department of Software Engineering, Faculty of Architecture and Engineering, Rauf Denktas University, Mersin 10 via Turkey</aff>
<aff id="af2"><label>2</label> Center of Excellence for Interdisciplinary AI and Data Science Research, Rauf Denktas University, Mersin 10 via Turkey</aff>
      <pub-date pub-type="epub">
        <day>06</day>
        <month>02</month>
        <year>2026</year>
      </pub-date>
      <volume>6</volume>
      <issue>1</issue>
      <history>
        <date date-type="received">
          <day>24</day>
          <month>07</month>
          <year>2025</year>
        </date>
        <date date-type="rev-recd">
          <day>30</day>
          <month>10</month>
          <year>2025</year>
        </date>
        <date date-type="accepted">
          <day>02</day>
          <month>02</month>
          <year>2026</year>
        </date>
        <date date-type="pub">
          <day>06</day>
          <month>02</month>
          <year>2026</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>&#xa9; Copyright 2026 by authors and Trend Research Publishing Inc. </copyright-statement>
        <copyright-year>2026</copyright-year>
        <license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
          <license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p>
        </license>
      </permissions>
      <abstract>
        Social media platforms like X (formerly Twitter) generate vast volumes of user-generated content that provide real-time insights into public sentiment. Despite the widespread use of traditional machine learning methods, their limitations in capturing contextual nuances in noisy social media text remain a challenge. This study leverages the Sentiment140 dataset, comprising 1.6 million labeled tweets, and develops predictive models for binary sentiment classification using Naive Bayes, Logistic Regression, and the transformer-based BERT model. Experiments were conducted on a balanced subset of 12,000 tweets after comprehensive NLP preprocessing. Evaluation using accuracy, F1-score, and confusion matrices revealed that BERT significantly outperforms traditional models, achieving an accuracy of 89.5% and an F1-score of 0.89 by effectively modeling contextual and semantic nuances. In contrast, Naive Bayes and Logistic Regression demonstrated reasonable but consistently lower performance. To support practical deployment, we introduce SentiFeel, an interactive tool enabling real-time sentiment analysis. While resource constraints limited the dataset size and training epochs, future work will explore full corpus utilization and the inclusion of neutral sentiment classes. These findings underscore the potential of transformer models for enhanced public opinion monitoring, marketing analytics, and policy forecasting.
      </abstract>
      <kwd-group>
        <kwd-group><kwd>Sentiment Analysis; Social Media Mining; Public Opinion Prediction; Natural Language Processing; BERT Transformer Model</kwd>
</kwd-group>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec1">
<title>Introduction</title><p>The proliferation of social media platforms has fundamentally transformed how public opinion is formed, shared, and analyzed. Among these platforms, X (formerly known as Twitter) stands out as a microblogging service capturing real-time expressions of public sentiment across diverse topics, including consumer behavior, political discourse, and global events. With over 500 million tweets posted daily, X (formerly Twitter) provides a rich and dynamic data source for public opinion mining through sentiment analysis [
<xref ref-type="bibr" rid="R1">1</xref>]. Recent studies further emphasize the growing role of data-driven machine learning models in supporting public policy formulation and socioeconomic forecasting, particularly in contexts involving large-scale societal change and technological transformation [
<xref ref-type="bibr" rid="R2">2</xref>].</p>
<p>Sentiment analysis, also known as opinion mining, is a subfield of Natural Language Processing (NLP) and computational linguistics focused on detecting, extracting, and classifying subjective information in text [
<xref ref-type="bibr" rid="R3">3</xref>]. This technique enables automatic categorization of textual content, such as tweets, into predefined sentiment classes: positive, negative, or neutral [
<xref ref-type="bibr" rid="R4">4</xref>]. NLP methods transform raw human language into structured data suitable for machine learning models [
<xref ref-type="bibr" rid="R5">5</xref>]. Similarly, large-scale data-driven modeling approaches have been applied to population-level societal and public health challenges, illustrating how predictive analytics can inform public awareness and policy planning [
<xref ref-type="bibr" rid="R6">6</xref>].</p>
<p>Recent advances in machine learning, especially deep learning architectures, have significantly improved sentiment classification accuracy. However, a key question remains: to what extent do these advanced models outperform traditional machine learning techniques in real-world sentiment analysis tasks?</p>
<p>This study benchmarks the performance of classical machine learning algorithms against transformer-based deep learning models using the publicly available Sentiment140 dataset [
<xref ref-type="bibr" rid="R4">4</xref>], which contains 1.6 million labeled tweets. We compare three sentiment classification approaches: Naive Bayes, Logistic Regression (a variant of the Maximum Entropy model), and BERT (Bidirectional Encoder Representations from Transformers) [
<xref ref-type="bibr" rid="R7">7</xref>]. The study focuses on preprocessing raw tweet text into model-compatible features, training and optimizing each classifier on the same dataset, evaluating performance using accuracy, F1-score, and confusion matrices, and identifying the most accurate and robust approach for public sentiment prediction.</p>
<p>Unlike prior studies focusing either on traditional machine learning or deep learning in isolation, our work provides a comprehensive and systematic comparison under consistent experimental conditions. This research offers practical implications for political forecasting, brand monitoring, and real-time crisis detection, where accurate and timely understanding of public sentiment can inform decision-making. Beyond sentiment analysis, artificial intelligence and machine learning techniques have been successfully applied to real-world decision-support systems, such as academic scheduling and institutional resource optimization, demonstrating the versatility of AI models across application domains [
<xref ref-type="bibr" rid="R8">8</xref>]. Additionally, we introduce an interactive tool (SentiFeel) for real-time sentiment classification to bridge academic research and real-world applications.</p>
<p>Research questions addressed include: Which model yields the highest classification accuracy on general-purpose X sentiment analysis? How do traditional models like Naive Bayes and Logistic Regression compare to state-of-the-art deep learning approaches such as BERT?</p>
<p>To answer these questions, this study applies consistent preprocessing, training, and evaluation procedures across all models using the Sentiment140 dataset. By maintaining uniform experimental conditions, we aim to ensure a fair and reliable comparison of performance metrics such as accuracy, precision, recall, F1-score, and ROC-AUC.</p>
<p>This work contributes to advancing natural language processing and machine learning by offering empirical insights into the strengths and limitations of each modeling approach. Ultimately, it supports the development of reliable social media analytics tools that can capture and interpret public sentiment in real-time.</p>
</sec><sec id="sec2">
<title>Literature Review</title><p>Early research in X-based sentiment analysis laid the foundation for contemporary studies by introducing critical datasets and baseline methodologies. Go et al. [
<xref ref-type="bibr" rid="R9">9</xref>] created the Sentiment140 dataset using distant supervision, automatically labeling tweets based on emoticons and applying traditional classifiers like Naive Bayes, Logistic Regression, and Support Vector Machines (SVM), achieving accuracy levels exceeding 80%. Pak and Paroubek [
<xref ref-type="bibr" rid="R1">1</xref>] also validated X as a viable source for sentiment classification, developing a three-way classifier (positive, negative, neutral) using supervised models trained on automatically collected tweets.</p>
<p>Lexicon-based approaches predate machine learning methods and use predefined sentiment dictionaries to assign scores to words or phrases. For example, SentiWordNet assigns sentiment values to WordNet synsets [
<xref ref-type="bibr" rid="R12">12</xref>], while VADER&#x26;#x02014;a rule-based tool designed specifically for social media&#x26;#x02014;considers features like capitalization and punctuation to enhance sentiment detection [
<xref ref-type="bibr" rid="R7">7</xref>].</p>
<p>While these approaches are efficient for identifying general sentiment trends, they often fall short when dealing with complex language features such as negation, slang, sarcasm, or irony. This limitation reduces their effectiveness in the nuanced and informal text typically found on platforms like X [
<xref ref-type="bibr" rid="R12">12</xref>].</p>
<p>Traditional machine learning methods such as Naive Bayes, SVM, and Logistic Regression have been widely used for sentiment classification across domains. Pang and Lee demonstrated their effectiveness on movie reviews [
<xref ref-type="bibr" rid="R10">10</xref>,<xref ref-type="bibr" rid="R11">11</xref>]. Applied to X data, studies including Go et al. [
<xref ref-type="bibr" rid="R9">9</xref>] and Mohammad et al. [
<xref ref-type="bibr" rid="R13">13</xref>] utilized unigram/bigram models, TF-IDF weighting, and emoticon features, achieving 70&#x26;#x02013;80% accuracy on large corpora. However, these models depend heavily on manual feature engineering, limiting scalability and generalizability.</p>
<p>Recent advances in deep learning have enhanced sentiment analysis performance. Recurrent Neural Networks (RNNs), particularly Long Short-Term Memory (LSTM) networks, capture sequential dependencies in text. Mollah [
<xref ref-type="bibr" rid="R12">12</xref>] developed an LSTM-based X sentiment classifier with notable accuracy gains. Convolutional Neural Networks (CNNs) have also been applied to short-form content. However, the most transformative progress comes from transformer-based architectures. BERT, introduced by Devlin et al. [
<xref ref-type="bibr" rid="R10">10</xref>], leverages self-attention to capture bidirectional context, enabling nuanced language understanding. Bello et al. [
<xref ref-type="bibr" rid="R14">14</xref>] designed a BERT-based framework for tweet sentiment classification, reporting superior performance over traditional and earlier deep learning models.</p>
<p>While lexicon-based and traditional ML approaches remain valuable baselines, they fall short in capturing semantic and pragmatic nuances. Transformer-based models such as BERT provide context-aware representations suited to social media&#x26;#x02019;s informal language. However, few studies have directly compared traditional models and BERT on consistent datasets with both quantitative evaluation and qualitative visualizations. This gap is critical for balancing model interpretability, efficiency, and performance.</p>
<p>This study addresses this gap by systematically evaluating Naive Bayes, Logistic Regression, and BERT on the Sentiment140 dataset. Beyond classification metrics, we include exploratory data analyses to provide empirical benchmarks and practical insights into the strengths and limitations of each approach in social media sentiment classification.</p>
<p>Emerging challenges and ethical considerations include demographic biases, echo chambers, and bot influence in social media data that may distort sentiment signals [
<xref ref-type="bibr" rid="R15">15</xref>]. Moreover, distant supervision labels based on emoticons may inadequately capture complex emotions or cultural subtleties, affecting generalizability [
<xref ref-type="bibr" rid="R16">16</xref>]. Recent meta-analyses highlight dataset quality, annotation strategies, and cross-domain adaptability as critical to improving sentiment analysis [
<xref ref-type="bibr" rid="R17">17</xref>]. Additionally, multilingual sentiment analysis and transfer learning are promising directions for robust, ethical sentiment mining [
<xref ref-type="bibr" rid="R18">18</xref>].</p>
</sec><sec id="sec3">
<title>Research Methodology</title><p>This study employs a quantitative, comparative approach to evaluate traditional and deep learning algorithms for sentiment analysis using the Sentiment140 dataset developed by Go et al. [
<xref ref-type="bibr" rid="R9">9</xref>]. The dataset contains 1.6 million tweets labeled via distant supervision&#x26;#x02014;800,000 positive and 800,000 negative tweets. For binary classification, neutral tweets were excluded, and sentiment labels were binarized as 0 (negative) and 1 (positive). From this filtered set, a balanced subset of 12,000 tweets was randomly sampled, with 10,000 used for training and 2,000 for testing to ensure balanced representation.</p>
<p>The preprocessing pipeline cleaned noisy social media text by removing URLs, mentions, hashtags, and punctuation. All text was lowercased, tokenized by whitespace, and filtered with NLTK stopwords [
<xref ref-type="bibr" rid="R15">15</xref>]. Tokens were then lemmatized and recombined into clean text features. Traditional models (Naive Bayes and Logistic Regression) used TF-IDF vectorization with max_features=5000, ngram_range=(1,2), min_df=5, and English stopwords excluded, producing a sparse matrix (10,000 &#x26;#x000d7; 5,000 for training and 2,000 &#x26;#x000d7; 5,000 for testing).</p>
<p>For the BERT-based model, we used the bert-base-uncased transformer from Hugging Face&#x26;#x02019;s Transformers library [
<xref ref-type="bibr" rid="R19">19</xref>]. Tweets were tokenized with the WordPiece tokenizer, padded and truncated to 128 tokens. The datasets were converted into Hugging Face Dataset objects for efficient training via the Trainer API. Fine-tuning ran for three epochs with a learning rate of 2e<sup>-5</sup>, weight decay 0.01, batch size 16, and 500 warm-up steps. Evaluation occurred at each epoch&#x26;#x02019;s end.</p>
<p>Hyperparameters were optimized via 5-fold cross-validation on the training set: Naive Bayes smoothing parameter &#x26;#x003b1; = 1.0; Logistic Regression with L2 regularization tested across C &#x26;#x02208; {0.01, 0.1, 1.0, 10.0}, with C = 1.0 yielding best results. Classifiers were evaluated on the test set using accuracy, precision, recall, F1-score, confusion matrix, and ROC-AUC metrics [
<xref ref-type="bibr" rid="R20">20</xref>]. Comparable machine learning benchmarking and optimization frameworks have been successfully employed in other applied domains, reinforcing the generalizability of model comparison strategies across industrial and social data contexts [
<xref ref-type="bibr" rid="R21">21</xref>].</p>
<p>Since the dataset is public and anonymized, ethical concerns were minimal. All procedures complied with responsible data science standards.</p>
<p>Future improvements could include using richer datasets like TweetEval [
<xref ref-type="bibr" rid="R22">22</xref>] and SemEval [
<xref ref-type="bibr" rid="R23">23</xref>], advanced preprocessing with spacy and domain-specific embeddings like GloVe-X [
<xref ref-type="bibr" rid="R24">24</xref>], data augmentation techniques [
<xref ref-type="bibr" rid="R15">15</xref>], advanced transformers like RoBERTa [
<xref ref-type="bibr" rid="R25">25</xref>], and ensemble methods combining traditional and deep learning classifiers to capture diverse sentiment patterns.</p>
</sec><sec id="sec4">
<title>Results and Discussion</title><p>This section presents the end-to-end predictive pipeline encompassing data preprocessing, model training and evaluation, and the comparative analysis of three sentiment classification models: Naive Bayes, Logistic Regression, and BERT. The overarching goal is to assess and contrast the performance of traditional machine learning algorithms against a state-of-the-art transformer-based deep learning model in classifying sentiments expressed in tweets.</p>
<title>4.1. Data Preparation and Experimental Setup</title><p>The study employed the Sentiment140 dataset, a widely recognized corpus consisting of 1.6 million tweets labeled using distant supervision techniques based on emoticons [
<xref ref-type="bibr" rid="R12">12</xref>]. To streamline the analysis for binary sentiment classification were excluded, and a balanced subset comprising 12,000 tweets was randomly selected&#x26;#x02014;10,000 tweets for training and 2,000 for testing.</p>
<p>Preprocessing involved rigorous steps to clean and standardize the textual data. This included the removal of URLs, user mentions (@), hashtags (#), punctuation, and stop words, followed by tokenization and conversion to lowercase. These procedures were essential for reducing noise and achieving uniform text representation [
<xref ref-type="bibr" rid="R24">24</xref>]. For traditional machine learning classifiers, TF-IDF (Term Frequency&#x26;#x02013;Inverse Document Frequency) vectorization was employed to convert text data into numerical feature representations. In contrast, BERT utilized the bert-base-uncased tokenizer and embedding pipeline from Hugging Face&#x26;#x02019;s Transformers library [
<xref ref-type="bibr" rid="R16">16</xref>], preserving contextual dependencies.</p>
<title>4.2. Model Performance</title><p>The Naive Bayes classifier was trained on a TF-IDF matrix comprising 5,000 features. The model achieved an accuracy of 74.6% and an F1-score of 0.74, corroborating its known effectiveness in handling high-dimensional and sparse text data [
<xref ref-type="bibr" rid="R25">25</xref>] (Table 1).</p>
<table-wrap id="tab1">
<label>Table 1</label>
<caption>
<p><b>Table 1</b><b>.</b> Na&#x000ef;ve Bayes Classification Performance</p>
</caption>

<table>
<thead>
<tr>
<th align="center"><bold>Class</bold></th>
<th align="center"><bold>Precision</bold></th>
<th align="center"><bold>Recall</bold></th>
<th align="center"><bold>F1 - Score</bold></th>
<th align="center"><bold>Support</bold></th>
<th align="center"></th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">0</td>
<td align="center">0.68</td>
<td align="center">0.76</td>
<td align="center">0.72</td>
<td align="center">994</td>
<td align="center"></td>
</tr>
<tr>
<td align="center">1</td>
<td align="center">0.73</td>
<td align="center">0.65</td>
<td align="center">0.68</td>
<td align="center">1007</td>
<td align="center"></td>
</tr>
<tr>
<td align="center"><bold>Macro Average</bold></td>
<td align="center"><bold>0.70</bold></td>
<td align="center"><bold>0.70</bold></td>
<td align="center"><bold>0.70</bold></td>
<td align="center"><bold>2001</bold></td>
<td align="center"></td>
</tr>
<tr>
<td align="center" colspan="5">
<hr />
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>

</fn>
</table-wrap-foot>
</table-wrap><p></p>
<p>Logistic Regression marginally outperformed Naive Bayes, attaining an accuracy of 78.1% and an F1-score of 0.77 (Table 2). Its ability to model linear feature correlations without assuming independence accounts for this superior performance [
<xref ref-type="bibr" rid="R26">26</xref>].</p>
<table-wrap id="tab2">
<label>Table 2</label>
<caption>
<p><b>Table 2</b><b>.</b> Logistic Regression Classification Performance</p>
</caption>

<table>
<thead>
<tr>
<th align="center"><bold>Class</bold></th>
<th align="center"><bold>Precision</bold></th>
<th align="center"><bold>Recall</bold></th>
<th align="center"><bold>F1 - Score</bold></th>
<th align="center"><bold>Support</bold></th>
<th align="center"></th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">0</td>
<td align="center">0.71</td>
<td align="center">0.70</td>
<td align="center">0.71</td>
<td align="center">994</td>
<td align="center"></td>
</tr>
<tr>
<td align="center">1</td>
<td align="center">0.71</td>
<td align="center">0.72</td>
<td align="center">0.72</td>
<td align="center">1007</td>
<td align="center"></td>
</tr>
<tr>
<td align="center"><bold>Macro Average</bold></td>
<td align="center"><bold>0.71</bold></td>
<td align="center"><bold>0.71</bold></td>
<td align="center"><bold>0.71</bold></td>
<td align="center"><bold>2001</bold></td>
<td align="center"></td>
</tr>
<tr>
<td align="center" colspan="5">
<hr />
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>

</fn>
</table-wrap-foot>
</table-wrap><p></p>
<p>Fine-tuning the bert-base-uncased model over three epochs, with a learning rate of 2e<sup>-5</sup>, resulted in a significant performance boost. BERT achieved an accuracy of 89.5% and an F1-score of 0.89, demonstrating its superior contextual language understanding and robustness against noisy, informal tweets [
<xref ref-type="bibr" rid="R13">13</xref>] (Table 3).</p>
<table-wrap id="tab3">
<label>Table 3</label>
<caption>
<p><b>Table 3</b><b>.</b> BERT Classification Report</p>
</caption>

<table>
<thead>
<tr>
<th align="center"><bold>Class</bold></th>
<th align="center"><bold>Precision</bold></th>
<th align="center"><bold>Recall</bold></th>
<th align="center"><bold>F1 - Score</bold></th>
<th align="center"><bold>Support</bold></th>
<th align="center"></th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">0</td>
<td align="center">0.87</td>
<td align="center">0.91</td>
<td align="center">0.89</td>
<td align="center">994</td>
<td align="center"></td>
</tr>
<tr>
<td align="center">1</td>
<td align="center">0.92</td>
<td align="center">0.88</td>
<td align="center">0.90</td>
<td align="center">1007</td>
<td align="center"></td>
</tr>
<tr>
<td align="center"><bold>Macro Average</bold></td>
<td align="center"><bold>0.90</bold></td>
<td align="center"><bold>0.90</bold></td>
<td align="center"><bold>0.89</bold></td>
<td align="center"><bold>2001</bold></td>
<td align="center"></td>
</tr>
<tr>
<td align="center" colspan="5">
<hr />
</td>
</tr>
</tbody>
</table>
</table-wrap><p></p>
<title>4.3. Visualization and Error Analysis</title><p>Figure 1 illustrates a comparative bar chart of model accuracy scores for Naive Bayes, Logistic Regression, and BERT on the binary sentiment classification task. The chart clearly shows that BERT significantly outperforms the two traditional models, achieving the highest accuracy of approximately 89.5%, followed by Logistic Regression and Naive Bayes with marginal differences around 70%&#x26;#x02013;71%.</p>
<p>This visual reinforces the quantitative findings presented in Tables 1&#x26;#x02013;3 and highlights BERT&#x26;#x02019;s superior capability in modeling nuanced sentiment features within informal social media text. The relatively close performance between Naive Bayes and Logistic Regression reflects the limitations of traditional machine learning models when faced with noisy, context-dependent data like tweets. The side-by-side comparison of key metrics (precision, recall, F1-score, and accuracy) across all three models (Figure 1).</p>
<p>Confusion matrices inFigure <xref ref-type="fig" rid="fig2"> 2</xref> indicate that Naive Bayes and Logistic Regression frequently misclassify negative tweets as positive, whereas BERT demonstrates a significantly lower false-positive rate.</p>
<fig id="fig1">
<label>Figure 1</label>
<caption>
<p>Accuracy comparison of Na&#x000ef;ve Bayes, Logistic Regression, and BERT models</p>
</caption>
<graphic xlink:href="6162.fig.001" />
</fig><fig id="fig2">
<label>Figure 2</label>
<caption>
<p>Confusion matrices for Na&#x000ef;ve Bayes, Logistic Regression, and the BERT model</p>
</caption>
<graphic xlink:href="6162.fig.002" />
</fig><p>Receiver Operating Characteristic (ROC) analysis further confirms BERT&#x26;#x02019;s dominance, with an AUC of 0.95, outperforming Logistic Regression (0.86) and Naive Bayes (0.82) (Figure 3). These results align with existing literature emphasizing the high discriminative power of transformer models [
<xref ref-type="bibr" rid="R22">22</xref>].</p>
<fig id="fig3">
<label>Figure 3</label>
<caption>
<p>ROC curves and AUC for Logistic Regression and BERT. A higher AUC for BERT indicates stronger discrimination ability.</p>
</caption>
<graphic xlink:href="6162.fig.003" />
</fig><p>To explore the representational quality of model features, PCA was applied to TF-IDF vectors and visualized in two dimensions (Figure 4). The resulting distribution lacked clear class separation, reflecting the limits of traditional vectorization. In contrast, t-SNE applied to BERT&#x26;#x02019;s [CLS] embeddings revealed tight, well-separated clusters (Figure 5), highlighting its superior encoding of contextual information.</p>
<fig id="fig4">
<label>Figure 4</label>
<caption>
<p>PCA projection of training and test data (colored by true label).</p>
</caption>
<graphic xlink:href="6162.fig.004" />
</fig><fig id="fig5">
<label>Figure 5</label>
<caption>
<p>t-SNE of BERT embeddings with predicted labels</p>
</caption>
<graphic xlink:href="6162.fig.005" />
</fig><title>4.4. Linguistic and Text Complexity Analysis</title><p>Word frequency analysis (Figure 6) revealed strong polarity indicators. Positive tweets frequently contained terms like "love," "good," and "happy," while negative tweets featured "hate," "bad," and "sad." These lexical trends helped all classifiers, especially Naive Bayes, which is highly sensitive to term frequency.</p>
<p>Further, tweet length distribution (Figure 7) showed that most tweets contained between 5 and 15 words. Such brevity hinders bag-of-words models but poses less challenge to pretrained models like BERT, which are trained on extensive corpora and excel in encoding meaning even in limited text contexts [
<xref ref-type="bibr" rid="R27">27</xref>].</p>
<fig id="fig6">
<label>Figure 6</label>
<caption>
<p>Bar charts of the top 20 words in positive and negative tweets (distinct plots).</p>
</caption>
<graphic xlink:href="6162.fig.006" />
</fig><fig id="fig7">
<label>Figure 7</label>
<caption>
<p>Histogram of tweet lengths (in words) for sampled data.</p>
</caption>
<graphic xlink:href="6162.fig.007" />
</fig></sec><sec id="sec5">
<title>Summary, Conclusion, and Recommendation</title><title>5.1. Summary</title><p>This study presents a comprehensive comparative evaluation of three sentiment classification models&#x26;#x02014;Naive Bayes, Logistic Regression, and BERT&#x26;#x02014;using the Sentiment140 dataset. The results illustrate a clear performance advantage of transformer-based models over traditional machine learning approaches. Logistic Regression outperformed Naive Bayes modestly, with an approximate 1% improvement in accuracy, owing to its capability to model inter-feature correlations through L2 regularization. However, both traditional models are inherently limited by their reliance on bag-of-words representations, which struggle to capture complex semantic phenomena such as negation, sarcasm, and idiomatic expressions [
<xref ref-type="bibr" rid="R28">28</xref>].</p>
<p>In contrast, BERT demonstrated substantial improvements in all performance metrics, achieving an accuracy of approximately 89.5% and an F1-score of 0.89. Its contextual embeddings effectively captured nuanced sentiment cues in short, informal tweets. This was confirmed through error analysis, which revealed BERT&#x26;#x02019;s reduced rate of false positives and negatives, and through t-SNE visualizations that highlighted well-separated sentiment clusters compared to the overlapping class distributions from PCA on TF-IDF features [
<xref ref-type="bibr" rid="R27">27</xref>].</p>
<p>Overall, the study establishes the superiority of BERT in handling sentiment classification tasks involving noisy, context-sensitive data like tweets. While traditional models remain useful as interpretable and computationally efficient baselines, their limitations in linguistic comprehension make them less suitable for complex real-world data.</p>
<title>5.2. Conclusion</title><p>This research confirms that transformer-based architectures, particularly BERT, provide significant enhancements over traditional classifiers for sentiment analysis of short-form social media content. BERT&#x26;#x02019;s ability to generate deep, context-aware embeddings makes it especially well suited to understanding informal and ambiguous language on platforms like X.</p>
<p>Despite their efficiency, Naive Bayes and Logistic Regression lack the representational depth necessary to handle subtle linguistic variations. Therefore, for critical applications demanding high accuracy and nuanced sentiment interpretation&#x26;#x02014;such as market analysis, public health monitoring, or political sentiment tracking&#x26;#x02014;BERT or similar transformer models are the recommended solution [
<xref ref-type="bibr" rid="R29">29</xref>].</p>
<p>The study contributes empirical evidence to the growing body of literature supporting the adoption of deep learning in natural language processing. The results further advocate for integrating interpretability tools alongside high-performing models to bridge performance with trust in model predictions.</p>
<p>Unlike existing studies that focus solely on BERT or benchmark traditional models independently, this study offers a head-to-head comparison of conventional machine learning models and BERT under uniform conditions using the same dataset, preprocessing pipeline, and evaluation metrics. Furthermore, the deployment of a real-time sentiment tool (SentiFeel) bridges academic research with practical implementation.</p>
<title>5.3. Limitations and Recommendations</title><p>Several limitations emerged during the course of this study. First, computational constraints restricted the training to a sample of 12,000 tweets, preventing exploitation of the full 1.6 million tweet corpus. This limitation may have influenced generalizability across diverse user expressions and sentiment styles. Second, the binary classification framework omitted neutral sentiments, simplifying the problem space and potentially overlooking more ambiguous cases in public opinion [
<xref ref-type="bibr" rid="R30">30</xref>].</p>
<p>Additionally, fine-tuning BERT with only one to three epochs, in some trials, likely limited its potential performance. Future experiments should consider more extensive training with broader hyperparameter sweeps.</p>
<p>From these insights, the following recommendations are proposed:</p>
<p>Invest in transformer-based models like BERT for sentiment classification tasks, especially in domains involving short, informal text, as their performance justifies the higher computational cost.</p>
<p>Utilize the full Sentiment140 dataset or incorporate live tweet streams to improve robustness and coverage.</p>
<p>Extend future studies to multi-class sentiment classification, including neutral and mixed-emotion labels, to better reflect the complexity of real-world sentiment.</p>
<p>Integrate interpretability tools (e.g., LIME, SHAP) for more transparent model predictions.</p>
<p>Deploy real-time tools, such as the proposed <italic>SentiFeel</italic> web application, to make advanced sentiment analysis accessible to non-specialists and applicable in operational settings [
<xref ref-type="bibr" rid="R31">31</xref>].</p>
<p></p>
<p></p>
<p><bold>Supplementary Materials:</bold> To make the sentiment analysis pipeline accessible and user-friendly, we developed an interactive web-based tool called SentiFeel, available at: https://jeaneudes-dev.github.io/sentifeel/ </p>
<p><bold>Author Contributions:</bold> &#x26;#x0201c;Conceptualization, L.A.F. and J.E.A.; methodology, L.A.F. and J.E.A; software, J.E.A; validation, L.A.F. and J.E.A; formal analysis, J.E.A; investigation, L.A.F.; resources, L.A.F.; data curation, J.E.A.; writing&#x26;#x02014;original draft preparation, J.E.A.; writing&#x26;#x02014;review and editing, L.A.F; visualization, J.E.A.; supervision, L.A.F.; project administration, L.A.F.; funding acquisition, L.A.F. and J.E.A. All authors have read and agreed to the published version of the manuscript.&#x26;#x0201d; </p>
<p><bold>Funding:</bold> This research received no external funding.</p>
<p><bold>Data Availability Statement: </bold>The Sentiment140 dataset used in this study is publicly accessible at http://help.sentiment140.com/for-students. </p>
<p><bold>Acknowledgments:</bold> The authors would like to thank the developers of the Sentiment140 dataset and the open-source communities behind tools such as NLTK, scikit-learn, and Hugging Face Transformers, which significantly facilitated the completion of this study.</p>
<p><bold>Conflicts of Interest:</bold> The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.</p>
<p></p>
</sec>
  </body>
  <back>
    <ref-list>
      <title>References</title>
      
<ref id="R1">
<label>[1]</label>
<mixed-citation publication-type="other">Pak, A., &#x00026; Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC). https://www.aclweb.org/anthology/L10-1141/
</mixed-citation>
</ref>
<ref id="R2">
<label>[2]</label>
<mixed-citation publication-type="other">Farinola, L. A., Assogba, J.-E., &#x00026; Assogba, M. B. M. (2025). Data-driven policy: Forecasting the socioeconomic impact of industrial automation using machine learning. In Proceedings of the Hasan Karacan Conference (p. 83). TRNC. https://eclss.org/publicationsfordoi/Abstracts_Kyrn2025_ESSARUCD.pdf#page=102
</mixed-citation>
</ref>
<ref id="R3">
<label>[3]</label>
<mixed-citation publication-type="other">Liu, B. (2012). Sentiment analysis and opinion mining. Morgan &#x00026; Claypool Publishers. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
</mixed-citation>
</ref>
<ref id="R4">
<label>[4]</label>
<mixed-citation publication-type="other">Pang, B., Lee, L., &#x00026; Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of EMNLP. https://www.aclweb.org/anthology/W02-1011/
</mixed-citation>
</ref>
<ref id="R5">
<label>[5]</label>
<mixed-citation publication-type="other">Zhang, J., Zhang, M., &#x00026; Hou, X. (2021). Text sentiment analysis based on deep learning: A survey. Complexity, 2021, Article 5597294. https://doi.org/10.1155/2021/5597294
</mixed-citation>
</ref>
<ref id="R6">
<label>[6]</label>
<mixed-citation publication-type="other">Farinola, L. A., &#x00026; Ayodeji, I. T. (2025). Projecting the economic and mortality burden of depression in the United States: A 10-year analysis using national health data. International Journal of Population Data Science, 10(1).
</mixed-citation>
</ref>
<ref id="R0">
<label>[0]</label>
<mixed-citation publication-type="other">
</mixed-citation>
</ref>
<ref id="R7">
<label>[7]</label>
<mixed-citation publication-type="other">Gupta, R., &#x00026; Joshi, A. (2020). Multilingual sentiment analysis: State of the art and independent comparison of techniques. Information Processing &#x00026; Management, 57(5), 102309. https://doi.org/10.1016/j.ipm.2020.102309
</mixed-citation>
</ref>
<ref id="R8">
<label>[8]</label>
<mixed-citation publication-type="other">Farinola, L. A., &#x00026; Assogba, M. (2025). Explicit artificial intelligence timetable generator for colleges and universities. Open Journal of Applied Sciences, 15(8), 2277-2290. https://doi.org/10.4236/ojapps.2025.158151
</mixed-citation>
</ref>
<ref id="R9">
<label>[9]</label>
<mixed-citation publication-type="other">Go, A., Bhayani, R., &#x00026; Huang, L. (2009). Twitter sentiment classification using distant supervision (Technical report). Stanford University. https://cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf
</mixed-citation>
</ref>
<ref id="R10">
<label>[10]</label>
<mixed-citation publication-type="other">Devlin, J., Chang, M.-W., Lee, K., &#x00026; Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT. https://aclanthology.org/N19-1423/
</mixed-citation>
</ref>
<ref id="R11">
<label>[11]</label>
<mixed-citation publication-type="other">Olteanu, S., Castillo, C., Diaz, F., &#x00026; Kiciman, E. (2019). Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data, 2, 13. https://doi.org/10.3389/fdata.2019.00013
</mixed-citation>
</ref>
<ref id="R12">
<label>[12]</label>
<mixed-citation publication-type="other">Mollah, M. (2020). Deep learning-based sentiment analysis on Twitter data. International Journal of Innovative Science and Research Technology, 5(3). https://ijisrt.com/deep-learning-based-sentiment-analysis-on-twitter-data
</mixed-citation>
</ref>
<ref id="R13">
<label>[13]</label>
<mixed-citation publication-type="other">Mohammad, S., Kiritchenko, S., &#x00026; Zhu, X. (2013). NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets. In Proceedings of SemEval. https://aclanthology.org/S13-2059/
</mixed-citation>
</ref>
<ref id="R14">
<label>[14]</label>
<mixed-citation publication-type="other">Bello, A., Adeyanju, M., &#x00026; Usman, R. (2022). Twitter sentiment classification using BERT and deep learning. Journal of Big Data, 9(1). https://doi.org/10.1186/s40537-022-00586-6
</mixed-citation>
</ref>
<ref id="R15">
<label>[15]</label>
<mixed-citation publication-type="other">NLTK Project. (2025). Natural Language Toolkit documentation. https://www.nltk.org/
</mixed-citation>
</ref>
<ref id="R16">
<label>[16]</label>
<mixed-citation publication-type="other">Hugging Face. (2025). Transformers library. https://huggingface.co/transformers/.
</mixed-citation>
</ref>
<ref id="R17">
<label>[17]</label>
<mixed-citation publication-type="other">Sokolova, S., &#x00026; Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing &#x00026; Management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002
</mixed-citation>
</ref>
<ref id="R18">
<label>[18]</label>
<mixed-citation publication-type="other">Barbieri, F., Camacho-Collados, J., Espinosa-Anke, L., &#x00026; Neves, L. (2020). TweetEval: Unified benchmark and comparative evaluation for tweet classification. In Findings of EMNLP 2020.
</mixed-citation>
</ref>
<ref id="R19">
<label>[19]</label>
<mixed-citation publication-type="other">https://doi.org/10.18653/v1/2020.findings-emnlp.148.
</mixed-citation>
</ref>
<ref id="R20">
<label>[20]</label>
<mixed-citation publication-type="other">Pennington, J., Socher, R., &#x00026; Manning, C. D. (2014). GloVe: Global vectors for word representation. In Proceedings of EMNLP. https://aclanthology.org/D14-1162/.
</mixed-citation>
</ref>
<ref id="R21">
<label>[21]</label>
<mixed-citation publication-type="other">Wei, J., &#x00026; Zou, K. (2019). EDA: Easy data augmentation techniques for boosting performance on text classification tasks. https://arxiv.org/abs/1901.11196.
</mixed-citation>
</ref>
<ref id="R22">
<label>[22]</label>
<mixed-citation publication-type="other">Farinola, L. A., &#x00026; Bazarkhan, D. (2025). Optimization of complex spray drying operations in manufacturing using machine learning. Open Journal of Applied Sciences, 15(9), 2662-2691.
</mixed-citation>
</ref>
<ref id="R23">
<label>[23]</label>
<mixed-citation publication-type="other">https://doi.org/10.4236/ojapps.2025.159179
</mixed-citation>
</ref>
<ref id="R24">
<label>[24]</label>
<mixed-citation publication-type="other">Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. https://arxiv.org/abs/1907.11692
</mixed-citation>
</ref>
<ref id="R25">
<label>[25]</label>
<mixed-citation publication-type="other">SemEval Task Organizers. (2025). International workshop on semantic evaluation. https://semeval.github.io/
</mixed-citation>
</ref>
<ref id="R26">
<label>[26]</label>
<mixed-citation publication-type="other">Eisenstein, J. (2019). Introduction to natural language processing. MIT Press.
</mixed-citation>
</ref>
<ref id="R27">
<label>[27]</label>
<mixed-citation publication-type="other">https://mitpress.mit.edu/9780262042840/.
</mixed-citation>
</ref>
<ref id="R28">
<label>[28]</label>
<mixed-citation publication-type="other">Manning, C. D., Raghavan, P., &#x00026; Sch&#x000fc;tze, H. (2008). Introduction to information retrieval. Cambridge University Press. https://nlp.stanford.edu/IR-book/
</mixed-citation>
</ref>
<ref id="R29">
<label>[29]</label>
<mixed-citation publication-type="other">Hastie, T., Tibshirani, R., &#x00026; Friedman, J. (2009). The elements of statistical learning (2nd ed.). Springer. https://doi.org/10.1007/978-0-387-84858-7
</mixed-citation>
</ref>
<ref id="R30">
<label>[30]</label>
<mixed-citation publication-type="other">van der Maaten, L., &#x00026; Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579-2605. https://www.jmlr.org/papers/v9/vandermaaten08a.html
</mixed-citation>
</ref>
<ref id="R31">
<label>[31]</label>
<mixed-citation publication-type="other">Cambria, E., Hussain, A., &#x00026; Schuller, B. (2020). Sentiment analysis: The state of the art and emerging challenges. IEEE Intelligent Systems, 35(5), 63-70. https://doi.org/10.1109/MIS.2020.2984777
</mixed-citation>
</ref>
<ref id="R32">
<label>[32]</label>
<mixed-citation publication-type="other">Liu, D., Jiang, M., &#x00026; He, J. (2021). A comparative study of transformer-based models for sentiment classification. Expert Systems with Applications, 185, 115693. https://doi.org/10.1016/j.eswa.2021.115693
</mixed-citation>
</ref>
<ref id="R33">
<label>[33]</label>
<mixed-citation publication-type="other">Balahur, A. (2013). Sentiment analysis in social media texts. In Proceedings of the Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. https://aclanthology.org/W13-1609/
</mixed-citation>
</ref>
<ref id="R34">
<label>[34]</label>
<mixed-citation publication-type="other">Naseem, A., Razzak, M., Musial, M. A., &#x00026; Imran, M. (2022). Transformer-based deep intelligent contextual embedding for Twitter sentiment analysis. Future Generation Computer Systems, 128, 389-406. https://doi.org/10.1016/j.future.2021.10.010.
</mixed-citation>
</ref>
<ref id="R1">
<label>[1]</label>
<mixed-citation publication-type="other">Pak, A., &#x00026; Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC). https://www.aclweb.org/anthology/L10-1141/
</mixed-citation>
</ref>
<ref id="R2">
<label>[2]</label>
<mixed-citation publication-type="other">Farinola, L. A., Assogba, J.-E., &#x00026; Assogba, M. B. M. (2025). Data-driven policy: Forecasting the socioeconomic impact of industrial automation using machine learning. In Proceedings of the Hasan Karacan Conference (p. 83). TRNC. https://eclss.org/publicationsfordoi/Abstracts_Kyrn2025_ESSARUCD.pdf#page=102
</mixed-citation>
</ref>
<ref id="R3">
<label>[3]</label>
<mixed-citation publication-type="other">Liu, B. (2012). Sentiment analysis and opinion mining. Morgan &#x00026; Claypool Publishers. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
</mixed-citation>
</ref>
<ref id="R4">
<label>[4]</label>
<mixed-citation publication-type="other">Pang, B., Lee, L., &#x00026; Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of EMNLP. https://www.aclweb.org/anthology/W02-1011/
</mixed-citation>
</ref>
<ref id="R5">
<label>[5]</label>
<mixed-citation publication-type="other">Zhang, J., Zhang, M., &#x00026; Hou, X. (2021). Text sentiment analysis based on deep learning: A survey. Complexity, 2021, Article 5597294. https://doi.org/10.1155/2021/5597294
</mixed-citation>
</ref>
<ref id="R6">
<label>[6]</label>
<mixed-citation publication-type="other">Farinola, L. A., &#x00026; Ayodeji, I. T. (2025). Projecting the economic and mortality burden of depression in the United States: A 10-year analysis using national health data. International Journal of Population Data Science, 10(1).
</mixed-citation>
</ref>
<ref id="R0">
<label>[0]</label>
<mixed-citation publication-type="other">
</mixed-citation>
</ref>
<ref id="R7">
<label>[7]</label>
<mixed-citation publication-type="other">Gupta, R., &#x00026; Joshi, A. (2020). Multilingual sentiment analysis: State of the art and independent comparison of techniques. Information Processing &#x00026; Management, 57(5), 102309. https://doi.org/10.1016/j.ipm.2020.102309
</mixed-citation>
</ref>
<ref id="R8">
<label>[8]</label>
<mixed-citation publication-type="other">Farinola, L. A., &#x00026; Assogba, M. (2025). Explicit artificial intelligence timetable generator for colleges and universities. Open Journal of Applied Sciences, 15(8), 2277-2290. https://doi.org/10.4236/ojapps.2025.158151
</mixed-citation>
</ref>
<ref id="R9">
<label>[9]</label>
<mixed-citation publication-type="other">Go, A., Bhayani, R., &#x00026; Huang, L. (2009). Twitter sentiment classification using distant supervision (Technical report). Stanford University. https://cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf
</mixed-citation>
</ref>
<ref id="R10">
<label>[10]</label>
<mixed-citation publication-type="other">Devlin, J., Chang, M.-W., Lee, K., &#x00026; Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT. https://aclanthology.org/N19-1423/
</mixed-citation>
</ref>
<ref id="R11">
<label>[11]</label>
<mixed-citation publication-type="other">Olteanu, S., Castillo, C., Diaz, F., &#x00026; Kiciman, E. (2019). Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data, 2, 13. https://doi.org/10.3389/fdata.2019.00013
</mixed-citation>
</ref>
<ref id="R12">
<label>[12]</label>
<mixed-citation publication-type="other">Mollah, M. (2020). Deep learning-based sentiment analysis on Twitter data. International Journal of Innovative Science and Research Technology, 5(3). https://ijisrt.com/deep-learning-based-sentiment-analysis-on-twitter-data
</mixed-citation>
</ref>
<ref id="R13">
<label>[13]</label>
<mixed-citation publication-type="other">Mohammad, S., Kiritchenko, S., &#x00026; Zhu, X. (2013). NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets. In Proceedings of SemEval. https://aclanthology.org/S13-2059/
</mixed-citation>
</ref>
<ref id="R14">
<label>[14]</label>
<mixed-citation publication-type="other">Bello, A., Adeyanju, M., &#x00026; Usman, R. (2022). Twitter sentiment classification using BERT and deep learning. Journal of Big Data, 9(1). https://doi.org/10.1186/s40537-022-00586-6
</mixed-citation>
</ref>
<ref id="R15">
<label>[15]</label>
<mixed-citation publication-type="other">NLTK Project. (2025). Natural Language Toolkit documentation. https://www.nltk.org/
</mixed-citation>
</ref>
<ref id="R16">
<label>[16]</label>
<mixed-citation publication-type="other">Hugging Face. (2025). Transformers library. https://huggingface.co/transformers/.
</mixed-citation>
</ref>
<ref id="R17">
<label>[17]</label>
<mixed-citation publication-type="other">Sokolova, S., &#x00026; Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing &#x00026; Management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002
</mixed-citation>
</ref>
<ref id="R18">
<label>[18]</label>
<mixed-citation publication-type="other">Barbieri, F., Camacho-Collados, J., Espinosa-Anke, L., &#x00026; Neves, L. (2020). TweetEval: Unified benchmark and comparative evaluation for tweet classification. In Findings of EMNLP 2020.
</mixed-citation>
</ref>
<ref id="R19">
<label>[19]</label>
<mixed-citation publication-type="other">https://doi.org/10.18653/v1/2020.findings-emnlp.148.
</mixed-citation>
</ref>
<ref id="R20">
<label>[20]</label>
<mixed-citation publication-type="other">Pennington, J., Socher, R., &#x00026; Manning, C. D. (2014). GloVe: Global vectors for word representation. In Proceedings of EMNLP. https://aclanthology.org/D14-1162/.
</mixed-citation>
</ref>
<ref id="R21">
<label>[21]</label>
<mixed-citation publication-type="other">Wei, J., &#x00026; Zou, K. (2019). EDA: Easy data augmentation techniques for boosting performance on text classification tasks. https://arxiv.org/abs/1901.11196.
</mixed-citation>
</ref>
<ref id="R22">
<label>[22]</label>
<mixed-citation publication-type="other">Farinola, L. A., &#x00026; Bazarkhan, D. (2025). Optimization of complex spray drying operations in manufacturing using machine learning. Open Journal of Applied Sciences, 15(9), 2662-2691.
</mixed-citation>
</ref>
<ref id="R23">
<label>[23]</label>
<mixed-citation publication-type="other">https://doi.org/10.4236/ojapps.2025.159179
</mixed-citation>
</ref>
<ref id="R24">
<label>[24]</label>
<mixed-citation publication-type="other">Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. https://arxiv.org/abs/1907.11692
</mixed-citation>
</ref>
<ref id="R25">
<label>[25]</label>
<mixed-citation publication-type="other">SemEval Task Organizers. (2025). International workshop on semantic evaluation. https://semeval.github.io/
</mixed-citation>
</ref>
<ref id="R26">
<label>[26]</label>
<mixed-citation publication-type="other">Eisenstein, J. (2019). Introduction to natural language processing. MIT Press.
</mixed-citation>
</ref>
<ref id="R27">
<label>[27]</label>
<mixed-citation publication-type="other">https://mitpress.mit.edu/9780262042840/.
</mixed-citation>
</ref>
<ref id="R28">
<label>[28]</label>
<mixed-citation publication-type="other">Manning, C. D., Raghavan, P., &#x00026; Sch&#x000fc;tze, H. (2008). Introduction to information retrieval. Cambridge University Press. https://nlp.stanford.edu/IR-book/
</mixed-citation>
</ref>
<ref id="R29">
<label>[29]</label>
<mixed-citation publication-type="other">Hastie, T., Tibshirani, R., &#x00026; Friedman, J. (2009). The elements of statistical learning (2nd ed.). Springer. https://doi.org/10.1007/978-0-387-84858-7
</mixed-citation>
</ref>
<ref id="R30">
<label>[30]</label>
<mixed-citation publication-type="other">van der Maaten, L., &#x00026; Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579-2605. https://www.jmlr.org/papers/v9/vandermaaten08a.html
</mixed-citation>
</ref>
<ref id="R31">
<label>[31]</label>
<mixed-citation publication-type="other">Cambria, E., Hussain, A., &#x00026; Schuller, B. (2020). Sentiment analysis: The state of the art and emerging challenges. IEEE Intelligent Systems, 35(5), 63-70. https://doi.org/10.1109/MIS.2020.2984777
</mixed-citation>
</ref>
<ref id="R32">
<label>[32]</label>
<mixed-citation publication-type="other">Liu, D., Jiang, M., &#x00026; He, J. (2021). A comparative study of transformer-based models for sentiment classification. Expert Systems with Applications, 185, 115693. https://doi.org/10.1016/j.eswa.2021.115693
</mixed-citation>
</ref>
<ref id="R33">
<label>[33]</label>
<mixed-citation publication-type="other">Balahur, A. (2013). Sentiment analysis in social media texts. In Proceedings of the Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. https://aclanthology.org/W13-1609/
</mixed-citation>
</ref>
<ref id="R34">
<label>[34]</label>
<mixed-citation publication-type="other">Naseem, A., Razzak, M., Musial, M. A., &#x00026; Imran, M. (2022). Transformer-based deep intelligent contextual embedding for Twitter sentiment analysis. Future Generation Computer Systems, 128, 389-406. https://doi.org/10.1016/j.future.2021.10.010.
</mixed-citation>
</ref>
<ref id="R1">
<label>[1]</label>
<mixed-citation publication-type="other">Pak, A., &#x00026; Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC). https://www.aclweb.org/anthology/L10-1141/
</mixed-citation>
</ref>
<ref id="R2">
<label>[2]</label>
<mixed-citation publication-type="other">Farinola, L. A., Assogba, J.-E., &#x00026; Assogba, M. B. M. (2025). Data-driven policy: Forecasting the socioeconomic impact of industrial automation using machine learning. In Proceedings of the Hasan Karacan Conference (p. 83). TRNC. https://eclss.org/publicationsfordoi/Abstracts_Kyrn2025_ESSARUCD.pdf#page=102
</mixed-citation>
</ref>
<ref id="R3">
<label>[3]</label>
<mixed-citation publication-type="other">Liu, B. (2012). Sentiment analysis and opinion mining. Morgan &#x00026; Claypool Publishers. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
</mixed-citation>
</ref>
<ref id="R4">
<label>[4]</label>
<mixed-citation publication-type="other">Pang, B., Lee, L., &#x00026; Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of EMNLP. https://www.aclweb.org/anthology/W02-1011/
</mixed-citation>
</ref>
<ref id="R5">
<label>[5]</label>
<mixed-citation publication-type="other">Zhang, J., Zhang, M., &#x00026; Hou, X. (2021). Text sentiment analysis based on deep learning: A survey. Complexity, 2021, Article 5597294. https://doi.org/10.1155/2021/5597294
</mixed-citation>
</ref>
<ref id="R6">
<label>[6]</label>
<mixed-citation publication-type="other">Farinola, L. A., &#x00026; Ayodeji, I. T. (2025). Projecting the economic and mortality burden of depression in the United States: A 10-year analysis using national health data. International Journal of Population Data Science, 10(1).
</mixed-citation>
</ref>
<ref id="R0">
<label>[0]</label>
<mixed-citation publication-type="other">
</mixed-citation>
</ref>
<ref id="R7">
<label>[7]</label>
<mixed-citation publication-type="other">Gupta, R., &#x00026; Joshi, A. (2020). Multilingual sentiment analysis: State of the art and independent comparison of techniques. Information Processing &#x00026; Management, 57(5), 102309. https://doi.org/10.1016/j.ipm.2020.102309
</mixed-citation>
</ref>
<ref id="R8">
<label>[8]</label>
<mixed-citation publication-type="other">Farinola, L. A., &#x00026; Assogba, M. (2025). Explicit artificial intelligence timetable generator for colleges and universities. Open Journal of Applied Sciences, 15(8), 2277-2290. https://doi.org/10.4236/ojapps.2025.158151
</mixed-citation>
</ref>
<ref id="R9">
<label>[9]</label>
<mixed-citation publication-type="other">Go, A., Bhayani, R., &#x00026; Huang, L. (2009). Twitter sentiment classification using distant supervision (Technical report). Stanford University. https://cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf
</mixed-citation>
</ref>
<ref id="R10">
<label>[10]</label>
<mixed-citation publication-type="other">Devlin, J., Chang, M.-W., Lee, K., &#x00026; Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT. https://aclanthology.org/N19-1423/
</mixed-citation>
</ref>
<ref id="R11">
<label>[11]</label>
<mixed-citation publication-type="other">Olteanu, S., Castillo, C., Diaz, F., &#x00026; Kiciman, E. (2019). Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data, 2, 13. https://doi.org/10.3389/fdata.2019.00013
</mixed-citation>
</ref>
<ref id="R12">
<label>[12]</label>
<mixed-citation publication-type="other">Mollah, M. (2020). Deep learning-based sentiment analysis on Twitter data. International Journal of Innovative Science and Research Technology, 5(3). https://ijisrt.com/deep-learning-based-sentiment-analysis-on-twitter-data
</mixed-citation>
</ref>
<ref id="R13">
<label>[13]</label>
<mixed-citation publication-type="other">Mohammad, S., Kiritchenko, S., &#x00026; Zhu, X. (2013). NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets. In Proceedings of SemEval. https://aclanthology.org/S13-2059/
</mixed-citation>
</ref>
<ref id="R14">
<label>[14]</label>
<mixed-citation publication-type="other">Bello, A., Adeyanju, M., &#x00026; Usman, R. (2022). Twitter sentiment classification using BERT and deep learning. Journal of Big Data, 9(1). https://doi.org/10.1186/s40537-022-00586-6
</mixed-citation>
</ref>
<ref id="R15">
<label>[15]</label>
<mixed-citation publication-type="other">NLTK Project. (2025). Natural Language Toolkit documentation. https://www.nltk.org/
</mixed-citation>
</ref>
<ref id="R16">
<label>[16]</label>
<mixed-citation publication-type="other">Hugging Face. (2025). Transformers library. https://huggingface.co/transformers/.
</mixed-citation>
</ref>
<ref id="R17">
<label>[17]</label>
<mixed-citation publication-type="other">Sokolova, S., &#x00026; Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing &#x00026; Management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002
</mixed-citation>
</ref>
<ref id="R18">
<label>[18]</label>
<mixed-citation publication-type="other">Barbieri, F., Camacho-Collados, J., Espinosa-Anke, L., &#x00026; Neves, L. (2020). TweetEval: Unified benchmark and comparative evaluation for tweet classification. In Findings of EMNLP 2020.
</mixed-citation>
</ref>
<ref id="R19">
<label>[19]</label>
<mixed-citation publication-type="other">https://doi.org/10.18653/v1/2020.findings-emnlp.148.
</mixed-citation>
</ref>
<ref id="R20">
<label>[20]</label>
<mixed-citation publication-type="other">Pennington, J., Socher, R., &#x00026; Manning, C. D. (2014). GloVe: Global vectors for word representation. In Proceedings of EMNLP. https://aclanthology.org/D14-1162/.
</mixed-citation>
</ref>
<ref id="R21">
<label>[21]</label>
<mixed-citation publication-type="other">Wei, J., &#x00026; Zou, K. (2019). EDA: Easy data augmentation techniques for boosting performance on text classification tasks. https://arxiv.org/abs/1901.11196.
</mixed-citation>
</ref>
<ref id="R22">
<label>[22]</label>
<mixed-citation publication-type="other">Farinola, L. A., &#x00026; Bazarkhan, D. (2025). Optimization of complex spray drying operations in manufacturing using machine learning. Open Journal of Applied Sciences, 15(9), 2662-2691.
</mixed-citation>
</ref>
<ref id="R23">
<label>[23]</label>
<mixed-citation publication-type="other">https://doi.org/10.4236/ojapps.2025.159179
</mixed-citation>
</ref>
<ref id="R24">
<label>[24]</label>
<mixed-citation publication-type="other">Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. https://arxiv.org/abs/1907.11692
</mixed-citation>
</ref>
<ref id="R25">
<label>[25]</label>
<mixed-citation publication-type="other">SemEval Task Organizers. (2025). International workshop on semantic evaluation. https://semeval.github.io/
</mixed-citation>
</ref>
<ref id="R26">
<label>[26]</label>
<mixed-citation publication-type="other">Eisenstein, J. (2019). Introduction to natural language processing. MIT Press.
</mixed-citation>
</ref>
<ref id="R27">
<label>[27]</label>
<mixed-citation publication-type="other">https://mitpress.mit.edu/9780262042840/.
</mixed-citation>
</ref>
<ref id="R28">
<label>[28]</label>
<mixed-citation publication-type="other">Manning, C. D., Raghavan, P., &#x00026; Sch&#x000fc;tze, H. (2008). Introduction to information retrieval. Cambridge University Press. https://nlp.stanford.edu/IR-book/
</mixed-citation>
</ref>
<ref id="R29">
<label>[29]</label>
<mixed-citation publication-type="other">Hastie, T., Tibshirani, R., &#x00026; Friedman, J. (2009). The elements of statistical learning (2nd ed.). Springer. https://doi.org/10.1007/978-0-387-84858-7
</mixed-citation>
</ref>
<ref id="R30">
<label>[30]</label>
<mixed-citation publication-type="other">van der Maaten, L., &#x00026; Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579-2605. https://www.jmlr.org/papers/v9/vandermaaten08a.html
</mixed-citation>
</ref>
<ref id="R31">
<label>[31]</label>
<mixed-citation publication-type="other">Cambria, E., Hussain, A., &#x00026; Schuller, B. (2020). Sentiment analysis: The state of the art and emerging challenges. IEEE Intelligent Systems, 35(5), 63-70. https://doi.org/10.1109/MIS.2020.2984777
</mixed-citation>
</ref>
<ref id="R32">
<label>[32]</label>
<mixed-citation publication-type="other">Liu, D., Jiang, M., &#x00026; He, J. (2021). A comparative study of transformer-based models for sentiment classification. Expert Systems with Applications, 185, 115693. https://doi.org/10.1016/j.eswa.2021.115693
</mixed-citation>
</ref>
<ref id="R33">
<label>[33]</label>
<mixed-citation publication-type="other">Balahur, A. (2013). Sentiment analysis in social media texts. In Proceedings of the Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. https://aclanthology.org/W13-1609/
</mixed-citation>
</ref>
<ref id="R34">
<label>[34]</label>
<mixed-citation publication-type="other">Naseem, A., Razzak, M., Musial, M. A., &#x00026; Imran, M. (2022). Transformer-based deep intelligent contextual embedding for Twitter sentiment analysis. Future Generation Computer Systems, 128, 389-406. https://doi.org/10.1016/j.future.2021.10.010.
</mixed-citation>
</ref>
    </ref-list>
  </back>
</article>