Search - Scientific Publications

Open Access February 06, 2026

Predictive Modeling of Public Sentiment Using Social Media Data and Natural Language Processing Techniques

Lawrence A. Farinola, Jean-Eudes Assogba

Journal of Artificial Intelligence and Big Data 2026, 6(1), 1-12. DOI: 10.31586/jaibd.2026.6162

Abstract Social media platforms like X (formerly Twitter) generate vast volumes of user-generated content that provide real-time insights into public sentiment. Despite the widespread use of traditional machine learning methods, their limitations in capturing contextual nuances in noisy social media text remain a challenge. This study leverages the Sentiment140 dataset, comprising 1.6 million labeled [...] Read more.

Social media platforms like X (formerly Twitter) generate vast volumes of user-generated content that provide real-time insights into public sentiment. Despite the widespread use of traditional machine learning methods, their limitations in capturing contextual nuances in noisy social media text remain a challenge. This study leverages the Sentiment140 dataset, comprising 1.6 million labeled tweets, and develops predictive models for binary sentiment classification using Naive Bayes, Logistic Regression, and the transformer-based BERT model. Experiments were conducted on a balanced subset of 12,000 tweets after comprehensive NLP preprocessing. Evaluation using accuracy, F1-score, and confusion matrices revealed that BERT significantly outperforms traditional models, achieving an accuracy of 89.5% and an F1-score of 0.89 by effectively modeling contextual and semantic nuances. In contrast, Naive Bayes and Logistic Regression demonstrated reasonable but consistently lower performance. To support practical deployment, we introduce SentiFeel, an interactive tool enabling real-time sentiment analysis. While resource constraints limited the dataset size and training epochs, future work will explore full corpus utilization and the inclusion of neutral sentiment classes. These findings underscore the potential of transformer models for enhanced public opinion monitoring, marketing analytics, and policy forecasting.

Figures

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

PDF Html Xml

Article

Open Access November 29, 2022

The Application of Machine Learning in the Corona Era, With an Emphasis on Economic Concepts and Sustainable Development Goals

Milad Shahvaroughi Farahani, Amirhossein Esfahani, Fardin Alipoor

International Journal of Mathematical, Engineering, Biological and Applied Computing 2022, 1(2), 95-149. DOI: 10.31586/ijmebac.2022.519

Abstract The aim of this article is to examine the impacts of Coronavirus Disease -19 (Covid-19) vaccines on economic condition and sustainable development goals. In other words, we are going to study the economic condition during Covid19. We have studied the economic costs of pandemic, benefits in terms of gross domestic product (GDP), public finances and employment, investment on vaccines around the [...] Read more.

The aim of this article is to examine the impacts of Coronavirus Disease -19 (Covid-19) vaccines on economic condition and sustainable development goals. In other words, we are going to study the economic condition during Covid19. We have studied the economic costs of pandemic, benefits in terms of gross domestic product (GDP), public finances and employment, investment on vaccines around the world, progress and totally the economic impacts of vaccines and the impacts of emerging markets (EM) on achieving sustainable development goals (SDGs), including no poverty, good health and well-being, zero hunger, reduced inequality etc. The importance of emerging economies in reducing the harmful effects of the Corona has also been noted. We have tried to do experimental results and forecast daily new death cases from Feb-2020 to Aug-2021 in Iran using Artificial Neural Network (ANN) and Beetle Antennae Search (BAS) algorithm as a case study with econometric models and regression analysis. The findings show that Covid19 has had devastating economic and health effects on the world, and the vaccine can be very helpful in eliminating these effects specially in long-term. We observed that there is inequality in the distribution of Corona vaccines in rich countries compared to poor which EM can decrease the gap between them. The results show that both models (i.e., Artificial intelligence (AI) and econometric models) almost have the same results but AI optimization models can robust the model and prediction. The main contribution of this article is that we have surveyed the impacts of vaccination from socio-economic viewpoint not just report some facts and truth. We have surveyed the impacts of vaccines on sustainable development goals and the role of EM in achieving SDGs. In addition to using the theoretical framework, we have also used quantitative and empirical results that have rarely been seen in other articles.

Figures

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 11

Figure 12

Figure 13

Figure 14

Figure 15

Figure 16

Figure 17

Figure 18

Figure 19

Figure 20

Figure 21

Figure 22

Figure 23

Figure 24

Figure 25

Figure 26

Figure 27

PDF Html Xml

Article

Open Access August 31, 2022

Extended Rule of Five and Prediction of Biological Activity of peptidic HIV-1-PR Inhibitors

Vishnu Kumar Sahu, Rajesh Kumar Singh, Pashupati Prasad Singh

Universal Journal of Pharmacy and Pharmacology 2022, 1(1), 20-42. DOI: 10.31586/ujpp.2022.403

Abstract In this research work, we have applied “Lipinski’s RO5” for pharmacokinetics (PK) study and to predict the activity of peptidic HIV-1 protease inhibitors. Peptidic HIV-1-PRIs have been taken from literature with their observed biological activities (OBAs) in term of IC50. The logarithms of the inverse of IC50 have been used as biological end point o(log1/C) in the study. For calculation of [...] Read more.

In this research work, we have applied “Lipinski’s RO5” for pharmacokinetics (PK) study and to predict the activity of peptidic HIV-1 protease inhibitors. Peptidic HIV-1-PRIs have been taken from literature with their observed biological activities (OBAs) in term of IC50. The logarithms of the inverse of IC50 have been used as biological end point o(log1/C) in the study. For calculation of physicochemical parameters, the molecular modeling and geometry optimization of all the derivatives have been carried out with CAChe Pro software using semiempirical PM3 method. Prediction of the biological activity of the inhibitors has shown that the best QSAR model is constructed from pharmacokinetic properties, molecular weight and hydrogen bond acceptor. This also proved that these properties play important role to describe the PKs of the drugs. On the basis of the derived models one can build up a theoretical basis to access the biological activity of the compounds of the same series.

Figures

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

PDF Html Xml

Article

Open Access August 21, 2021

Global Analysis of Potential COVID 19 Transmission and Enabling Factors

Hesham Magd, Henry Karyamsetty

Global Journal of Epidemiology and Infectious Disease 2021, 1(1), 46-61. DOI: 10.31586/gjeid.2021.010103

Abstract Background: Coronavirus disease has caused global turmoil especially causing huge impact on human life all over the world. Current reports states more than 3 million people have lost life and more than 160 million people are known to be suspected with the SARS-CoV-2. Transmission and disease incidence rates are indicators to assess the seriousness of COVID-19 pandemic and studies to understand the factors that aid in this direction are very vital to curb the disease. Methods: The study intends to discover the relationship by performing statistical analysis using correlation and multiple linear regression analysis between the variable’s population density, temperature, relative humidity, and active time of virus and find out the parameters that predict the cases reported per million population in 83 countries. Results: Analysis indicates active time of virus in days is very positively associated with the COVID -19 cases in all the countries r = .604, p < .01. Active time of virus shows strong negative correlation with temperature r = -.930, p [...] Read more.

Background: Coronavirus disease has caused global turmoil especially causing huge impact on human life all over the world. Current reports states more than 3 million people have lost life and more than 160 million people are known to be suspected with the SARS-CoV-2. Transmission and disease incidence rates are indicators to assess the seriousness of COVID-19 pandemic and studies to understand the factors that aid in this direction are very vital to curb the disease. Methods: The study intends to discover the relationship by performing statistical analysis using correlation and multiple linear regression analysis between the variable’s population density, temperature, relative humidity, and active time of virus and find out the parameters that predict the cases reported per million population in 83 countries. Results: Analysis indicates active time of virus in days is very positively associated with the COVID -19 cases in all the countries r = .604, p < .01. Active time of virus shows strong negative correlation with temperature r = -.930, p < .01 revealing that rise in temperature will reduce the virus activity in the population. Together, these variables will account for 36.2% variance in the cases per million population with no significant prediction estimated from any factor. Conclusion: The study outcomes clearly state that population density alone is insufficient to estimate the extent of influence on COVID -19 cases as the number of persons living per sq. km of land is a dynamic quantity tend to fluctuate over time and space due to migration of population. In conjunction to the previous studies reported on the environmental and climatic factors influencing the cases reported, population dynamics does not show much significance on the disease spread and incidence. Contribution: The rise in confirmed cases and the high incidence rate reported in countries can be attributed to the active time of virus life expectancy as there is a positive correlation observed between the COVID-19 cases reported and the virus active time in the examined countries. Also, environment and climatic factors play a role in modulating the infection and transmission rate with less significant influence of population density on the COVID-19.

Figures

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

PDF Html Xml

Article

Open Access May 20, 2021

Bioconcentration Factor of Polychlorinated Biphenyls and Its Correlation with UV- and IR-Spectroscopic data: A DFT based Study

Sangeeta Sahu, Vishnu Kumar Sahu, Anil Kumar Soni

Online Journal of Chemistry 2021, 1(1), 1-10. DOI: 10.31586/ojc.2021.010101

Abstract Polychlorinated biphenyls (PCBs) are important class of persist organic pollutants that were used as a component of paints especially in printings, as plastificator of plastics and insulating materials in transformers and capacitors, heat transfer fluids, additives in hydraulic fluids in vacuum and turbine pumps. There is always a need to establish reliable procedures for predicting the [...] Read more.

Polychlorinated biphenyls (PCBs) are important class of persist organic pollutants that were used as a component of paints especially in printings, as plastificator of plastics and insulating materials in transformers and capacitors, heat transfer fluids, additives in hydraulic fluids in vacuum and turbine pumps. There is always a need to establish reliable procedures for predicting the bioconcentration potential of chemicals from the knowledge of their molecular structure, or from readily measurable properties of the substance. Hence, correlation and prediction of biococentration factors (BCFs) based on λ_max and vibration frequencies of various bonds viz υ(C-H) and υ(C=C) of biphenyl and its fifty-seven derivatives have been made. For the study, the molecular modeling and geometry optimization of the PCBs have been performed on workspace program of CAChe Pro 5.04 software of Fujitsu using DFT method. UV-visible spectra for each compound were created by electron transition between molecular orbitals as electromagnetic radiation in the visible and ultraviolet (UV-visible) region is absorbed by the molecule. The energies of excited electronic states were computed quantum mechanically. IR spectra of transitions for each compound were created by coordinated motions of the atoms as electromagnetic radiation in the infrared region is absorbed by the molecule. The force necessary to distort the molecule was computed quantum mechanically from its equilibrium geometry and thus frequency of vibrational transitions was predicted. Project Leader Program associated with CAChe has been used for multiple linear regression (MLR) analysis using above spectroscopic data as independent variables and BCFs of PCBs as dependent variables. The reliability of correlation and predicting ability of the MLR equations (models) are judged by R², R²_adj, se, q²_L10O and F values. This study reflected clearly that UV and IR spectroscopic data can be used to predict BCFs of a large number of related compounds within limited time without any difficulty.

Figures

Figure 1

Figure 2

PDF Html Xml

Editorial Article

Open Access September 28, 2025

Gut-Brain Axis in Autism Spectrum Disorder: A Bibliometric and Microbial-Metabolite-Neural Pathway Analysis

Avam Arora

Open Journal of Neuroscience 2025, 3(1), 47-51. DOI: 10.31586/ojn.2025.6169

Abstract The gut-brain axis (GBA) has emerged as a central focus in the study of neurodevelopmental disorders, particularly autism spectrum disorder (ASD). Research suggests that microbial composition and its metabolic byproducts influence neural development, synaptic plasticity, and behavior [1,2,3]. A structured bibliometric analysis of Scopus and Web of Science records was performed using Bibliometrix [...] Read more.

The gut-brain axis (GBA) has emerged as a central focus in the study of neurodevelopmental disorders, particularly autism spectrum disorder (ASD). Research suggests that microbial composition and its metabolic byproducts influence neural development, synaptic plasticity, and behavior [1,2,3]. A structured bibliometric analysis of Scopus and Web of Science records was performed using Bibliometrix and VOSviewer to trace trends and thematic evolution of GBA–ASD literature [7,8]. In parallel, a data-driven pathway modeling approach maps microbial metabolites (e.g., short-chain fatty acids, tryptophan catabolites) to host signaling pathways including vagal stimulation, immune cytokine modulation, and blood–brain barrier (BBB) permeability [4,5]. Simulations implemented in Python’s NetworkX illustrate how perturbations in metabolite flux may influence CNS outcomes. The findings reveal growing emphasis on butyrate, serotonin, microglial priming, and maternal immune activation in ASD-related GBA studies, and highlight the need for rigorous empirical validation of computational predictions [9,10,11].

PDF Html Xml

Brief Report

Open Access June 28, 2025

Development of a Hemodialysis Data Collection and Clinical Information System and Establishment of an Intradialytic Blood Pressure/Pulse Rate Predictive Model

I-Hsuan Peng, Chen-Kang Tien, Pei-Chun Lee

Journal of Artificial Intelligence and Big Data 2025, 5(2), 1-23. DOI: 10.31586/jaibd.2025.6029

Abstract This research is a collaboration involving a university team, a partnering corporation, and a hemodialysis clinic, which is a cross-disciplinary research initiative in the field of Artificial Intelligence of Things (AIoT) within the medical informatics domain. The research has two objectives: (1) The development of an Internet of Things (IoT)-based Information System customized for the hemodialysis machines at the clinic, including transmission bridges, clinical personnel dedicated web/app, and a backend server. The system has been deployed at the clinic and is now officially operational; (2) The research also utilized de-identified, anonymous data (collected by the officially operational system) to train, evaluate, and compare Deep Learning-based Intradialytic Blood Pressure (BP)/Pulse Rate (PR) Predictive Models [...] Read more.

This research is a collaboration involving a university team, a partnering corporation, and a hemodialysis clinic, which is a cross-disciplinary research initiative in the field of Artificial Intelligence of Things (AIoT) within the medical informatics domain. The research has two objectives: (1) The development of an Internet of Things (IoT)-based Information System customized for the hemodialysis machines at the clinic, including transmission bridges, clinical personnel dedicated web/app, and a backend server. The system has been deployed at the clinic and is now officially operational; (2) The research also utilized de-identified, anonymous data (collected by the officially operational system) to train, evaluate, and compare Deep Learning-based Intradialytic Blood Pressure (BP)/Pulse Rate (PR) Predictive Models, with subsequent suggestions provided. Both objectives were executed under the supervision of the Institutional Review Board (IRB) at Mackay Memorial Hospital in Taiwan. The system completed for objective one has introduced three significant services to the clinic, including automated hemodialysis data collection, digitized data storage, and an information-rich human-machine interface as well as graphical data displays, which replaces traditional paper-based clinical administrative operations, thereby enhancing healthcare efficiency. The graphical data presented through web and app interfaces aids in real-time, intuitive comprehension of the patients’ conditions during hemodialysis. Moreover, the data stored in the backend database is available for physicians to conduct relevant analyses, unearth insights into medical practices, and provide precise medical care for individual patients. The training and evaluation of the predictive models for objective two, along with related comparisons, analyses, and recommendations, suggest that in situations with limited computational resources and data, an Artificial Neural Network (ANN) model with six hidden layers, SELU activation function, and a focus on artery-related features can be employed for hourly intradialytic BP/PR prediction tasks. It is believed that this contributes to the collaborating clinic and relevant research communities.

Figures

Figure 1

Figure 2

Figure 3 (a)

Figure 3 (b)

Figure 3 (c)

Figure 3 (d)

Figure 4 (a)

Figure 4 (b)

Figure 4 (c)

Figure 4 (d)

Abstract Maintenance of large-scale software is difficult due to large size and high complexity of code.80% of software development is on maintenance and the other 60% is on trying to understand the code. The severity of the code smells must be measured as well as fairness on it because it will help the developers especially in large scale source code projects. Code smell is not a bug in the system as it [...] Read more.

Maintenance of large-scale software is difficult due to large size and high complexity of code.80% of software development is on maintenance and the other 60% is on trying to understand the code. The severity of the code smells must be measured as well as fairness on it because it will help the developers especially in large scale source code projects. Code smell is not a bug in the system as it doesn’t prevent the program from functioning but it may increase the risk of software failure or performance slowdown. Therefore, this paper seeks to help developers with early prediction of severity of code smells and test the level of fairness on the predictions especially in large scale source code projects. Data is the collection of facts and observations in terms of events, it is continuously growing, getting denser and more varied by the minute across different disciplines or fields. Hence, Big Data emerged and is evolving rapidly, the various types of data being processed are huge, but no one has ever thought of where this data resides, we therefore noticed this data resides in software’s and the codebases of the software’s are increasingly growing that is the size of the modules, functionalities, the size of the classes etc. Since data is growing so rapidly it also mean the codebases of software’s or code are also growing as well. Therefore, this paper seeks to discuss the 5V’s of big data in the context of software code and how to optimize or manage the big code. When we talk of "Big Code for Big Software's," we are referring to the specific challenges and considerations involved in developing, managing, and maintaining of code in large-scale software systems.

Filter options

Predictive Modeling of Public Sentiment Using Social Media Data and Natural Language Processing Techniques

The Application of Machine Learning in the Corona Era, With an Emphasis on Economic Concepts and Sustainable Development Goals

Extended Rule of Five and Prediction of Biological Activity of peptidic HIV-1-PR Inhibitors

Global Analysis of Potential COVID 19 Transmission and Enabling Factors

Bioconcentration Factor of Polychlorinated Biphenyls and Its Correlation with UV- and IR-Spectroscopic data: A DFT based Study

Gut-Brain Axis in Autism Spectrum Disorder: A Bibliometric and Microbial-Metabolite-Neural Pathway Analysis

Development of a Hemodialysis Data Collection and Clinical Information System and Establishment of an Intradialytic Blood Pressure/Pulse Rate Predictive Model

Achieving Maintainability, Readability & Understandability of Software Projects using Code Smell Prediction

Predictive Failure Analytics in Critical Automotive Applications: Enhancing Reliability and Safety through Advanced AI Techniques

Query parameters

View options

Information

About SCIPUB

Policies

Follow SCIPUB

Citations of

Views of

Downloads of