Back to Article

An Analysis of Crime Prediction and Classification Using Data Mining Techniques

Journal of Artificial Intelligence and Big Data | Vol 1, Issue 1

Table 1. Summary of Crime Prediction andClassification Using Data Mining

AuthorsMethodsDatasetKey FindingsLimitations and Future Work
Almaw & Kadam (2018)Random Tree, 1-ensemble, 3-ensemble, Statistical AnalysisCrime DatasetRandom Tree: 82.02% accuracyMore ensemble techniques needed and extend crime trend analysis.
Feng et.al.Stateful LSTM with Keras and Prophet ModelCrime data (3 years training)Compared to conventional neural network models, the Prophet model and Keras LSTM produced superior prediction results, which helped law enforcement allocate resources.Further optimization of training dataset sizes and exploration of hybrid deep learning methods.
Crimes prediction using spatiotemporal data and kernel density estimation et.al.Gradient Boosting Machine (GBM) Spatiotemporal and zoning datasetsKDE with zoning district characteristics and smoothing improves model performance; achieved a multiclass logarithmic loss of 2.356104 on validation and 2.35443 on test sets.Expand to real-time prediction applications and evaluate generalizability across various cities and regions.
Kim et.al.Enhanced Decision Tree with K-Nearest NeighbourVancouver crime data (15 years)The prediction accuracy of KNN and Boosted Decision Tree models varied between 39% and 44%.Improve accuracy through advanced preprocessing, feature engineering, and incorporating contextual external data.
Almaw and Kadam et.al.Naive Bayes, J48, and Random TreeExperimented datasetRandom Tree outperformed others with 82.0227% accuracy. Ensemble models showed 81.6073% (1-ensemble) and 79.2353% (3-ensemble).Limited focus on computational efficiency and need to explore ensemble models with novel classifiers.
Sivaranjani, Sivakumari and Aasha et.al.K-Means and K-Nearest Neighbor (KNN)Crime data visualized on Google MapsK-Means clustering visualized with Google Maps enhances usability; KNN used for prediction and evaluated using precision, recall, and F-measure.Need to refine spatial accuracy and investigate more advanced algorithms for geospatial clustering and prediction.