|
| Author |
Proposed Work |
Dataset |
Key Findings |
Challenges/Gaps |
|
|
| Hasan, Balbahaith, and Tarique (2019) |
Developed a heuristic ML-based algorithm and GUI app using the top 5 of 23 classifiers |
616 SQL statements |
Achieved 93.8% accuracy in detecting SQLi attacks |
Small dataset size; scalability to real-world scenarios not validated |
|
| Noor et al. (2019) |
suggested an arrangement based on semantic ML to connect risks and TTPs via probabilistic networks |
TTP taxonomy dataset (133 TTPs, 45 threat families) |
Detected threats with 92% accuracy; low false positives; 0.15s average detection time |
Specific to TTP-based threats; generalization to SQLi-specific detection not tested |
|
| Zhang (2019) |
Designed ML classifiers (CNN, MLP) to detect SQLi vulnerabilities in PHP code using code-level features |
PHP source code files |
CNN achieved 95.4% precision; MLP achieved 63.7% recall and F-measure of 0.746 |
Limited to PHP; varying performance across different classifiers |
|
| Ul Islam et al. (2019) |
Created a NoSQL injection supervised learning tool. detection with a novel dataset |
Custom-designed NoSQL injection dataset |
Achieved 0.93 F2-score; outperformed Sqreen by 36.25%; database-agnostic |
Limited availability of NoSQL datasets; manual feature engineering required |
|
| McWhirter et al. (2018) |
Gap-Weighted String Subsequence was used. Kernel + SVM on SQL query strings for classification |
Amnesia testbed datasets |
Achieved 97.07% (Select) and 92.48% (Insert) accuracy; adapted to unseen threats |
Lower accuracy with unsanitized quotation marks; sensitive to input anomalies |
|
| Chattopadhyay et al. (2018) |
examined the difficulties in implementing ML methods for identifying malware |
Multiple datasets (unspecified) |
Compared various ML techniques across datasets; summarized performance based on different metrics; identified optimal techniques for evolving patterns. |
Lack of clarity in dataset specifics; issues in defining and generalizing ML approaches to dynamic, real-world intrusion patterns; scalability concerns. |
|