Adaptive hybrid SMS spam detection system with user feedback-based self-learning
Abstract
This study presented a comprehensive approach to SMS spam detection based on a hybrid architecture that integrated local message processing algorithms with high-performance cloud-based deep learning models. This approach enabled a balance between classification accuracy and the privacy of processed messages. The objective of this study was to develop an intelligent hybrid SMS spam detection system capable of delivering high classification accuracy, maintaining up-to-date knowledge, enabling user personalisation, and adapting to new attack patterns. To achieve the study’s objective, a comprehensive analytical approach was applied, combining a detailed review of scientific literature on SMS spam detection – including machine learning, neural networks, and hybrid methods – with empirical analysis. To implement classic machine learning models (Naïve Bayes, Logistic Regression, Random Forest), standard machine learning libraries were used, and for deep learning, frameworks that support recurrent neural networks, in particular Long Short-Term Memory and transformer architectures, were applied. The system was tested on the open SMS Spam Collection dataset using Accuracy (up to 0.98), F1-score (up to 0.95) and ROC-AUC (up to 0.98) metrics. Moreover, a system was developed to dynamically update knowledge based on user feedback, alongside a weighted framework designed to evaluate the trustworthiness of that feedback. During the study, a multi-level system was developed that performed initial classification on the user’s device with the ability to delegate processing to a cloud module in cases of uncertainty. Compared to basic approaches, the hybrid architecture demonstrated improved classification accuracy, reduced false positives and false negatives, and increased adaptability to changes in the structure of spam messages. Aggregation of suspicious messages in the cloud ensured effective retraining of models in cases of conceptual shift. The practical value of the results lies in the potential integration of the developed system into mobile platforms, as well as corporate information security tools, for the purpose of filtering SMS content and protecting end-users from social engineering
Keywords
natural language processing; long short term memory; architecture of spam; messages; metric
References
- Abid, M.A., Ullah, S., Siddique, M.A., Siddique, M.A., Mushtaq, M.F., Alijedaani, W., & Rustam, F. (2022). Spam SMS filtering based on text features and supervised machine learning techniques. Multimedia Tools and Applications, 81, 39853-39871. doi: 10.1007/s11042-022-12991-0.
- Ahmadi, M., Khajavi, M., Varmaghani, A., Ala, A., Danesh, K., & Javaheri, D. (2025). Leveraging large language models for cybersecurity: Enhancing SMS spam detection with robust and context-aware text classification. ArXiv. doi: 10.48550/arXiv.2502.11014.
- Al Maruf, A., Al Numan, A., Haque, M.M., Jidney, T.T., & Aung, Z. (2023). Ensemble approach to classify spam SMS from Bengali text. In M. Singh, V. Tyagi, P. Gupta, J. Flusser & T. Ören (Eds.), Advances in computing and data sciences. ICACDS 2023 (pp. 440-453). Cham: Springer. doi: 10.1007/978-3-031-37940-6_36.
- Almeida, T. & Hidalgo, J. (2011). SMS spam collection. UCI Machine Learning Repository. doi: 10.24432/C5CC84.
- Al-Zebari, A., Barwary, M., Omar, N., Zebari, N.A., & Zebari, D.A. (2025). Deep learning hybrid approach for accurate SMS spam identification. Journal of Information Systems Engineering and Management, 10(10s). doi: 10.52783/jisemv10i10s.1426.
- Baaqeel, H., & Zagrouba, R. (2020). Hybrid SMS spam filtering system using machine learning techniques. In 2020 21st international Arab conference on information technology (ACIT) (pp. 1-8). Giza: IEEE. doi: 10.1109/ ACIT50332.2020.9300071.
- Bäckman, D. (2019). Evaluation of machine learning algorithms for SMS spam filtering. (Bachelor’s thesis, Umeå University, Umeå, Switzerland).
- Boyko, N., & Kovalchuk, R. (2023). Data update algorithms in the machine learning system. Computer Systems and Information Technologies, 1, 6-13. doi: 10.31891/csit-2023-1-1.
- Gadde, S., Lakshmanarao, A., & Satyanarayana, S. (2021). SMS spam detection using machine learning and deep learning techniques. In 2021 7th international conference on advanced computing and communication systems (ICACCS) (pp. 358-362). Coimbatore: IEEE. doi: 10.1109/ICACCS51430.2021.9441783.
- Gomaa, W.H. (2020). The impact of deep learning techniques on SMS spam filtering. International Journal of Advanced Computer Science and Applications, 11(1), 544-549. doi: 10.14569/IJACSA.2020.0110167.
- Honeycutt, D.R., Nourani, M., & Ragan, E.D. (2020). Soliciting human-in-the-loop user feedback for interactive machine learning reduces user trust and impressions of model accuracy. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 8(1), 63-72. doi: 10.1609/hcomp.v8i1.7464.
- Hossain, S.M.M., Sumon, J.A., Sen, A., Alam, M.I., Kamal, K.M.A., Alqahtani, H., & Sarker, I.H. (2022). Spam filtering of mobile SMS using CNN-LSTM based deep learning model. In Hybrid intelligent systems (pp. 106-116). Cham: Springer. doi: 10.1007/978-3-030-96305-7_10.
- Kalyani, V.V., Rama Sundari, M.V., Neelima, S., Satya Prasad, P.S., PattabhıRama Mohan, P., & Lakshmanarao, A. (2024). SMS spam detection using NLP and deep learning recurrent neural network variants. In 2024 international conference on cognitive robotics and intelligent systems (ICC – ROBINS) (pp. 92-96). Coimbatore: IEEE. doi: 10.1109/ICC-ROBINS60238.2024.10533895.
- Li, Y., Zhang, R., Rong, W., & Mi, X. (2024). SpamDam: Towards privacy-preserving and adversary-resistant SMS spam detection. ArXiv. doi: 10.48550/arXiv.2404.09481.
- Mohammed, C.N., & Ahmed, A.M. (2024). A semantic-based model with a hybrid feature engineering process for accurate spam detection. Journal of Electrical Systems and Information Technology, 11, article number 26. doi: 10.1186/s43067-024-00151-3.
- Molina-Coronado, B., Mori, U., Mendiburu, A., & Miguel-Alonso, J. (2023). Efficient concept drift handling for batch Android malware detection models. ArXiv. doi: 10.48550/arXiv.2309.09807.
- Oyeyemi, D.A., & Ojo, A.K. (2024). SMS spam detection and classification to combat abuse in telephone networks using natural language processing. Journal of Advances in Mathematics and Computer Science, 38(10), 144-156. doi: 10.9734/jamcs/2023/v38i101832.
- Prashob, J., & Yerima, S.Y. (2022). A comparative study of word embedding techniques for SMS spam detection. In 14th IEEE international conference on computational intelligence and communication networks (CICN 2022) (pp. 149-155). Al-Khobar: IEEE. doi: 10.1109/CICN56167.2022.10008245.
- Rojas-Galeano, S. (2021). Using BERT encoding to tackle the Mad-lib attack in SMS spam detection. ArXiv. doi: 10.48550/arXiv.2107.06400.
- Salman, M., Ikram, M., & Kaafar, M.A. (2024). Investigating evasive techniques in SMS spam filtering: A comparative analysis of machine learning models. IEEE Access, 12, 24306-24324. doi: 10.1109/ACCESS.2024.3364671.
- Vats, S., Shastri, S., & Mehta, S. (2024). Federated learning for SMS spam detection: A privacy-focused approach. 2024 15th international conference on computing communication and networking technologies (ICCCNT) (pp. 1-5). Kamand: IEEE. doi: 10.1109/ICCCNT61001.2024.10724879.