Journal: Volume 29, No. 4, 2024
Pages: 21 – 31
DOI: https://doi.org/10.62660/bcstu/4.2024.21
976 Views

Optimisation of intelligent system algorithms for poorly structured data analysis

Mykola Demchyna, Taras Styslo, Serhii Vashchyshak
Received 16.07.2024
Revised 11.10.2024
Accepted 16.12.2024

Abstract

Integration of heterogeneous types of medical data using modern deep learning methods can improve the accuracy and efficiency of diagnosing complex diseases, such as cardiovascular diseases, which is relevant for personalised medicine and reducing the risk of medical errors. The study aimed to present the development of a decision support system for improving the diagnosis of cardiovascular diseases by integrating heterogeneous types of medical data. To create the knowledge base, data from real clinical scenarios were used, which underwent the stages of cleaning, standardisation, and semantic analysis using specialised medical dictionaries. The system demonstrated high efficiency due to its ability to integrate text, image and signal data into a single analysis process. The efficiency was evaluated by such metrics as accuracy, completeness, F1-score, and predictive values of positive and negative results. The introduction of transformers ensured a 15% increase in diagnostic accuracy compared to traditional methods, and the use of a hybrid computing approach reduced model training time by 30% and enabled the processing of up to 1 TB of data per day. Additionally, the integration of heterogeneous types of medical data into the system has improved the personalisation of diagnostics, accounting for individual patient characteristics such as medical history, genetic factors, or comorbidities. Transformer attention mechanisms improved resistance to noise and data gaps, which ensures reliable results even with incomplete or inaccurate information. Optimisation of the models reduced delays in data processing, which is critical for prompt clinical decision-making. In addition, transformers have proven their ability to dynamically scale to process new types of data without losing efficiency, opening opportunities 

Keywords

References

[1] Abbas, N., & Gasmi, S. (2024). Optimizing machine learning techniques for big data analysis in natural language processing and text analytics. ResearchGate. doi: 10.13140/RG.2.2.15547.84002.

[2] An, Z., Bu, W., Wu, Z., & Li, D. (2023). Intelligent design of complex products based on extraction and reconstruction of key dimensions. In 2023 China automation congress (CAC) (pp. 7966-7970). Chongqing: IEEE. doi: 10.1109/cac59555.2023.10450557.

[3] Baviskar, D., Ahirrao, S., Potdar, V., & Kotecha, K. (2021). Efficient automated processing of the unstructured documents using artificial intelligence: A systematic literature review and future directions. IEEE Access, 9, 72894-72936. doi: 10.1109/ACCESS.2021.3072900.

[4] Brown, T.B., et al. (2020). Language models are few-shot learners. ArXivdoi: 10.48550/arXiv.2005.14165.

[5] Chen, S., Kang, J., Liu, S., & Sun, Y. (2020). Cognitive computing on unstructured data for customer co-innovation. European Journal of Marketing, 54(3), 570-593. doi: 10.1108/ejm-01-2019-0092.

[6] Cong, R., Deng, O., Nishimura, S., Ogihara, A., & Jin, Q. (2024). Multiple feature selection based on an optimization strategy for causal analysis of health data. Health Information Science and Systems, 12, article number 52. doi: 10.1007/s13755-024-00312-8.

[7] Dahrouj, H., et al. (2021). An overview of machine learning-based techniques for solving optimization problems in communications and signal processing. IEEE Access, 9, 74908-74938. doi: 10.1109/ access.2021.3079639.

[8] European General Data Protection Regulation (GDPR). (2016, April). Retrieved from https://surl.li/wwepce.

[9] Fatima, H., & Gasmi, S. (2024). Optimization strategies in machine learning for improved big data analysis and natural language processing. ResearchGate. doi: 10.13140/RG.2.2.28130.75208.

[10] Gambella, C., Ghaddar, B., & Naoum-Sawaya, J. (2021). Optimization problems for machine learning: A survey. European Journal of Operational Research, 290(3), 807-828. doi: 10.1016/j.ejor.2020.08.045.

[11] Jain, S., Jain, S., & Jain, A.K. (2021). Deep learning approach towards unstructured text data utilization: Development, opportunities, and challenges. In A. Sharaff, G. Sinha & S. Bhatia (Eds.), New opportunities for sentiment analysis and information processing (pp. 29-49). New York: IGI Global Scientific Publishing. doi: 10.4018/978-1-7998-8061-5.ch002.

[12] Karthick, K. (2024). Comprehensive overview of optimization techniques in machine learning training. Control Systems and Optimization Letters, 2(1), 23-27. doi: 10.59247/csol.v2i1.69.

[13] Kumar Rachakatla, S., Ravichandran, P., & Reddy Machireddy, J. (2023). Advanced data science techniques for optimizing machine learning models in cloud-based data warehousing systemsAustralian Journal of Machine Learning Research & Applications, 3(1), 396-419.

[14] Kushnir, D., Ocherklevich, O., & Paramud, Ya. (2021). Deep neural network model for text semantic analysis based on word embeddings. In Proceedings of the 11th international conference on advanced computer information technologies (ACIT) (pp. 718-721). Deggendorf: IEEE. doi: 10.1109/acit52158.2021.9548393.

[15] Law of Ukraine No. 2297-VI “On the Protection of Personal Data”. (2010, June). Retrieved from https://www. president.gov.ua/documents/2297vi-11567.

[16] Lu, H., & Li, Y. (2020). Cognitive computing for intelligence systems. Mobile Networks and Applications, 25(4), 1434-1435. doi: 10.1007/s11036-019-01428-y.

[17] Mahadevkar, S.V., Patil, S., Kotecha, K., Soong, L.W., & Choudhury, T. (2024). Exploring AI-driven approaches for unstructured document analysis and future horizons. Journal of Big Data, 11, article number 92. doi: 10.1186/ s40537-024-00948-z.

[18] Malyha, I., & Shmatkov, S. (2022). Machine learning methods for solving semantics and context problems in processing textual data. Bulletin of V.N. Karazin Kharkiv National University, Series “Mathematical Modeling. Information Technology. Automated Control Systems”, 56, 35-42. doi: 10.26565/2304-6201-2022-56-03.

[19] Meng, J., & Wang, Z. (2022). Intelligent algorithms of English semantic analysis based on deep learning technology. In 2022 IEEE Asia-Pacific conference on image processing, electronics and computers (IPEC) (pp. 1530-1533). Dalian: IEEE. doi: 10.1109/IPEC54454.2022.9777363.

[20] Nedosnovanyi, O., Cherniak, O., & Golinko, V. (2023). Comparative analysis of cloud services for geoinformation data processing. Information Technologies and Computer Engineering, 20(2), 50-57. doi: 10.31649/1999-99412023-57-2-50-57.

[21] Oza, R.R., & Domadiya, D.H. (2023). Analysis of unstructured data using artificial intelligenceInternational Journal of Creative Research Thoughts, 11(5), 969-973.

[22] Rane, N.L., Mallick, S.K., Kaya, Ö., & Rane, J. (2024). Techniques and optimization algorithms in deep learning: A review. In Applied machine learning and deep learning: Architectures and techniques (pp. 59-79). Yakutiye: Deep Science Publishing. doi: 10.70593/978-81-981271-4-3_3.

[23] Rueda, R., Fabello, E., Silva, T., Genzor, S., Mizera, J., & Stanke, L. (2024). Machine learning approach to flare-up detection and clustering in chronic obstructive pulmonary disease (COPD) patients. Health Information Science and Systems, 12, article number 50. doi: 10.1007/s13755-024-00308-4.

[24] Shen, Z. (2023). Algorithm optimization and performance improvement of data visualization analysis platform based on artificial intelligence. Frontiers in Computing and Intelligent Systems, 5(3), 14-17. doi: 10.54097/fcis. v5i3.13836.

[25] Singh, S., & Hooda, S. (2023). A study of challenges and limitations to applying machine learning to highly unstructured data. In 2023 7th international conference on computing, communication, control and automation (ICCUBEA) (pp. 1-6). Pune: IEEE. doi: 10.1109/ICCUBEA58933.2023.10392115.

[26] Smetaniuk, O., & Tsisar, D. (2024). Digital economy as a foundation for creating digital strategies of enterprises. Innovation and Sustainability, 4(3), 68-75. doi: 10.31649/ins.2024.3.68.75.

[27] Turet, J.G., & Costa, A.P.C.S. (2022). Hybrid methodology for analysis of structured and unstructured data to support decision-making in public security. Data & Knowledge Engineering, 141, article number 102056. doi: 10.1016/j.datak.2022.102056.

[28] Van De Berg, D., Savage, T., Petsagkourakis, P., Zhang, D., Shah, N., & del Rio-Chanona, E.A. (2022). Data-driven optimization for process systems engineering applications. Chemical Engineering Science, 248(B), article number 117135. doi: 10.1016/j.ces.2021.117135.

[29] Wilson, A., & Anwar, M.R. (2024). The future of adaptive machine learning algorithms in high-dimensional data processing. International Transactions on Artificial Intelligence, 3(1), 97-107. doi: 10.33050/italic.v3i1.656.

[30] World Medical Association’s Declaration of Helsinki. (1964, June). Retrieved from https://www.wma.net/ policies-post/wma-declaration-of-helsinki/.

[31] Zhang, G., Fu, C., & Zhou, H. (2024). Research on key technologies of deep learning techniques in unstructured data processing. Applied Mathematics and Nonlinear Sciences, 9(1). doi: 10.2478/amns-2024-3175.

[32] Zhang, Y., & Yang, Q. (2022). A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12), 5586-5609. doi: 10.1109/TKDE.2021.3070203.

[33] Zohuri, B. (2024). Artificial intelligence and machine learning driven adaptive control applications. Journal of Material Sciences and Engineering Technology, 2(4). doi: 10.61440/JMSET.2024.v2.27.

Suggested citation

Demchyna, M., Styslo, T., & Vashchyshak, S. (2024). Optimisation of intelligent system algorithms for poorly structured data analysis. Bulletin of Cherkasy State Technological University, 29(4), 21-31. https://doi.org/10.62660/bcstu/4.2024.21