Bulletin of Cherkasy State Technological University

ISSN 2306-4412
E-ISSN 2708-6070

  • Home
  • Articles & Issues
    • Current
    • All Issues
  • About
    • Aims and Scope
    • Editorial Board
    • Indexing
  • For Authors
    • Submission Terms and Author's Rights
    • Formatting Guidelines
    • Peer Review Process
    • Funding Policy
  • Ethics & Policies
    • Publication Ethics
    • Conflict of Interest
    • Open Access & Archiving Policy
    • Complaints Policy
    • Privacy Statement
    • Corrections and Retractions
    • Anti-plagiarism Policy
    • Generative AI Policy
  • Contacts
Submit an article
en
  • Українська

https://doi.org/10.62660/bcstu/4.2025.128

Volume 30, No. 4, 2025

128-142

  • Read article
  • A comparative analysis of CodeBERT and CodeLlama models: Architecture, functionality and application in software coding tasks

    Oleksandr Deineha Olena Arshava Irіna Zhovtonizhko

    Received 30.05.2025, Revised 19.10.2025, Accepted 15.12.2025

    Abstract

    The relevance of the research was conditioned by the need to compare the large language models CodeBERT and CodeLlama, which were actively used for automating code generation and analysis with the aim of improving the efficiency and quality of software. The aim of the study was a comprehensive juxtaposition of the architectural and functional characteristics of the selected language models CodeBERT and CodeLlama. Interpretative, comparative, systemic and structural-categorical analyses were used to study the architectures, tasks, and relevance of the models. A comprehensive comparative analysis of the CodeBERT and CodeLlama models was carried out according to key parameters: model architecture (the RoBERTa encoder architecture in CodeBERT versus the Llama 2 decoder architecture in CodeLlama), the scale and sources of training data, the range of supported tasks, performance on benchmark datasets, advantages and limitations, typical areas of application, and conditions of accessibility and licensing. The results showed that the difference in architecture and training data significantly affected the effectiveness of the models in different types of tasks, and also determined the practical capabilities and limitations. Particular attention was paid to the issues of implementing the models in practical scenarios, taking into account hardware resources and licensing policy. The results showed that CodeLlama required significantly greater computational resources for effective operation, whereas CodeBERT was easier to implement on standard equipment. It was also established that the licensing conditions of CodeLlama were more restrictive, which could complicate its use in commercial projects, in contrast to CodeBERT with an open licence. It was concluded that these models performed predominantly complementary functions: CodeBERT was an effective tool for code-understanding tasks, whereas CodeLlama demonstrated high results in generation tasks. The conclusions outlined the challenges and prospects for the development of next-generation models with multitasking and multimodality. Practical value – assistance to developers and researchers in choosing the optimal tool, taking into account technical and licensing aspects

    Keywords:

    large language models; encoder transformer architecture; decoder architecture; systemic and functional analysis; optimisation model; statistical analysis; natural language processing

    Suggested citation
    Deineha, O., Arshava, O., & Zhovtonizhko, I. (2025). A comparative analysis of CodeBERT and CodeLlama models: Architecture, functionality and application in software coding tasks. Bulletin of Cherkasy State Technological University, 30(4), 128-142. https://doi.org/10.62660/bcstu/4.2025.128
    490 Views

    References

    1. Bai, X., Huang, S., Wei, C., & Wang, R. (2025). Collaboration between intelligent agents and large language models: A novel approach for enhancing code generation capability. Expert Systems with Applications, 269, article number 126357. doi: 10.1016/j.eswa.2024.126357.
    2. Bhandari, G., Gavric, N., & Shalaginov, A. (2025). Generating vulnerability security fixes with code language models. Information and Software Technology, 185, article number 107786. doi: 10.1016/j.infsof.2025.107786.
    3. Budzynskyi, O.V. (2025). Method of detecting vulnerabilities and automated response in corporate database protection systems. Modern Information Security, 2(62), 180-186. doi: 10.31673/2409-7292.2025.029259.
    4. Çaylı, O. (2024). AI-enhanced cybersecurity vulnerability-based prevention, defense, and mitigation using generative AI. Orclever Proceedings of Research and Development, 5(1), 655-667. doi: 10.56038/oprd.v5i1.616.
    5. d’Aloisio, G., Traini, L., Sarro, F., & Di Marco, A. (2025). On the compression of language models for code: An empirical study on codeBERT. In 2025 IEEE international conference on software analysis, evolution and reengineering (SANER) (pp. 12-23). Montreal: IEEE. doi: 10.1109/SANER64311.2025.00010.
    6. de-Fitero-Dominguez, D., Garcia-Lopez, E., Garcia-Cabot, A., & Martinez-Herraiz, J.-J. (2024). Enhanced automated code vulnerability repair using large language models. Engineering Applications of Artificial Intelligence, 138(A), article number 109291. doi: 10.1016/j.engappai.2024.109291.
    7. Deineha, O., Donets, V., & Zholtkevych, G. (2024). The approach development of data extraction from lambda terms. Eastern-European Journal of Enterprise Technologies, 3(2(129)), 42-54. doi: 10.15587/17294061.2024.298991.
    8. Gain, B., Bandyopadhyay, D., Mukherjee, S., Sahoo, A., Dana, S., Kodeswaran, P., Sen, S., Ekbal, A., & Garg, D. (2025). Transforming code understanding: Clustering-based retrieval for improved summarization in domainspecific languages. In O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. Di Eugenio, S. Schockaert, K. Darwish & A. Agarwal (Eds.), Proceedings of the 31st international conference on computational linguistics: Industry track (pp. 546-560). Abu Dhabi: Association for Computational Linguistics.
    9. Gao, Z., Wang, H., Wang, Y., & Zhang, C. (2024). Virtual compiler is all you need for assembly code search. In Proceedings of the 62nd annual meeting of the association for computational linguistics (pp. 3040-3051). Bangkok: Association for Computational Linguistics. doi: 10.18653/v1/2024.acl-long.167.
    10. Ghaemi, H., Alizadehsani, Z., Shahraki, A., & Corchado, J.M. (2024). Transformers in source code generation: A comprehensive survey. Journal of Systems Architecture, 153, article number 103193. doi: 10.1016/j.sysarc.2024.103193.
    11. Gurjar, A., Camp, L.J., Ringenberg, T., Ma, X., & Chaora, A. (2023). Can large language models detect PII in code? SSRN. doi: 10.2139/ssrn.4619112.
    12. Hodovychenko, M.A., & Kurinko, D.D. (2025). Analysis of existing approaches to automated refactoring of object-oriented software systems. Herald of Advanced Information Technology, 8(2), 179-196. doi: 10.15276/ hait.08.2025.11.
    13. Huang, K., Zhang, J., Bao, X., Wang, X., & Liu, Y. (2025). Comprehensive fine-tuning large language models of code for automated program repair. IEEE Transactions on Software Engineering, 51(4), 904-928. doi: 10.1109/ tse.2025.3532759.
    14. Li, J., Tao, C., Li, J., Li, G., Jin, Z., Zhang, H., Fang, Z., & Liu, F. (2023). Large language model-aware in-context learning for code generation. ACM Transactions on Software Engineering and Methodology, 34(7), article number 190. doi: 10.1145/3715908.
    15. Luo, W., Keung, J., Yang, B., Ye, H., Goues, C.L., Bissyandé, T.F., Tian, H., & Le, X.B.D. (2025). When fine-tuning LLMs meets data privacy: An empirical study of federated learning in LLM-based program repair. ACM Transactions on Software Engineering and Methodology. doi: 10.1145/3733599.
    16. Mohamed, K., Yousef, M., Medhat, W., Mohamed, E.H., Khoriba, G., & Arafa, T. (2024). Hands-on analysis of using large language models for the auto evaluation of programming assignments. Information Systems, 128, article number 102473. doi: 10.1016/j.is.2024.102473.
    17. Panebianco, F., Isgro, A., Longari, S., Zanero, S., & Carminati, M. (2025). Guessing as a service: Large language models are not yet ready for vulnerability detection. In Proceedings of the joint national conference on cybersecurity (ITASEC & SERICS 2025) (pp. 1-17). Bologna: Security and Rights in CyberSpace Foundation.
    18. Qin, Z., Wu, Y., & Han, L. (2025). CLNX: Bridging code and natural language for C/C++ vulnerability-contributing commits identification. In Proceedings of the AAAI conference on artificial intelligence (pp. 25047-25055). Washington: AAAI Press. doi: 10.1609/aaai.v39i23.34689.
    19. Raihan, N., Newman, C., & Zampieri, M. (2024). Code LLMs: A taxonomy-based survey. In 2024 IEEE international conference on Big Data (BigData) (pp. 5402-5411). Washington: IEEE. doi: 10.1109/BigData62323.2024.10826108.
    20. Saberi, I., Esmaeili, A., Fard, F., & Chen, F. (2025). AdvFusion: Adapter-based knowledge transfer for code summarization on code language models. In 2025 IEEE international conference on software analysis, evolution and reengineering (SANER) (pp. 563-574). Montreal: IEEE. doi: 10.1109/SANER64311.2025.00059.
    21. Shi, J., Yang, Z., Kang, H.J., Xu, B., He, J., & Lo, D. (2024). Greening large language models of code. In ICSESEIS’24: Proceedings of the 46th international conference on software engineering: Software engineering in society (pp. 142-153). New York: Association for Computing Machinery. doi: 10.1145/3639475.3640097.
    22. Shimmi, S., Rahman, A., Gadde, M., Okhravi, H., & Rahimi, M. (2024). VulSim: Leveraging similarity of multidimensional neighbor embeddings for vulnerability detection. In 33rd USENIX security symposium (USENIX Security 24) (pp. 1777-1794). Philadelphia: Curran Associates, Inc.
    23. Siavvas, M., Kalouptsoglou, I., Gelenbe, E., Kehagias, D., & Tzovaras, D. (2024). Transforming the field of vulnerability prediction: Are large language models the key? In 2024 32nd international conference on modeling, analysis and simulation of computer and telecommunication systems (MASCOTS) (pp. 1-6). Krakow: IEEE. doi: 10.1109/MASCOTS64422.2024.10786575.
    24. Singhal, A., Ghosh, R., Mundra, R., Dadlani, H., & Dutta, D. (2025). Code2JSON: Can a zero-shot LLM extract code features for code RAG? In ICLR 2025 third workshop on deep learning for code (pp. 1-23). Singapore: International Conference on Learning Representations.
    25. Su, Z., Xu, X., Huang, Z., Zhang, Z., Ye, Y., Huang, J., & Zhang, X. (2024). Codeart: Better code models by attention regularization when symbols are lacking. Proceedings of the ACM on Software Engineering, 1, 562-585. doi: 10.1145/3643752.
    26. Taghavi Far, S.M., & Feyzi, F. (2025). Large language models for software vulnerability detection: A guide for researchers on models, methods, techniques, datasets, and metrics. International Journal of Information Security, 24, article number 78. doi: 10.1007/s10207-025-00992-7.
    27. Tehrani, A., Bhattacharjee, A., Chen, L., Ahmed, N.K., Yazdanbakhsh, A., & Jannesari, A. (2024). CodeRosetta: Pushing the boundaries of unsupervised code translation for parallel programming. In 38th conference on neural information processing systems (pp. 100965-100999). Vancouver: Neural Information Processing Systems Foundation, Inc.
    28. Weyssow, M. (2024). Aligning language models to code: Exploring efficient, temporal, and preference alignment for code generation. (PhD dissertation, University of Montreal, Montreal, Canada).
    29. Xiang, B., & Shao, Y. (2024). SUMLLAMA: Efficient contrastive representations and fine-tuned adapters for bug report summarization. IEEE Access, 12, 78562-78571. doi: 10.1109/access.2024.3397326.
    30. Yong, C., Defeng, H., Chao, X., Nannan, C., & Jianbo, L. (2025). Smart contract generation model based on code annotation and AST-LSTM tuning. Journal of Supercomputing, 81, article number 731. doi: 10.1007/s11227025-07186-x.
    31. Zhang, Y., Kang, H., & Wang, Q. (2025). MMFDetect: Webshell evasion detect method based on multimodal feature fusion. Electronics, 14(3), article number 416. doi: 10.3390/electronics14030416.
    32. Zheng, Z., Ning, K., Wang, Y., Zhang, J., Zheng, D., Ye, M., & Chen, J. (2023). A survey of large language models for code: Evolution, benchmarking, and future trends. ArXiv. doi: 10.48550/arXiv.2311.10372.
    33. Zhong, M., Lyu, F., Wang, L., Geng, H., Qiu, L., Cui, H., & Feng, X. (2024). ComBack: A versatile dataset for enhancing compiler backend development efficiency. In 38th conference on neural information processing systems (pp. 112310-112328). Vancouver: Neural Information Processing Systems Foundation, Inc.
    34. Zhou, Z., Li, M., Yu, H., Fan, G., Yang, P., & Huang, Z. (2024). Learning to generate structured code summaries from hybrid code context. IEEE Transactions on Software Engineering, 50(10), 2512-2528. doi: 10.1109/ tse.2024.3439562.

    18006, Ukraine, Cherkasy, 460, Shevchenko Blvd.

    info@bulletin-chstu.com.ua

    • Contacts
    • Home
    • All Issues

    © 2026 Bulletin of Cherkasy State Technological University