AI-Driven Strategic Modernization of Legacy ETL Workflows on Serverless Cloud Platforms

Authors

  • Thananjayan Kasi

DOI:

https://doi.org/10.22399/ijcesen.4754

Keywords:

Serverless Computing, ETL Modernization, Event-Driven Architecture, Distributed Processing, Cloud-Native Transformation

Abstract

Legacy ETL systems create considerable operational overhead with manual capacity management, rigid scheduling models, and inadequate cloud integration support. The proposed framework in this article presents a systematic solution for migrating legacy ETL processes to serverless cloud environments, overcoming essential shortfalls through AI-aided workflow classification, natural language processing-assisted code translation, and event-oriented orchestration. Unlike previous lift-and-shift or containerized models with operational inflexibilities, this framework embraces fully managed serverless services integrated with machine learning functions. The result is considerable cost savings, performance enhancements via distributed processing engines, and improved data freshness. The framework involves automated inventory documentation via metadata-driven models, AI-aided complexity stratification across heterogeneous data, Apache Spark-based code refactoring using Business Process Model and Notation templates, and reliable, fault-tolerant, trigger-activated orchestrations. The model ensures improved reliability, scalability, and standards compliance while eliminating infrastructure-related operational issues. Challenges in metadata completeness gaps, translation processes with semantic drift, artificial intelligence-related bias, and operational constraints are addressed comprehensively through complete provenance tracking and parallel validations. The proposed framework outperforms current serverless cloud-migration solutions through intelligent automation and pattern-driven optimization. In an anonymized enterprise case study, the framework reduced end-to-end batch processing time and infrastructure cost while improving workflow success rates and migration throughput compared to the legacy environment. A formal evaluation section details workflow classification quality, translation accuracy, and operational improvements observed during the migration program.

References

[1] O. Ogunwole et al., "Modernizing Legacy Systems: A Scalable Approach to Next-Generation Data Architectures and Seamless Integration", International Journal of Multidisciplinary Research and Growth Evaluation, 2023. [Online]. DOI: https://doi.org/10.54660/.IJMRGE.2023.4.1.901-909

[2] N. Syeda et al., "Analysis of cost-efficiency of serverless approaches", arXiv, Jun. 2025. [Online]. DOI: https://doi.org/10.48550/arXiv.2506.05836

[3] D. Chanda, "Automated ETL Pipelines for Modern Data Warehousing: Architectures, Challenges, and Emerging Solutions", The Eastasouth Journal of Information Systems and Computer Science, 2024. [Online]. DOI: doi.org/10.58812/esiscs.v1i03.523

[4] I.M. Putrama and P. Martinek, "Heterogeneous data integration: Challenges and opportunities", ScienceDirect, 2024. [Online]. DOI: https://doi.org/10.1016/j.dib.2024.110853

[5] M. Zaharia et al., "Apache Spark: A unified engine for big data processing", ACM Digital Library, 2016. [Online]. DOI: https://doi.org/10.1145/2934664

[6] J. Awiti et al., "Design and implementation of ETL processes using BPMN and relational algebra", ScienceDirect, 2020. [Online]. DOI: https://doi.org/10.1016/j.datak.2020.101837

[7] S. Aier and R. Winter, "Fundamental Patterns for Enterprise Integration Services", IGI Global Scientific Publishing, 2010. [Online]. DOI: https://doi.org/10.4018/jssmet.2010010103

[8] S.R. Chigurupati, "Distributed Database Systems for Scalable Enterprise Applications: A New Paradigm", IJSAT, Mar. 2025. [Online]. DOI: https://doi.org/10.71097/IJSAT.v16.i1.2795

[9] T.T. Bukhari et al., "Cloud-Native Business Intelligence Transformation: Migrating Legacy Systems to Modern Analytics Stacks for Scalable Decision-Making", IJSRHSS, 2024. [Online]. DOI:

https://doi.org/10.32628/IJSRSSH242763

[10] M.M. Alam and W. Wang, "A Comprehensive Survey on the State-of-the-art Data Provenance Approaches for Security Enforcement", arXiv, 2021. [Online]. DOI: https://doi.org/10.48550/arXiv.2107.01678

[11] C. Diggs et al., "Leveraging LLMs for Legacy Code Modernization: Challenges and Opportunities for LLM-Generated documentation", arXiv, 2024. [Online]. DOI: https://doi.org/10.48550/arXiv.2411.14971

[12] A. Awasthi and A. Vaidya, "ETL Pipeline Integration for Machine Learning-Based Product Classification: a Comprehensive Guide", IJARET, Mar.-Apr. 2025. [Online]. DOI: https://doi.org/10.34218/IJARET_16_02_006

[13] R. Krasniqi et al., "SE Perspective on LLMs: Biases in Code Generation, Code Interpretability, and Code Security Risks", ACM Digital Library, 4th Dec. 2025. [Online]. DOI: https://doi.org/10.1145/3774324

[14] S. Metla, "Powering America's Digital Future: Big Data Migration and ETL Modernization for Scalable Intelligence", Sarcouncil Journal of Engineering and Computer Sciences - Zenodo, Jul. 2025. [Online]. DOI: https://doi.org/10.5281/zenodo.15870392

[15] S.K. Rai, "Demystifying Cloud-Native Data Engineering Architectures", IJITMIS, Mar.-Apr. 2025. [Online]. DOI: https://doi.org/10.34218/IJITMIS_16_02_062

[16] A. Pogiatzis and G. Samakovitis, "An Event-Driven Serverless ETL Pipeline on AWS," MDPI, 2020. [Online]. DOI: https://doi.org/10.3390/app11010191

[17] C. Lou et al., "HydraServe: Minimizing Cold Start Latency for Serverless LLM Serving in Public Clouds", arXiv, Sep. 2025. [Online]. DOI: https://doi.org/10.48550/arXiv.2502.15524

[18] S. Singamsetty, "Accelerating data engineering efficiency with self-learning AI algorithms", International Journal of Computing and Artificial Intelligence, Feb. 2025. [Online]. DOI: https://doi.org/10.33545/27076571.2025.v6.i1c.154

Downloads

Published

2026-01-13

How to Cite

Thananjayan Kasi. (2026). AI-Driven Strategic Modernization of Legacy ETL Workflows on Serverless Cloud Platforms. International Journal of Computational and Experimental Science and Engineering, 12(1). https://doi.org/10.22399/ijcesen.4754

Issue

Section

Research Article