Energy-Aware Training and Deployment of Large-Scale Machine Learning Models: A Review of Distributed Graph Data Science and Multi-Objective Resource Optimization

Abhirami S; V.Anuratha; M.Elamparithi

doi:10.22399/ijcesen.5204

Authors

Abhirami S
V.Anuratha
M.Elamparithi

DOI:

https://doi.org/10.22399/ijcesen.5204

Keywords:

Machine learning, Energy-Aware Training, Resource Optimization

Abstract

The exponential growth of large-scale Machine Learning (ML) models, particularly Transformers and Graph Neural Networks (GNNs), has catalyzed advancements across various domains, yet it imposes a substantial environmental and computational cost. This review paper investigates energy-aware strategies for the training and deployment phases of large-scale ML systems through the lens of Graph Data Science and Distributed Computing. By synthesizing recent literature on GNNs, Deep Reinforcement Learning (DRL), and bio-inspired optimization techniques, this study explores novel methods to minimize resource usage. We analyze frameworks that optimize computation graphs, implement dynamic cache scheduling, and employ game-theoretic Nash equilibrium for task offloading in edge-cloud environments. The paper identifies critical research gaps in dynamic graph partitioning and energy profiling, proposing a multi-objective framework that balances accuracy, latency, and energy efficiency. Finally, we recommend a two-phase research trajectory to advance sustainable "Green AI," paving the way for scalable, environmentally responsible artificial intelligence.

References

[1] Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L. M., Rothchild, D., ... & Dean, J. (2021). Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350.

[2] Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. arXiv preprint arXiv:1906.02243.

[3] Schwartz, R., Dodge, J., Smith, N. A., & Etzioni, O. (2020). Green AI. Communications of the ACM, 63(12), 54-63.

[4] Chowdhury, M. Z., et al. (2020). 6G wireless communication systems: Applications, requirements, technologies, challenges, and research directions. IEEE Open Journal of the Communications Society, 1, 957-975.

[5] Shao, Y., et al. (2024). Distributed graph neural network training: A survey. ACM Computing Surveys, 56(8), 1-39.

[6] Niam, A., Kosar, T., & Nine, M. S. Z. (2025). RapidGNN: Energy and communication-efficient distributed training on large-scale graph neural networks. arXiv preprint arXiv:2509.05207.

[7] Long, Y. (2025). A distributed training architecture for combinatorial optimization. arXiv preprint arXiv:2511.09261.

[8] Huang, Y. (2025). Multimedia tasks-oriented edge computing offloading scheme based on graph neural network in vehicular networks. IEEE Access, 13, 9780-9791.

[9] Al-Qudah, R., Al Tawil, A., AlMuhajiri, M., Almazaydeh, L., & Suen, C. Y. (2026). Nash-regularized heterogeneous graph transformer networks for strategic task offloading in 6G edge computing. International Journal of Data and Network Science, 10, 1-12.

[10] Chen, D., Liu, J., Li, T., He, J., Chen, Y., & Zhu, W. (2025). Research on mobile robot path planning based on MSIAR-GWO algorithm. Sensors, 25(3), 892.

[11] Smith, K., & Climer, S. (2024). Improving data cleaning using discrete optimization. arXiv preprint arXiv:2405.00764.

[12] Han, S., Mao, H., & Dally, W. J. (2015). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149.

[13] Hamilton, W., Ying, Z., & Leskovec, J. (2017). Inductive representation learning on large graphs. Advances in Neural Information Processing Systems, 30, 1024-1034.

[14] Kumar, K. P., Lal, B., & Srinivasulu, B. V. (2025). Graph-neural-network-driven proactive resource allocation for cloud workloads: A predictive graph-modeling framework. American Journal of AI Cyber Computing Management, 5(4), 335-345.

[15] Ganesan, M., & Chelliah, B. (2024). Optimizing task offloading and resource management in 6G networks through a hierarchical edge-fog-cloud architecture. IEEE Access, 12, 87452-87468.

[16] Ma, C., et al. (2024). Efficient and scalable reinforcement learning for large-scale network control. Nature Machine Intelligence, 6(9), 1006-1020.

[17] Hu, W., Fey, M., Zitnik, M., Dong, Y., Ren, H., Liu, B., ... & Leskovec, J. (2020). Open graph benchmark: Sets and benchmarks for machine learning on graphs. Advances in neural information processing systems, 33, 22118-22133.

[18] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.

[19] Nash, J. (1951). Non-cooperative games. Annals of mathematics, 286-295.

[20] Mirjalili, S., Mirjalili, S. M., & Lewis, A. (2014). Grey wolf optimizer. Advances in engineering software, 69, 46-61.

[21] Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., ... & Sun, M. (2020). Graph neural networks: A review of methods and applications. AI Open, 1, 57-81.

[22] Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.

[23] Vinyals, O., Fortunato, M., & Jaitly, N. (2015). Pointer networks. Advances in neural information processing systems, 28.

[24] Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., & Monfardini, G. (2008). The graph neural network model. IEEE transactions on neural networks, 20(1), 61-80.

[25] Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.

[26] Mao, H., Alizadeh, M., Menache, I., & Kandula, S. (2016). Resource management with deep reinforcement learning. Proceedings of the 15th ACM Workshop on Hot Topics in Networks, 50-56.

[27] Xu, K., Hu, W., Leskovec, J., & Jegelka, S. (2018). How powerful are graph neural networks?. arXiv preprint arXiv:1810.00826.

[28] Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2017). Graph attention networks. arXiv preprint arXiv:1710.10903.

[29] Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785-794.

[30] Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., ... & Zheng, X. (2016). Tensorflow: A system for large-scale machine learning. 12th USENIX symposium on operating systems design and implementation (OSDI 16), 265-283.

[31] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.

[32] Wang, M., Zheng, D., Ye, Z., Gan, Q., Li, M., Song, X., ... & Zhang, Z. (2019). Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315.

[33] Fey, M., & Lenssen, J. E. (2019). Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428.

[34] Karypis, G., & Kumar, V. (1998). A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing, 20(1), 359-392.

[35] Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10), P10008.

[36] Dorigo, M., Maniezzo, V., & Colorni, A. (1996). Ant system: optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 26(1), 29-41.

[37] Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN'95-international conference on neural networks, 4, 1942-1948.

[38] Karaboga, D. (2005). An idea based on honey bee swarm for numerical optimization (Vol. 200). Technical report-tr06, Erciyes university, engineering faculty, computer engineering department.

[39] Yang, X. S. (2010). A new metaheuristic bat-inspired algorithm. Nature inspired cooperative strategies for optimization (NICSO 2010), 65-74.

[40] Mirjalili, S., & Lewis, A. (2016). The whale optimization algorithm. Advances in engineering software, 95, 51-67.

[41] Heidari, A. A., Mirjalili, S., Faris, H., & Aljarah, I. (2019). Harris hawks optimization: Algorithm and applications. Future generation computer systems, 97, 849-872.

[42] Wang, G. G., Deb, S., & Cui, Z. (2019). Monarch butterfly optimization. Neural computing and applications, 31, 1995-2014.

[43] Wu, Z., Ramsundar, B., Feinberg, E. N., Gomes, J., Geniesse, C., Pappu, A. S., ... & Pande, V. (2018). MoleculeNet: a benchmark for molecular machine learning. Chemical science, 9(2), 513-530.

[44] Dwivedi, V. P., Joshi, C. K., Luu, A. T., Laurent, T., Bengio, Y., & Bresson, X. (2020). Benchmarking graph neural networks. arXiv preprint arXiv:2003.00982.

[45] Jiang, W. (2022). Graph-based deep learning for communication networks: A survey. Computer Communications, 185, 40-54.

[46] Rusek, K., Suárez-Varela, J., Almasan, P., Barlet-Ros, P., & Cabellos-Aparicio, A. (2020). RouteNet: Leveraging

Energy-Aware Training and Deployment of Large-Scale Machine Learning Models: A Review of Distributed Graph Data Science and Multi-Objective Resource Optimization

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

Information

Keywords

Announcements

Current Issue