Ensemble Time Series Modeling for High Precision Cloud Memory Usage Prediction

Authors

  • Aditi M Jain
  • Ayush Jain
  • Rushilkumar Patel

DOI:

https://doi.org/10.22399/ijcesen.2133

Keywords:

Cloud computing, Machine learning, Time series analysis, ARIMA, SARIMAX, Forecasting

Abstract

This study presents a novel approach to predicting cloud resource usage, focusing on memory allocation in a Google cluster environment. By combining traditional time series analy- sis with advanced machine learning techniques, we developed a highly accurate predictive model that significantly outperforms existing methods. Our research began with ARIMA and SARI- MAX models, providing insights into temporal patterns, and progressed to more sophisticated Prophet and Random Forest models, which greatly improved predictive accuracy. The key contribution of this study is the development of an ensemble model that combines predictions from Prophet and Random Forest. This innovative approach consistently outperformed individual models, achieving a remarkable mean squared error (MSE) of 1.678e-05 and a mean absolute error (MAE) of 0.002418. These results represent a 3.5-fold improvement over the best individual model, with predictions deviating on average by less than 0.25% from actual values. Our research demonstrates the potential to revolutionize cloud resource management through highly accurate predictions. The ensemble model’s exceptional performance suggests strong potential for real-world applications, potentially leading to significant enhancements in capacity planning, resource allocation, and overall efficiency in cloud computing environments. Furthermore, the methodological framework developed in this study, which combines statistical rigor with machine learning flexibility, offers a promising approach for addressing complex forecasting chal- lenges across various domains in cloud computing and beyond.

References

[1] Zhang, Q., Cheng, L., & Boutaba, R. (2010). Cloud computing: State-of-the-art and research challenges. Journal of Internet Services and Applications, 1(1), 7–18. DOI: https://doi.org/10.1007/s13174-010-0007-6

[2] Google. (n.d.). Google cluster data. Retrieved from https://github.com/google/cluster-data

[3] Cortez, E., Bonde, A., Muzio, A., Russinovich, M., Fontoura, M., & Bianchini, R. (2017). Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In Proceedings of the 26th Symposium on Operating Systems Principles (pp. 153–167). DOI: https://doi.org/10.1145/3132747.3132772

[4] Calheiros, R. N., Masoumi, E., Ranjan, R., & Buyya, R. (2014). Workload prediction using ARIMA model and its impact on cloud applications’ QoS. IEEE Transactions on Cloud Computing, 3(4), 449–458. DOI: https://doi.org/10.1109/TCC.2014.2350475

[5] Taylor, S. J., & Letham, B. (2018). Forecasting at scale. The American Statistician, 72(1), 37–45.

[6] Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

[7] Dietterich, T. G. (2000). Ensemble methods in machine learning. In J. Kittler & F. Roli (Eds.), Multiple Classifier Systems (Vol. 1857, pp. 1–15). Springer. DOI: https://doi.org/10.1007/3-540-45014-9_1

[8] S. J. Taylor and B. Letham, (2018). Forecasting at scale, The American Statistician, vol. 72, no. 1, DOI: https://doi.org/10.1080/00031305.2017.1380080

[9] L. Breiman, (2001). Random forests,” Machine learning, vol. 45, no. 1 DOI: https://doi.org/10.1023/A:1010933404324

[10] Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. DOI: https://doi.org/10.1109/TAC.1974.1100705

Downloads

Published

2025-05-13

How to Cite

M Jain, A., Ayush Jain, & Rushilkumar Patel. (2025). Ensemble Time Series Modeling for High Precision Cloud Memory Usage Prediction. International Journal of Computational and Experimental Science and Engineering, 11(2). https://doi.org/10.22399/ijcesen.2133

Issue

Section

Research Article