Ensemble Time Series Modeling for High Precision Cloud Memory Usage Prediction
DOI:
https://doi.org/10.22399/ijcesen.2133Keywords:
Cloud computing, Machine learning, Time series analysis, ARIMA, SARIMAX, ForecastingAbstract
This study presents a novel approach to predicting cloud resource usage, focusing on memory allocation in a Google cluster environment. By combining traditional time series analy- sis with advanced machine learning techniques, we developed a highly accurate predictive model that significantly outperforms existing methods. Our research began with ARIMA and SARI- MAX models, providing insights into temporal patterns, and progressed to more sophisticated Prophet and Random Forest models, which greatly improved predictive accuracy. The key contribution of this study is the development of an ensemble model that combines predictions from Prophet and Random Forest. This innovative approach consistently outperformed individual models, achieving a remarkable mean squared error (MSE) of 1.678e-05 and a mean absolute error (MAE) of 0.002418. These results represent a 3.5-fold improvement over the best individual model, with predictions deviating on average by less than 0.25% from actual values. Our research demonstrates the potential to revolutionize cloud resource management through highly accurate predictions. The ensemble model’s exceptional performance suggests strong potential for real-world applications, potentially leading to significant enhancements in capacity planning, resource allocation, and overall efficiency in cloud computing environments. Furthermore, the methodological framework developed in this study, which combines statistical rigor with machine learning flexibility, offers a promising approach for addressing complex forecasting chal- lenges across various domains in cloud computing and beyond.
References
[1] Zhang, Q., Cheng, L., & Boutaba, R. (2010). Cloud computing: State-of-the-art and research challenges. Journal of Internet Services and Applications, 1(1), 7–18. DOI: https://doi.org/10.1007/s13174-010-0007-6
[2] Google. (n.d.). Google cluster data. Retrieved from https://github.com/google/cluster-data
[3] Cortez, E., Bonde, A., Muzio, A., Russinovich, M., Fontoura, M., & Bianchini, R. (2017). Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In Proceedings of the 26th Symposium on Operating Systems Principles (pp. 153–167). DOI: https://doi.org/10.1145/3132747.3132772
[4] Calheiros, R. N., Masoumi, E., Ranjan, R., & Buyya, R. (2014). Workload prediction using ARIMA model and its impact on cloud applications’ QoS. IEEE Transactions on Cloud Computing, 3(4), 449–458. DOI: https://doi.org/10.1109/TCC.2014.2350475
[5] Taylor, S. J., & Letham, B. (2018). Forecasting at scale. The American Statistician, 72(1), 37–45.
[6] Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
[7] Dietterich, T. G. (2000). Ensemble methods in machine learning. In J. Kittler & F. Roli (Eds.), Multiple Classifier Systems (Vol. 1857, pp. 1–15). Springer. DOI: https://doi.org/10.1007/3-540-45014-9_1
[8] S. J. Taylor and B. Letham, (2018). Forecasting at scale, The American Statistician, vol. 72, no. 1, DOI: https://doi.org/10.1080/00031305.2017.1380080
[9] L. Breiman, (2001). Random forests,” Machine learning, vol. 45, no. 1 DOI: https://doi.org/10.1023/A:1010933404324
[10] Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. DOI: https://doi.org/10.1109/TAC.1974.1100705
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Computational and Experimental Science and Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.