Benchmarking Predictive Autoscaling vs. Horizontal Scaling for Large-Scale Data Pipelines: A Real-Time FinOps Evaluation

Deepika Annam

doi:10.22399/ijcesen.4997

Authors

Deepika Annam

DOI:

https://doi.org/10.22399/ijcesen.4997

Keywords:

Predictive Autoscaling, Horizontal Scaling, Data Pipelines, FinOps, Cloud Resource Optimization

Abstract

Autoscaling is an essential facilitator of stability of performance and cost at cloud-native data platforms, but most empirical analyses are conducted with stateless web services and microservice-based architectures as opposed to data engineering workloads. A common set of data pipelines, such as ETL flows, streaming analytics, batch processing, and Spark-based distributed computations, uses the characteristics of stateful execution, bursty demand patterns, DAG-based scheduling, and extreme sensitivity to scaling latency. Machine intelligence-based predictive autoscaling methods like ARIMA, Prophet, LSTM, and workload archetype classifiers have the potential to provide proactive resource provisioning in the future by anticipating a demand value, thus eliminating SLA breaches and tail latency. But these predictive strategies are not done with strict benchmarking relative to the conventional reactive horizontal autoscaling in the context of data-centric workloads running in the domain of real-world variability. At the same time, FinOps structures promote cloud cost visibility and efficiency, but there is no standardized metric that can measure the cost-performance trade-offs of predictive and reactive scaling schemes in a variety of pipeline structures. This benchmark presents a full-system assessment model that integrates workload traces recorded in the past with synthetic replay workloads replicating enterprise data pipeline behavior, including bursty ingestion, periodic batch cycles, and multi-stage DAG executions. The predictive, reactive, and hybrid autoscaling options are tested on Kubernetes and Spark clusters with an extensive metric suite, which combines engineering performance indicators with cost-related dimensions and FinOps-oriented metrics. The resulting construct gives objective foundations of autoscaling decision-making and provides real-world guidelines and artifacts of evaluation to organizations that want to streamline cloud spending without sacrificing pipeline stability or operational robustness.

References

[1] Tania Lorido-Botran et al., "A review of auto-scaling techniques for elastic applications in cloud environments," Journal of Grid Computing, 2014. [Online]. Available: https://doi.org/10.1007/s10723-014-9314-7

[2] Yahya Al-Dhuraibi et al., "Elasticity in cloud computing: State of the art and research challenges," IEEE Transactions on Services Computing, 2017. [Online]. Available: https://doi.org/10.1109/TSC.2017.2711009

[3] Haoran Qiu et al., "FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices," 2020. [Online]. Available: https://www.usenix.org/conference/osdi20/presentation/qiu

[4] Rodrigo N. Calheiros et al., "Workload prediction using ARIMA model and its impact on cloud applications' QoS," IEEE Transactions on Cloud Computing, 2015. [Online]. Available: https://doi.org/10.1109/TCC.2014.2350475

[5] Guilin Zhang et al., "AAPA: An Archetype-Aware Predictive Autoscaler with Uncertainty Quantification for Serverless Workloads on Kubernetes," arXiv preprint arXiv:2507.05653v3, 2025. [Online]. Available: https://arxiv.org/pdf/2507.05653v3

[6] Vinoth Punniyamoorthy et al., "An SLO Driven and Cost-Aware Autoscaling Framework for Kubernetes," arXiv preprint arXiv:2512.23415, 2025. [Online]. Available: https://arxiv.org/abs/2512.23415

[7] J.R. Storment and Mike Fuller, “Your Business Operating Manual for the Cloud,” 2026. [Online]. Available: https://www.finops.org/community/finops-book/

[8] Anshul Gandhi et al., "Adaptive, Model-driven Autoscaling for Cloud Applications," 2014 USENIX Federated Conferences Week, 2014. [Online]. Available: https://www.usenix.org/conference/icac14/technical-sessions/presentation/gandhi

[9] Vaibhav Pandey, "Reinforcement learning-based autoscaling for distributed data processing," Advances on P2P, Parallel, Grid, Cloud and Internet Computing, 2025. [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-032-10344-4_3

[10] Krzysztof Rzadca et al., "Autopilot: workload autoscaling at Google," EuroSys '20: Proceedings of the Fifteenth European Conference on Computer Systems, 2020. [Online]. Available: https://doi.org/10.1145/3342195.3387524

Benchmarking Predictive Autoscaling vs. Horizontal Scaling for Large-Scale Data Pipelines: A Real-Time FinOps Evaluation

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Information

Keywords

Announcements

Current Issue