A theory of auto-scaling for resource reservation in cloud services
From MaRDI portal
Publication:5046013
Abstract: We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, disk, etc.). Jobs with different rewards arrive over time and require certain amounts of resources for the duration of their service. When a job arrives, the system must decide whether to admit it or reject it, and if admitted, in which server to schedule the job. The objective is to maximize the expected total reward received by the system. This problem is motivated by control of cloud computing clusters, in which, jobs are requests for Virtual Machines or Containers that reserve resources for various services, and rewards represent service priority of requests or price paid per time unit of service by clients. We study this problem in an asymptotic regime where the number of servers and jobs' arrival rates scale by a factor , as becomes large. We propose a resource reservation policy that asymptotically achieves at least , and under certain monotone property on jobs' rewards and resources, at least of the optimal expected reward. The policy automatically scales the number of VM slots for each job type as the demand changes, and decides in which servers the slots should be created in advance, without the knowledge of traffic rates. It effectively tracks a low-complexity greedy packing of existing jobs in the system while maintaining only a small number, , of reserved VM slots for high priority jobs that pack well.
Recommendations
- Resource allocation and routing in parallel multi-server queues with abandonments for cloud profit maximization
- Dynamic resource allocation in the cloud with near-optimal efficiency
- Choosing among heterogeneous server clouds
- Resource Allocation in Cloud Computing Via Optimal Control to Queuing Systems
- Computational Science and Its Applications – ICCSA 2004
Cites work
- scientific article; zbMATH DE number 3950178 (Why is no real title available?)
- scientific article; zbMATH DE number 2086630 (Why is no real title available?)
- scientific article; zbMATH DE number 2107164 (Why is no real title available?)
- scientific article; zbMATH DE number 2107836 (Why is no real title available?)
- A formal proof in Coq of Lasalle's invariance principle
- Adaptive Resource Provisioning for the Cloud Using Online Bin Packing
- An exact algorithm for large unbounded knapsack problems
- An infinite server system with general packing constraints
- Asymptotic analysis of single resource loss systems in heavy traffic, with applications to integrated networks
- Asymptotic optimality of a greedy randomized algorithm in a large-scale service system with general packing constraints
- Choosing among heterogeneous server clouds
- Fast Approximation Algorithms for the Knapsack and Sum of Subset Problems
- Large loss networks
- Large-scale heterogeneous service systems with general packing constraints
- Loss networks
- Online knapsack revisited
- Optimal Control and Trunk Reservation in Loss Networks
- Optimization via trunk reservation in single resource loss systems under heavy traffic
- Queueing Networks and Markov Chains
- Stochastic on-line knapsack problems
- Unbounded knapsack problem: Dynamic programming revisited
Cited in
(4)- Resource allocation and routing in parallel multi-server queues with abandonments for cloud profit maximization
- \texttt{DEPAS}: a decentralized probabilistic algorithm for auto-scaling
- SLA based resource allocation policies in autonomic environments
- Auto-scaling system in Apache spark cluster using model-based deep reinforcement learning
This page was built for publication: A theory of auto-scaling for resource reservation in cloud services
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5046013)