Failure-aware resource management for high-availability computing clusters with distributed virtual machines
From MaRDI portal
Publication:666083
DOI10.1016/j.jpdc.2010.01.002zbMath1233.68055OpenAlexW1963853421MaRDI QIDQ666083
Publication date: 7 March 2012
Published in: Journal of Parallel and Distributed Computing (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.jpdc.2010.01.002
resource managementcluster computingsystem availabilitycomponent failuresdistributed virtual machinessystem reconfiguration
Related Items (4)
Software rejuvenation policies for cluster system ⋮ DEFT: dynamic fault-tolerant elastic scheduling for tasks with uncertain runtime in cloud ⋮ Quantifying event correlations for proactive failure management in networked computing systems ⋮ Scalable, Adaptable, and Fast Estimation of Transient Downtime in Virtual Infrastructures Using Convex Decomposition and Sample Path Randomization
Uses Software
Cites Work
- Implementing unreliable failure detectors with unknown membership
- The customizable fault/error model for dependable distributed systems.
- A dynamic and reliability-driven scheduling algorithm for parallel real-time jobs executing on heterogeneous clusters
- Performance characteristics of the multi-zone NAS parallel benchmarks
- On the quality of service of failure detectors
- CRITICAL PATH SCHEDULING PARALLEL PROGRAMS ON AN UNBOUNDED NUMBER OF PROCESSORS
This page was built for publication: Failure-aware resource management for high-availability computing clusters with distributed virtual machines