Reliability Metrics and Key Performance Indicators for Cloud-Based Virtualization Applications  
Author Xuemei Zhang


Co-Author(s) Carolyn Johnson


Abstract Cloud computing is an evolving new technology for complex systems to share computing resources. Cloud computing solutions rely on virtualized compute, memory, storage and networking resources to provide services to end users. These solutions will be different from traditional ones with dedicated hardware and software architectures. This paper discusses the challenges associated with telecommunication network virtualization. Methodologies for reliability and availability assessment for cloudbased telecom applications are proposed. Metrics such as equivalent virtualization solution level Mean Time Between Failures (vMTBF) are introduced. This metric can be used to directly compare to field outage rate at the solution-level. Field outages due to hardware failures, virtual machine failures, application software failures, database failures and network failures, etc. will need to be tracked. The overall outage rate can be compared to the solution level failure rate, which can be inversed from vMTBF. These failures are major factors that reduce service retainability and accessibility, two very critical key performance indicators. In this paper, we illustrate how Markov models can be used to estimate the solution level vMTBF. An example case study is used to illustrate the method. Service accessibility and retainability due to system failure are analyzed as functions of the solution-level failure rate.


Keywords Virtual Function Reliability, Cloud Based Solutions, Virtualization Solution-level MTBF, Absorbing-state Markov models, Service Retainability, Service Accessibility
    Article #:  2101
Proceedings of the 21st ISSAT International Conference on Reliability and Quality in Design
August 6-8, 2015 - Philadelphia, Pennsylvia, U.S.A.