Virtualization is fast emerging as a game-changing technology in the enterprise computing space. What was once viewed as a technology useful for testing and development is going mainstream and is affecting the entire data-center ecosystem. Over the course of the next few weeks, PuneTech is going to run a series of articles on virtualization from experts in the industry. This article, the first in the series, gives an introduction to server virtualization, and has been written by Anurag Agarwal and Anand Mitra, founders of KQ Infotech.
What is virtualization
Virtualization is essentially some kind of abstraction of computing resources. There are various kinds of abstractions. Files provide an abstraction of disk blocks into linear space. Storage virtualization products, like logical volume manager, virtualize multiple storage devices into single storage and vice versa.
Processes are also a form of virtualization. A process provides an illusion to the programmer that she has the entire address space at her disposal and has exclusive control of hardware resources. Multiplexing of these resources between all the processes on the system is done by the OS, transparent to the process. This concept has been universally adopted.
All multi-programming operating systems are characterized by executing instructions in at least two privilege levels i.e. unprivileged for user programs, and privileged for the operating system. The user programs use “system calls” to request the operating system to perform privileged operations on its behalf. The interface which consists of the unprivileged instruction set and the set of system calls define an “extended machine” which is easier to program than the bare machine and makes user programs more portable.
The benefits of having the kernel wrapping completely around the hardware and not exposing it to upper layer has its advantages. But in this model, only one operating system can be run at a given time. One cannot perform any activity that would disrupt the running system (for example, upgrade, migration, system debugging, etc.)
A virtual machine provides an abstraction of complete physical machine. This is the also known as server virtualization. The basic idea is to run more than one operating system on the single server at the same time.
The History of Server Virtualization
In 1964, IBM had developed a Virtual Machine Monitor (CP) to run their various OSes on their mainframes. Hardware was too expensive to leave underutilized. They had addressed many of the performance challenges inherent in virtualization by designing hardware amenable to virtualization. However with the advent of cheap computing resources and proliferation of commodity hardware, virtualization was no longer popular and was viewed as a artifact of a an era where computing resources were scarce. This was reflected in design of x86 architectures which no longer provided enough support to implement virtualization efficiently.
With the cost of hardware going down and complexities of software increasing, a large number of administrators started putting one application per server. This provides them isolation, where one application does not interfere with other application. However, over some time it started resulting into a problem called server sprawl. There are too many underutilized servers in data centers. Most windows servers have average utilization between 5% and 15%. This utilization rate will further go down with dual core and quad core processors becoming very common. In addition to the cost of the hardware, there are also power and cooling requirements for all these servers. The earlier problem of utilization of hardware resources has started surfacing again.
Ironically the very reason which resulted in the demise of virtualization in the mainstream, was the cause of its resurrection. The features which made the OSes attractive, also made them more fragile. And this renewed interest in virtualization resulted into VMWare providing a server virtualization solution for x86 machines in 1999. Server consolidation has increased the server utilization to the 60% to 80% level. This has resulted in 5 to 15 times reduction in the servers.
Virtual machines have introduced a whole new paradigm of looking at operating systems. Traditionally they were coupled with physical machines, and they needed to know all the peculiarities of hardware. Once hardware becomes obsolete, your operating system becomes obsolete too. But virtual machines have changed that notion. They have decoupled the operating systems from hardware by introducing a virtualization layer called virtual machine monitor (VMM).
Types of Virtualization architectures
There are many VMM architectures.
Full emulation: It is the oldest virtualization technique in use. An emulator is a software layer which tracks the memory and CPU state of the machine being emulated and interprets each instruction applying the effect it would have on the virtual state of the machine it has constructed. In a regular server, machine instructions are directly executed by the CPU and the memory is directly manipulated. In full emulation, the instructions are handed over to the emulator and it then converts these instructions into a (possibly different) set of instructions to be executed on the actual underlying physical machine. Full emulation is routinely used during the development of software for new hardware which might not be available yet. Virtualization can be considered as a special case of emulation where both the machine being emulated and host are similar. This allows execution of unprivileged instructions natively. Qemu falls in this category.
Hosted: In this approach, a traditional operating system (Windows or Linux) runs directly on the hardware. This is called the host OS. VMM is installed as a service in the host OS. This application creates and manages multiple virtual machines as processes. Each virtual machine process has a full operating system inside it. Each of these is called a guest OS. This approach greatly simplifies the design of the VMM as it can directly use the services provided by the host operating system. VMWare server, VMWare workstation, Virtual box, and KVM fall in this category.
Hypervisor based: Hosted VMM solutions have a high overhead, as the VMM does not directly control the hardware. In the hypervisor approach the VMM is directly installed on the hardware. The VMM provides virtual hardware abstractions to create and manage multiple virtual machines. Performance overhead in this approach is very small.
Another way to classify virtual machines is on the basis of how privileged instructions are handled. Modern processors have a privileged mode of execution that the OS kernel executes in, and a non-privileged mode that the user programs execute in. This can cause a problem for virtual machines because although the host OS (or the hypervisor) runs in privileged mode the entire guest OS runs in non-privileged mode. Most of today’s OSs are specifically designed to run in privileged mode, and hence their binaries end up having some instructions that must be run in privileged mode. (For example, there are 17 such instructions in the Intel IA-32 architecture.) This causes a problem for the virtual machine, and there are two major approaches to handling this problem.
Para virtualization: In this approach, the binary of the OS needs to be rewritten statically to replace the use of the privileged instructions by appropriate calls into the hypervisor. In other words, the operating system needs to be ported to the virtual hardware abstraction provided by VMM. This requires changes in the operating system code. This approach has least performance penalty. This is the approach taken by Xen.
Full virtualization: In this approach, no change is made in the operating system code. There are two ways of supporting this.
- Using run time emulation of the privileged instructions. The VMM monitors program execution during runtime, and takes over control of execution whenever a privileged instruction arises in the guest OS. This approach is called binary translation. VMWare uses this technology.
- Hardware assisted virtualization: Both intel and AMD have come up virtualization extensions of their hardware to support virtualization. Intel calls this VT technology and AMD calls it SVM technology. These extensions provide an extra privilege level for VMM to run. These extensions have provided a number of additional features like nested page tables and IOMMU, to make virtualization more efficient.
VMWare: VMWare has a suite of products in this area. There are two hosted products, called VMWare workstation and VMWare server. Their hypervisor product is called VMWare ESX. They have one version of ESX that comes burned in the bios. It is called VMWare ESXi. They have virtual center as management product to manage complete virtual machine infrastructure in the data center. There all the products are based on the dynamic binary translation technology. They support various flavors for Windows and Linux.
Xen: It is an open source project. It is based on para-virtualization and hypervisor technologies. Linux is modified to support para-virtualization. Xen now supports Windows with hardware assisted virtualization. There are number of products based on Xen. Citrix, which bought XenSource has couple of Xen based products, Sun has xVM, Oracle has Oracle VM. Redhat and Suse have been shipping Xen as part of their Linux distributions for some time.
Hyper-V: This is Microsoft’s entry in this space. It is similar to the Xen architecture. It also requires hardware assistance. It comes bundled with Windows server 2008, and supports running Windows and Linux guest operating systems in the virtual machines.
Advantages of Virtualization
Virtualization has also provided some new capabilities. Server provisioning becomes very easy. It is just creating and managing a virtual machine. This has transformed the way testing and development are done. There is another interesting feature called Vmotion or live migration, where a running virtual machine can be moved from one physical machine to other physical machine. Executing of the virtual machine is briefly suspended, and the entire image of the virtual machine is moved to a different machine. Now the execution can be re-started and the guest OS will continue from exactly the same point where it was suspended. This eliminates the need for downtime, even for things like hardware maintenance. This also enables the dynamic resource management or utility computing.
Adoption of server virtualization has been phenomenal. There are already hundreds of thousands servers running virtual machines. Initial adoption of virtual machine was restricted only to test and development, but now it has matured enough to become quite popular in production too.
About the Authors
Anurag Agarwal has more than 11 years of industry experience both in India and US. Prior to founding the KQInfotech, he was a technical director at Symantec India. Anurag has designed, developed various products at Symantec (earlier Veritas). During 2006-2007, Anurag has conceived the idea of Software Fault Tolerance for Xen at Symantec. He was awarded highest technical award of outstanding innovator in 2006 for this invention. Anurag has build and lead a team of ten people in India to take it from idea stage to product.
During the same time Anurag has started working with College of Engineering, Pune. There he and his friends offered a full semester course in Linux kernel. Anurag was also involved in mentoring a large number of students from various engineering colleges. This involvement in teaching and mentoring students resulted in formation of KQInfotech with training and mentoring focus.Prior to this, Anurag has architected scalable transaction system for Cluster file system at Symantec in USA. This architecture improved scalability of cluster file system from three nodes to sixteen nodes and beyond. He was awarded star award for this work. He has filed half a dozen patents at Symantec. Anurag has extensive knowledge in Solaris, Linux kernel, file system, storage technologies and virtualization.He has ME from Indian Institute of Science, Bangalore and BE from MBM Engineering College, Jodhpur.
After completing his post-graduation (iitb) in 2001, Anand had been working with Symantec India (Formerly Veritas). Prior to founding KQInfotech, he was a Principal Software Engineer at Symantec, chartered with the task of scoping and designing a support for windows on Xen based Fault Tolerance. He has worked for 6.5 years on the clustered filesystem product VxFS & CFS. He had architected the online upgrade for Veritas File system and designed the write fastpath which improved performance of the file system. He has also designed the integration of Power6 (powerPC) CPU feature of storage keys for the Veritas storage stack. He co-maintained technical relations with IBM for special proprietary kernel interfaces within AIX and designed a filesystem pre-allocation API for IBM DB2 database.