The risks of virtualisation in public and private clouds

Server virtualisation is one of the cornerstone technologies of the cloud. The ability to create multiple virtual servers that can run any operating system on a single physical server has a lot of advantages.

These advantages can appear in public, Infrastructure as a Service (IaaS), as well as in private clouds.

However, it also brings some risks.

Virtualisation reduces the number of physical servers required for a given workload, which brings cost benefits. It also allows for more flexible sizing of computer resources such as CPU and memory. This in turn tends to speed up development projects, even without automatic provisioning. Virtualisation can even increase the security of IT because it is easier to set up the right network access controls between machines.

So in order to get real benefits, steer clear of the risks. A pretty extensive overview of these risks was written by the US National Institute of Standards and Technology (NIST). You can find it at Special Publication 800-125. This article is partly based on that.

Let us first get some of the important concepts straight. The host is the machine on which the hypervisor runs. The hypervisor is the piece of software that does the actual virtualisation. The guests then, are the virtual machines on top of the hypervisor, each of which runs its own operating system.

The hypervisors are controlled through what is called the ‘management plane’, which is a web console or similar that can remotely manage the hypervisors. This is a good deal more convenient than walking up to the individual servers. It is also a lot more convenient for remote hackers. So it is good practice to control access to the management plane. That might involve using two-factor authentication (such as smart cards), and only giving individual administrators the access that is needed for their task.

An often mentioned risk of virtualisation is the so-called ‘guest escape’, where one virtual machine would access or break into its neighbor on the same virtual machine. This could happen through a buggy hypervisor or insecure network connections on the host machine. The hypervisor is a piece of software like any other software. In fact, it is often based on a scaled-down version of Linux, and any Linux vulnerability could affect the hypervisor. However, if you control the hypervisor, you control not just one system, you can control the entire cloud system. So it is of the highest importance that you are absolutely certain that you run the right version of the hypervisor. You should be very sure of where it came from (its provenance), and you should be able to patch or update it immediately.

Network Design

Related to this is the need for good network design. The network should allow real isolation of any guest, so that they will not be able to see any traffic from other guests, nor traffic to the host.

An intrinsic risk of server virtualisation is so called ‘resource abuse’, where one guest (or tenant) is overusing the physical resources, thereby starving the other guests of the resources required to run their workloads. This is also called the ‘noisy neighbour’ problem. To address it can require a number of things. The hypervisor might be able to limit the over usage of a guest, but in the end, somebody should be thinking about how to avoid putting too many guests on a single host. That is a tough balance to strike: too few guests means you are not saving enough money, too many guests mean you risk performance issues.

In the real world, there are typically a lot of virtual servers that are identical. They run from the same ‘image’, and each virtual server is then called an instance of that image, or instance for short.

Then, with virtual servers it becomes easy to clone, snapshot, replicate, start and stop images. This has advantages, but also creates new risks. It can lead to an enormous sprawl or proliferation of server images that need to be stored somewhere. This can become hard to manage and represents a security risk. For example, how do you know that when a dormant image is restarted after a long time, that it is still up to date and patched? I heard a firsthand story of an image that got rootkitted by a hacker.

So the least you should do; is do your anti-malware, virus and version checking also on images that are not in use. Even when you work with a public IaaS provider, you are still responsible for patching the guest images.

In summary, server virtualisation brings new power to IT professionals. But as the saying goes, with great power comes great responsibility.

(Image source: Shutterstock)