Skip to main content

Cloud cost management: Maximising cloud spend during Covid-19

(Image credit: Image Credit: Melpomene / Shutterstock)

Enterprises around the world are set to spend a whopping $266 billion on cloud computing infrastructure services in 2020, according to a report from Gartner. This is not surprising given that public cloud usage has seen a massive increase in adoption over the last few years. However, IT teams are having a difficult time predicting costs and often don’t have any internal policies around cloud spending. Cost management is typically the first issue organisations want to tackle when they begin a cloud transformation journey. However, cloud cost optimisation is not a one-time fix. The most effective cloud cost management strategy requires vigilance and a commitment to continual review as things change quickly and constantly. In this article, we’ll discuss how to manage cloud expenses, and take a look at the different cloud service types to gain a better understanding of all the expenses there are to manage with each particular cloud offering.

Cost considerations with the private cloud

The other issue is with day-two operations, especially if it's a cloud that is cobbled together with a lot of different components from different vendors. Keeping all of that running smoothly can be quite a challenge, and the cost of downtime must be considered.

When building a private cloud, there are two main considerations. The first consideration is what the true cost will be to get things up and running. When starting from scratch, there are professional services involved, architectural discussions, tools to be evaluated, open source technologies to be considered, scale issues, performance issues, etc. A lot of the initial implementation of a private cloud can take a significant amount of time and resources.

The second consideration is operations. With operations, there needs to be a high level of governance because of the utilisation issues and costs to ensure that you have the right team in place to keep things running smoothly in an on-prem cloud environment, is very high.

A private cloud is a shared platform, but in large enterprises, people tend to carve out their own resources and keep them to themselves. In the old days, this type of mentality made sense because it was very hard to retrieve these resources once they were given up. Unfortunately, that kind of attitude often persists today, which is something that enterprises need to keep an eye out for.

Everything that a public cloud does behind the scenes, all of the heavy lifting of day-two operations and so on, an organisation is now responsible for with the private cloud. This means having to build an experienced team that knows how to run the cloud, maintain SLAs, and maintain an experience for the end-users that is seamless. Essentially, making sure there is the same level of service and experience that they would normally receive with a public cloud.

The main cost in the private cloud is hiring the right people to manage it at scale. Highly skilled people that work with cloud technologies are hard to come by right now. Which makes it even more difficult to retain team members that are skilled in cloud because the market gives them the flexibility to pursue opportunities that interest them the most or offer the best incentives.

Cost considerations with the hybrid cloud (private and public)

A classic example of a hybrid cloud use case is an e-commerce company. When the holidays come around, they have a huge surge in site traffic and users. In fact, many e-commerce companies make around 80 per cent of their revenues during the holiday season. They have huge volumes that they have to deal with at a time when all of their systems need to be running flawlessly. However, many companies don’t have the infrastructure capable of handling this increase in volume.

Let's say an e-commerce company is using an on-prem data centre. From February to October, the demand is 20 per cent, and then the peak is at 80 per cent from November to December, but they only have infrastructure that deals with 20 per cent. What happens when the peak loads come in?

What usually happens is they end up buying a lot of extra capacity and keep it lying around idle for 80 per cent of the year. Then when the holiday season comes, they'll activate that extra infrastructure so they can service the demand that's coming in, which is a highly inefficient solution.

This is where the public cloud comes in to play. For 80 per cent of the time, they can be using the private cloud, and then when the holiday season hits, they can turn on the public cloud temporarily for two months in order to service that extra load. Once the holidays are over, they shut everything down in the public cloud and move back to their private cloud.

What about a company that might have demand that's unpredictable? Sometimes they may have peaks that happen towards the end of the month, and then the rest of the time there's not much going on. In all of these cases, it’s useful to have a combination of public and private clouds.

Now, how are costs optimised? It needs to be on-demand. Meaning there needs to be an auto-scaling strategy and a bursting strategy out to a public cloud. This implies that there not only needs to be an auto-scaling strategy and a bursting strategy out to a public cloud but also that the application is designed to be stateless and is highly fault-tolerant.

Whenever there is a demand that exceeds capacity, the workloads automatically burst out to the public cloud and run there. Once that demand is over, the automated mechanisms shut everything down and bring the workloads back on-prem. Trying to do this manually will result in all the problems that have been mentioned earlier with the public cloud. As long as the VM is running, the clock is ticking, and the bills are going up. So, the faster it is to shut things off and on, the more cost-efficient an organisation will be. 

Then there's the concept of spot instances, which can reduce costs in the public cloud. The public clouds don't guarantee performance or even availability with spot instances, but they could be 70 per cent to 80 per cent cheaper than regular instances. For non-critical workloads, running something overnight, this can be done using spot instances, which reduce costs significantly. These instances are a lot cheaper, but while cloud providers guarantee performance and reliability, they don't guarantee capacity or availability as they would with other instances. In other words, you may not get spot instances when you ask for them or they may not stay up once you get them.

Using spot instances is a supply-demand situation, almost like the stock market. The instance prices vary significantly over time, so optimise when to use them. This can be done with a hybrid cloud where, if it makes sense, use spot instances and burst out to the public cloud, which can be a huge cost saver.

Cost considerations with multiple public clouds

Upstream container standards like Docker or Kubernetes (which are designed to be portable from the ground up) package up an application and run it on a public cloud. If that doesn’t work, or if it's too expensive, it’s easy to take that container and run it on another public cloud because of the portability. Using containers and open-source standards like Kubernetes help with that single fabric that will allow organisations to be cloud-agnostic to the underlying public cloud and be able to take advantage of multi-cloud scenarios.

When using multiple clouds, each of the cloud vendors has its own mechanisms, dashboards, visibility tools, and governance tools. The challenge here is trying to collate all that information into a single pane of glass.

When in a multi-cloud scenario, it is important to have a central management approach that gives the costs, capacity, and utilisation of resources in each cloud at a high level. Teams need to be able to manage costs and cost per unit across multiple clouds. Try using spot instances to cut costs in any single cloud or move workloads between different clouds.

Moving workloads between clouds is an extremely hard thing to do, however. Typically, once an application is built, organisations are "locked-in" to a cloud provider because they are likely using many proprietary services that are not portable. This makes it a very labour-intensive and error-prone process to take a workload out of a cloud and move it into another public cloud.

Something else to keep in mind when working in the public cloud is to build portable applications and don’t use any proprietary lock-in capabilities within that specific cloud provider. If an organisation is using very high-level capabilities that do not exist anywhere else, they are going to be locked-in to that provider. One way to build portability into applications is to use something like containers.

Cost considerations with the public cloud

The best way to optimise costs when using a public cloud involves governance and monitoring the resources that teams are using. With the public cloud, it is very easy to spin up new resources because it is just a simple login and a credit card swipe. So, whenever teams need more resources, they spin them up and don't govern their usage or utilisation in a way that allows them to optimise and run things efficiently.

Whether teams use 5 per cent of a virtual machine or 90 per cent, they’re still paying the full price for each resource. Therefore, one of the most effective ways to optimise costs in the public cloud is to encourage careful governance, utilisation, and visibility.

Optimising costs in the public cloud also comes down to discipline and quotas. If there are quotas around what the end-users can use, and for how long they can use it, then organisations need to implement policies around the expiry of virtual machines. Similarly, if certain machines are not being fully utilised, then go back and consolidate them or right size them to fit the needs of the workload. Automatically cleaning up unused resources is a good practice to ensure there are no unused capacity that is being paid for unnecessarily. Another best practice is to turn off resources during low-utilisation time periods such as weekends and late hours.

In a nutshell, optimising costs in the public cloud is all about management. And, of course, if teams have long-term running workloads that are not necessarily using those resources efficiently, there are other options, like moving to a private cloud or using a hosting provider. There are also ways to leverage sustained use discounts from public cloud vendors, such as making reservations in AWS. If you have guaranteed usage, you can save by negotiating volume discounts with vendors. The market is extremely competitive in terms of costs and all of the public cloud vendors are fighting fiercely for customers.

Simply put, the cloud is a utility and it needs to be managed as such – cloud costs need to be reported and allocated appropriately, cloud services need to be optimised, and in order to reap the benefits of the cloud these cost control actions need to be automated. Whether cloud expense management is a full-time, or secondary responsibility, it is important to build it into any cloud management strategy from day one. It will take time but what the return is increased optimisation and validation of cloud services and costs, ensuring maximised ROI.

Kamesh Pemmaraju, Platform 9