Monitoring toolbox for the evolving IT landscape

(Image credit: Image Credit: The Digital Artist / Pixabay)

These days, companies increasingly rely on microservices-based architecture to deliver software faster and safer. The advent of microservices naturally paved the way for container and multi-cloud technology, empowering us to rethink how we build and deploy our applications. By 2020, more than 50 per cent of companies will have transitioned to container technology, up from 20 per cent only two years ago. Enterprises are also increasingly moving their workloads to multiple cloud providers to balance risk and take advantage of what the various cloud platforms have to offer. In fact, according to a Virtustream and Forrester Consulting study, 86 per cent of enterprises have incorporated a multi-cloud strategy.

To stay ahead of the curve, companies need solutions that are built to keep up with the changing IT landscape. To fully reap the benefits of a multi-cloud strategy — avoiding expensive downtime while maintaining complete visibility into your entire infrastructure — a future-proof, multi-cloud monitoring solution must be part of your IT strategy.

Challenges to prepare for

While a multi-cloud strategy offers virtually endless benefits and flexibility, there are more moving parts to keep track of when it comes to maintaining visibility into an organisation’s infrastructure. Containerised technology — including Kubernetes, an open source container-orchestration system for automating deployment, scaling, and management of containerised applications — requires we change our traditional approach to monitoring.

Kubernetes makes it a lot easier for teams to manage containers, allowing them to schedule and provision them while maintaining a desired state, automatically. It has served as a common platform to deploy applications wherever they run, whether that’s AWS, GCP, Azure, or bare metal. With all that power and automation come challenges, especially when it comes to keeping an eye on performance.

One key challenge in this new, dynamic era is that applications are constantly on the move. On top of having many more moving pieces to keep track of, availability is critical, and downtime is not only expensive but also damaging to business reputation. No matter the size of your deployment, you still need to know how many available resources you have in that deployment, as well as knowing the health of your deployed applications and containers.

Along with constantly moving applications, organisations are now dealing with smaller pieces to monitor. And while improved operational visibility through monitoring is often cited as a top priority among chief information officers (CIOs) and senior operations leadership, too often monitoring is an afterthought. Organisations without a proactive approach to monitoring and observability encounter downtime and endless “day two” (AKA, ongoing) operational challenges, including maintaining visibility and avoiding downtime.

Finding a future-proof monitoring solution

Ephemeral infrastructure is the new normal, and digital transformation, cloud migration, DevOps, containerisation and other initiatives are compelling movements in the modern enterprise. Although they vary in scope and overlap or intersect in practice, they are unified in purpose: to deliver increased organisational velocity, empowering businesses to ship more changes faster.

Here are key considerations when searching for a monitoring tool that keeps up with the ever-evolving IT landscape and the new multi-cloud reality:

  • Go beyond alerts. Problem detection is only the first step in an effective monitoring strategy. Once performance degradations or system failures are detected, action must be taken immediately. Operators should ensure that alerts are actionable, so they require their involvement, otherwise they should be provided with the tools they need to automate their monitoring workflows (such as auto-remediation). Alerts should be reserved for tasks that operators can't automate in order to truly streamline the monitoring process.
  • Don’t forget about the backend. Your solution’s backend should provide a horizontally scalable monitoring data processing solution. It should process data via event handlers, which can route metrics to your preferred data store (e.g. InfluxDB, Elasticsearch, etc), trigger automated remediation actions, or create and resolve tickets in PagerDuty or ServiceNow. As previously mentioned, it should extend far past alerts and give a thorough overview of the IT infrastructure.
  • Secure Transport: Your monitoring solution should support and use standard cryptography for communication. Look for a model that allows for a single agent to collect and transmit data securely over complex networks without having to compromise firewalls.
  • Maintain total visibility into your systems. Service checks are another key feature operators should look for when adopting a monitoring solution. Service checks allow operators to monitor services or measure resources and know exactly how much disk space is left, which is crucial to gaining visibility into server resources, services, and application health, as well as collecting and analysing metrics.
  • Find a flexible solution that allows multiple ways for data input. With the ubiquity of multi-cloud and multi-generational infrastructure (i.e., bare metal alongside the latest cloud-native technology), monitoring should be agile and able to collect monitoring data in a variety of ways, as opposed to being constrained by a single method of collection. Having a flexible, agent-based monitoring solution offers organisations customisable workflows tailored to their specific infrastructure. Finding a flexible solution also means you’ll have a monitoring tool that fits within multiple monitoring instrumentation ecosystems, taking advantage of their instrumentation libraries while allowing you to process and store data in the database of your choosing, like InfluxDB, Elastic, or Splunk.

Essentially, it’s important to find a monitoring solution that lets you monitor your entire infrastructure. As opposed to going with a tool whose model focuses purely on telemetry data, a flexible monitoring solution will work with industry standards technologies and formats (like Nagios and StatsD), giving additional context from events, health checks (i.e., not just metrics), the ability to process monitoring data as a workflow, and a secure solution.

Even if the forecasted 50 per cent of enterprises move to containerised and multi-cloud infrastructure by 2020, it’s not like the change will happen completely, overnight — they’ll still be contending with decades of older technologies and applications. Not only do companies need a flexible approach to monitoring multi-cloud and ephemeral infrastructure, they also require a solution that’s extremely effective at monitoring the long tail of their multi-generational infrastructure. As companies continue to innovate and adopt the latest technologies, it’s more important than ever they have a future-proof solution for maintaining visibility over their entire infrastructure, both old and new.

Sean Porter, co-founder and CTO, Sensu