How do you debate the nature of art? For some, art exists in the daubings of their favourite painter. For others, it lies in the level design of an especially enjoyable video game. For many IT professionals, art is seen in the elaborate construction of their IT architectures. Beauty, it seems, is very much in the eye of the beholder.
In the world of IT, you'll often come across IT professionals describing their architecture as though it were the Sistine Chapel—a unique, almost otherworldly achievement that you're unlikely to witness anywhere else.
This perception is especially strong when it comes to monitoring, with organisations claiming they require a monitoring solution that is as unique as their IT environment. The solution is perceived as a masterfully crafted tapestry of special APIs and context-sensitive command sets, perfectly poised to deliver results, but only for this one, beautiful snowflake of an architecture
This, I'm afraid, is nonsense. Monitoring is simple. That's not to say it requires little effort—good monitoring, which allows you to collect the statistics you need without injecting observer bias, requires hard graft. At its heart, however, monitoring is uncomplicated, and one particularly simple aspect of monitoring is in great demand, even if it may puncture IT professionals' illusions of artistic grandeur.
The aspect in question is automation, and may be the answer for those tired of the artist's brush.
Time to automate
As an IT professional with a perfectly happy relationship with your monitoring system, you may be asking why automation technology is important. The reason is pretty simple: it will make your life easier.
Most seasoned IT professionals would happily admit that responding to tickets, alerts, and emails can be tedious and time-consuming—hours better spent doing something more valuable. If, for example, an IT professional receives an alert from a monitoring system, what would be the next step? Clear a queue, perhaps, or restart a service.
What may seem a mildly irksome diversion will grow into something far more inconvenient should these alerts occur many times throughout a day. These actions, however, could easily be undertaken by automation tools, removing the need for human intervention.
Of course, if life were always that simple we'd be able to clock off early and have comfort knowing that our site architectures were in safe, machine-driven hands. As any IT professional will know, while many of these minor issues require the same actions every time, there will always be a case which seems impervious to your charms and require greater ingenuity to solve.
This is why IT professionals should be sure to adopt sophisticated monitoring systems, which allow you to create an alert that will trigger an immediate action, then waits a specific amount of time, and if the problem persists, a second action can be triggered. This can be repeated several times, with different actions, to ensure the issue is dealt with automatically, and not requiring you to leave your seat, so to speak.
Sometimes, however, no clear, definite action can be taken to solve a problem. How do you automate something that requires, say, a check of the last 10 lines of a log file? Review it against something else and then run a test query? Again, it's pretty simple: take the steps required to address the issue and insert them into an alert message.
This will then offer technicians a more detailed briefing than simply “This system is down,” with an alert offering insight into the conditions of the architecture, when the failure took place, instead of after the fact when the IT professional finally gets around to looking at it. This simplifies and streamlines the troubleshooting process, giving IT professionals their valuable time back.
The beauty of this approach is that even if the information provided isn't completely relevant to fixing the problem at that moment in time, most of the time it will be. Essentially, this positions your monitoring tool as a Level One diagnostician. Only one that never sleeps or takes breaks—the dream for most organisations.
So, how do you get started?
Plan of attack
The prospect of automation may seem daunting, but IT professionals can comfort themselves with the fact that this is not an all-or-nothing approach. A softly-softly-catchy-monkey approach would work best for most IT professionals looking to adopt automation technology, and whichever way you look to pull this off, you can relax knowing that you're not risking an oh-lord-my-systems-are-offline type scenario.
As with any IT adoption, planning is vital to the success of automation within your organisation. Here are some tips to help you get your ducks in a row:
• Know your test machines: Be sure to set up alerts so that they only trigger for the machines you actually want. Otherwise, opting for a quantity-over-quality approach will only waste more time, and could nullify the positive impact of IT automation.
• Learn to reset: It's worth knowing how your monitoring resets an alert to trigger it again, because you're going to be using that capability. A lot.
• Make friends with reverse thresholds: Your main alert will keep tabs on whether your CPU>90 per cent, yet you probably don't want to repeatedly spike test machines. To avoid this, turn that bracket around to CPU<90 per cent which, one hopes, would trigger much more reliably.
• Learn to log: Verbose logging is an important component in understanding exactly what is happening, and when. If your monitoring tool already has logging, then fire it up and regularly insert messages logging your steps. It may sound like a pain, but you'll thank me later.
• Taste your own medicine: Testing alerts is an important part of the automation project. So important, in fact, that you should avoid sending test alerts to anybody other than yourself. It may seem a bit masochistic, but it will help ensure that you are best informed as to the workings of your environment.
These five steps can help you get started when looking to embrace automation technologies, and ensure that early attempts at automated responses should be successful. Though bear in mind that you're not finished there—keep learning and refining your tool to ensure that you get exactly what you want from it, when you need it.
Automation, for all its benefits, is not art. Its role isn't to take your breath away, but to make your life easier. Trust me, you'll be grateful for such modest aspirations in the long run.
Leon Adato, Head Geek, SolarWinds
Image source: Shutterstock/Vasin Lee