5 database monitoring issues that need your attention now

As the old management adage goes, “You can’t manage what you don’t measure.” In the fast-paced world of IT, this more accurately translates to, “You can’t improve what you don’t measure.”

When it comes to your database, a continuous monitoring system provides the foundation for progress: metrics. Without a monitoring system, it’s impossible to determine whether changes — server configurations, software releases, new business processes, etc. — actually impact the system’s performance, availability, and functionality.

Think of it like a science experiment: You wouldn’t trust research that didn’t measure the results. So why would you settle for faulty database metrics?

How Comprehensive Database Monitoring Fuels Innovation

Monitoring systems are a shared source of truth, where data is the key to incident response and diagnostics as well as to speeding up daily work. Ultimately, a poor or nonexistent monitoring system can lead to an ossified, slow-moving IT department that can’t innovate rapidly. And you don’t want dinosaurs in the tech department; we all know what happened to them.

In the absence of a good automated monitoring system, customers become the monitoring system, notifying the company of problems it didn’t know existed. Not only is this embarrassing, but it also creates an incredibly slow feedback loop.

With today’s rapid pace of innovation, operating at a snail’s pace won’t cut it. I’ve seen many modern IT leaders struggle to deliver quality projects on time and under budget because of inadequate monitoring systems.

If your database monitoring system shows any of the following symptoms, it’s probably time for an upgrade — before innovation comes to a screeching halt:

1. You Lack Monitoring Tools

First things first: If your database lacks a monitoring system, then, Houston, we have a problem. Databases are too often monitored only superficially — or not at all — because high-quality monitoring tools are few and far between.

One of the big reasons is that companies fall into the trap of believing a generic tool can monitor all of their systems. Just like people, databases are complex. Monitoring tools designed for standard metrics or yes/no answers won’t work. Databases need monitoring for specific concepts and components, not a one-size-fits-all approach.

Solution: Once you recognise the intricacy of your database, you can appreciate the need for database-specific monitoring. Then you can implement it. Database monitoring must be a top priority unless you want to experience downtime and performance issues.

2. False Alerts Run Wild

Ask any database administrator if he receives quality alerts from his monitoring systems, and the answer will be an emphatic no. Most DBAs receive a barrage of emails and alerts about everyday occurrences that aren’t problems at all. Meanwhile, real issues go unreported.

This arises, once again, from the overuse of simplistic, generic monitoring tools that only track and alert on vanity metrics, such as cache-hit ratios. To make matters worse, many DBAs believe the reason they don’t get alerted in time to prevent downtime is because they don’t have alerts on enough indicators, so they configure alarms on every metric they can dream up in hopes of catching transient issues such as system stalls and lockups. However, receiving too many alerts only distracts DBAs from real problems.

Most DBAs also have been taught to place thresholds on basic, bread-and-butter metrics. But those metrics constantly change based on customer usage and time of day or day of week, so there is never a single correct threshold. Tools that operate with static thresholds just don’t work.

Solution: Rather than measure superfluous vanity metrics — which is what many generic monitoring systems do — trim it down to basics. Try to decrease noise and monitor the fewest possible indicators of reliability problems. Focus on concurrency, latency, and limitations on fixed-size resources, such as the maximum number of permitted connections. Ask yourself if your basic KPIs are acceptable: Is the query latency within bounds? Is replication working and keeping up with changes?

3. Your System Can’t Effectively Monitor Queries

The single most important thing to monitor in a database is query activity, but many databases don’t expose this information in detail. As a result, the monitoring system relies on surrogate metrics, which have no real connection to query performance, application performance, or the user experience. To make matters worse, many monitoring systems have no concept of query monitoring, and even if they do, they can’t cope with analysing and storing the volume of query performance metrics that a busy database generates.

Solution: Although there is no easy answer, it’s almost always possible to produce a daily slow query report if you invest the time to do so. This report shows the top queries by total accumulated time and frequency. There also are resources available for query optimisation that can help you build a strong foundation for query insight.

4. Your In-House Staff Built the Monitoring Software

Monitoring software, especially the open-source variety, often is a pain to configure and install. And because it requires ongoing maintenance and is inefficient at scale, maintenance and hardware costs can quickly escalate. Most DBAs report that when they use open-source monitoring tools or build their own, they spend much of their week administering the database monitoring system instead of the actual database.

Solution: Buy, don’t build, your monitoring software. Building software is a highly technical and labor-intensive process that takes a dedicated full-time staff member to do well. Most DBAs already have full plates. Use a quality software-as-a-service solution so you benefit from economies of scale and high-level insight. Plus, you won’t have to host your own servers for the monitoring tool.

5. Your System Doesn’t Provide a Holistic View

Today’s companies rarely build on a single database. Instead, they employ a variety of databases specialised for distinct purposes — all deployed in distributed clusters with multiple servers working together.

With so many disparate servers, companies need to see the forest, rather than the trees, to measure and optimise resource allocation and identify performance problems. Unfortunately, most monitoring systems can’t provide that level of insight, forcing IT to rely either on generic tools that don’t offer insight or vendor-specific solutions that don’t provide cross-database functionality.

Solution: Choose mature technologies that can provide robust metrics and performance statistics, especially regarding query performance metrics. You should also use cross-platform monitoring tools to reduce the number of dashboards. Most importantly, the technician must be able to quickly drill down from a high-level view to individual servers. With monitoring tools that support server aggregation into services, you can see micro and macro views.

Don’t let your company’s inefficient or nonexistent database monitoring system bottleneck your innovation. Systems are constantly changing, and with each update or new release of your application, you need to fully understand the impact on your databases so you can release new, improved systems with confidence. Don’t risk becoming extinct. Invest the time and resources in developing a comprehensive monitoring system, and you’ll gain high-level insight that will pay dividends immediately and over the long run.

Baron Schwartz, founder of VividCortex, is one of the world’s leading experts on MySQL, and he has helped build and scale some of the largest web, social, gaming, and mobile properties.