Every network is different, but they all share similar challenges. For the engineers and IT managers tasked with monitoring, analysing and troubleshooting corporate networks, there are a multitude of alerts and alarms that can be triggered by abnormal activity.
In the case of organisations with large, distributed networks and multiple local branches or smaller offices, it can be even more difficult to isolate and remedy problems from a remote location. Sending dedicated staff to troubleshoot at those offices is both time-consuming and costly.
With tools like Insight and even free options like Spiceworks in the mix, there are more options than ever for engineers to intelligently monitor and prioritise the alerts that they receive. But do all alerts deserve the same level of attention? Are they all created equal? Actually, no. When dealing with edge networks, specifically, the most obvious and arguably most important alerts are related to network utilisation.
Let’s take a look at the three kinds of network problems that engineers should be monitoring and dealing with most actively.
No news is generally considered to be good news, but in the world of networks it usually means that something has gone terribly wrong. If there is no traffic on the network it could be because the ISP connection is down, the cable modem might be malfunctioning, or something may simply have been unplugged. There are a million other possibilities, too.
It is rare that a network used by even a single machine doesn’t send some traffic every couple of seconds. But if traffic is not guaranteed, then it’s best to set up a ping to ensure that there will always be a minimum of expected traffic. If the connection is down, then a network monitoring appliance like Savvius Insight also will not be able to successfully send an alert, but it could still generate event data that can be analysed during a post-capture forensics investigation. In other words, no traffic is bad news, and the more information you have, the better prepared you will be to resolve it quickly next time.
High utilisation itself is not necessarily a problem, unless it results in other issues such as slow response time. An alert on high utilisation is more of an indicator that the activity on the network is getting close to its limit, and it may be time to increase capacity. If the ISP is charging based on usage, high utilisation may be an indicator to decrease usage in order to manage costs. Either way, knowing when high utilisation occurs is helpful when managing a network.
The question then becomes more about utilisation threshold. How long should there be high utilisation before an alert is sent? Spikes can happen, and you might not want to get an alert for every one, but then again maybe you do. In order to customise the alert threshold, you will need a configurable solution like Insight, as well as the conditions to reset the alert before another one can go off.
Slow Response Time
Finally, we have a different scenario. The network is up, there is traffic and no problem with high utilisation. But response time is slow.
Response time refers to the time required for a request from a client to receive a reply. Considering how much of our interactions are done through web browsers today, the fault may lie with a slow cloud-based application. Of course, it could also be the network. The bottom line is that slow response times mean unhappy and unproductive users. And the problem here is that, unlike a dead network, users will often grudgingly tolerate slow networks without reporting them.
This is where an appliance like Insight becomes so valuable, because it monitors the network for slow response times and generates alerts when they occur. Once detected, a troubleshooting and analytics suite like Omnipeek, or even Wireshark, will have some tools to help fix the problem.
Chris Bloom, Technology Evangelist, Savvius, Inc.
Photo Credit: Palto/Shutterstock