Wednesday, May 25, 2011

Too little bandwidth and too little time to figure out who is using it all

The Problem

Have you ever been using the Internet in your office or place of residence and experience a problem where your connection becomes SUPER slow. It isn't just affecting your computer though, everyone around you is experiencing the same issue. I am talking about where you start to get messages on Gmail that say "taking a long time to load...still trying". Or maybe the online radio you are listening to sounds like an scratched up CD where every other word is skipped.

This is a problem that I was tasked with identifying and solving. The basic business class Internet service provided by Time Warner is 7 Mbps download and 1 Mbps upload. While this is ample in most cases when services like basic HTTP(S), email and instant messaging are being used, it can easily get out of hand when you introduce specialized applications in the environment that use high amounts of bandwidth for extended periods of time.

For example, let's say you are using some type of synchronization solution (such as a backup utility) to sync your work computer files with a home (or other off site server). By default, these programs are set to utilize the full amount of bandwidth available to perform the synchronization as soon as possible. It is easy to identify the source of the problem when you are the one that is causing the issue because you experience the network degradation of turning on the synchronization solution. However, how do you figure it out when you are just another computer on the network and the problem seemingly comes out of nowhere.

The Solution

I always use Google as a baseline for my connectivity tests. PING tests showed that there was high latency when connecting to Google. While during normal network operations, ping times are < 10 milliseconds (ms), I noticed that at the time that the problem was occurring, PING times were upwards to 500+ ms. In addition to that, network performance is down the drain with the Google website taking on average 5 - 10 literal seconds to load.

The Tools

Spare server
Managed Switch

Enter "nTop" stage right

Luckily, I had access to a spare server and a managed switch that allowed me to setup a SPAN (tap) port. Combine these two things with the free open-sourced program "nTop" and you have all of the tools to identify users who use the offending applications on your network with ease. Once everything is setup correctly, you can access the nTop webpage and see on a per-ip basis how much bandwidth is being used, the servers being connected to, the amount of traffic that has been transferred as well as other useful statistics.

Now, whenever the network gets slow, I pop into nTop, identify the user that is using all of the available bandwidth and notify them. Once they quit the offending application, all goes back to normal and all is happy again.

No comments:

Post a Comment