Summit 09 Bonus: Licensed NetScanTools Pro - All Summit 09 attendees will receive a full licensed copy of NetScanTools Pro – a $249 value.
I recognized the tone in the voice that day - the panicked sound of someone dealing with a non-functional network. In this case, the network was a critical one (I can't disclose the specific type of network or customer).
At approximately 3:34am, their critical network came crashing down - no connectivity for any hosts on the network. They'd placed Wireshark on the network and it too crashed within moments.
Ok... so there was something definitely cruising along the network wreaking havoc. I had to see those packets!With over 2,000 miles separating us, it would be a 'walk through the capture' process.
The first step - dump the GUI!
Wireshark comes with Tshark for command-line capture. The syntax used was:
tshark -c 100 -w gen1.pcap
The -c parameter indicates the number of packets to capture. The -w parameter is used to define the name of the trace file to create. Why only 100 packets? What? Well... if there is a catastrophic issue on the network that could kill systems that connect to it that quickly, it shouldn't take many packets to characterize that traffic.
Immediately upon capturing these 100 packets, I instructed the customer to disconnect from the network. You don't need network access to analyze captured traffic - trace files are processed through the Wiretap Library - directly off the disk.
The 100 packets told the story - an insane looping packet storming through the network at a blazingly rapid packets per second rate. When facing a traffic issue like this it is important to look at the IP header's Identification value. You need to differentiate between a looping packet or a series of individual packets sent from a 'killer host' (and I mean killer as in "network killer").
If the IP Identification field value is the same for all the packets, then the packet is looping somehow. If the packet has a different IP Identification field then the packets have each passed through the IP protocol separately from a host. It's an amazingly simple differentiation - and an important one.If the packets had unique IP Identification field values, we'd be looking at a single host causing the problems. We'd be delving into the MAC header of the packet to identify the location of the lousy host. (Having a master list of MAC addresses for all hosts on the network is imperative in that situation. Mark that down as something to do this week!)
In this case, all the IP Identification field contained the same value - this was a looping packet. We had an infrastructure issue. On this heavily switched network it seemed spanning tree was not doing its job. Poor spanning tree - no one really pays attention to it until it has a problem.
Being remote to the customer location, I could not look over their shoulder as they walked the network and shut down one switch at a time. It was in their hands now. I sat waiting for their call - waiting to hear if they'd found the culprit. I didn't wait long.
I waited 30 looooong minutes for the call even though hit had taken the client less than 5 minutes to find a switch that was looping traffic back through the network. They spent the other 25 minutes starting up hosts on the network to ensure all was well. The switch was configured properly - so it would be replaced with another switch while they played with the problem switch in the lab (someday... someday).This offsite analysis hits a key point in troubleshooting - the devastating failures are typically easier to spot. They scream at us. They stomp their feet and throw things. All they want is to have someone listen.
Enjoy life one bit at a time...
Join us at Summit 09 on December 7-9th! You'll get a copy of NetScanTools Pro and 3 full days of hands-on individual and group labs focused on troubleshooting and security. Download the Summit Information Guide. All Access Pass members receive a 50% discount to Chappell Summit 09. Don't miss it!