War Stories: Cursed VLANs
This article is Part 7 in a 12-Part Series.
- Part 1 - War Stories: Loops that Permanently Broke the Network
- Part 2 - War Stories: Switches Lying about Duplex Mismatches
- Part 3 - War Stories: Check Point Meltdown
- Part 4 - War Stories: Dual-Vendor Firewall Strategy
- Part 5 - War Stories: Proxy ARP Auto-Configuration
- Part 6 - War Stories: Gratuitous ARP and VRRP
- Part 7 - This Article
- Part 8 - War Stories: Unix Security
- Part 9 - War Stories: ITIL Process vs Practice
- Part 10 - War Stories: Closing out Projects
- Part 11 - War Stories: Backup NICs, DNS and AD
- Part 12 - War Stories: Always Check Your Inputs
I’ve written before about switch ports being permanently disabled. This time it’s something new to me: VLANs that refuse to forward frames.
A Simple Network
The network was pretty straightforward. A pair of firewalls connecting through a pair of switches to a pair of routers:
Sub-interfaces were used on the routers and firewalls, with trunks to the switches. VLAN 100 was used for 100.100.100.0/24, and VLAN 200 was used for 200.200.200.0/24. The switches were configured to pass VLANs 100 & 200.
All was working as expected. All devices could see each other on all VLANs.
Until it stopped
We received reports that we’d lost reachability to Router A’s VLAN 200 sub-interface. After doing some investigation, we could see that Firewall-A could no longer see Router A’s MAC address on G0.200. But everything else was fine - the VLAN 100 interface worked perfectly. So we knew it couldn’t be a physical interface issue.
Hmmm. What’s going on? First instinct: check the switch port configuration. Has anything changed? Nope. VLAN 200 still there, configured as expected. The router & firewall were still tagging frames with VLAN 200. But they couldn’t see each other, and the switch wasn’t learning any MAC addresses on that VLAN.
Spanning-tree? Nope, all ports in forwarding state. VLAN still there in the VLAN database? Yep, looks OK. What about Firewall-B and Router-B? They can see each other, but they can’t see Firewall-A or Router-A. Switch-2 shows MAC addresses for Firewall-B and Router-B, but nothing on the link to Switch-1.
Why didn’t it work?
Workaround: Move to a new VLAN
We re-configured the Firewall & Router sub-interfaces to use VLAN 300 instead. We added that VLAN to the associated trunk ports, and everything started working. Full connectivity restored. But VLAN 200? It seems to be cursed.
I still haven’t figured out what happened here. Anyone ever seen anything like this?