War Stories: Cursed VLANs

I’ve written before about switch ports being permanently disabled. This time it’s something new to me: VLANs that refuse to forward frames.

A Simple Network

The network was pretty straightforward. A pair of firewalls connecting through a pair of switches to a pair of routers:

Cursed VLAN

Sub-interfaces were used on the routers and firewalls, with trunks to the switches. VLAN 100 was used for, and VLAN 200 was used for The switches were configured to pass VLANs 100 & 200.

All was working as expected. All devices could see each other on all VLANs.

Until it stopped

We received reports that we’d lost reachability to Router A’s VLAN 200 sub-interface. After doing some investigation, we could see that Firewall-A could no longer see Router A’s MAC address on G0.200. But everything else was fine - the VLAN 100 interface worked perfectly. So we knew it couldn’t be a physical interface issue.

Hmmm. What’s going on? First instinct: check the switch port configuration. Has anything changed? Nope. VLAN 200 still there, configured as expected. The router & firewall were still tagging frames with VLAN 200. But they couldn’t see each other, and the switch wasn’t learning any MAC addresses on that VLAN.

Spanning-tree? Nope, all ports in forwarding state. VLAN still there in the VLAN database? Yep, looks OK. What about Firewall-B and Router-B? They can see each other, but they can’t see Firewall-A or Router-A. Switch-2 shows MAC addresses for Firewall-B and Router-B, but nothing on the link to Switch-1.

Why didn’t it work?

Workaround: Move to a new VLAN

We re-configured the Firewall & Router sub-interfaces to use VLAN 300 instead. We added that VLAN to the associated trunk ports, and everything started working. Full connectivity restored. But VLAN 200? It seems to be cursed.

I still haven’t figured out what happened here. Anyone ever seen anything like this?