Mincebert's blog: March 2015

An interesting aside here — I've been updating my GNS3 model of the University of Cambridge network to include a second connection to Janet (the UK Education and Research network). We want to run the links and NAT boxes (which are ASA5580-20s) in active-active mode so I've had to add those.

I can't be bothered trying to wrestle with a pair of ASAs in multi-context mode (which you need for active-active) so I looked at using NAT under IOS on a 7200 (the platform we use as the router equivalent of our Catalyst 6500-Es in GNS3). I don't need stateful switchover, but I do want the basic service to failover in the simulation.

This turns out to be very simple as the 7200s support NAT in VRFs and we can use HSRP to do the active-active balancing.

Background

CUDN border and NAT from GNS3

Our NAT service works by putting an ASA on a stick, attached to the border routers. Some PBR (Policy Based Routing) redirects traffic destined for the internet (via Janet) to the ASA that comes from University-wide private addresses (which are RFC1918 addresses we route internally and NAT when they leave).

This arrangement means that only traffic to be NATed needs to go through the ASAs: IPv4 public IP addresses and IPv6 flow straight through, so we don't have to worry about handling those, which not only reduces load on the ASAs but also means we don't have to work out how to get things like multicast through them.

There are two ASAs operating in a pair, handling roughly half of the private addresses each. This is done by putting half of the private addresses through one context, normally active on one box, and the other half through a second context, normally active on the other. If either fail, one ASA takes over all the load.

The inside of the ASAs is on a /29 link subnet with static routing: the ASAs provide a redundant first hop address for traffic to them for the router to redirect traffic two. The outside is a /24 block to provide a pool of public IP addresses to NAT behind. That subnet is a effectively a regular client subnet.

NATed traffic coming back in goes to the public range, gets de-NATed by the ASAs and sent back into the network on the inside /29.

The router provides first hop redundant gateways on both the inside and outside networks, although in practice there is only one at present.

There are separate inside /29s and outside /24s for each half of the private addresses to be NATed.

In the GNS3 simulation, we're going to replace the ASAs with IOS routers doing NAT.

Router configuration

The configuration of the routers doing the PBR is identical to the real production ones.

First, we create the interfaces to link to the NAT - this is the configuration for the first router, which is going to handle the outside range 131.111.184.0/24 by default (the other outside range will go through 131.111.185.0/24 to load balance - see the BGP configuration later):

interface Ethernet1/2.1981

description nat-1-outside

encapsulation dot1Q 1981

ip address 131.111.184.253 255.255.255.0

no ip proxy-arp

standby version 2

standby 81 ip 131.111.184.254

standby 81 timers 1 3

standby 81 priority 200

standby 81 preempt

standby 81 track 30 decrement 50

interface Ethernet1/2.1982

description nat-1-inside

encapsulation dot1Q 1982

ip address 193.60.92.34 255.255.255.248

no ip proxy-arp

standby version 2

standby 82 ip 193.60.92.33

standby 82 timers 1 3

standby 82 priority 200

standby 82 preempt

standby 82 track 30 decrement 50

interface Ethernet1/2.1983

description nat-2-outside

encapsulation dot1Q 1983

ip address 131.111.185.253 255.255.255.0

no ip proxy-arp

standby version 2

standby 83 ip 131.111.185.254

standby 83 timers 1 3

standby 83 priority 190

standby 83 preempt

standby 83 track 30 decrement 50

interface Ethernet1/2.1984

description nat-2-inside

encapsulation dot1Q 1984

ip address 193.60.92.42 255.255.255.248

no ip proxy-arp

standby version 2

standby 84 ip 193.60.92.41

standby 84 timers 1 3

standby 84 priority 190

standby 84 preempt

standby 84 track 30 decrement 50

Then we create access lists to match the private address to be NATed - we use all of 172.16.0.0/12, except for 172.31.0.0/16 (don't ask!):

ip access-list extended nat-1_clients

deny ip any 128.232.0.0 0.0.255.255

deny ip any 129.169.0.0 0.0.255.255

deny ip any 131.111.0.0 0.0.255.255

deny ip any 192.18.195.0 0.0.0.255

deny ip any 193.60.80.0 0.0.15.255

deny ip any 193.63.252.0 0.0.1.255

permit ip 172.16.0.0 0.7.255.255 any

deny ip any any

ip access-list extended nat-2_clients

deny ip any 128.232.0.0 0.0.255.255

deny ip any 129.169.0.0 0.0.255.255

deny ip any 131.111.0.0 0.0.255.255

deny ip any 192.18.195.0 0.0.0.255

deny ip any 193.60.80.0 0.0.15.255

deny ip any 193.63.252.0 0.0.1.255

permit ip 172.24.0.0 0.3.255.255 any

permit ip 172.28.0.0 0.1.255.255 any

permit ip 172.30.0.0 0.0.255.255 any

deny ip any any

We then create the route-maps to do PBR and redirect traffic across the /29 'inside' links:

route-map nat_redirect permit 110

match ip address nat-1_clients

set ip next-hop 193.60.92.38

route-map nat_redirect permit 120

match ip address nat-2_clients

set ip next-hop 193.60.92.46

... and attach them to the inside interfaces (linking to the core routers):

interface Ethernet1/0

description CORE-CENT

ip policy route-map nat_redirect

interface Ethernet1/1

description CORE-MILL

ip policy route-map nat_redirect

I mentioned we were going to use this router to handle the outside range 131.111.184.0/24 and the other 131.111.185.0/24. To steer inbound traffic via this router, we want to advertise that prefix to Janet explicitly, so we need to add this to our BGP configuration:

router bgp 64602

address-family ipv4 unicast

network 131.111.184.0 mask 255.255.255.0

exit-address-family

address-family ipv4 multicast

network 131.111.184.0 mask 255.255.255.0

exit-address-family

And also add that range to the outbound prefix list - I've created a new prefix-list under a specific name as I think it best to keep lists the same across routers, if their names are the same - these are not, so I've changed it on each:

ip prefix-list janetc-out_prefixes seq 5 permit 128.232.0.0/16

ip prefix-list janetc-out_prefixes seq 10 permit 129.169.0.0/16

ip prefix-list janetc-out_prefixes seq 15 permit 131.111.0.0/16

ip prefix-list janetc-out_prefixes seq 20 permit 192.18.195.0/24

ip prefix-list janetc-out_prefixes seq 25 permit 192.84.5.0/24

ip prefix-list janetc-out_prefixes seq 30 permit 192.153.213.0/24

ip prefix-list janetc-out_prefixes seq 35 permit 193.60.80.0/20

ip prefix-list janetc-out_prefixes seq 40 permit 193.63.252.0/23

ip prefix-list janetc-out_prefixes seq 45 permit 131.111.184.0/24

Finally, you may have spotted the track directive to HSRP above, in the interface definitions. This is to cause HSRP to lower its priority, if this router loses direct connectivity to Janet. The 7200 can't track the metric of a BGP router (unlike a Catalyst 6500), so I've just made it track the interface connecting to Janet:

track 30 interface Ethernet1/3 ip routing

This causes the gateways on the outside and inside interfaces to be handled by the router which has the best connectivity to Janet.

NAT configuration

The 7200 supports NAT in VRFs - while we don't strictly need to use them, it is a nice way of keeping the routing tables for the two NAT context separate (otherwise it wouldn't be clear which outside gateway was going to be used as the default route back to the router to go on to Janet).

First, create the VRFs:

vrf definition nat-1_vrf

rd 64602:1981

address-family ipv4

exit-address-family

vrf definition nat-2_vrf

rd 64602:1983

address-family ipv4

exit-address-family

Then some access lists to identify the clients to be NATed:

ip access-list standard nat-1-clients

permit 172.16.0.0 0.7.255.255

ip access-list standard nat-2-clients

permit 172.24.0.0 0.3.255.255

permit 172.28.0.0 0.1.255.255

permit 172.30.0.0 0.0.255.255

Create the two NAT pools and mappings (note the 'vrf' option and 'match-in-vrf' which is essential when using VRFs):

ip nat pool nat-1c-pool 131.111.184.1 131.111.184.1 netmask 255.255.255.0

ip nat inside source list nat-1-clients pool nat-1c-pool vrf nat-1_vrf match-in-vrf overload

ip nat pool nat-2c-pool 131.111.185.1 131.111.185.1 netmask 255.255.255.0

ip nat inside source list nat-2-clients pool nat-2c-pool vrf nat-2_vrf match-in-vrf overload

The configure the outside and inside interfaces, putting them in the appropriate VRFs and configuring NAT. Note that we need HSRP (with different priorities across both contexts, to balance load) on the inside as that is used as a static route destination by the connecting router. However, we don't need this on the outside as we're using different outside addresses on each router (131.111.184.2 and 131.111.185.2, respectively):

interface Ethernet1/0.1981

description nat-1-outside

vrf forwarding nat-1_vrf

encapsulation dot1Q 1981

ip address 131.111.184.250 255.255.255.0

no ip proxy-arp

ip nat outside

interface Ethernet1/0.1982

description nat-1-inside

vrf forwarding nat-1_vrf

encapsulation dot1Q 1982

ip address 193.60.92.37 255.255.255.248

no ip proxy-arp

ip nat inside

standby version 2

standby 162 ip 193.60.92.38

standby 162 timers 1 3

standby 162 priority 200

standby 162 preempt

interface Ethernet1/0.1983

description nat-2-outside

vrf forwarding nat-2_vrf

encapsulation dot1Q 1983

ip address 131.111.185.250 255.255.255.0

ip nat outside

interface Ethernet1/0.1984

description nat-2-inside

vrf forwarding nat-2_vrf

encapsulation dot1Q 1984

ip address 193.60.92.45 255.255.255.248

no ip proxy-arp

ip nat inside

standby version 2

standby 164 ip 193.60.92.46

standby 164 timers 1 3

standby 164 priority 190

standby 164 preempt

Finally, a bit of static routing for the inside and outside destinations, across both VRFs:

ip route vrf nat-1_vrf 0.0.0.0 0.0.0.0 Ethernet1/0.1981 131.111.184.254

ip route vrf nat-1_vrf 172.16.0.0 255.248.0.0 Ethernet1/0.1982 193.60.92.33

ip route vrf nat-1_vrf 172.24.0.0 255.252.0.0 Ethernet1/0.1982 193.60.92.33

ip route vrf nat-1_vrf 172.28.0.0 255.254.0.0 Ethernet1/0.1982 193.60.92.33

ip route vrf nat-1_vrf 172.30.0.0 255.255.0.0 Ethernet1/0.1982 193.60.92.33

ip route vrf nat-2_vrf 0.0.0.0 0.0.0.0 Ethernet1/0.1983 131.111.185.254

ip route vrf nat-2_vrf 172.16.0.0 255.248.0.0 Ethernet1/0.1984 193.60.92.41

ip route vrf nat-2_vrf 172.24.0.0 255.252.0.0 Ethernet1/0.1984 193.60.92.41

ip route vrf nat-2_vrf 172.28.0.0 255.254.0.0 Ethernet1/0.1984 193.60.92.41

ip route vrf nat-2_vrf 172.30.0.0 255.255.0.0 Ethernet1/0.1984 193.60.92.41

And that's all there is to it!

This doesn't give stateful failover (preserving translations), but my simulation only needs to pass traceroutes and pings, so it seems a lot of unnecessary work, as that's completely different on the ASAs.

OK - I've spent a day getting annoyed by this! I was trying to get two Nexus 56128Ps (running NX-OS 7.0(3)) across their management interfaces with CFS to synchronise configuration with switch-profiles.

I had the Mgmt0 interfaces connected to a Cisco 2960 as access ports, with no other connections in that VLAN and everything worked fine.

However, when I tried to connect an uplink from the 2960 to our main network (on a test VLAN) synchronisation broke with show switch-profile status reporting that the Peer is unreachable. Disconnecting the cable fixed the problem again, immediately.

The problem

After a lot of mucking about, this turned out to be an IGMP issue - the Management0 port on the Nexus switches advertise their presence to each other using multicast messages to a group (239.255.70.83). However, they weren't sending IGMP Membership Report messages to indicate they themselves want to join the group, preventing the the announcements from reaching each other.

When the switch was not connected to the rest of the network, there was no IGMP Querier, so the switch resorted to flooding multicast traffic. However, when connected to the main network, the IGMP Membership Query messages from the router started reaching the 2960 and it started to limit flooding.

Pulling the uplink cable from the 2960 immediately aged out the Querier and flood recommenced. However, if the VLAN was severed in a way not known to the 2960 (e.g. removing the VLAN from the upstream switch), the Querier would take 3 minutes to expire (as expected) before things began to work again.

After some poking about, fiddling around with the configuration of the router, it appears that IGMPv2 is supported by the management interface but IGMPv3 (which is our default) is not.

The fix

Fixing this could be solved in one of three ways:

Disabling multicast routing on the VLAN,
Changing the IGMP version to 2 (instead of 3), if this has been raised, or
Disabling IGMP Snooping on the switches on the management VLAN (e.g. no ip igmp snooping vlan XXX)

I can't find mention of this in the Cisco documentation, nor a way of changing the IGMP version on the Nexus 56128Ps.

Mincebert's blog

Friday 27 March 2015

NAT in a VRF on a 7200 (in GNS3)

Background

Router configuration

NAT configuration

Friday 6 March 2015

Nexus management port not sending IGMP Membership Reports

The problem

The fix

Wednesday 4 March 2015

FEXs on a Nexus 7010

About Me