Cisco
ASA Upgrade to 9.0(2)
“In theory there is no difference between theory and practice. In practice there is.” Yogi Berra
Usually, no matter how much planning, testing, thinking about, or stalling you build in to an upgrade project, the bill comes due at the end and isn’t what you expected. Maybe someone ordered the lobster, or multiple drams of Johnny Walker Blue, but either way you have a situation to deal with… right now. How you deal with it determines many things about your skills and your character, but that’s for another post as I’m going to try my best to keep this one short and to the point.
I recently had the wonderful experience of upgrading an Active/Passive failover pair of ASA units to the newest of the new code, 9.0(2) from 8.4.2(8). After the 8.3 kerfuffle (NAT changes automagically, anyone?) I was particularly keen to not miss any possible gotchas in this upgrade. I also scheduled a larger than usual maintenance window–even though we didn’t expect any downtime–just in case.
I should interject here, for those of you aghast at the thought that we would possibly implement new, some would say, bleeding edge code on production systems, a couple of things:
- We always run bleeding edge code, usually because we have need of the bleeding edge features the code brings. In this case, IPv6 features sorely lacking in prior code versions
- We have adopted a very aggressive IPv6 stance, as I have written about before, and we tend to find our aspirations and designs are well out in front of significant portions of the code available for our equipment
- Noting the prior two items again, I’ll also add that Firewalls, in particular, seem to have code that is months or years behind the route/switch world. That holds true across all vendor platforms. Why? I don’t know, but that’s another post.
With our need-for-upgrade bona fides established, I dutifully read the entire Release notes for the 9.0(x) ASA code and while I was excited at many of the new features–mostly around IPv6–and disappointed at others (No OSPFv3 address families? Really?) something immediately jumped out at me: Page 19 and its disturbing title, “ACL Migration in Version 9.0”
Any time you see the word “migration” in any documentation referring to an upgrade of production code or configuration, you know two things:
- It probably happens auto-magically, which is basically a synonym for “we’re going to bork your code but we’re only going to loosely tell you where, how, and why.”
- You’d better have good backups and be prepared, because a roll-back is likely going to be as painful as just plowing ahead.
To summarize the boring details for you, prior to the new code you had two categories of Access Control Lists (ACLs): those for IPv4 and those for IPv6. Inside each of those macro-levels you had the normal standard and extended lists and whatever other features. You applied the IPv4 and the IPv6 access-lists to the interface in whatever direction and that was that. True to the dual-stack model, you really were running two parallel networks and never the ‘twain shall meet.
During and after the upgrade to 9.0(x) ASA code, a couple of things happen:
- IPv6 standard ACLs are no longer supported, and any you have are migrated to extended ACLs.
- If IPv4 and IPv6 ACLs are applied in the same direction, on the same interface they are merged.
- The new keywords any4 and any6 are added in place of the old any keyword.
- Supposedly, if certain conditions are met (and they were in my case) your IPv4 and IPv6 ACLs should be merged into one (they were not).
While it is a bit scary to have any vendor automagically migrating portions of your configuration to a new format, it happens and as long as they document well and you do your due diligence, things can work out just fine. Other times they completely go to hell because of an undocumented feature. This upgrade fell somewhere in the middle.
As it turns out, a critical fact was left out of the documentation. Namely, that all of your access-groups that had been applied in some direction or another would now, quite frankly, not be applied to anything. In other words, my firewalls were now letting anything out of the network and nothing in. I quickly applied my new access-lists to the interfaces a couple of times before I discovered that you can now only have one applied in any direction (par for most IOS devices).
Since these were production and I had some higher risk on the IPv4 side (we have a lot of rules, and a default-block outbound policy) than the IPv6 side, I did the following:
- I blocked IPv6 in and out, then applied the IPv4 lists to the interfaces in the correct directions.
- I hand migrated (notepad is your friend) the IPv6 access rules into the IPv4 lists and brought IPv6 access back online.
- I then deleted the redundant (old) ACLs.
Everything came back, life was good, mostly nobody noticed anything. What’s the lessons learned from this experience? Besides don’t upgrade ASAs? How about these:
- Always have a backup of your configuration, preferably taken a few minutes before you start the upgrade. In this case I didn’t use the backups for more than a reference, but they were available if I had wanted to roll back.
- Know your configuration and your devices. This seems intuitive, but a lot of people would have gotten part way through this migration, saw that their ACLs were borked, and been lost. If you’re going to live on the edge, at least have a helmet.
- Read the documentation. I did, and while it didn’t directly help, I at least knew ahead of time what was likely to break. I also knew once it broke what the likely problem area was. To tie this into the CCIE Lab (back to studying, so it’s on my mind) it’s a bit like being able to look at a network diagram and instinctively know where you’ll have problems (two routers doing redistribution between EIGRP and OSPF, check).
At the end of the day, it all worked out for a variety of reasons listed above. Would I suggest any readers out there try this sort of “no net” upgrade to bleeding edge code? Probably not. In my case, I’m a masochist it seems, and this is my therapy. Now on to my 6500 upgrade to 15.1(1)SY. I’m sure I’ll be writing about that not long from today.
Policy-based VPN between Juniper SRX and Cisco ASA
One of the things that I am called upon to do fairly often in my current role is to configure remote access VPN devices for some site or another. Often these sites are transient in nature, only staying active for weeks or months at the longest before disappearing forever–or at least until I don’t care any more.
Because I spend a fair amount of time setting these VPN tunnels up, I have gotten fairly good at the ins and outs of IPsec VPN tunnel configuration and troubleshooting. I was even beginning to think of myself as a bit of a VPN whisperer. That was about to change, however.
We use Cisco’s ASA product line everywhere for this type of thing, and as many of you no doubt know, if you use one vendor’s product for VPN tunnels you are generally in a good position. You’ll likely still have problems from time to time–just enough to keep you honest–but it works itself out. This is the model we’d been following for several years. Until now.
For reasons passing interest here, we made the decision to start exploring the use of Juniper’s SRX product line for our remote, transient tunnel destinations and smaller offices. We were not–and are not–prepared to rip-and-replace our entire ASA installed base, though, so these Juniper devices would have to integrate with the existing infrastructure. This, as they say, is when the proverbial excrement hit the fan.
As anyone who deals in these things knows, mixing vendors on either side of a VPN tunnel is generally a recipe for trouble. Depending on vendors, your troubles range from the “few drinks after work” type to the “move to Mexico and open a beach bar” kind. I had heard from many people that I trust that the SRX to ASA configuration was this latter type.
Juniper SRX devices prefer a type of VPN tunnel known as a route-based VPN. I’m not going to go into specifics here, but suffice it to say it’s a technique that makes sense and a lot of vendors work this way. Cisco’s ASA, on the other hand, prefers a type of VPN tunnel known as policy-based. Policy-based VPNs have some limitations and seem to be favored mostly by Cisco and anyone who wants to integrate with Cisco. I had to make things work without changing much on the HQ side (where the Corporate ASA units sit) outside of what we normally do, so that meant that the SRX at the remote site needed to be configured in a policy-based way.
Bring on the pain.
The process I followed went something like this:
- Spend a couple of days learning the Juniper SRX syntax. This part was actually kind of fun.
- Spend 5 minutes configuring new tunnel on corporate ASA.
- Spend 3+ days trying to get Juniper to talk to ASA. Spend only slightly more time configuring as banging head on desk.
- Spend 5.5 hours spread over two more days with jTAC. Bang head more while they can’t figure it out either.
- Go home, relax, have eureka moment, race back to office, make one-line change, FIX EVERYTHING!
- Go back home and drink.
As it turns out, the problem can best be described as something that every Cisco Engineer learns in infancy: Cisco is consistently inconsistent. For example, when the ASA refers to SHA, do you suppose it’s refering to the old SHA‑0 or the SHA‑1 that corrected a flaw in the old SHA‑0? Dunno. Since they, in some places, only say “SHA” and in others “SHA‑1” it’s anybody’s guess, really.
The main thing I re-learned from this experience is that the defaults–the little timers, identities, and various other little bits–that make up a successful negotiation for an IPsec tunnel are different across platforms. Sometimes you really have to search to find what they’re called. Sometimes you have to bang your head on the desk for a few days.
One more thing: this post was typed up quickly, with no editing, and way too much coffee. If I’ve overlooked things, gotten things wrong, or just confused you more, I apologize and I’ll try to come back and clean it up later. In the mean time, hopefully the diagram and configurations below will help you in your own quest to get a policy-based VPN configured between the SRX and ASA.
Oh, and as soon as I recover from this experience I’ll build out a configuration to do a route-based example, which looks to only require a couple of changes on the SRX side and a tad more tweaking perhaps on the ASA side.
ASA CONFIGURATION (Sanitized):
crypto map outside_map 1 match address outside_cryptomap_3 crypto map outside_map 1 set connection-type bi-directional crypto map outside_map 1 set peer 5.5.5.5 crypto map outside_map 1 set ikev1 phase1-mode main crypto map outside_map 1 set ikev1 transform-set ESP-AES-256-SHA crypto map outside_map 1 set reverse-route crypto map outside_map interface outside
crypto ipsec ikev1 transform-set ESP-AES-256-SHA esp-aes-256 esp-sha-hmac crypto ipsec security-association lifetime seconds 28800 crypto ipsec security-association lifetime kilobytes 4608000 crypto ipsec security-association replay window-size 64 crypto ipsec fragmentation before-encryption outside crypto ipsec fragmentation before-encryption inside crypto ipsec df-bit copy-df outside crypto ipsec df-bit copy-df inside
crypto ikev1 enable outside crypto ikev1 policy 4 authentication pre-share encryption aes-256 hash sha group 2 lifetime 28800 nat (inside,outside) source static CISCO_NETWORK CISCO_NETWORK destination static JUNIPER_NETWORK JUNIPER_NETWORK no-proxy-arp
object network CISCO_NETWORK subnet 10.0.0.0 255.255.0.0 description Cisco Network
object network JUNIPER_NETWORK subnet 10.7.24.0 255.255.255.0 description Juniper Network
JUNIPER CONFIGURATION (Sanitized):
proposal ike-policy1 { authentication-method pre-shared-keys; dh-group group2; authentication-algorithm sha1; encryption-algorithm aes-256-cbc; lifetime-seconds 28800; } policy ike-policy1 { mode main; proposals ike-policy1; pre-shared-key ascii-text "$%#@!!!"; ## SECRET-DATA } gateway ike-gate { ike-policy ike-policy1; address 5.5.5.5; local-identity inet 6.6.6.6; external-interface fe-0/0/0; }
proposal IPSEC_PROPOSAL { protocol esp; authentication-algorithm hmac-sha1-96; encryption-algorithm aes-256-cbc; lifetime-seconds 28800; } policy IPSEC_POLICY { proposals IPSEC_PROPOSAL; } vpn ike-vpn { ike { gateway ike-gate; inactive: proxy-identity { local 10.7.24.0/24; remote 10.0.0.0/16; service any; } ipsec-policy IPSEC_POLICY; } establish-tunnels immediately; }
source { rule-set trust-to-untrust { from zone trust; to zone untrust; rule nonat { match { source-address 10.7.24.0/24; destination-address 10.0.0.0/16; } then { source-nat { off; } } } rule ZZZ { match { source-address 0.0.0.0/0; destination-address 0.0.0.0/0; } then { source-nat { interface; } } } } }
Windows 7 and IPv6
One of the projects that I began early last year is an IPv6 rollout across our entire worldwide network. During the initial heady days of the project, I managed to get the infrastructure largely configured in a dual-stack arrangement. Then something came up. I can’t remember exactly what, at this point, but something did, and so the project sat. And sat. And sat some more.
Earlier this week, I reviewed the appalling (lack of) progress so far and began moving forward. I had already completed the acquisition of Provider Independent Address Space (PI), getting our upstream providers to advertise the routes, and the setting up of 90% of the infrastructure (SVI addressing, the different IPv6 settings like IPv6 CEF and unicast-routing, etc.) Everything here worked as expected.
I decided to start with the infrastructure more related to the client side, including servers of interest (DNS, AD, mail, etc.) to the inside, and the end-user workstations. It should be noted here that we had IPv6 turned off on all workstations and servers, despite Microsoft’s best practices which state that IPv6 should be left on because several services will break with it off. Since these services (Homegroups, for one) don’t really apply in a corporate environment, I didn’t care too much what their best practices recommended. The first step was to turn IPv6 on for a few test machines in the IT VLAN, and a couple of the Active Directory servers.
The good news is that the servers came up and were happy as soon as they got their addresses. We’re going with a scheme of static assignment for servers, infrastructure management addresses, etc., so all I really had to do was click the checkbox for IPv6 (these are Windows 2008 R2 servers), assign address and gateway, and voila, we had reachability. Next, I fixed up some DNS settings, added some AAAA records (the IPv6 equivalent of IPv4 A records), and we had DNS resolution on IPv6 working as well.
Windows 7 wouldn’t prove to be quite as easy, for a few reasons. While the checkboxes and settings are ostensibly the same, some of the back-end code is obviously different and needs some non-obvious tweaking to get right. I couldn’t for the life of me figure out why I had no reachability, when the configuration should actually be easy as pie (stateless auto configuration–just “turn on” IPv6 and walk away). In fact, I ended up testing my Macbook Pro and a Linux machine, and both of them worked as expected right out of the gate, so I knew this was a Windows 7 problem.
The first thing that occurred to me after testing the other non-Windows machines was domain policies. We have a lot of them, and you never know what might be causing issues. I had one of my Server guys check that, and then check a non-domain Windows 7 machine with the same image (Windows 7 Enterprise), and he came up with no applicable policy problems and the same behavior.
Finally, after spending too much time with Google, I managed to piece together an idea of what was happening. As it turns out, the mechanism by which Windows 7 puts together its interface identifiers is to blame. I found the applicable information in a posting from February, 2010 on a site called itexpertvoice.com, linked to here. It probably won’t surprise anyone to find out that Microsoft screwed up implementing something they helped create (RFC 4941). The command you’ll want to run, from an elevated command prompt (or GPO, SCCM, etc.) is:
netsh interface ipv6 set global randomizeidentifiers=disabled
This takes you back to behavior supported by standards-compliant hardware, and allows your Windows 7 machine to actually use IPv6.
Another command you may want to use, and this is strictly a personal and policy choice, is:
netsh interface ipv6 set privacy disabled
This turns off privacy addressing, an idea also described in RFC 4941. Given the globally unique, routable nature of IPv6, I can see where some people might get squirrelly, but I just don’t see the need. Again, this is something you’ll want to review for your own environment.
Another interesting problem I found, and one more I had to solve before my Windows 7 machines were ready to be released back into the wild, revolves around voice. Specifically, voice VLANs. In a voice-enabled network using Cisco IP telephony, each access port on a switch typically belongs to two VLANs: a voice VLAN and a data VLAN. You plug a phone in, your computer plugs in to a pass-through port on the phone, and each auto-magically gets an IP address in the correct VLAN. Except with IPv6.
I should say, however, it didn’t work that way in my environment. What happened here is that my workstation would actually get an IP address in both the data and the voice VLANs. And, due to the way prefix policy works (more on that in a minute), would use the voice VLAN as the source address in traffic originating from the workstation. The way I solved the problem for now is to turn off IPv6 and take away the IPv6 addressing from all of the voice VLANs (specifically, on the SVIs), since I haven’t gotten to the voice part of the IPv6 project anyway.
I don’t know what’s causing the workstation to get assigned a voice address, but I’m going to continue to look into it. More than likely it has to do with Stateless Auto configuration and ND, but time will tell. In the mean time, what’s this prefix policy thing I mentioned above? Read about it in RFC 3484.
Put simply, IPv6 prefix policy exists because any given interface in IPv6 is going to have several addresses; everything from Unicast to link-local, multicast, and maybe more. So, if I’m a workstation and I want to source traffic outbound on a given interface, which address do I use? Prefix policy helps to solve that problem by “ranking” different addresses on an interface in a standards-specific way.
After reading the RFC, open up an elevated command prompt on your Windows 7 machine and enter the following command to see how your machine is set up:
netsh int ipv6 show prefix policies
and you’ll get something that looks like so:
You can manipulate the order of things here, and add if needed, using the following commands (note these are two commands, but word-wrap is in effect here):
netsh int ipv6 add prefixpolicy=d2a:d00c:b678:ecfb::1/128
precedence 2 label=22
netsh int ipv6 add prefixpolicy=d2a:d00c::b678::/48
precedence 2 label=22
This gives you a matching prefix pair, and says that for all traffic destined for d2a:d00c:b678::/48 network, use the d2a:d00c:b678:ecfb::1/128 address on the interface as the source. In my case, my voice VLAN (applicable hextet: 107) was showing up in the interface prefix list before my data VLAN (applicable hextet: 1ff). I could have used the example above to set the correct precedence, but this isn’t exactly scalable to thousands of phones and workstations, all of which may have a different mix of addresses.
So, we still have the voice VLAN issue to solve, but the Windows 7 machines can now talk comfortably on the network to other workstations and servers, including using SSH, DNS, RDP, and all other major services. A lot of software still isn’t IPv6 compatible, but that’s why we’re in a dual-stack world for the foreseeable future.
Next challenge, if anyone has suggestions? Working around some of the feature non-parity issues mostly in the area of security. For instance, the ASA line as of last check still doesn’t support OSPFv3 or failover via IPv6, and no tools exist that I’m aware of for mapping complex security policy from IPv4 to IPv6, which is why our IPv6 world here still remains sequestered to just our network.
Ideally we’ll have outside traffic allowed and all of our public-facing services running in time for World IPv6 day this year, which happens on June 6th. Mark your calendars!
Cisco Live Sunday Labtorial
This post is late in coming, considering that I’ve been back from Cisco Live for a good couple of weeks now. Nevertheless I’m posting it now, so hopefully someone finds the information useful.
Without going into the details of the entire Cisco Live experience, I’d just like to talk about the class I took on the first Sunday of the show–or the day before the show officially starts, depending on who you talk to.
On Sunday I attended a full-day mock CCIE R&S lab (Session LTRCCIE-3001). This was an instructor-led affair, with Bruce Pinsky (Distinguished Engineer) and Bruno van de Werve (Product Manager) acting as facilitators and proctors. Considering Bruno’s experience as both a proctor for the actual R&S lab, and now the head of the R&S program, this was an experience well-worth having if only for the ability to ask questions.
Unfortunately for all of us, and through no fault of either Bruce or Bruno, the in-class network was crashed from the moment we all got there. There were a number of failures, including some bad cables (how do you miss that in testing) which resulted in all of us essentially sitting around for over an hour.
To make up for the delay in getting started, someone from Cisco came in and apologized and handed out gift cards to Mandalay Bay. It was a nice gesture, but considering the gift cards had a face value of five dollars, it might have been better to not hand out anything. It had the affect of actually irritating several students, and giving the rest of us something to joke about for a while. The class cost $1000 (or 10 Cisco Learning Credits) so the value of even an hour should have been closer to $125 or so.
After that snafu, and a brief presentation by Bruno and Bruce on numbers of CCIE in the world, with breakdowns by region, we got started with the meat of the class: the labs themselves. We were all looking forward to this, since it was being run by Cisco and had the smell of real-world vs. some of the third-party labs (note that I use third party labs for training, and have no problems with them, but this was officially sanctioned and so had a little something extra, at least in “feel.”)
The troubleshooting section came first, and used the same system as the real lab so that was a nice touch. In our case we had only five trouble tickets to complete in one hour vs. the real lab which has ten in two hours. I believe this was done to facilitate the “instructor led” nature of the class, and allow us to ask plenty of questions. Bruce and Bruno were stellar in this regard, coming around to any student with a question and helping them to understand the problem or just passing out hints to those who still wanted to figure it out on their own.
I learned a lot about myself and my troubleshooting techniques during this portion of the day, as I got bogged down on the first ticket and blew the rest of my time. It was a relatively straightforward ticket where a particular address wasn’t answering an ICMP Echo to another device. It was a few routers together, with BGP. I spent the entire hour re-architecting the BGP–down to bare metal and rebuilding the config from scratch–and almost was done when time expired. As it turned out, it was a simple address statement that was missing.
Bruno got a chuckle out of this and pointed out that the lab is not intended as a “best practices” lab. He said that in most cases you won’t be removing configuration at all during the TS section; you’ll simply be adding something missing or correcting route statements, etc. It was helpful for me to hear this and to go through the experience, because it taught me that I really need to focus on finding the simple problem quickly and not rebuilding things the way I think they ought to be built. After 17 years in the industry, that’s a difficult habit to change, but one I’ll have to in order to be successful on the real lab.
After a brief recap and break, we moved on to the configuration section. For the most part there were no surprises here, and I had my Layer‑2 (Frame, Spanning-tree, VTP, etc.) and IGP (RIP, OSPF, and EIGRP here) set up quickly enough. Redistribution was what you’d expect, with a lot of everything going every which way. Again, no one in their right mind would ever design that network, but it’s what you can expect to see in the lab.
The one thing I did miss and had to have Bruno point out to me, is in a redistribution task regarding OSPF. The task wanted a route from one area to show up in area 0. I got the route there, but Bruno said that I had it wrong. Reason? The area where the route originated was discontiguous, or detached from area 0. We all know that typically means you want a virtual link, but since the task didn’t specify this I simply brought the route into area 0 as an external. Bruno said that the task “implied” a virtual link, and while I disagree with the wording of the task and the nature of implied configurations, it was helpful to hear since this is likely the same kind of thing I’ll see in the real lab.
Where I slowed down–and I knew I would–is on the MPLS and BGP configuration sections. As a long-time enterprise engineer, I simply don’t touch either of these technologies in the real-world, and I haven’t spent enough time with them in the lab to feel comfortable. I still muddled my way through some of it, but with the amount of time it took I’d never make it through the real lab. The message for me here is that I really need to take some time with these technologies until I not only understand them well, but can configure them quickly.
Overall, this was a very valuable experience and one I would heartily recommend to anyone looking to take the R&S lab. It gave valuable insight into the time pressures you’ll face, as well as the number of tasks, the wording, and the level of difficulty you can expect to see. This is just one more reason that Cisco Live is where you want to be every year if you’re at all serious about your networking career.