“In theory there is no difference between theory and practice. In practice there is.” Yogi Berra
Usually, no matter how much planning, testing, thinking about, or stalling you build in to an upgrade project, the bill comes due at the end and isn’t what you expected. Maybe someone ordered the lobster, or multiple drams of Johnny Walker Blue, but either way you have a situation to deal with… right now. How you deal with it determines many things about your skills and your character, but that’s for another post as I’m going to try my best to keep this one short and to the point.
I recently had the wonderful experience of upgrading an Active/Passive failover pair of ASA units to the newest of the new code, 9.0(2) from 8.4.2(8). After the 8.3 kerfuffle (NAT changes automagically, anyone?) I was particularly keen to not miss any possible gotchas in this upgrade. I also scheduled a larger than usual maintenance window–even though we didn’t expect any downtime–just in case.
I should interject here, for those of you aghast at the thought that we would possibly implement new, some would say, bleeding edge code on production systems, a couple of things:
- We always run bleeding edge code, usually because we have need of the bleeding edge features the code brings. In this case, IPv6 features sorely lacking in prior code versions
- We have adopted a very aggressive IPv6 stance, as I have written about before, and we tend to find our aspirations and designs are well out in front of significant portions of the code available for our equipment
- Noting the prior two items again, I’ll also add that Firewalls, in particular, seem to have code that is months or years behind the route/switch world. That holds true across all vendor platforms. Why? I don’t know, but that’s another post.
With our need-for-upgrade bona fides established, I dutifully read the entire Release notes for the 9.0(x) ASA code and while I was excited at many of the new features–mostly around IPv6–and disappointed at others (No OSPFv3 address families? Really?) something immediately jumped out at me: Page 19 and its disturbing title, “ACL Migration in Version 9.0”
Any time you see the word “migration” in any documentation referring to an upgrade of production code or configuration, you know two things:
- It probably happens auto-magically, which is basically a synonym for “we’re going to bork your code but we’re only going to loosely tell you where, how, and why.”
- You’d better have good backups and be prepared, because a roll-back is likely going to be as painful as just plowing ahead.
To summarize the boring details for you, prior to the new code you had two categories of Access Control Lists (ACLs): those for IPv4 and those for IPv6. Inside each of those macro-levels you had the normal standard and extended lists and whatever other features. You applied the IPv4 and the IPv6 access-lists to the interface in whatever direction and that was that. True to the dual-stack model, you really were running two parallel networks and never the ‘twain shall meet.
During and after the upgrade to 9.0(x) ASA code, a couple of things happen:
- IPv6 standard ACLs are no longer supported, and any you have are migrated to extended ACLs.
- If IPv4 and IPv6 ACLs are applied in the same direction, on the same interface they are merged.
- The new keywords any4 and any6 are added in place of the old any keyword.
- Supposedly, if certain conditions are met (and they were in my case) your IPv4 and IPv6 ACLs should be merged into one (they were not).
While it is a bit scary to have any vendor automagically migrating portions of your configuration to a new format, it happens and as long as they document well and you do your due diligence, things can work out just fine. Other times they completely go to hell because of an undocumented feature. This upgrade fell somewhere in the middle.
As it turns out, a critical fact was left out of the documentation. Namely, that all of your access-groups that had been applied in some direction or another would now, quite frankly, not be applied to anything. In other words, my firewalls were now letting anything out of the network and nothing in. I quickly applied my new access-lists to the interfaces a couple of times before I discovered that you can now only have one applied in any direction (par for most IOS devices).
Since these were production and I had some higher risk on the IPv4 side (we have a lot of rules, and a default-block outbound policy) than the IPv6 side, I did the following:
- I blocked IPv6 in and out, then applied the IPv4 lists to the interfaces in the correct directions.
- I hand migrated (notepad is your friend) the IPv6 access rules into the IPv4 lists and brought IPv6 access back online.
- I then deleted the redundant (old) ACLs.
Everything came back, life was good, mostly nobody noticed anything. What’s the lessons learned from this experience? Besides don’t upgrade ASAs? How about these:
- Always have a backup of your configuration, preferably taken a few minutes before you start the upgrade. In this case I didn’t use the backups for more than a reference, but they were available if I had wanted to roll back.
- Know your configuration and your devices. This seems intuitive, but a lot of people would have gotten part way through this migration, saw that their ACLs were borked, and been lost. If you’re going to live on the edge, at least have a helmet.
- Read the documentation. I did, and while it didn’t directly help, I at least knew ahead of time what was likely to break. I also knew once it broke what the likely problem area was. To tie this into the CCIE Lab (back to studying, so it’s on my mind) it’s a bit like being able to look at a network diagram and instinctively know where you’ll have problems (two routers doing redistribution between EIGRP and OSPF, check).
At the end of the day, it all worked out for a variety of reasons listed above. Would I suggest any readers out there try this sort of “no net” upgrade to bleeding edge code? Probably not. In my case, I’m a masochist it seems, and this is my therapy. Now on to my 6500 upgrade to 15.1(1)SY. I’m sure I’ll be writing about that not long from today.