IPv6 Feature Parity (Lack of) Rant

The current state of IPv6 support in many vendors products makes me want to donkey-kick someone right in the…  well, let’s just say it upsets me.

I have been leading an IPv6 roll-out for some time now, among other things, and have found some interesting and widely differing levels of support for the next generation Internet Protocol.  With some vendors, many things work and work well, while with others it’s as if things haven’t changed in a decade or more.  Even with the vendors who do have relatively good support for IPv6, however, there remains often odd, even inexplicable, gaps in that support.  This has made our deployment a lot more challenging than it needed to be.

Much has been made of the chicken-and-egg nature of the problem: does demand drive the support, or does having support create the demand?  Self-named analysts, vendor representatives, media pundits, and even my dog seem to have an opinion on this, but I’ve heard little from the people in the trenches actually trying to implement this stuff.  Implement as in across the board full feature parity, not half-assed or “it worked in the lab” analyses.

Further exasperation comes as you figure out that you don’t know what you don’t know, and get 65% into the project before you figure out that some feature is missing.  A feature like, say, HSRP.  Whenever I complain that HSRP support is missing (or other FHRP) someone inevitably suggests RA tuning as a solution to the problem, which is a bit like handing someone asking for a Hamburger a popsicle; nice, but not the same.  Just how fast do you think you can achieve failover with RA tuning anyhow?  And don’t even get me started on what happens in a dual stacked failover scenario where RA tuning is handling IPv6 and some FHRP is handling IPv4.  At least BFDv3 is available for route failovers.

It’s not even that big, significant, oh-my-god features are always the ones missing, however.  Often times it’s the random, little features.  Cisco’s ASA, for instance, can’t do stateful failover using anything but an IPv4 address.  Why?  They’ve implemented IPv6 ACLs, objects, NAT (god help us all) and a lot of the bulk gotta-have-it features.  Why not failover?  Oh, and OSPFv3 support is missing too.   Why?  Dunno.

Our UCS is no exception to this rule, as almost nothing is IPv6 ready that I can find.  Ditto for the VMware installation we run on it.  Never mind that we’re at the newest patch levels, running VSphere 5, ESXi, etc.  View?  Nope, no support there either.  Our NetApp array on the back-end?  The big beast with multiple glorious 10-Gig connections?  Bubkiss for the IPv6 support there as well.  Although they do have a nice bit of marketing available online here.  See if you can tell when they’ll have IPv6 support from that document.

In all fairness here, I should point out that the Virtual Machines that you run in VMware, on the UCS do support IPv6 just fine, or at least as fine as the individual OS you’re installing (see previous rant on Windows 7 here).  SLES (Suse’s flagship server product) supports IPv6 from the command line, for instance, but not from within YaST.  Not a big deal if you’ve used Linux or any flavor of Unix for a while, but for a junior engineer?  That can mean more escalations and a more inefficient NOC.

In a lot of ways, actually, the Operating System purveyors seem to be way ahead of most infrastructure (network, storage, security) providers in supporting IPv6–even with their flaws–but that may be simply do to the amount of features they have to port vs. what a Cisco, Juniper or HP has to support.  The notable exception here being Apple, which for some inexcusable reason just dropped IPv6 support from their Airport Wireless product.

At the end of the day, I understand that rewriting absolutely everything to support an entirely new protocol is incredibly difficult.  I also understand that IPv6 has some behaviours that mean feature parity is not always going to be at 100% because it just doesn’t make sense.  I even understand that features will be rolled out in some sort of priority-ranked order, and that maybe management interfaces aren’t at the top of that list.  But what I don’t understand, or can’t get my head around, is why so many glaring inconsistencies exist when we’ve had so long to work at it.  Or why some vendors give little more than lipservice to IPv6 while not supporting any of it in their products.

Incoming search terms for the article:

Windows 7 and IPv6

One of the projects that I began early last year is an IPv6 rollout across our entire worldwide network.  During the initial heady days of the project, I managed to get the infrastructure largely configured in a dual-stack arrangement.  Then something came up.  I can’t remember exactly what, at this point, but something did, and so the project sat.  And sat.  And sat some more.

Earlier this week, I reviewed the appalling (lack of) progress so far and began moving forward.  I had already completed the acquisition of Provider Independent Address Space (PI), getting our upstream providers to advertise the routes, and the setting up of 90% of the infrastructure (SVI addressing, the different IPv6 settings like IPv6 CEF and unicast-routing, etc.)  Everything here worked as expected.

I decided to start with the infrastructure more related to the client side, including servers of interest (DNS, AD, mail, etc.) to the inside, and the end-user workstations.  It should be noted here that we had IPv6 turned off on all workstations and servers, despite Microsoft’s best practices which state that IPv6 should be left on because several services will break with it off.  Since these services (Homegroups, for one) don’t really apply in a corporate environment, I didn’t care too much what their best practices recommended.   The first step was to turn IPv6 on for a few test machines in the IT VLAN, and a couple of the Active Directory servers.

The good news is that the servers came up and were happy as soon as they got their addresses.  We’re going with a scheme of static assignment for servers, infrastructure management addresses, etc., so all I really had to do was click the checkbox for IPv6 (these are Windows 2008 R2 servers), assign address and gateway, and voila, we had reachability.  Next, I fixed up some DNS settings, added some AAAA records (the IPv6 equivalent of IPv4 A records), and we had DNS resolution on IPv6 working as well.

Windows 7 wouldn’t prove to be quite as easy, for a few reasons.  While the checkboxes and settings are ostensibly the same, some of the back-end code is obviously different and needs some non-obvious tweaking to get right.  I couldn’t for the life of me figure out why I had no reachability, when the configuration should actually be easy as pie (stateless auto configuration–just “turn on” IPv6 and walk away).  In fact, I ended up testing my Macbook Pro and a Linux machine, and both of them worked as expected right out of the gate, so I knew this was a Windows 7 problem.

The first thing that occurred to me after testing the other non-Windows machines was domain policies.  We have a lot of them, and you never know what might be causing issues.  I had one of my Server guys check that, and then check a non-domain Windows 7 machine with the same image (Windows 7 Enterprise), and he came up with no applicable policy problems and the same behavior.

Finally, after spending too much time with Google, I managed to piece together an idea of what was happening.  As it turns out, the mechanism by which Windows 7 puts together its interface identifiers is to blame.  I found the applicable information in a posting from February, 2010 on a site called itexpertvoice.com, linked to here.  It probably won’t surprise anyone to find out that Microsoft screwed up implementing something they helped create (RFC 4941).  The command you’ll want to run, from an elevated command prompt (or GPO, SCCM, etc.) is:

netsh interface ipv6 set global randomizeidentifiers=disabled

 

This takes you back to behavior supported by standards-compliant hardware, and allows your Windows 7 machine to actually use IPv6.

Another command you may want to use, and this is strictly a personal and policy choice, is:

netsh interface ipv6 set privacy disabled

 

This turns off privacy addressing, an idea also described in RFC 4941.  Given the globally unique, routable nature of IPv6, I can see where some people might get squirrelly, but I just don’t see the need.  Again, this is something you’ll want to review for your own environment.

Another interesting problem I found, and one more I had to solve before my Windows 7 machines were ready to be released back into the wild, revolves around voice.  Specifically, voice VLANs.  In a voice-enabled network using Cisco IP telephony, each access port on a switch typically belongs to two VLANs: a voice VLAN and a data VLAN.  You plug a phone in, your computer plugs in to a pass-through port on the phone, and each auto-magically gets an IP address in the correct VLAN.  Except with IPv6.

I should say, however, it didn’t work that way in my environment.  What happened here is that my workstation would actually get an IP address in both the data and the voice VLANs.  And, due to the way prefix policy works (more on that in a minute), would use the voice VLAN as the source address in traffic originating from the workstation.  The way I solved the problem for now is to turn off IPv6 and take away the IPv6 addressing from all of the voice VLANs (specifically, on the SVIs), since I haven’t gotten to the voice part of the IPv6 project anyway.

I don’t know what’s causing the workstation to get assigned a voice address, but I’m going to continue to look into it.  More than likely it has to do with Stateless Auto configuration and ND, but time will tell.  In the mean time, what’s this prefix policy thing I mentioned above?  Read about it in RFC 3484.

Put simply, IPv6 prefix policy exists because any given interface in IPv6 is going to have several addresses; everything from Unicast to link-local, multicast, and maybe more.  So, if I’m a workstation and I want to source traffic outbound on a given interface, which address do I use?  Prefix policy helps to solve that problem by “ranking” different addresses on an interface in a standards-specific way.

After reading the RFC, open up an elevated command prompt on your Windows 7 machine and enter the following command to see how your machine is set up:

netsh int ipv6 show prefix policies

 


and you’ll get something that looks like so:

 

 

 

 

 

 

You can manipulate the order of things here, and add if needed, using the following commands (note these are two commands, but word-wrap is in effect here):

netsh int ipv6 add prefixpolicy=d2a:d00c:b678:ecfb::1/128
precedence 2 label=22
netsh int ipv6 add prefixpolicy=d2a:d00c::b678::/48
precedence 2 label=22

 

This gives you a matching prefix pair, and says that for all traffic destined for d2a:d00c:b678::/48 network, use the d2a:d00c:b678:ecfb::1/128 address on the interface as the source.  In my case, my voice VLAN (applicable hextet: 107) was showing up in the interface prefix list before my data VLAN (applicable hextet: 1ff).  I could have used the example above to set the correct precedence, but this isn’t exactly scalable to thousands of phones and workstations, all of which may have a different mix of addresses.

So, we still have the voice VLAN issue to solve, but the Windows 7 machines can now talk comfortably on the network to other workstations and servers, including using SSH, DNS, RDP, and all other major services.  A lot of software still isn’t IPv6 compatible, but that’s why we’re in a dual-stack world for the foreseeable future.

Next challenge, if anyone has suggestions?  Working around some of the feature non-parity issues mostly in the area of security.  For instance, the ASA line as of last check still doesn’t support OSPFv3 or failover via IPv6, and no tools exist that I’m aware of for mapping complex security policy from IPv4 to IPv6, which is why our IPv6 world here still remains sequestered to just our network.

Ideally we’ll have outside traffic allowed and all of our public-facing services running in time for World IPv6 day this year, which happens on June 6th.  Mark your calendars!

Incoming search terms for the article:

IPv6 Half-truths

This post will be a short one, and mostly just comes from a discussion I had the other day with another engineer.  It turns out that even among people who are comfortable with IPv6, and maybe even have experience deploying it, a lot of misinformation still persists.  Hopefully I can correct a couple of those today.  I also tossed in a hot-potato at the end just to see how many folks get hopped up.  Discussion is welcome, and in addition to comments here I can be found on twitter hiding behind the handle: @someclown.

You must turn on IPv6 by using the IPv6 unicast-routing command.

Not true.  This is one of the more persistent, yet wildly incorrect, pieces of information regarding IPv6.  I have even seen many training centers and instructors at the CCIE level get this one wrong and it falls into the category of attention to detail.  What this command actually does is enable unicast routing for IPv6, just as it says.  To actually enable IPv6 you simply need to go to any interface and use the ipv6 enable command.  And yes, you can enable IPv6 on the interface without enabling unicast routing.  Of course, it would be helpful to have an address on the interface as well.

Yes, but if you don’t turn on unicast routing you can’t route IPv6 traffic.

Not strictly speaking true.  You can still set up a default route for IPv6 traffic and get it off of your system.  To the extent that you want to argue whether or not this is actually routing is fine, but you can move IPv6 traffic off of your local device using a default route, and never have enabled routing for IPv6.

Using a /127 address on point-to-point links is wrong, wrong, wrong.

This is an interesting one, and usually sparks a fair amount of debate.  Up until very recently, the recommendation across the board (RFC 4291) was to use /64 addresses even on point-to-point links, ostensibly because the IPv6 space is so big anyhow, and because several protocols will break (notably subnet-router anycast, specified in RFC 3627).  While I’m not disputing that this is what the current best-practices reflect, I will say that RFC 6164 which has a status of Proposed Standard makes a fairly compelling case for using /127 on point-to-point links.  I’m sure this won’t be resolved anytime soon, standards or no, but I would say that if you have a compelling reason for using /127 and know what you’re doing it for, go for it.  Just be aware that standards can change, and you don’t want to leave a steaming pile for the poor person who has to follow you.