Driving Multicast Adoption

The benefits of IPv6 and multicast are clear.

Adoption of both is lagging.

IPv6 adoption is slow, but happening.

In the case of multicast, it's not happening and never really has been. IPv4 multicast was a failure. Used in some very specific cases, but IPv4 multicast support is not widespread, and developing anything for it to run on the public Internet was doomed to fail. The mbone helped a bit, but it was beyond the reach of most developers. Too hard.

IPv6 gives us another opportunity. IPv6 multicast is much improved over IPv4 and, crucially, is baked in to IPv6. IPv6 doesn't work without it. On a single network segment IPv6 multicast just works. Ok, I'm glossing over MLD switch support here, but even without it, multicast works, just not as well.

There are, however, several barriers to IPv6 multicast adoption on the public Internet.

Firstly, and most obviously, to have IPv6 multicast, we need to have IPv6. Bummer, that. Fortunately, IPv6 is coming, and a lot of work has been put into transitional technologies to make it work even where there is no direct support for it.

People don't develop for multicast because it's hard. Even where it is clear there is a benefit, the path to using it is not just difficult, but completely impractical for most software projects.

Most modern commerical routers support 6cast[1]. I'd go so far as to say "all". The exceptions are far outweighed by the rule here. SME kit, I'm not so sure. We can assume, on the basis of IPv6 support to date, that it is lacking.

There are several transitional technologies for 6cast, but these are all aimed at ISPs and router manufacturers (eg. Automatic Multicast Tunnelling (AMT)). BGP and Anycast are not a protocols that are available to software developers.

To be clear, I'm aiming at making the benefits of IPv6 Multicast available, easily, to developers of software like Puppet, Nagios etc. Until we reach grassroots developers, the technology is a failure.

So, we have a chicken and egg problem.

No software developer can use multicast with confidence on the public Internet because it isn't supported. There's no drive to support multicast on the public Internet, because noone uses it.

So, we need a transitional technology. One aimed at developers and end users. One that makes IPv6 and Multicast available by default to everyone, regardless of whether their local network supports it. One that brings the benefits of multicast while guaranteeing at least the levels of reachability that we already have with TCP and UDP.

If we can make it easy to use 6cast via a transitional technology and demonstrate the (so far theoretical) benefits, then there will exist clear business benefits to enabling it on the wider Internet and no penalty to deploying it now.

Enter librecast.

A programming library, with wrappers for all major languages, which provides secure, scalable group communication using IPv6 and multicast by default, falling back to legacy modes where required to ensure no loss of reachability.

Think ZeroMQ, but with a different, narrower focus. ZeroMQ is a great idea, but IPv6, encryption and multicast are afterthoughts, barely supported, and added with some difficulty due to incompatibilities with the design. ZeroMQ's focus is fast, reliable messaging. Librecast's focus is scalable, secure, group communication. Speed is still a priority, and will be a byproduct of simple design and short codepaths.

In the case where IPv6 and Multicast works natively, librecast will be faster and much more scalable than ZeroMQ. If it isn't, we've done something wrong. The *design* is more efficient, so any performance or scalability failings are a bug.

How does it work?

Simple. If 6cast works, we use it. A new node starting up enters the "discovery" phase. It will try to connect to our core multicast network ("Stratum 0")[2]. If it cannot, it will request a relay node via unicast to something on Stratum 0.

If a relay node is required, the stratum 0 that is contacted will send out a multicast message to the "relay" group. These nodes (or some of them) will respond via unicast to the requesting node. The node can then select one (or more) of them to act as relays.

(Stratum 0) [relay] <-- GRE (udp) tunnel --> [node] (Stratum 1)

The new node can now act as a Stratum 1 gateway node for any other nodes local to it.

This is very similar to how Automatic Multicast Tunneling (AMT) works, except without the need for Anycast or BGP which are unavailable to our target developers.

We're focussed on Any Source Multicast (ASM) - true many to many group communications. Interestingly, Single Source Multicast (SSM) is basically a simpler, special case of ASM. SSM will often penetrate where ASM will not[3]. SSM can, therefore, be used as an alternative to GRE tunneling. Several nodes can connect to a single repeater using SSM more efficiently than setting up numerous GRE tunnels.

SSM has some interesting properties which we can exploit. If a node (N1) on one network segment sends a listener request to a node several segments away, not only can the nodes on the same segment as N1 now receive the multicast traffic, but all nodes on all intervening segments can now also receive that traffic, even though they are using ASM not SSM.

Group addresses. With ASM we have 112 of our 128 IPv6 bits to use for the group address. This means we can have a group for pretty much anything[4]. We can, for example, take any word like "webservers", use a hash function like SHA3 and squeeze out bits to fill the group address, making it unique and collisions very unlikely. Castinet does this already (albeit using SHA1).

When relaying, we need to be careful to avoid duplicating traffic. Rather than using the full 112 bits for the group address, I plan to reserve, say, 8 bits for flags. We can then set a relay flag so that any nodes listening to live stratum 0 traffic don't get repeats via a relay because they group address differs by one bit.

Next (technical) step is to replace recv() with recvmsg() so that I can obtain the destination address of each multicast packet and can therefore set the relay flag and send it on to the correct group address.

On the coding front, I'm focusing exclusively on the multicast bits, as everything else is a distraction. We know how to do authentication and encryption etc. so I'm taking it for granted that is all prior art and there's no innovation to be done there. At some point I'll fill in some of those blanks. Necessary to make it functional and usable, but not necessary to demonstrate my point.

Other next steps are for me to write some blog posts (with pictures/diagrams) on various aspects of IPv6 multicast, produce some demo videos, write business and technical papers and put some slides together. Oh, and submit some talk proposals in various places. As a RIPE member, we get invited to various conferences, so I might try one of those to reach beyond the Open Source community.

[1] I made up "6cast", but I'm tired of writing IPv6 Multicast. The fact that there isn't an accepted abbreviation already tells you all you need to know about adoption status.

[2] How we determine what is Stratum 0 is an open question. We can use DNS A or SRV records in a similar way to puppet/saltstack. We could also set this in configuration. Ideally, I'd rather this is automatically discoverable. We can embed an address(es) in IPv6 so the node can figure it out from this. Doesn't matter, really. Technical, we'll sort it.

[3] ASM requires a Rendezvous Point (RP) to be configured or embedded in the IPv6 address. SSM does not. Full SSM works by simply typing "ipv6 multicast-routing" on a Cisco router. Further configuration is required for full ASM support.

[4] Yes, there's some overhead on routers for each group we use, but we're trading a little router memory for much more efficient group communications.

[5] Truncating a hash in this case is fine.

2017-03-23