Websockets at scale

I made these quick notes essentially for my own future reference, to once and for all document some ideas about this one architectural template, and be able to move on to something else. For this reason, the style is less adequate to a presentation than it should be, less structured than it could be, but if the post may still be useful to others, despite all its stylistic shortcomings, I’ll be happy for that.


For reasons explained here and here, the best way to avoid interference from mediating network elements (routers and proxies) with websocket communication, is to hide the content their ability to interfere is based on, by encrypting the TCP payload, using TLS.

This is step one.

“Step zero” would have been validating that you are a good Internet citizen, by making sure your decision to keep connections open through all those globally shared network elements was economical and environmentally rational.

Step two then is about load balancing (LB). And it has two main aspects to it.

Support the fallback mechanism

The first is about stone-age browsers that don’t support websockets. There are also still network routers that don’t support it. If this is still a problem for you, then it means your solution needs to deal with the long-polling mechanism generally used as a fallback by the websocket libraries out there (e.g. sockjs, socket.io). This fallback requires consistently routing to the same web server all the separate HTTP connections emulating in time a single websocket connection.

You can use session stickiness for this, if load balancing at the application layer. Your LB needs to be the TLS endpoint for this to work, of course.

Otherwise, your LB can also work based on IP affinity, meaning that it would consistently route to the same server all connection requests from the same source IP address. This has the negative potential to create strong load imbalances in the web server infrastructure when lots of your app clients may be represented to the world by the same NAT Gateway public IP(v4) address, and they all hit the same back-end server because they all look like the same client. It generally also means that when there is transformation of the source IP address at the LB, no X-Forwarded-For header with the original client source IP will be added to the HTTP Upgrade request that initiates a websocket connection with the back-end web server.

The problem with both solutions is scale. And this, as already alluded to above, is exactly the second aspect we must consider when discussing load-balancing websockets.

Scale

The fallback mechanism should really start being considered a non-issue, after all these years of websocket existence, shouldn’t it ? Let’s face it, do you really have much money to spend with your IE 9 user base ? Probably not. But scale ? That’s a real topic, very much pertinent not only to websockets but also to HTTP2 and more generically to any long lasting connection between your clients and servers. The issue, in these days of massive hardware resources, doesn’t come as much from resource limit as it does from a basic TCP limit.

A load balancer has only a limited number of ephemeral TCP ports it can open at the same time towards the servers. It’s how TCP was designed, 16 bits to represent port numbers, that’s it. And from all of those ports, less than half are ephemeral, which means for a Linux box, in the best case, 28.232 outgoing connections from your LB to the web servers. That’s 28.232 simultaneous clients. Maximum. Not much for your massive multiplayer online game, is it ?

There are technological approaches to this (e.g F5’s OneConnect) that multiplex (better said : serialize – not cool! ) different incoming connections, into a small pool of reusable outgoing connections to the web servers, but they are immensely tricky to configure, may require your application to be aware of this multiplexing and simply don’t work for things like HTTP2 at all. However, when you go down the hardware route (pardon the pun) – and you should if you have the money, if nothing else for the integrated DDoS protection – the limits above stop making much sense because internally they can pack a bunch of different NICs and route 28.232 outbound connections (or more, they own their own kernel) independently, from each of them.

Hardware solutions are expensive though… and they don’t scale vertically indefinitely.

An architectural Solution

Multiple load balancers, each taking care of a different set of web servers, and a DNS round-robin scheme mapping the site name to their alternative IP addresses.

True, another group of problems, associated with availability and fail-over is raised by the DNS cache refreshment delay (in a micro-service architecture by the way, using Consul as a registry may suffer from the same issue, exactly for the same reasons). We do not deal with that here, not yet.

Obviously, such a solution for scale invalidates completely the stickiness strategies I mentioned above, so the stickiness strategy at scale, when needed – even beyond the websocket fallback domain – should be to drop stickiness completely 🙂 ! Enable all servers to deal with any incoming connection instead. Which means sharing state (e.g authentication, authorization) across all servers (but check JWT before that, and also this). It should also mean doing so with an in-memory cache (e.g. Ignite, Hazelcast, Redis).

I argue that it’s not expensive to start any web application project by enabling this architecture as a foundation for its presentation layer. Basically, prepare a distributed cache solution from the start.

Or

Simply use the cloud. Use an AWS Application Load Balancer and you get websockets over TLS, session stickiness for any necessary fallback, all the balancing capacity you’ll ever need inside any single Region and DDoS basic protection. Among regions however you may have to consider using something closer to the described above but even then, most of the time you can probably be  geographically precise and let Route53 target always the same region’s ALB. Note that the dns cache TTL in your clients’ browsers or proxies is still a problem in this solution.

You should also prepare the architecture to allow rolling deployment upgrades from the start, but that will be another post 🙂

Going even further

At scale, you may also want to analyze the desired throughput, priority, coupling and lifecycle of your business flows, to check if asynchronous notifications – the natural match for websockets – could benefit from a segregated, independent infrastructure (even if the notifications per se may not represent a separate business bounded context). You’ll probably want to use CORS if you go down this path. You’ll probably also need authentication in your websocket endpoint, so remember to set the withCredentials property in your CORS request in that case (and while you’re at it,  google for ‘jwt websockets’) .

 

Leave a comment