Can the Internet scale indefinitely? Many people, if not most, assume that it can. Here is a cautionary tale of a network that strangely and unexpectedly didn’t scale so well. I hope that you find it of interest.
Once upon a time, in a country not so far away, there was a fixed broadband provider whose service was sick. They called in my friend, the performance doctor, who did a network X-ray. He reported a worrying symptom.
The doctor observed a loss rate that was bursting over 1% during peak periods. (The reported average loss was much lower.) This even occurred on end user links that were operating at only tens of kilobits per second. Yet the links themselves could support many megabits in each direction.Under these network conditions, many applications will deliver a poor user experience. Streaming video buffers. Web browsing feels “lumpy”. As for the gaming experience, don’t even ask.
So what was going on here?
The broadband service provider was handing off their traffic at multiple interconnection points to an access network. Each access point is a cross-connect, with many interfaces, each running at 10Gbps. That means oodles of local capacity.
Whilst interconnect capacity may be plentiful, all access networks inevitably face capacity constraints somewhere. Because of this, the cross-connect interface is also a traffic management point for things like DDoS mitigation.
This point may also be a wholesale or internal pricing point, with billing based on peak load. To keep costs under control, many broadband service providers use the traffic management capability to rate limit the traffic into these cross connects.
This rate limiting is “neutral”, in the sense that there is no bias in how it is performed. That means there is a uniform loss probability per packet during periods of saturation. These periods might last just a few tens of seconds.
This setup appears to have uncovered a (previously unreported?) scaling problem in TCP/IP.
An unexpected new behaviour emerges
What was supposed to happen at each interconnect when it rate limits the traffic? Well, TCP’s congestion avoidance algorithm is “elastic”, so when loss occurs it “backs off”. This reduces the rate at which it sends traffic, and thus reduces the load on the network.
Crucially, this lower load is intended to result in less loss, both individually and collectively.
When the network doctor took his high-fidelity X-ray pictures he got a direct measurement of the instantaneous loss and delay. Rather than this assumed behaviour, something strange appeared to be occurring.
Surprisingly, the experienced loss rate had become disconnected from the offered load.
Our best understanding is that this is an emergent behaviour that is intrinsic to the interaction:
- between an elastic protocol (like TCP), with
- “neutral” packet handling (where “all packets are equal”), and
- only emerges with highly aggregated loads at high data rates.
This last point is important: the only scaling approach for the current Internet design is to aggregate ever more flows at ever higher data rates.
How TCP is meant to work
Why did this undesirable behaviour emerge? In this circumstance, there was a breach of key assumptions about TCP’s operation.
Four long-standing and ubiquitous assumptions for TCP are that:
- The overall load is mainly due to long-lived TCP flows.
- The “elastic” protocol behaviour will (eventually) cause the loss rate to improve for that same stream. That means there is an individual reward for “backing off” and being “cooperative”.
- The actions of many “cooperative” individual streams will “collaborate” to lower the collective load, and hence the overall loss rate.
- Some kind of stable operational equilibrium will result.
Assumption #1: Long-lived flows dominate
We have come a long way since the early Internet, which was typically used for file transfers and terminal access. The load structure of the contemporary Internet increasingly consists of interactive Web browsing and video on demand. These applications mostly comprise (on the timescales that we are looking at here) short-lived and bursty active flows.
Interactions that would have been longer-lived flows in the past have now become much more short-lived. The combination of higher-speed server infrastructure and faster access links has dissipated much of the “smoothness” of the system, replacing it with a “jackhammer” effect. (Technically speaking, the “non-stationarity” has increased.)
The upshot is that demand is now “choppy”. Flows keep entering the misnamed “slow start” (i.e. fast exponential growth), and then stopping for a while.
These changes in demand and supply structure have had the combined effect of undermining the first assumption.
Assumption #2: Individual reward exists
In the early Internet, it was a reasonable assumption that experiencing loss implied that your own flow was having a significant impact on the network resource. There was an individual reward for appearing “cooperative” by “backing off”: your own subsequent packets were likely to face a lower probability of loss.
At these interconnect points we face a very different situation. There are millions of unrelated streams, increasingly many of them in exponential growth. Any “slack” that is created by one stream backing off is relatively small, and is also immediately consumed one or more others.
The consequence is that it doesn’t matter how far any one stream backs off: it still experiences the same (or at least indistinguishable) loss rate. This breaks the second assumption.
Assumption #3: Collective reward exists
The individual reward for “cooperative” behaviour is assumed to turn into a “collaborative” collective reward. How is this supposed to work?
When an individual flow “backs off”, the network has a decaying memory of your past “good deed”: a queue that has dissipated to some degree. If the network persists in being overdriven, increasingly many flows are hit with loss. More and more flows back off, as they are incentivised to do. The system is then increasingly “nice” in return for this “collaboration”.
Contrast this to what happens at the interconnect points. A side-effect of “ultra-neutral” rate limiting is that the system is memoryless. There is neither a concept of a flow to be rewarded, nor a good deed that requires remembering. The system has no means to offer a collective reward for individual collaboration. Our experience is that this kind of design choice is widespread.
A loss event for a long-lived flow has a bigger reduction in load than for a short-lived one. When there are many long-lived flows the emergent behaviour is to appear to “collaborate” in a desirable way. In this cautionary tale, the flows are instead biased towards being short-lived. The collective benefit of individual “good behaviour” fails to spontaneously emerge.
As a consequence, it doesn’t matter how far any individual flow “backs off”. It is simply not possible for the flows to “collaborate” to get a stable collective throughput. This behaviour breaks the third assumption.
Assumption #4: Stable equilibrium exists
Because the collective loss rate is not rewarded, there is no stable equilibrium. All active end users are suffering reduced QoE as a consequence.
Furthermore, there is nothing that they can do to improve their delivered QoE. No change in individual behaviour has the required effect. Neither can any modification to the adaptive application protocols (like in Skype) work.
This breaks the fourth assumption. The common understanding of TCP’s behaviour has not delivered the hoped-for outcome.
Why is this problem only happening now?
In TCP there is an assumption that offered load and the rate of loss are coupled. This assumption underpins elastic protocol behaviour and its delivery of a self-stabilising overall system. Hence it is at the heart of global Internet provision.
This assumption has scaled over many orders of magnitude in the past. However, these circumstances have surfaced a scaling limitation. So why was the past not a good guide to the future here?
This interconnect is just one element of the overall end-to-end stochastic system. The data rates here are tens of gigabits per second. That means the state of the system is changing millions of times per second at each interface.
As networks get “faster” there are more frequent state changes, but the time constant of the TCP control loop stays roughly the same (since it is dominated by distance and the speed of light).
As a result, the ratio of “system steps” to “round trip response time” has diverged. The system is now evolving far too “fast” for the end-to-end TCP control loop to successfully interact with it.
Root cause analysis
For the record, none of the parties involved are motivated by any form of “rent extraction”. They merely want a reasonable and standard control over their input costs, which reflect underlying intrinsic delivery costs. There is a genuine desire to deliver good QoE to all network users.
We have also talked through the scenario with several of our senior industry contacts. They and we agree that there is nothing that the service provider is doing that is outside industry norms.
This leads us to conclude that problem is systemic, and not specific to this service provider. The “end-to-end principle” that underpins the Internet has long been known to be a shaky assumption, and has not fulfilled its promise here.
Why is this? The queues in the system are the means by which flows interact with one another, and how their performance is coupled. The increasing rate of state changes means performance information being conveyed from inside the network to the edge is decaying in usefulness faster and faster.
So the 1970s protocols being used really do have scaling limits. This instance of a rate limiter appears to us to be a harbinger of a more general and widespread issue of “stochastic disorder”.
This means there is an absence of a stable equilibrium, which puts the ability to deliver consistent QoE at risk. This in turn puts the intrinsic value of Internet services at general risk. It appears to us that, in this circumstance, inappropriate use of “packet neutrality” has exacerbated this problem.
The Internet is not a scale-free architecture
As service providers scale their networks they have assumed that adding higher capacity links would not introduce new performance hazards. In other words, they have assumed that the Internet’s architecture is “scale-free”.
It is the nature of assumptions that you only need a single counter-example to prove them to be false.
Our strong hypothesis is that what we have witnessed here is a crunch that the whole industry has to face. Indeed, it may already (unknowingly) be facing it.
The naïve resolution is to throw capacity at the design problem, resulting in ever-lower efficiency as you move to ever-higher speeds. This is unlikely to excite investors, and customers won’t appreciate rising bills.
The wiser alternative is to engage with the design issues directly, and engineer the stochastic properties of the system to deliver both the desired performance and a managed cost structure.
From our perspective, this has been an interesting engagement. We identified the root cause and gathered the definitive evidence. We then constructed a stochastic solution to restore the desired behaviour.
If you are an operator and want to discuss this issue and how to safely run networks in saturation, then we can help.
If you are a regulator and feel worried that “neutrality” might have unintended consequences, then we can help. Get in touch
For the latest fresh thinking on telecommunications, please sign up for the free Geddes newsletter.