Here is a crate of apples.
Nice, aren’t they? Don’t they all look smooth and shiny! What if I told you that the average apple in this crate was only picked a week ago? So fresh, too!
Would you like one to eat? I’ll just pick it at random for you… What do you say?
You ought to be pausing for some serious thought at this moment. What trickery could be afoot?
Here you go! Your own, individual, bottom-of-the-crate apple!
What do you mean, “YUK!”. You ungrateful person! These apples are, on average, extremely fresh!
I think you get the point. As the famous statistician Hans Rosling says in the video The Joy of Stats: “Useful as averages are, they don’t tell you the whole story. Almost everyone in Sweden has more than the average number of legs: the variation is just as important as the average.”
OK, so this is meant to be a serious article about telecoms, not over-ripe apples or legless Swedes. How does this all relate to our world of broadband networks?
Let’s take a real and important example. Consider the specification IR.34 v9.1 for IP-based long distance and international phone calls. It’s called IPX and you can read more in my previous article “IPX: Telecoms salvation or suffering?”. The spec is managed by the GSM Association and i3forum. Your future ability to make voice calls over technologies like VoLTE depends on it working.
Now please strongly note: I am not picking on the GSMA, i3forum and IPX for any reason other than it’s an example that happens to be to hand that is also published. These issues are endemic to the entire telecoms industry. I just can’t publish the network planning rules for any telcos, or internal design specs for other cloud application services. No loss of face should be imputed to those randomly selected for mild public humiliation as an educational case study.
Take a look at their requirement for packet loss in section 6.3.4, which forms a part of their quality specification for voice services:
You’ve no doubt spotted the danger word: “average”. Over a Gregorian calendar month, indeed. (Why not a fortnight, or maybe a Mayan Haab’ month? Who knows.)
So, I can be compliant with their spec by having just under 0.1% packet loss, on average, in a 30-ish day period. Assuming a constant offered load, that means I could lose every single packet for a continuous 40 minute period at 2pm on the first Tuesday of every month, as long as I have 100% delivery for the rest of the month. This is what most people call an “outage”.
From the end user’s point of view, this is a voice telephony service that is worth avoiding like a plague-infested rat. However, based on my reading, it would be completely compliant with the IPX packet loss spec, with no SLA breach. In other words, this technical specification based on averages is not sufficient to create a working voice service.
Now, any IPX service supplier reading this will be squirming and wanting to shout back “but we wouldn’t do that, it’s not how it works, there’s also an availability requirement!”. Well, the retort is: how should it work? What distribution of loss is acceptable? We need to know the distribution because, as you will now fully grasp:
There is no quality in averages.
So what? The point and purpose of broadband networks is to manufacture performance for distributed computing applications. We do this by copying a quantity of data between computation processes, with a quality of timeliness. At every stage of design, construction, operation and marketing we are using quality metrics based on averages. These don’t adequately reflect the actual requirement for a working service in the users’ eyes. They also don’t decompose and compose as engineering specifications: when we join all the pieces together, we end up either using excessive resources, or create systems that unexpectedly fail.
Our collective inability to grasp the absence of quality in averages causes immense waste. Compared to what is possible, there is a tremendous misallocation of capital, a horrendous shortfall in the customer experience, and a scandalous rate of technical failure of new products and services.
This issue pervades every aspect of the telecoms business. Every time you see a bandwidth claim for multi-megabits or gigabits per second (an average), remember that sending recorded media in the post easily delivers the same average throughput, just with a rather different distribution of arrival times. Once more, as it is worth repeating over and over: there is no quality in averages.
So what next, one might well ask? Well, if we don’t want this rotten situation to continue, we need to adopt a different approach. Our metrics and requirements need to reflect a non-negotiable reality: there is only quality in distributions.
To learn more about metrics and measurements based on distributions, see the presentation “How Communications Service Providers can create new value from quality attenuation analytics”.
For the latest fresh thinking on telecommunications, please sign up for the free Geddes newsletter.