Imagine that someone had constructed the world’s most technically capable ISP, but practically nobody knew about it. They say that truth is stranger than fiction, so here is the true story of the world’s first quality-assured ISP. It was commissioned by the Welsh Assembly in 2006 to serve the needs of deaf people across Wales.
I interviewed Dr Neil Davies, who was the main force behind its design. The technology he invented, Contention Management, potentially has a profound impact on the future cost and value of broadband services.
MG: What problem was the Welsh Assembly trying to solve?
ND: Around 3% of people are hearing impaired and need video cues additional to sound. The problem Welsh Assembly saw was the delivery of fit-for-purpose visual communications over commodity DSL broadband (at the time being 256kbps up and 512kbps down).
This was not seen as a “nice to have” thing, but rather as a vital service that citizens could rely on, for example to make healthcare appointments, or to summon emergency services.
We had worked with the Centre for Deaf Studies on issues of accessibility for many years. For example, we constructed a real-time transcription system in the late 1980s to be able to transcribe live video and lectures. As part of the project consortium, they came to us asking for help to design and deliver the service.
How does sign language differ from ordinary video?
Sign language is hard in terms of delivering interactive video:
- The visual stream contains all the information; this is not video-enhanced speech. As the video has all the information, there is no ability to “fill in” visual quality issues from the audio.
- A large amount of the information content is in facial expressions, and subtle hand and finger motion. There are only small differences between frames, and these have to be captured and rendered faithfully.
- Video codecs have a different response to packet loss and delay than audio codecs. In particular, certain standard implementations (typically those with the most compression) would completely collapse when there was loss, with a half to one second gap in the video stream. This made them unsuitable for sign language use.
- PC webcam applications like FaceTime and Skype are poor for sign language, as you need a good view of the head, shoulders and face. This means you typically need dedicated hardware.
- There is also a high degree of concurrency in sign language; it’s not turn-taking as with audio. If you sit in a room with a dozen deaf people, each one will be involved in several simultaneous conversations, as they don’t “interfere” with each other like with sound.
How did you go about understanding the problem?
We deployed a series of pairwise trials where two sign language users engaged in a video call over existing broadband services. There were around 60 test calls, with different pairs, over 6-8 weeks, at different times of day.
For each trial call the user had to rate the quality, rather like mean opinion scores (MOS) for voice. We also instrumented the systems to gather the loss and delay characteristics of the video traffic steams end-to-end.
Out of that, there was only one call that worked perfectly! We wanted to reproduce that single element of success, where the underlying infrastructure had been sufficiently capable.
We needed to identify what the difference was between the perfect session, and the others, so that we could satisfy the users’ performance needs.
What distinguished success from failure?
We performed lab trials where we varied the packet loss and delay, and filmed people signing and the corresponding video artefacts. This helped us to understand the constraining factors.
We then assessed the performance needs of the control signalling, as well as audio and video streams. Although audio isn’t figural for sign language, it still matters: some people are deafened, not deaf, so there is a range of requirements.
We worked out what pattern of packet loss and delay was needed for these control and media streams, so that the application would stay within its acceptable performance envelope. It turns out that effective sign language conversations are possible over substantively higher round-trip times than for voice, by a factor of five (i.e. a 750ms one-way delay is OK).
We then compared the measured user network performance data with the application performance envelope data. From this we deduced that the QoE failures were due to contention effects, and their inappropriate management.
We then investigated where the contention was occurring, and if it was something that we could effectively manage.
How did you analyse the cause of contention?
Our project timing meant that we had an opportunity to access the underlying infrastructure. The UK regulator, Ofcom, has wisely unbundled the UK market, hence allowing innovative experiments by new entrants (like us). To enable multiple retail ISPs, BT Wholesale supports ISP cross-connects into its network.
We got access to an early cross-connect link that was then empty. We constructed a test ISP with a couple of end user points, and measured the distribution of packet loss and delay from those end points through to the central core.
This was a 3-point measure: user to interconnect, and interconnect to an external measurement point. This separated out the BT Wholesale contribution to contention from any general Internet backbone effects.
What did you find?
From that data we derived the structure of the delay of the underlying connectivity, in both directions. We worked out what the underlying contention variability was (i.e. that which was under BT’s control, before we applied our own load).
The variability was less than 5ms in the upstream, and 15ms in the down. There was also some loss, due to contention at peak hours, albeit at low rates. DSL connections are not perfect, so there was a loss rate due to dropouts.
So the underlying BT infrastructure was stable and solid. The contention effects were mostly due to the load that we were applying, and thus was under our own control.
We ascertained that the unavoidable loss at the packet level was one in ten thousand or better, so the video codecs would deliver less than 3 visual artefacts per operating hour. That was acceptable as a QoE bound, if we managed the remaining contention.
How did you manage the contention?
We couldn’t dedicate the DSL line to a single use, so we had to make it work consistently and assure the sign language service whilst also delivering a normal domestic broadband experience.
We had done previous work for Boeing on Future Combat Systems. That project taught us that to achieve this you had to know how to run networks in saturation. That means differential distribution of “quality attenuation” to packet streams, and an appropriate scheduling device to do this.
So when we built the assured sign language solution, we had in fact constructed a generic “contention management” solution to assure any kind of traffic.
How does Contention Management work?
Contention Management (CM) works exactly as the name suggests: is distributes contention. By knowing the contention points, we pre-contend the traffic so that it fits into the constraints.
In this deployment the constraining points of contention along the path are clear. In the upstream, the contention is at the egress from the premises. In the downstream it is at the ingress over the last mile DSL, and ingress into the wholesale over the first hop. Both of these have to be managed.
We now knew both the demand and supply properties, so could construct appropriate scheduling that fulfilled the requirement, even in saturation. That means we could create a custom contention treatment so that video phone behaviour was acceptable, even when other traffic on the system pushed it into overload.
What was the effect on the user experience?
It was perfect! Those assured application worked within specification, with a quality close to that of a dedicated circuit costing ten or a hundred times the price. The only performance issue we were left with was DSL line retrains, which were outside of our control.
What happened next?
We designed this cutting-edge service, constructed the working solution, and figured out how to price it. But it didn’t roll out to its intended users. Whilst it was technically working, the service delivery wasn’t in our hands.
The video phones were £700 each, and there was a need for some level of installation assistance. There is a high correlation between deafness and poverty: it’s rare to be disabled and rich.
Sadly, there was a personality clash between consortium staff, who were in a dispute over who would install the system. Without deaf installers, how could you communicate with clients? After all, you can’t just say “watch a video on how to install it”. It needed a social interaction.
So whilst we had a solution to serve every deaf person in Wales at under £1000, minor nuances of deployment got in the way – to the extent that the Welsh Assembly and Sign Wales relationship broke down, and the project was never completed.
In a future article we will explore the future for quality-assured broadband. In the meantime, if you would like to discuss how to apply Contention Management technology in your business, please get in touch.
For the latest fresh thinking on telecommunications, please sign up for the free Geddes newsletter.