50 Years Legacy of Accurate Audio Analysis
How Dolby audio quality design and the invention of the ITU-T standard for measuring voice voltage level from 50 years ago came together to provide the Communications industry with the most accurate measurement of PESQ & POLQA MOS available today.
“Malden”, based in south London, and partner of Teraquant since 2005 was a small company that used to handle the outsourced manufacture of modems and equipment for British Telecom. British Telecom would give them the design spec and Malden (now owned by Opale System/Tenedis in Paris, France) would go and manufacture it.
One day British Telecom said to Malden, “can you build us a speech voltmeter?” As we all know, voices go up and down between a whisper and a shout. But voice is a complex waveform containing many different audio frequencies altogether. This is the opposite of a sine wave, the simplest form of signal, which contains one spectral frequency component. A sine wave has been easy to measure with a voltmeter measuring RMS (Root Mean Squared) voltage for about a century and a half. So, how to measure voice as a single number indication of voice voltage?
Creating a Long Lasting Communications Solution
BT had to invent something that would average all the speech energy over a small period of time. They came up with a measurement that is now the ITU-T P.56 standard and designed a box way back in the 1970s, the “Speech Voltmeter No 6”, to measure this voltage. Malden designed the PCSV6, a two channel SV6 controlled by a PC to supersede the SV6. Subsequently, the DSLA (Digital Speech Level Analyser) was developed to generate speech at a defined level and to measure the received level.
So, when we started measuring Mean Opinion Score (MOS), we needed something to capture audio with a large dynamic range. That meant it could make accurate measurements for low level speech whispers and, at the same time, make accurate measurements for high level speech shouting or when the gain has been turned up in the network and a whisper becomes a shout when received at the loudspeaker/earpiece.
It turns out that the DSLA (Digital Speech Level Analyser) was perfect for this. In addition, it has a low noise floor and is a perfectly linear amplifier. This meant that it could capture speech with almost zero non-linearities, i.e., introducing any impairments within the test equipment. It was designed by somebody from Dolby Laboratories who are famous for High Fidelity audio. So, it is a pure amplifier and accordingly, it adds no impairments to the speech that it captures.
When you are seeking to measure the impairments or artefacts in an audio or speech file, it is essential that the test equipment or the audio capture device itself introduces no artefacts or impairments, as this would reduce the accuracy of the speech Quality measurement. This is the case when measuring audio MOS such as to the PESQ ITU-T P.862 standard or the POLQA ITU-T P.863 standard.

As we have been measuring MOS over the past 20 to 25 years, since the advent of the Voice over IP (VoIP) industry, this has turned out to be extremely useful. The DSLA makes the most accurate audio PESQ MOS measurements because it has the purest amplifier, an extremely high-definition A/D converter (analog to digital converter) and introduces zero impairments or artefacts to the captured audio to be measured.
The result is that the measurements realized by MultiDSLA have a 97% correlation with the collective or mean subjective opinion as heard and scored (i.e., MOS) by a large population of humans, within a 95% confidence interval claimed to be the most accurate in the industry.
The leaders in this voice technology, creating the new codecs for voice and video which have become the industry standard in most of our cell phones these days, have stated “Malden, you are the only measurement audio capture instrument that accurately measures our codec.”
What does this mean for you?
It means that when you measure MOS for your product in your lab, you get accurate measurements which means you do not reject good product. A test instrument that introduces artefacts makes a lower MOS measurement. You might think this is an accurate representation of your system under test (DUT) and its audio quality, but in actual fact the artefacts have been introduced by the test instrument and drag down the MOS measurement. Therefore, you would reject that product.
So in summary, accurate MOS measurements save you costs, allow you to ship good product more quickly and eliminate waste, thereby reducing costs and increasing productivity.
In addition, if you are comparing yourselves with the competition, you might be misrepresenting your product in the wrong way, i.e., scoring your product poorer than your competition. In operational networks, accuracy enables you to troubleshoot and address problems that are affecting user experience, not imperfections in your measurement apparatus. This avoids chasing ghosts, chasing problems that don’t actually impact your customers and your SLAs or your Service quality.
Also, in operational networks it is important to understand the difference between Standards Based “Audio MOS” Metrics and “packet-based MOS” metrics.
If you would like to know more about the subjects in this article, please get in touch as below.
For more information, view:
Audio Voice Quality Proactive Monitoring as a Service


