Discovering the Audio Dynamic Range Delivering the Best PESQ/POLQA MOS Voice Quality Scores

Posted 10/26/2021

In Troubleshooting and Performance Monitoring

Reading Time: 5 minutes

The ACSI (American Customer Satisfaction Index updated by University of Michigan) provides a rating of network quality based on customer evaluations of call quality, call reliability (dropped calls), network coverage, and data speed. Call quality (meaning POLQA / PESQ / MOS scores) is still a major factor in determining how consumers rate a cell phone service.

What do we do if we can’t hear the person at the other end?

How often do you find yourself turning up the received volume on your cell phone when the call quality or specifically clarity of the audio is bad?

If voice quality is bad, we tend to increase the volume so that we can interpret what the speaker is saying. So it is recommended that you, the network provider, test your call quality end-to-end, from device over network to device at different audio levels to determine a level at which optimum speech quality can be provided. This is an easy low-cost way to ensure best audio quality can be delivered end to end.

The dynamic range of the network, just as increasing the bandwidth of the audio system from narrowband audio to wideband audio, assists listeners in understanding the content. Measuring the audio level at which best voice clarity is achieved allows you to efficiently engineer your network to deliver great voice quality to your subscribers.

Completing a POLQA/PESQ drive test without testing at multiple audio input levels diminishes the value of the test and wastes costs.

A typical mobile connected to a wireless cellular network, is tuned to send audio at -18.2dBov. (i.e., 18.2dB below the maximum audio level permitted by the system or in other words, the level of which clipping will commence). Audio energy transmitted up to 10dB above or 10dB below this is sent maintaining this full dynamic range. In other words, speech audio signals remaining within this range will be transmitted up to -8.2dBov to -28.2dBov linearly in amplitude with the input signal. Audio signals exceeding this range will be companded. In other words the dynamic range will be reduced for example 10dB variation input may only result in a 5dB variation of audio transmitted through the network.

What is the transmitted audio from your devices and how does that match to the level range at which your network will deliver the best POLQA/PESQ measurements?

What is Dynamic Range?

Dynamic range is the ability of the audio system to transmit low level quiet voice signals at the same time as high amplitude audio signals without distortion. For example, the voice sub system of a telephone network. The dynamic range of human voice from quietest to loudest is about 96dB. However, the telephone network reduces this to a theoretical best case of 42dB dynamic range with narrowband voice using the 8 bits of vertical resolution available in a 64K PCM system.

Audio Dynamic Range of a Wireless Network

The nominal level for POLQA is 26 dBov, so the levels received from your testing apparatus are adjusted with full linearity to this level.

Voice level or voltage is measured using the standard ITU-T definition P.56. See the previous Teraquant article, “The Rusty Old Red Box“.

This means that the optimum audio level input into the POLQA system receiver should be 26dB below full-scale deflection or below the point at which clipping occurs within the receiving amplifier and ADC (Analog to Digital Conversion) system. If a burst of increased audio energy or volume amplitude occurs as is expected with typical speech, e.g., during an exclamation or shout, this must still be kept within the linear dynamic range of the system and avoiding any insertion of impairments or non-linearities which will reduce the POLQA quality score.

Imagine a mobile device tuned for a nominal audio level of -20dBov which is great for conversational speech. In a first responder scenario, there’s a loud siren in the background coupled into the radio or mobile device at 0dBov. We need to attenuate or compand that so it doesn’t override too much the conversation. At the same time, someone shouts a command “off Mic” at -10dBov. This is a lifesaving warning for everybody and needs to be heard.

Audio levels need to be tuned throughout the network from microphone to transmitting mobile device, over a wireless link, through an IP/IMS network (as is the case in an LTE/4G/5G wireless network) to receiving mobile to achieve best user experience.

The quality delivered by the system under test (SuT) or the network under test will vary depending on the incidented input audio level. A low audio level and a high audio level will impinge on non-linear parts of the end-to-end system and therefore it is essential to determine the audio level at which best speech quality or POLQA score can be obtained.

Field Testing of Wireless Voice Services

Voice quality measurements made using the POLQA/PESQ algorithms are very sensitive to the audio level transmitted. It is essential to understand which audio level within a wireless network (E2E including mobile devices) delivers the best POLQA score. With this information, you can engineer your network and devices to ensure optimum speech quality voice quality as delivered to your customers end-to-end across your network with your given range of supported mobile devices

The Level Offset parameter adjusts the level at which test signals are played into the network. This level significantly influences the speech quality scores measured. Playing speech at the right or wrong level can mean the difference between achieving a maximum practical 4.5 score or a failing 2.75 score. If the level is too low, the noise floor degrades the speech quality score. If the level is too high, amplitude clipping degrades the score.

What Requirements does this Create for Drive Test or Field-Based Test Equipment?

Often low-quality test equipment introduces such impairments due to their diminished dynamic range. Many vendors take a standard PC card, put it in their own enclosure and offer that as measurement instrumentation.

Industry class test systems must have high dynamic range exceeding 100dB in order to be able to test using audio levels varying by up to 30dB while still remaining in the linear range of the receiving amplification system. It is necessary to measure the speech quality (POLQA) obtained into the companding range with more than 10dB average level variation from the nominal level of this system. To comply with POLQA recommendations the degraded signal or received test signal must be within +5dB and -20dB of the Reference signal.

See for Yourself!

Try out the Opale Systems multiDSLA with 110dB dynamic range and find out why all advanced voice labs worldwide use it. Find out the difference between the current POLQA score you are getting versus what is your real accurate measurement result. In addition, find out the optimum level of audio which may be different from your nominal level of audio, which provides the best audio quality in your network with your certified devices. Find out the benefit derived from noise cancelling devices and which systems work better in the presence of background noise and see for the first-time detailed diagnostics which show what happens to your audio within each speech utterance, for each frame coming out of the voice codec.

If you would like to read more white papers on these topics, please reference our Voice Quality Partner Opale Systems in France.

and to see wider content on voice, UC & WAN/SASE monitoring and implementation, look us up on LinkedIn

Please get in touch as below to schedule your loaner equipment so you can make the measurements in your own environment and discover the difference. If you have any questions, please just send us an email using the Contact Us button below.

Schedule Calendly

For more information, view the OCOM information on:
SIP Monitoring in Real-Time