Wednesday, December 24, 2008

Motivation

In performing this work, we would like to be able to determine the impact of design and environmental changes (e.g. network conditions, such as packet loss) on voice quality. The ability to quantify voice quality is important for a number of reasons. First, we would like to compare the quality of voice over packet networks to the PSTN, as the PSTN has become the de facto standard for what constitutes acceptable voice quality. We would also be able to test the effectiveness of various network protocols and policies that are known to support real time traffic. Lastly, from a business perspective, measurements of voice quality allow a vendor to offer better features than those of its competitors, as well as to provide the basis for voice quality service level agreements (SLA).Voice quality could be measured using a procedure called Mean Opinion Scores (MOS). The MOS uses the Absolute Category Rating (ACR) procedure to determine the general acceptability or quality of voice communication systems or products. A MOS measurement is made by having a group of listeners rank a speech sample on a scale of 1-5, where 1 is very bad, 5 is excellent and 4 is normally considered ‘toll-quality’ (what one hears on the Public Switched Telecommunications Network (PSTN)). Obviously MOS is highly subjective and not highly reproducible. It is difficult to assemble a group of people, creation of ideal test facilities, selection of proper sound files, assembling audio devices and it is not suitable for long-term measurementTo address the shortcomings of this subjective testing a number of methods have been developed to create an objective and reproducible measurement of perceived voice quality. There are two clarity measurements currently used, the first one is PSQM (Perceptual Speech Quality Measurement) developed by KPN Research and the second one is PAMS (Perceptual Analysis/Measurement System) developed by British Telecom. Both these techniques use natural speech or speech-like samples as their inputs. The speech samples are played over the network that is setup for different configurations and the received speech sample is compared with the original speech sample using clarity algorithms.

No comments: