An introduction to ISO/MPEG Layer 3 and Layer 2 coding
Telos Systems' pioneering work in ISDN for broadcast use resulted in the world's first codec with MPEG Layer 3 and Layer 2 capability, the Telos Zephyr, and the world's first codec with AAC (Advanced Audio Coding), the Zephyr Xstream. Today, Telos is recognized as an authority on telephone network interfacing and the use of audio compression for broadcast. Portions of the following text were written by our founder, Steve Church for the NAB Engineering Handbook; we thank the NAB for their use here.
Perceptual Coding and MPEG Compression
The broadcast world has been transformed by the introduction of perceptual audio coding techniques. Applying perceptual coding methods, it is possible to pass studio-quality 15 or 20 kHz bandwidth audio over ISDN channels.
By far, the most popular perceptual coders rely upon techniques developed under the MPEG umbrella. When the CD had just been introduced, the first proposals for audio coding were greeted with suspicion and disbelief. There was widespread agreement that it would not be possible to satisfy golden ear listeners while deleting 80% or more of the digital audio data. In response, the MPEG (Motion Pictures Experts Group) was formed, and since 1988 the group has been working on the standardization of high quality low bit rate audio coding.
Three standards have been completed: MPEG-1 (coding of mono and stereo signals at sampling rates of 32, 44.1 and 48 kHz), MPEG-2 (ISO/MPEG IS-11172: coding of 5+1 multi-channel sound signals and low bit rate coding of mono and stereo audio at sampling rates of 16, 22.05 and 24 kHz) and the latest standard, MPEG-4 (ISO/IEC 14496). Today almost all agree not only that audio bit rate reduction is effective and useful, but that the MPEG process has been successful at picking the best technology and encouraging compatibility across a wide variety of equipment.
In 1992, this process resulted in the selection of three related audio coding methods, each targeted to different bit rates and applications. These are the famous layers. In 1997, another algorithm, Advanced Audio Coding (AAC) was added to the MPEG standard.
All of the MPEG codecs rely upon the celebrated acoustic masking principle—an amazing property of the human aural perception system. When a tone—called a masker—is presented at a particular frequency, we are unable to perceive audio at nearby frequencies that are sufficiently low in volume. As a result, it is not necessary to use precious bits to encode these inaudible, masked frequencies. In perceptual coders, a filter bank divides the audio into multiple bands. When audio in a particular band falls below the masking threshold, few or no bits are devoted to encoding that signal, resulting in a conservation of bits that can then be used for the bands where they are needed.
Layer 3 Technical Description
Layer 3 (popularly known as MP3) implements a unique combination of methods to attain high compression ratios while preserving audio quality. Telos Systems' Layer 3 implementation was developed in close collaboration with its developer, Fraunhofer-Gesellschaft.
Psychoacoustic masking: The audio in Layer 3 is divided into 576 frequency bands. First, a polyphase filter bank performs a division into the 32 "main" bands, which correspond in frequency to those used by Layer 2. Filters are then used to further subdivide each main band 18 times. At the 32kHz sampling rate, the resulting bandwidth is 27.78Hz (compared to 500Hz for Layer 2) allowing very accurate calculation of the masking threshold values.
Redundancy reduction: Redundancy reduction is accomplished by a Huffman (entropy) coding process to take advantage of the statistical properties of the signal output from the psychoacoustic stage. This lossless redundancy reduction process is the ideal supplement to psychoacoustic masking. In general, maskers with high tonality have more redundancy but allow less masking, while noise-like signals have low redundancy and high masking effect.
Bit reservoir buffering: Often, there are some critical parts in a piece of music that cannot be encoded at a given data rate without "softening" of the transients. Layer 3 uses a short time "bit reservoir" buffer to address that need. If a critical part occurs, the encoder can use the saved bits to code this part with a higher data rate.
Joint-stereo: Layer 3's joint stereo mode takes advantage of the redundancy in stereo program material. The Telos Zephyr Xstream encoder switches from discrete L/R to a matrixed L+R/L-R mode on a per frame (24/36ms) basis. The intensity coding scheme used in Layer 2 codecs combines audio above about 6kHz to mono and pans it to seven fixed positions across the stereo stage.
Layer 2 Technical Description
The ISO/MPEG-1 Audio Layer 2 compression algorithm, commonly called Layer 2, has become an internationally acknowledged standard for high-quality digital delivery of audio around the world since 1992. Its audio coding scheme is used in a wide variety of broadcasting, storage, and telecommunications applications. This industry benchmark was developed by CCETT (Centre Commun d'Etudes de Telecommunications et de Telediffusion) in Rennes, France; the IRT in Munich, Germany; and Philips in Eindhoven, Netherlands, who solely or jointly hold patent rights on the technology.
The ISO/MPEG-1 Audio Layer 2 standard defines the bit stream syntax and the decoder specifications. The encoder's open architecture allows continuous improvements and easily accommodates application-specific requirements. Because it is a worldwide standard, licenses are issued on a nondiscriminatory and fair basis.
MUSICAM or Layer 2: What's in A Name? The licensors, CCETT, Philips, and the IRT, initially dubbed their ISO/MPEG Layer 2 audio coding scheme MUSICAM, an acronym for Masking pattern adapted Universal Subband Integrated Coding And Multiplexing. The three used the name MUSICAM before the algorithm was selected by the ISO/MPEG as Layer 2 of the MPEG-1 audio standard.
Throughout much of the world, the protected name MUSICAM is held by the French broadcasting organization TDF and Thomson Brandt. However, in the United States the trademark, MUSICAM, is protected by one of the licensees of Layer 2. To avoid confusion in the marketplace and enhance user awareness of compatibility world wide, many of the algorithm's implementers now describe the coding scheme as MPEG Audio Layer 2 or just Layer 2.
Evidence of Layer 2's value and performance is provided by:
The outcome of critical international listening tests conducted by prestigious organizations including ISO/IEC, ITU-R, EUREKA, CRC and the Electronics Industries Association (EIA).
The wide range of applications.
The production of Layer 2 decoder chips by the world's key semiconductor companies.
The Layer 2 implementation in Zephyr Xstream meets the MPEG audio world-wide coding standard. We have worked closely with the Institut fur Rundfunktechnik, co-developer of Layer 2, to incorporate the latest technological improvements for superior audio quality. Our Layer 2 is fully compatible with all other Layer 2 implementations, included those marketed as MUSICAM.