New MPEG-4 High-efficiency AAC Audio- Enabling new applications
MPEG-4 High Efficiency AAC is the combination of MPEG AAC and the SBR Bandwidth Extension amendment, which was finalized during the March 2003 MPEG meeting. The amendment is based on Coding Technologies' SBR (Spectral Band Replication) technology which Coding Technologies and its customers deploy the technology under the aacPlus™1 brand name. In June 2001, XM Radio became the first commercial system to deploy what has now become High Efficiency AAC. Since then, it has gained market momentum by delivering CD-quality stereo at 48Kbps and excellent quality stereo at 32Kbps. In late 2001, its value was recognized by MPEG and it is now on track to become a core profile for MPEG-4 audio called “High-Efficiency AAC.”
MPEG-4 High Efficiency AAC (HE AAC) is not a replacement for AAC, but rather a superset which extends the reach of high-quality MPEG-4 Audio to much lower bit rates. High Efficiency AAC decoders will decode both plain AAC and the enhanced AAC plus SBR. The result is a backward compatible extension of the standard which nearly doubles the efficiency of MPEG-4 Audio.
SBR is a unique bandwidth extension technique that enables audio codecs to deliver the same listening experience at around half the bit rate. As a result, High Efficiency AAC delivers CD-quality stereo at 48Kbps and 5.1 channel surround sound at 128Kbps. This level of efficiency is ideal for Internet content delivery and fundamentally enables new applications in the markets of mobile and digital broadcasting.
According to Coding Technologies, licensing of the SBR technology follows the simple licensing model of MPEG-4 AAC with annual caps for personal computer applications, per-unit fees for hardware devices and no additional fees for electronic music distribution. This license structure and its technical capability are designed to make HE-AAC well suited to commercial content services.
MPEG-4 “High-Efficiency AAC” audio profile
The new “High-Efficiency AAC” audio profile consists of the existing MPEG-4 AAC object type and the new SBR object type.
In March 2003, MPEG Audio took a major step forward with the finalization of the High Efficiency AAC specification. In order to address the needs of the digital TV and audio industry, MPEG-2 AAC LC plus SBR will be also standardized in an amendment to MPEG-2 AAC (Part 7 of MPEG-2). This means that wherever MPEG-2 and MPEG-4 AAC are used today, systems operators can leverage the new, open, standardized technology to reduce their bandwidth requirements by up to half.
In addition, MPEG has also recognizes a “low-power” decoder variant for HE AAC. Developed jointly by Panasonic, NEC and Coding Technologies, this low-power decoding method requires 40% less processing power and decodes HE AAC bitstreams with only a slightly reduced audio quality. The availability of both lowpower and high-quality decoders for HE AAC enables the standard to run on the widest possible range of processors in mobile and portable device applications.
Listening Tests Show Superior Audio Quality
Several independent tests have been conducted over the past 2 years to demonstrate the value of MPEG-4 HE AAC compared to ‘normal’ AAC, other audio coding standards and proprietary codecs. These tests show that MPEG-4 HE AAC offers a significant benefit over these proprietary codecs and over AAC without extensions, and places it clearly as the most efficient audio codec in existence.
MPEG Listening Test
MPEG conducted a listening test in the course technology selection for the MPEG-4 Audio Extension 1. For the MPEG test, experienced listeners used the MUSHRA blind test method to relatively rank the items compared to a known unencoded reference. As is typical for MPEG testing, items known to be difficult to properly encode (e.g. harpsichord, glockenspiel, pitchpipe, male German speech, castanets) were used. Even with this high bar, HE AAC displayed good absolute performance and provided a clear improvement to AAC.
Digital Radio Mondiale Listening Test
Digital Radio Mondiale is an international industry consortium creating the new global standard for digital broadcasting in the AM bands below 30 MHz. Their testing for technology selection targeted 24Kbps mono and also used the MUSHRA method with average broadcasting content. HE AAC again demonstrated superior performance, even when compared to AAC at 32Kbps.
European Broadcasting Union Internet Audio Listening Test
In 2002, the European Broadcasting Union (EBU) conducted its second round of Internet audio tests. This suite of tests compared a variety of codec at several bitrates with Coding Technologies’ aacPlus included in the mix at 48Kbps. These results show aacPlus (now standardized as HE AAC) as the clear winner, significantly outperforming proprietary competitors and improving over other standards. The EBU report also went on to credit the SBR technology in particular (used in both the aacPlus and the mp3PRO submissions) as being the only fundamental enhancement to audio compression as compared to the same suite of tests two years earlier.
The unique capability of MPEG-4 HE AAC to achieve high-quality at very low bit rates not only enhances existing markets but also enables new markets for digital audio. Where bandwidth is constrained, the value of “High-efficiency AAC” is magnified. It not only enhances audio-only services, but also video services like digital TV. By coupling HE AAC with MPEG-4 Video, more bits can be allocated to the video signal without degrading the quality of the audio signal. This is true for mono, stereo, and multichannel applications. Combined with the new MPEG/ITU Advanced Video Coding standard (included in MPEG-4 as part 10), even more significant gains in quality are possible.
Mobile streaming and download
There is significant expectation for mobile multimedia services associated with the new 2.5G and 3G mobile service networks. Streaming video and other high-bandwidth services have been showcased as the ideal applications for this new infrastructure. The problem is that for much of the coverage area the peak bandwidth in these networks is generally around 144Kbps with individual users sustaining connections of about 40Kbps. Delivering quality video over this type of connection is problematic and may not meet the consumer expectation.
HE AAC is well suited to solve this problem by being able to provide consumer-grade download and streaming audio services within today’s bandwidth. 48Kbps HE AAC provides CD-quality stereo programming while at 32Kbps, it provides excellent quality stereo programming. These bit rates combined with the growing market for subscription audio services show a strong business opportunity starting in 2003 on into 2004 and beyond.
Digital broadcast via satellite and cable
aacPlus gained its first commercial success with XM Satellite Radio. Over a given transponder bandwidth, aacPlus allowed XM Radio to offer more channels at a higher quality than the competing Sirius Radio system which uses the proprietary PAC codec from Lucent. aacPlus was also selected by the Digital Radio Mondiale consortium as part of the standard for Shortwave and AM digital radio.
This heritage has brought credibility for High-efficiency AAC in the open standards world of digital satellite and cable broadcasting. As operators look to enhance their services with more channels or with high-definition, the efficiency of HE AAC gives them more options to either consolidate audio bandwidth to make room for more video or to layer more audio services like multi-lingual and 5.1 surround. Since Spectral Band Replication (SBR) is being added to the MPEG-2 standard as well, operators have the flexibility to use HE AAC within the MPEG-2 or MPEG-4 context as desired.
Video on demand and subscription content services are continuing to grow on the Internet. In these services, aggregate server bandwidth usage and last-mile bandwidth constraints make it difficult for operators and aggregators to provide high-quality, reliable services to consumers. By cutting the audio bandwidth requirements nearly in half, MPEG-4 both reduces costs and increases reliability for Internet content services. Since HE AAC also follows AAC in being rights management compatible, and not having additional content distribution fees, it helps create a safe and economical vehicle to deploy audio and audio/video content over the Internet.
Availability and Licensing
Following the lead of MPEG-4 AAC, the SBR technology has a straight per codec licensing structure without electronic content distribution or usage fees. This structure makes MPEG-4 HE AAC ideal for commercial content distribution services. Licensing for HE AAC consists of two parts with schedules and details available from Coding Technologies for the SBR object types and from Via Licensing for the AAC object types. MPEG-4 HE AAC is a proven technology which is already widely deployed and ready for use today. Reference decoder source code will soon be made available through MPEG and optimized source code for both encoders and decoders is available for license from different vendors, including Coding Technologies and Fraunhofer IIS. Optimized binary implementations for Win32, Linux, MacOS X, ST Micro, ARM, Motorola, and Trimedia are available now and other firmware implementations are being completed.
SBR or Spectral Band Replication was developed by Coding Technologies as a generic method to significantly enhance the efficiency of perceptual audio codecs like MPEG Layer-3 (mp3) and MPEG AAC. SBR does not replace the core codec, but rather operates in conjunction with it to create a more efficient superset that can cut the required bit rate in half. MPEG-4 Audio uses SBR in conjunction with AAC to create the “High-efficiency AAC” profile which Coding Technologies has given the name “aacPlus”.
Present in both the encoding and decoding process, SBR leverages the correlation between the low and high frequencies in an audio signal to describe the high-end of the signal using only a very small amount of data. This SBR data describing the high-frequencies is coupled with the low-frequency compressed data from the AAC codec. Once combined, the complete HE AAC bitstream contains enough data to recreate the original signal.
For example, to create 48Kbps stereo HE AAC, the encoder generates two signals: an MPEG AAC signal at about 42Kbps and a SBR signal at about 6Kbps. The SBR signal is then placed into the MPEG AAC auxiliary fields as defined in MPEG-4 and sent out as a complete 48Kbps MPEG-4 HE AAC bitstream.
Since the SBR data is placed within the AAC auxiliary fields, the enhanced signal will be accepted by both an existing AAC and a new HE AAC decoder. If sent to an AAC decoder, only the low-frequency audio signal will be recognized and decoded. If sent to an HE AACdecoder, the SBR and the AAC will be decoded to recreate the full frequency signal. This technique makes the new Profile forward compatible with AAC. Also, since the HE AAC decoder contains a full-fledged AAC decoder, it is able to decode both the “Plain AAC” and “Highefficiency AAC” MPEG-4 Audio profiles. This combination makes HE AAC backward compatible with AAC.
MPEG – http://mpeg.tilab.com/
M4IF – http://www.m4if.org/
Coding Technologies – http://www.codingtechnologies.com/
Via Licensing - http://www.vialicensing.com
Fraunhofer IIS - http://www.iis.fraunhofer.de/amm/
Digital Radio Mondiale – http://www.drm.org/
XM Radio – http://www.xmradio.com/
Martin Dietz and Stefan Meltzer, aacPlus: A State of the Art Audio Coding Scheme, European Broadcasting
Union Technical Review: No. 291 (July 2002), http://www.ebu.ch/trev_291-dietz.pdf
Gerhard Stoll and Franc Kozamernik, EBU Listening Test on Internet Audio Codecs, EBU Technical
Review: No. 283 (June 2000), http://www.ebu.ch/trev_283-kozamernik.pdf
MPEG Meeting Press Release (October 2002),
Oliver Kunz, Codec Designs Fine-tuned with Spectral Band Replication, EE Times, (June 24, 2002),http://www.commsdesign.com/design_center/homenetworking/design_corner/OEG20020621S0076
Martin Dietz, Lars Liljeryd, Kristofer Kjörling, and Oliver Kunz, Spectral Band Replication, a novel approach in audio coding, 112th AES Convention, Munich, May 2002, Preprint #5553
ETSI TS 101980 v1.1.1 (September 2001), Digital Radio Mondiale; System Specification
Recommendation ITU-R BS.1348-1 (February 2001): Service requirements for digital sound broadcasting at frequencies below 30 MHz
MPEG document N5201, Text of ISO/IEC 14496-3:2001/FPDAM 1, Bandwidth Extension