Your car probably has four (or more) speakers – but you are using them to listen to two channel FM broadcasts– a technology that was invented in the ‘50s. Now, imagine that your radio is feeding your ears immersive digital surround music and cinematic production effects– the sort of thing that you hear in a well-equipped movie theater or on a state-of-the-art home surround set-up. Wouldn’t you think this would be a much better way to introduce the beneﬁts of digital radio broadcasting to the public than the “improved stereo” message we have now?
Look around, and you will see plenty of action in the surround audio sphere. DVD Video audio tracks are universally in 5.1 format. You will see surround speaker setups in any store that sells audio or video equipment. Even computer shops are full of 5.1 sound cards and speaker systems. The new DVD Audio and SACD disks are almost always produced with a surround option.
While the focus for multi-channel audio has been elsewhere, surround actually makes a lot of sense for radio. Most listening occurs in cars, and the environment there is pretty good for enjoying multichannel music. There is no problem to ﬁ nd space for the 4 or 5 speakers and sub-woofer. In contrast to an ofﬁ ce or home, you are in a stable position relative to the speakers. Audio is the time-tested accompaniment to driving and surround is a natural next step.
But there hasn’t been much talk yet about surround for radio. That is because the technology needed to accomplish it effectively has just recently been invented, and is only now being introduced. Just a few years ago, it seemed we didn’t have enough bandwidth for quality stereo in IBOC, let alone surround. But multi-channel audio coding technology has advanced quite amazingly, and with surround a real here-and-now possibility for radio broadcasting, you can expect to hear a lot about it in the coming months.
As always, Fraunhofer Institute (FhG), the people who invented MP3 and most of MPEG AAC, have been busy pushing the frontiers of audio perceptual research. The latest result is a powerful spatial audio coding system that takes advantage of the most up-to-date knowledge in aural perception. From psychoacoustics studies, it has been learned that the level difference, time difference, and coherence between channels is what creates the perception of spatial image. The key to FhG’s multichannel system is that they represent these difference values with very compact coding, rather than transmitting all of the individual audio channels. The encoder estimates the values as a function of frequency (that is, within each sub-band) and transmits them to the decoder in an ancillary stream that accompanies the main coded audio stream.
The block diagrams illustrate how an encoder/decoder pair would work within a broadcast channel. The first step is to create the compatible stereo downmix from the multi-channel material. The resulting stereo signal is coded using any perceptual codec. Since there are no changes to the basic codec, this signal can be received by stereo radios. The spatial encoder extracts the various spatial cue parameters from the multi-channel input, which are transmitted in an ancillary data channel. The decoder, if present in the receiver, recreates the original multichannel audio.
In the diagrams, you can see that we need to have a downmix function to create the compatible stereo channels from the multi-channel source. The most obvious way to do this is with simple linear combiner, as follows:
where a and b are constant scale factors, with the values usually ranging from .5 to .7. But this simple procedure is far from the best possible.
When making an optimized downmix, there are a number of considerations which come from both psychoacoustics and production practices. We must present a stereo mix to listeners without multi-channel receivers that sounds as good as a stereo-only broadcast. Simply collapsing the front and back signals into a 2-channel representation may cause confusion in the normal binaural cues and degrade stereo listening. And it almost certainly will sound different from the version that listeners are used to hearing. The FhG system allows a producer to make a manual downmix, thus preserving maximum artistic freedom and allowing maximum flexibility to adapt to different kinds of audio material. Since almost all music released in surround format also has a stereo version on the same disk that could be used as input to the encoder, this stereo version is what would be heard by listeners with non-surround radios – with no modification or compromise of any kind.
Advanced automated downmixing is also an option when manual mixes are not available. A processor could dynamically modify the scaling values and relative phase during mixdown. Such a processor would use advanced algorithms that take into consideration the absolute source positioning, panning laws, the way sources were mixed into the multi-channel signals, and original inter-channel phase relationships. This approach would have the potential to achieve a quality that is comparable to manual downmixes.
All well and good, I hear you asking, but will this work with HD Radio? The astonishing answer is: Yes. The FhG spatial encoding system is fully compatible with HD Radio’s current codec for the stereo channels. And the side-channel for spatial information is less than 20kbps, a rate that is possible in HD Radio’s ancillary data channel.
Maybe you remember the quadraphonic systems from the 70’s that had a brief and unsuccessful run on vinyl and at a few radio stations? Don’t confuse this modern multi-channel perceptual approach with those – or any of the current descendents that are around. While these latter have new names, they are lipstick applied to the withered old lips of the failed 70’s vinyl quad schemes. They have the critical drawback that only fixed-scale downmixes are possible, so stereo compatibility suffers. This is one reason the 70’s era matrix systems didn’t catch-on – they had a weird soft and indistinct quality in stereo. Clearly, this is an important issue for broadcasters. With most people listening in stereo, we can’t afford to compromise our fundamental service. And that is why the FhG approach is so well-suited to radio broadcast: the system does not depend upon any specific downmix procedure to work. Indeed, the downmixing process can be thought-of as a component outside of the basic spatial coding system.
Another problem with matrix schemes is poor surround separation. Matrix systems must mingle everything into a 2-channel signal, which is a crippling constraint on performance. They can have only a few dB separation between some of the channel pairs. (Which channels get the separation and which don’t are design compromises. Each system deals with this differently.) Because FhG’s spatial encoding uses an independent digital side-channel and a modern perceptual approach to spatial cue encoding, it offers very high separation that is not dependent on the nature of the audio or that needs to be compromised for stereo compatibility. By the way, beware of matrix demonstrations using material in one or two channels at a time. These are deceptive because a steering circuit– a gain processor in something like a noise-gate configuration combined with an operation that dynamically varies the matrix coefficients – detects this very directional condition and steers the strongest signal into the target channel, while reducing gain or providing some kind of cancellation in the other channels. (This approach is also a left-over from the 70’s, having first been used in the Tate and Vario-Matrix “logic” schemes.) With normal non-ping-pong programming, which has material present in all the channels simultaneously, the separation is dependent upon the underlying matrix scheme and is much poorer than the demonstrations suggest. In today’s digital world, there is no reason we should bind ourselves to such limited approaches.
The ISO/MPEG audio group has noted these recent advances and their market potential and has started a new work item with the working title Spatial Audio Coding. FhG will submit their spatial approach to MPEG for consideration and testing, and chances are good that it or some variation will eventually be approved as an international standard. Thus there will be the usual advantages of MPEG: an independent conﬁ rmation of performance, and assurance of fair access to licensing.
Where will we get the music to play? In fact, there is a lot of material already available in multichannel. There are a few hundred DVD Audio and SACD surround discs to get us started. These are perfect source material for surround HD Radio, and are off-the-shelf today. A bit less perfect, but still useful, are the Dolby Digital and DTS 5.1 audio tracks that accompany DVD video clips and concerts. With surround broadcasting up and running, record companies will have a powerful incentive to release new material in multi-channel. If the music industry offers music in a surround format, and radio promotes it, there is a fighting chance to get affluent baby boomers interested in buying music again. And if younger people have the opportunity to hear music in this modern format, they might get more interested in both high-end audio and the high-resolution music disks that feed it.
One lesson from DAB in Europe is that mere “improved digital sound” is not enough to cause listeners to buy new and more expensive radios. We need a significant and clear message to motivate change. The one country that has had DAB success is the UK, and they did it with a combination of new programs and cross-promotion. The new program door is closed to us in the USA because HD Radio must be simulcast. But have a look at any shop that sells car audio gear. See all the multi-speaker setups? The subwoofers? The early adopters who are looking for the maximum aural experience? What about all the people with surround home theater systems? Wouldn’t immersive audio on your air appeal to them – and be a good thing for your station? Wouldn’t you like to listen to your station in surround?
And isn’t this a win for all? Listeners get something compellingly new and interesting. They already know about 5.1 from their exposure to home theater and could readily imagine the benefits of having that experience in their cars. FM radio stations again take the lead in offering a superior audio technology. HD Radio gets a clear and understandable value proposition. Record companies get to sell their libraries all over yet again – and in a format less amenable to MP3 copying. Programmers and production directors get to create cinematic high-wow-factor promo pieces to breathe new life into programming. And there’s even something for the sales department…audio shops will sell a lot of new gear, and will need to advertise their wares – on radio, of course! So, when do we start?