- IRC #non at irc.freenode.net
Most of the literature about Ambisonics focuses on the mathematical underpinnings. This article is going to be about the practical uses of ambisonics in mixing music--not theories and formulas.
The big deal about Ambisonics is that it allows one to record, synthesize, and mix material, using just 4 discrete audio channels, including 3D spatial information that is independent of any specific speaker configuration. The Ambisonics B-Format signal can be decoded to basically any speaker configuration, be it one of the traditional Ambisonics arrays (square, hexagon, cube, etc.) or the more commercially popular surround configurations such as 5.1, 7.1, etc. The advantage of this should be obvious: from one master mix, an infinite number of consumer configurations can be derived. Or, in a perfect world, you could publish your work in Ambisonics B-Format or UHJ and let the listeners decode it to their specific configuration.
You can even decode it to plain old stereo and the result will be have a more stable image than you're likely to get with the industry standard intensity panning.
Ambisonics can be recorded directly with a device generally known as a soundfield microphone. The most popular configuration comprises four capsules arranged on the faces of a tetrahedron. The output of the capsules is in so called A-Format, which is then transformed into B-Format with some simple math and some not-so-simple filtering to compensate for the fact that the capsules can never be truly coincident.
When recorded this way, the resultant B-Format signal encodes the spherical soundfield that existed at the time of the recording, including all the distance cues and reverberation inherent to the natural space.
If one's intent is to record live material in this way, then there's not much need for additional processing.
However, as with any ambient recording, options for further manipulation are limited. For this reason, in these sections we will focus instead on the synthetic encoding of position and distance--which is more in line with multi-track DAW scenarios.
Most Ambisonics encoders (often called referred to simply as panners), and all extant Ambisonics LADSPA plugins, are capable of encoding only one aspect of the sound's position--its angular relationship to the point of origin (what would be the microphone in a soundfield recording). The math of Ambisonics encoders does not simulate any other part of the natural environment, such as the distance of a sound source, the speed of sound in air, the inverse square law, or reverberance.
It is this point that distinguishes Non Mixer's Spatializer from an ordinary encoder/panner.
The four channels of Ambisonics B-Format contain enough information to not only place a sound in a circle around the listener, but also to place it at any elevation above or below. This is often referred to as periphony. However, few playback systems have the necessary speaker arrangement to reproduce vertical positioning. And, furthermore, given the typical physical arrangement of real world performances, the musical value of such reproduction is questionable. Nevertheless, the information is there, in the Z channel.
Since Ambisonics encodes the entire soundfield, and not simply discrete speaker positions, it is possible to extract the response of a virtual microphone from any point upon the soundsphere. Typically, this technique is used to extract close-mic like signals from an existing soundfield recording. However, as previously mentioned, any reverberance inherent in the recording coming from the direction that the virtual microphone is pointed at cannot be eliminated.
First, a number of approaches are based on the idea that you have to convert from B-Format to A-Format, do some reverb, and then convert back to B-Format. This is simply false. The result of converting from B-Format to A-Format, running a mono reverb on each channel, and converting back to B-Format is 100% identical to the result of just running a mono reverb on each channel of a B-Format signal. (this concept originates from the analog days and the idea of sending ambisonics A-format through a 'quad' reverb tank, where the delay lines would never be truly identical).
Note that just applying a mono reverb doesn't provide a very convincing effect. This is due to a lack of back and forth reflection (as between the walls of a room). A simple simulation of reflection can be added by including an ambisonics rotation operation in the reverb's feedback loop. No need to involve A-Format or separate delays on each channel at all.
The second misconception is that a convolution reverb can't operate on a B-Format signal directly. The published ambisonics jconvolver configs operate under this assumption. All of them either have a single input (presumably from an impulse in the frontal direction) or a number of separate IRs for discrete impulse positions. The idea is that you have to pick the right (mono) input. Sometimes one may use the trick of sending a B-Format signal to a number of directional decoders the outputs of which can be mapped to the correct convolution input. This doesn't work very well because in first order virtual microphone rejection is not that great and there will always be some bleed into the other inputs, which results in a smeared image with phasing issues. And connecting each sound source to each reverb input separately is an annoying burden (and it can't cope with moving sources at all).
However, there's another option: When a B-Format signal is fed *directly*, W,X,Y,Z to W,X,Y,Z, into each channel of a B-Format IR, you get a reverb with expression of reflections back and forth relative to the input. Now, this doesn't result in the *exact* same thing you'd get from an impulse at say 45 degrees, but it is close to what you'd get if you took that 0 degree impulse and rotated it by 45 degrees. If the room the IR was generated in was circular, then the two are actually fairly equivalent. And when the room was not circular, the difference between changing the angle of the impulse and changing the angle of the IR is so subtle that I sincerely doubt anyone can hear it (I certainly can't). Maybe a bat or dolphin or some other creature with a biology tuned to echo-location can distinguish the shape of a room by the sound of its reverberations, but all us lowly humans can do (and even then with some straining) is determine the direction of the first reflections. Your milage may vary and this technique might not scale up to higher orders well.
For a high quality ambisonics reverb that does a good job of preserving the direction of early reflections, use one of my jconvolver configurations:
Load the configuration of your choice in jconvolver (typically run via nsm-proxy)
Then, in Non Mixer, create two strips to represent your reverb bus:
[Early Reverb] (4 channels) and [Late Reverb] (1 channel)
and one to represent a master bus:
[Master] (4 channels)
Create strips for your instruments, then add Spatializer modules to all of your instrument strips. Connect the 'early reverb' outputs of each spatializer module to the [Early Reverb] bus. Connect the 'late reverb' outputs to the [Late Reverb] bus. Then connect the direct outputs of the strips to the [Master] bus.
Now connect the outputs of the [Early Reverb] bus to the W,X,Y,Z inputs of jconvolver and the output of the [Late Reverb] bus to the Late Reflection input of jconvolver. Connect jconvolver's W,X,Y,Z outputs to the inputs of the [Master] bus.
The outputs of the [Master] bus can be connected to ambdec for decoding, or one of the AMB decoder ladpsa plugins may be used.
This work is licensed under a Creative Commons License