Reading Summaries from “THE ART OF DIGITAL AUDIO RECORDING” by Steve Savage (written for Audio Production II, Fall 2012)
Reading Summary #1 (Chapter 1 – Complete; Chapter 2.4 – Mixing Boards & Control Surfaces p. 29 – 44)
Chapter 1 of the text offers a quick overview of the qualities of digital recording, particularly those that have revolutionized the recording process and left analog behind, to the point where it is now regarded as an anachronistic process that a small minority of audiophiles still swear by but few people ever actually use. (Try shopping for “recording tape” sometime if you want confirmation of this fact.)
First of all, digital recording is nondestructive, because with no tape, there is no need to erase. Even when punching in, the original audio track remains intact if you decide you may prefer it after all. Nondestructiveness also applies to editing, because when you make a “cut”, what you are actually doing is instructing the software to play back the audio in a certain way, without changing the actual file. This opens up many aesthetic choices to a producer, engineer, or artist than had been available in analog, when tape was physically cut with razor blades and reconnected with splicing tape. (Sounds so primitive now, but it was not so long ago that this was the norm.)
Most signal processing units have been recreated digitally now, with added tricks that were previously unavailable, like the ability to alter tempo without changing pitch (and vice versa…hello Auto-Tune). Automation has developed to a new level at the mixing stage; while automated faders and other controls were already available on high quality analog consoles in the past, the subtlety, complexity, and sheer ease of digital automation offers engineers abilities and choices they couldn’t have had previously.
Two things that have not changed: 1) the importance of keeping aware of signal path in order to stay on top of your process and easily ferret out any potential problems, and 2) Ray Charles’ sage observation that at the end of the day, the only question that matters is “What does it sound like?”
In the fourth section of Chapter 2, we focus on mixing boards and control surfaces. The virtual control surface of a digital audio workstation is modeled after a traditional mixer and the channel strips look strikingly similar. You can create a large number of channels of different types; how many are available depends on your software, but generally you have more options than on physical consoles.
I/O settings are very important. Each channel must have its primary input and output assigned. One must know the difference between interface routing (to or from external gear such as an MBox/other audio interface, or speakers, etc.) and buss routing (to or from another internal track or plug-in within the computer itself).
An external audio interface may not necessarily have a mic preamp, which is crucial to getting a decent mic signal. Do your research and make sure you have what you need, especially if you are using a mic that requires phantom power.
Your software mixer will include both inserts (which will be covered more thoroughly in a later chapter) and aux sends, which are additional outputs that are routed to another place besides the main output, usually an aux channel which offers more signal processing and other control possibilities for that channel’s audio information. Aux sends are also used for headphone mixes. They can be pre-fader or post-fader, depending whether or not you want the diverted signal to be controlled by the audio channel’s fader.
We get a short look at panning, which is the location of each track across the stereo spectrum, and how that works. We then have an introduction to “groups”, a handy way to assign multiple channels to a single fader and control them all as a group. Also, the book stresses the importance of naming your tracks as soon as you create them, in order to avoid the nightmare of searching through a list of regions named “Audio 1.01” or “Audio 9.08”, etc. We get a quick look at other types of channels besides audio such as aux input, master fader, MIDI, and “instrument”, a convenient combo of a MIDI and aux channel designed for virtual instrument plug-ins. The section ends by comparing the use of external mixers (analog or digital) vs. sticking to the virtual control surface.
Reading Summary #2 (Chapter 6, 6.1, 6.2 – Mixing p. 170 – 186; Chapter 6.2 – Mixing procedures p. 195 – 199):
The chapter begins by pointing out, essentially, that mixing is more of an art than a science, and there is no simple cut-and-dried way to make a good mix. However, along with the creativity and imagination, there is a great deal of fine detail involved, and a good mix requires intense focus and attention. Listeners may not be analyzing the mix in a conscious fashion, but it will have major effects on how they hear the music; it may even be the factor that determines whether they like a song or not.
Mixing is defined as “the combining of audio elements”, or taking multiple channels/tracks/mic inputs etc. and combining them into one track (mono), two tracks (stereo) or more tracks (5.1 or 7.1 surround sound). While surround sound is the norm for film and mono is loved by some retro music fans and audiophiles, stereo remains the dominant audio format. Remixing, which once meant simply redoing a mix with different volume levels, panning, signal processing etc., now has been expanded to mean not only this but adding additional elements and synching them with the original tracks in creative ways, for instance for the dance floor. (Jamaican dub reggae and extended 12” disco mixes may have been the beginning of this rethinking of the creative possibilities of mixing.) Mixing can drastically alter the nature of a song, which raises serious issues of who has the right to approve a mix. It can be a touchy subject; I can attest to this from personal experience.
We are reminded again that speakers with a reasonably flat frequency response and a room with good acoustics are essential for a good mix, as one must be able to listen critically and create a result that will sound good in any environment. The terms “in the box” (using the virtual control surface in your recording software) and “out of the box” (using a physical console plugged into your interface) are defined. The chapter advises us to listen to our work at a variety of levels, since the Fletcher-Munson curve states that we hear different sound balances depending on the volume at which we listen. (Since our own ears do not have a perfectly flat frequency response either.)
Organization of the tracks in our file is stressed. There is a standard for ordering tracks (drums, percussion, bass, guitars, keys, vocals, etc.) but no hard and fast rule beyond arranging them in a way that makes sense to you. Grouping is a handy technique that allows you to “lock” a group of tracks together once you have worked out their relative levels (for instance a drum kit) and move them together in a single action. Submixes and subgroups can also be created by sending several tracks to a stereo aux track and controlling them all by changing said aux track. The master fader is used the same as the master fader on a physical console: a single stereo fader that controls all the tracks in the mix together. Plug-ins can be installed on a master fader, but compression applied here can seriously mess up the fade-out at the end of a song.
There are a daunting number of decisions to make when it comes to levels. You must create a balance between all of the individual elements. It is helpful to think in terms of foreground and background, even if you’re going for the “wall of sound” approach. Panning will make a big difference in clarity and detail. The book encourages you to find alternatives beyond simply placing elements at hard left, hard right, or dead center.
We jump to the end of the chapter to look at levels again and introduce the concept of “the three dimensional mix”. Height is a metaphor for both amplitude and frequency, while width would be represented by panning, but how do you create depth? Volume level, of course, provides some illusion of distance; another tool to enhance our “audio z-axis” is reverb and delay. The final piece of this reading advises us to continually revise our work and keep our previous attempts safe by always saving each mix under a new name, using a consistent chronological numbering system. (Songtitle Mix 1, Songtitle Mix 2, etc.)
Reading Summary #3 (Chapter 5.1 – Plug-ins p. 154 – 159; Chapter 2.5 – EQ p. 44 – 55; Chapter 6.2 – Mixing continued p. 185 – 186)
This assignment brings together three sections of the book dealing with plug-ins in general and EQ in particular. First, we take a look at plug-ins, specifically as used in inserts. Inserts are an option used to directly alter the audio channel itself, which distinguishes them from sends. An example offered here is the way that EQ is almost always built into an analog mixer. Dynamics-based signal processors are also preferred as inserts, which is logical if you consider how a compressor or limiter is used. (The goal there is to change the dynamic range of a signal, so to use a compressor as an auxiliary send simply would not do the job.)
An important consideration in digital recording is the order in which inserts are added. On a DAW channel strip, inserts are arranged in vertical top-to-bottom order. If you insert two or more plug-ins on a single track, the signal will be altered by each plug-in in the same order. This will make a big difference, for instance, if you are adding both EQ and compression. If the compressor comes second, the sound that will be compressed will be the sound that follows the EQ adjustments. It is more common for the EQ to follow the compressor, but what you want will depend on the sound you are looking for. (As the little sidebar on page 158 reminds us, we should be cognizant of the rules, but the ultimate test is in the result, and one should never be afraid to “break” the rules and experiment as time allows.)
Inserts can also be placed on submixes, such as an auxiliary track with all of the drums routed through it, to both support sonic unity and conserve CPU power.
We move back to chapter 2 with a review of EQ, or equalization, which raises (boosts) or lowers (dips) particular frequencies within the range of human hearing (approximately 20 Hz to 20kHz) to alter the tone and timbre of our audio signal. (The reading includes an explanation of how timbre is the result of particular combinations of overtones and harmonics interacting with the fundamental pitch, to illustrate that EQ manipulation can subtly transform the timbre of your sound.) The three primary EQ parameters are levels (how much boost or dip), frequency being altered, and bandwidth (or Q), the RANGE of frequencies affected, which can be narrow or wide and is seen onscreen as bell curve shape. Another option is shelving, which raises or lowers all frequencies above or below a certain point in the frequency range. A more extreme version of this is a high-pass or low-pass filter, which sharply limits all frequencies above or below your chosen point. These are useful for cleaning up unwanted noise at the high or low ends.
Before parametric EQs became the current standard, graphic EQs were very common. (You still see these sometimes on live sound boards.) These mainly consist of a selection of faders, each capable of boosting or dipping a particular frequency. Changing the Q setting is not an option here.
EQ will change the phase relationship of sound as well, which can alter the results in unexpected ways.
Also, one use of EQ is to compensate for the limitations of human hearing itself. We do not hear in a flat frequency range; our ears have evolved such that mid-range sounds (the range of the human voice) are relatively more audible than those of equal amplitude in the high and low ends of the spectrum. (This is known as the Fletcher-Munson curve.) An EQ tactic known as “the loudness curve” raises those less-audible frequencies so that our hearing processes them as a flat frequency response. On a graphic equalizer this is referred to as a “smile curve” because of the shape created when you gradually raise the faders on each end.
Do you EQ during the actual tracking process? This is usually frowned on, but there are circumstances, such as unwanted signals on a certain drum mic, where it can be useful. Particular ugly sounds such as the dreaded 60 cycle hum can be filtered out. But in all cases where EQ is used, an engineer must keep in mind that other incoming sounds will be altered by any change you make.
Remember the concept of “sounds best vs. fits best” when mixing. You are not just EQ’ing one instrument to sound well by itself; you must also consider how it interacts with the other instruments. An excellent example offered here is acoustic guitars. An acoustic strum can flavor a rock mix nicely, but the sound you want in that context is very different from the way you would EQ a solo folk musician.
Reading Summary #4 (Chapters 2.7 to 2.8 – Reverb and Effects p. 67– 71; Chapter 6.2 – Mixing continued p. 191 – 195)
This week’s reading deals with delay-based effects, which in this book’s definition include reverb. What we usually think as “delay” or “echo” is referred to here as “long and medium delays”, which are characterized by discrete, repeating sounds that can be perceived as separate. Though these sounds do not exist in nature, they are reminiscent of the sort of natural echo found in certain environments.
A delay unit will have controls for the length of the delay (in milliseconds or ms…certain plug-ins can also be set according to musical notation if your recording is on a grid) and feedback (meaning the number of repeats, which can be anywhere from one when it is set to “0” all the way to infinity). Each repeat diminishes in volume, though this is also subject to control. “Medium” delays are between 100 and 175 ms, and “long” delays are about 250 ms or more. Medium delays are often referred to as “slapback”, usually used with no feedback for that classic rockabilly echo.
“Short” delays (1-50 ms) are not perceived as a delay or echo, and are not used to simulate natural environments but to “thicken” sounds or just add a cool effect. “Chorusing” uses a modulated short delay (modulation creates slight changes in pitch so that the staggered sounds are just slightly out of pitch with each other) that is often heard applied to guitars on 80s Goth records, among others. Doubling does a similar thing without the modulation, thickening the sound. Phasing and flanging use both modulation and extremely short delay times, causing controlled comb filtering as repeated out-of-phase sounds interfere with one another, creating that beloved “whooshing” effect we hear on old psychedelic records.
Reverb is a very different member of the delay family, a simulation of the complex delays that are created in natural environments. We don’t hear separate echoes; instead the effect is, as the book calls it, a sort of “cloud” of sound. Our instincts interpret reverb as a sonic depiction of the size, shape, and texture of the environment in which the sound is generated. Types of reverb are determined by “pre-delay” (time between the direct sound and its first reflections—larger rooms have longer pre-delay times), early reflection, and time/length (reverb “tail”).
We get some advice on using reverbs in mixing in the next section of the reading. Reverb can easily be abused; with it, you can create cool ambience or turn your mix into a muddy mess. We are advised not to “tweak” our presets; it’s better to find a different preset that suits your needs. Short reverbs are generally preferable for more rhythmic sounds, while long reverbs work better for sustained sounds. It’s not a good idea to put the same reverb on every musical element, because they will blend into one another too much, unless perhaps you’re going for some sort of extreme quasi-Spector sound. Sending your sound to a stereo aux track with a different panning location from your direct sound can create a nice sense of space. A lot of creativity and aesthetic decision making goes into choosing how reverb and delay effects should interact on an audio track. You can do it “in parallel” (independently of one another) or “serial” (the signal goes through one effect, then the next, so that the second effect alters the sound that was altered by the first), depending what you’re after.
Reading Summary #5 (Chapters 2.6 to 2.9 – Dynamics p. 55 – 66; Chapter 6.2 – Mixing continued p. 186 – 188)
Our next reading focuses on dynamics processing. Dynamic range is the variation in volume between the loudest and softest sounds in an audio program. Compressors, the main type of dynamics processor, reduce the dynamic range by reducing the volume of the loudest sound while leaving softer sounds unaffected. The three parameters of a compressor are: 1) the threshold, which is the dB level at which the compressor takes effect; 2) the ratio, which is how severely the louder sounds are reduced (expressed as, yes, a ratio of x:1…e.g, a 2:1 ratio means that every sound at 2dB above the threshold will be reduced to 1 dB above); and 3) makeup gain. (Since compression reduces the overall amplitude of the audio track or tracks, gain is used to bring the average sound up to an acceptable level.
A limiter is basically a stronger compressor with a ratio of 20:1 or above, but a brickwall limiter works differently, controlling the threshold and the output ceiling (the loudest level the processor allows), increasing the overall volume. These are used primarily for mastering.
More advanced compressor/limiter controls specifically affect the speed of “attack” and “release” (does the compressor take effect nearly instantaneously or with some delay? A too-quick attack or release can create unpleasant results) and “knee”, which is a ratio that adjusts with the sound level. (“Hard-knee” means the compression ratio is constant whenever sound exceeds the threshold, while “soft-knee” means the ratio increases with amplitude, which tends to give a more natural-sounding result.)
Multiband compressors are used to apply compression to particular frequency ranges and are helpful when you want to affect very specific sounds and leave others relatively untouched. The de-esser is a very specific type of frequency-conscious compressor that is used to reduce sibilance in the human voices. (As a singer, I have the misfortune of having an unusually sibilant voice, so engineers have had to spend a lot of time on the de-esser settings in past recording sessions, though I’ve also learned that singing slightly off-mic is also helpful.)
Pages 186-188 offers some practical tips on how to use dynamics processors and the mixing stage. The two functions of compression, according to this section, are 1) to create a “natural” sound without clipping or noise/distortion problems, and 2) to create effects that are not natural at all, but sound really cool. Elements like vocals and bass can be made to “sit more comfortably” in a mix through compression; in a mix with many instruments, a more even dynamic range allows a more consistent balance between tracks without elements popping out randomly at arbitrary moments. Steve also recommends “buss compression” (compression applied to an entire mix”) which works as a “glue” to blend the tracks together, but this should be applied with great care.
Reading Summary #6 (Chapter 6.3 – Automation, Recall, collaboration p. 199 – 209)
One of the most revolutionary aspects of the shift from analog to digital recording is the expanded capabilities of automation and recall. While many high quality analog consoles afforded an engineer the ability to “record” fader moves (automation) so that they would “remember” (recall) to replicate them on repeated playbacks, the nature of computer files allows one to create as many subtle and complex real-time changes in volume, panning, and other parameters as your CPU power will allow.
Two types of automation are “online” and “off-line”. Online automation refers to changes made while your project is actually playing, and off-line automation is done using a graphic interface and “drawing” the changes you want to make. Online automation is generally employed with a hardware interface that mimics the physical action of an analog board. The DAW software has different settings such as “write”, “touch” or “latch” that affect how the hardware controller (or the mouse, but this is not necessarily the best way to employ your motor skills) affects the “writing” of parameter data in your mix.
In off-line automation, you select what parameter you wish to see and alter onscreen. The default view will be a straight line. When you want to, for instance, pan to one speaker or change the volume of a track at one point in the song, you create “breakpoints” on the line (similar to “keyframes” in video editing software) and move the line up or down according to your needs. One disadvantage of online automation becomes clear when you view your parameters in this way; the real-time changes you write create a very large number of breakpoints, and an excess of these may eventually eat up your available CPU power.
With so many possibilities available to you, there is always the danger of overdoing things, or as the book aptly puts it, “to create complexity with little audible advantage.” It can be tempting to forget to see the forest for the trees, and one must not lose sight of the result as a whole. Keep in mind that a simpler mix may fulfill your needs just as well or maybe even better.
Automation and recall are often thought of as one thing, but it’s worthwhile to consider recall on its own. Recall is the ability to “keep” all the moves you have written. It is computers’ specific capacity to retain all the information that has been saved that has taken much of the tedium out of mixing, one of the big selling points for going digital.
In the past, when you mixed a multitrack recording to a stereo tape, that was that, and if you liked 95% of a mix but wanted to change one or two elements, you may well have had to start all over. (This happened to my old band at Soma Sync (the studio founded by author and instructor for this very class, Steve Savage), when Lorry Fleming of Alias Records asked our producer to redo an already-completed and very complicated mix we had just finished in order to bring the vocal up just a little bit higher; it was the only time I have ever seen the normally amiable and unflappable Greg Freeman show genuine anger.) In digital recording this is no trouble at all, as long as you have a saved session file. In fact, it is now the norm for collaborators to send their files back and forth to different locations in order to tweak elements a bit; we now have the freedom to exchange ideas and nitpick to our hearts’ content. While this flexibility is a boon to both artists and recordists, it does not eliminate the need for clear communication. Brother Ray’s wisdom may still be the last word on the ultimate goal of the recording process, but it’s crucial that everyone involved is on the same page when it comes to describing “what it sounds like, baby!”