Skip to Content

Audio-Vision (Aesthetics Of Sound, April 2012)

Chion’s Audio-Vision Ch. 1-4 (Reviewed for BECA 435, Aesthetics Of Sound, at San Francisco State University, April 2012)

Michel Chion’s book Audio-Vision is a highly theoretical, provocative work that challenges you to stretch your imagination and take in many new concepts of how sound and vision combine in cinema. (Excerpts from the text were later used, with Michel Chion’s permission, in the Granite Countertops song, High Definition.)

We begin with a very poetic and entertaining foreword from Walter Murch that sets us up for what Chion has to tell us with an elaborate metaphor about King Sight and Queen Sound and how the latter is under-recognized despite her power.

In the first chapter, Chion introduces us to the concept of “added value”, which is defined as the “expressive and informative value” sound brings to an image that gives the impression that they are linked. In fact, Chion points out that added value works so well that the audience ends up believing that sound only played a supporting role, rather than being the crucial element that made the audiovisual impression so believable in the first place. The principle of “synchresis” (a portmanteau word combining “synchronism” with “synthesis”) is the creation of this “immediate and necessary” relationship between what one sees and what one hears.

Chion first looks at “value added by text”, saying that cinema is “vococentric”, i.e. it privileges the voice. The voice is isolated in the sound mix with other sounds functioning as accompaniment. The text structures vision, Chion argues, giving the example of a TV broadcast of an air show, where the narration describes what the viewer plainly sees on the screen, pointing out different things the narrator could have said that would have affected what we paid close attention to on the screen. A documentary called Letter From Siberia, where voiceovers expressing differing ideologies are dubbed over the same montage, is brought up with some criticism: essentially, the aim of the film was to show how political preconceptions inform how a neutral image might be viewed, but Chion argues that the added value of text is much broader than this application and, in fact, structures and frames all vision by defining it with words.

Value added by music is split into two categories: “empathetic music”, which directly expresses the emotions of a scene, and “anempathetic music”, which carries on, seemingly indifferent to the actions and feelings being depicted, yet intensifying these through the very contrast it provides, or in Chion’s colorful language, “inscribing it on a cosmic background.” The anempathetic role can be played by effects as well as by music.

We then take a look at how sound affects the perception of movement and speed, beginning with some fundamental differences in how the two senses are processed by the brain. Visual imagery can be in motion, or frozen in time as a still, while sound is always in motion. (Try to imagine a single “frame” of sound. It’s difficult, if not impossible, as sound simply does not work this way.) Very few sounds are perfectly continuous with no variations over time. By the same principle, it is much more easy to reverse a visual sequence and pass it off as reality, even if it seems to violate the laws of physics; a reversed audio track will almost always sound backwards, due to the envelope. (Attack, decay, sustain, release, the sequence of which exists in time and determines much of the character of every sound we hear.) Or, as Chion phrases it, “sounds are vectorized.”

In Chapter 2, we are introduced to “the three listening modes”: causal listening, which is listening to a sound to gather information about its source (“What is that? Where is it coming from?”); semantic listening, which uses code, especially language, to interpret a message (“What are they saying?”); and reduced listening, the listening mode that focuses on the qualities of the sound itself rather than cause or meaning. Reduced listening is actually a greater challenge than it may seem. How does one describe a sound without any reference to its cause, meaning, or effect? An objective description of a sound is difficult, but the effort can sharpen our sense and teach us a lot. I would add that aspiring sound designers and Foley artists in particular can benefit from the willful divorcing of sounds from their sources and from asking themselves “what ELSE could this be?”

Another concept in this chapter is acousmatic listening, which when one hears sounds without seeing their sources. Contrary to what one may think, listening in an acousmatic situation does not aid in reduced listening; instead, the listener feels compelled to guess the source of the sound all the more. Acousmatic sound intensifies causal listening, and it requires repeated hearings for the listener to attend to the traits of the sound itself and stop imagining its source.
The third chapter begins with a thought-exercise in which are invited to compare the relationship between audio and visual stimuli with the music theory concepts of harmony (notes stacked together in the same time space to combine into chords) and counterpoint (independent lines occurring simultaneously and harmonizing in differing rhythmic patterns). Neither seems to fully apply in Chion’s view, but this doesn’t stop him from trying.

The point is to talk about the “lines” in video and audio and how they run parallel to one another, intersect, diverge, and comment on each other. Chion asserts “there is no soundtrack”, meaning to say that 1) the sounds of a movie by themselves are not a complete, coherent entity in and of themselves, and 2) audio elements are in a “simultaneous vertical relationship” with the visuals on the screen. Offscreen sounds are only perceived as such in the company of an image; without the visual element, sounds can’t be seen as source-connected or source-disconnected. (For the most part, I agree; however, I find it amusingly ironic that the cover image of this book is a still from David Lynch’s Eraserhead, whose soundtrack I have in my own CD collection and consider to be an evocatively eerie work of musique concrete in its own right.)

Much of the third chapter deals specifically with nondiegetic music. Chion talks extensively about an early John Ford film, The Informer, whose score was particularly literal, responding to every action onscreen in an almost operatic manner. Defending the technique from critics who find it so dated it’s amusing, Chion points out many of the more subtle elements of this score, and presents it as a good example, however extreme, of leitmotif.

Elsewhere, we are presented with what is by now a familiar notion in this class: how sound effects are used to make the film not just lifelike, but more vivid than real life itself. Focusing on how an actual punch in the jaw makes far less noise than the Foley-enhanced punches in the movies, we are reintroduced to the central concept of synchresis, the joining in our minds of what we hear and what we see. It’s the fact that our brains make these connections that makes sound design so effective in the first place.

Chapter 4 takes us further into conceptual territory, talking about how the frame is the container that holds all visual images, but there is no equivalent “enclosure” for sounds. It is the image that “magnetizes” the sounds; we perceive the sound as coming from the character or other “source” depicted on the screen. When we are placing sounds in the stereo or surround-sound field, we must keep this phenomenon in mind. Chion: “Today’s multitrack mixes very often strike a compromise between psychological localization and real localization.”

Bringing back the concept of the “acousmatic” (sounds divorced from visual sources), it follows to ask what is the opposite of acousmatic sound? Chion rejects the word “direct” in favor of “visualized sound” and proposes the idea of looking at the main three categories of film sound (the “visualized zone” of “onscreen” or source-connected diegetic sound, and the “acousmatic zones” of “offscreen” or source-disconnected diegetic, and nondiegetic sound) as a circle. He further divides the onscreen and offscreen portions of the pie chart into “external” and “internal” sound, and the internal into “objective internal” (breathing, heartbeat, etc.) and “subjective internal” (interior monologues). More distinctions are provided, such as “pit music” (incidental music that is not part of the action of the film world, analogous to an accompaniment performed by a pit orchestra below the “stage”) and “screen music” (music with a source in the movie; for example, a musician plays an instrument, someone drops a needle on a vinyl record, etc.) The chapter draws to a close with Chion contemplating directionality in surround sound.

This is certainly not a simple book. It will take many more readings to tease out all the ideas Chion is playing with here, but it’s an intriguing ride.

Chion, M. (1990). Audio-vision: Sound on screen. (pp. vii-94). New York: Columbia University Press.