TUTORIAL for the HANDBOOK FOR ACOUSTIC ECOLOGY


VOICE AND TEXT-BASED COMPOSITION


A Composition Class


At this point in the Tutorial, if you have been following the suggested Personal Studio Experiments (which are all listed here), you will have recorded sounds, derived sound objects from those recordings, processed your source sounds in a variety of ways, and assembled some mixes, either as exercises or as a self-contained piece, a sound object study.

This approach to sound design (i.e. starting with minimal material and maximizing variants of it) is a tried-and-true way to learn processing techniques, but it’s likely that you don’t feel like you mastered them all – that simply takes more time. However, we hope that you’ve discovered a lot about sound along the way, and developed your analytical listening abilities as well as your audio design techniques.

The best way to proceed now is to design a larger scale composition. It still may be of the “evolving idea” type of process – a bottom-up approach – but it can also be a more top-down process where you have an idea to begin with, and flesh it out as you progress. Hopefully, you’ll have a better sense of what techniques will contribute to the idea, and what possibilities can be explored.

You can obviously continue with the sound object approach, and essentially create an acousmatic style piece where sounds are organized for their own qualities in a manner that ranges from abstracted to something much more abstract. However, in these last two modules, we are going to concentrate on two fascinating genres, voice and text-based projects in this module, and soundscape composition in the next.

In each section of this module, we will include a box with questions and issues that can apply to any compositional project, but supplement them with practical examples regarding an approach based on voice and text.

Compositional genres based on vocal material are numerous and highly varied, such as documentary, oral history, radiophonic style drama, narrative, poetry, text-sound, and abstracted uses of vocal material treated as sound objects, among others that blend and blur the boundaries between these genres – perhaps the most interesting of all.

A) Common starting points: Idea or material?

B) The role of processing

C) Structuring a work

D) Maximizing the compositional process


home

A. Common starting points: Idea or material?

Which comes first: the Idea or the Source Material? Top-down or bottom-up? Will both start to dialogue with each other?
If the Idea dominates, is it 'composing with sound' and if the Source Material dominates, is it 'composing through sound'? How will this dialogue between them evolve over time?
The Idea: is it ...
(a) practical (length, resources, time)
(b) can it be made aurally interesting?
(c) too conceptual (i.e. the listener needs to be told what it's about!)
(d) could audio processing enhance the content? (i.e. could the listener 'experience' the idea?)
(e) is there a danger of 'forcing' materials into a preconceived plan, instead of letting available materials influence it?
The Source Material: is it ...
(a) good quality (or could it be improved?)
(b) aurally evocative
(c) what does your contextual knowledge contribute to its meaning?
(d) what do the sounds suggest to your imagination, memory, mood or emotions
(e) what types of processing will make it more aurally appealing and reflective of its meaning?


With voice and text, the first question about the material is its actual source. As Cathy Lane (Organised Sound, 11(1), 2006) suggests, it can be scripted, derived from everyday conversations, events and interviews, or pre-recorded archives and other sources? She also goes on, in the article, to discuss a catalogue of approaches from a wide range of compositions, and her list provides lots of possible works that would be good to listen to.

But in aural terms, the main variable is still which voice is being recorded. Quality, expressiveness and appropriateness of a voice is what really matters. There are many changes you can make with studio processing, such as editing out material, boosting the presence band (2-3 kHz) to add clarity, transposing the pitch and/or speed to make the person sound older or younger (but keep in mind in most cases this can only seem natural within a limit of plus or minus 5%, and maybe 10% if you’re lucky). But all of the subtleties of expression and paralanguage are basically going to remain the same.

With scripted material (written by yourself or someone else) the first option you should always consider is that you record yourself. Most people are hesitant to do that, and don’t like the way they sound (as discussed in the Speech Acoustics module where hearing yourself without the bone conduction that is normally present can be disconcerting). But at least you don’t have to book an appointment.

But, in many cases, it is good to find a person whose voice projects the kind of image you’re looking for. Below, we’ll show examples of both those with a lot of experience with voice recording, and those with less, but in each case, the result was excellent.

At the technical level, there are several aspects to keep in mind. First of all, find a quiet space, a studio preferably, or somewhere that is reasonably quiet. A person’s home, if acoustically suitable, may make the person feel more comfortable and even add a typical indoor ambience. However if it seems appropriate, a quiet outdoor space can work too, and you’ll avoid room low frequency resonances (i.e. eigentones).

Secondly, keep the microphone at the same distance for all takes. This will facilitate any editing done later to avoid obvious shifts in level. if the mic has no screen, then place it at a 30°-40° angle away from straight ahead to avoid plosive consonants sending air directly into the mic.

I prefer to monitor the recording on headphones with my back to the person, as it minimizes any distraction I might cause them, and allows me to focus on exactly what is being recorded. Recording levels should always allow 10 dB or more headroom, to accommodate sudden emphases, laughter or other emotional peaks. Always do a level check before you start.

Keep the recording going even when “mistakes” have been made. Lots of stopping and starting again may put more pressure on the person, and it’s easy to edit out your own comments or instructions. Have the person pick up the previous phrase if it’s scripted (not just the word in question) in order for inflection patterns to be consistent. If the person is improvising, just let the recording continue – you never know what “gem” you may be getting. Any extra time between the vocal parts will provide you with room tone which may be useful for editing later.

Unless the person is very experienced, the biggest problems may be (a) clarity of enunciation (most people slur their words), and (b) speaking too fast, the two problems often being related. Unless rapid speech is somehow desirable for the text, try to get the person to slow down (you can’t alter this significantly afterwards without distorting the phonemes, as mentioned above). Also, your eventual listener can’t lipread the speaker, and you may be processing or mixing the voice with other sounds, so if you don’t start with a clearly understood recording, it will only get worse.

The Blind Man. Our first example is where the author of the poem I wanted to use, Norbert Ruebsaat, was definitely the best person to read it, and he was comfortable with recording. His sense of tempo and spacing, clarity of expression, and his rather understated emotional quality was something that always seemed to draw the listener in. Not all poets and writers are the best readers of their work, however, but when they are, the result is usually very satisfying.

Since the poem is fairly short, Norbert was able to read it all in a single take, which could be repeated until he was satisfied, which as I recall was the third take. Then I asked him to improvise on the text, particularly because he was very sensitive to the sound of the words, and could easily vary and juxtapose them in novel ways that would have been awkward to separate out or edit later.

The improv went on for several minutes, and we’ll hear an excerpt of it to give you the idea. In the final mix, the complete text, divided into the five sections that he indicated with his phrasing, were used without processing, and bits of the improv were the sole source of all other transformations that will be shown below.

Excerpt of a reading of The Blind Man, by author Norbert Ruebsaat
Excerpt of an improvisation by the author based on his poem

Patterns. Another project involved a poem by Amy Lowell (1874-1925) titled Patterns, originally intended for a live performance by Diana McIntosh who commissioned the piece. The poem, now regarded as proto-feminist, tells the story of an aristocratic lady, engaged to a lord in the 18th century, who is killed in battle before they can be married.

The lady’s inner thoughts are to rebel against the “patterns” to which she is confined, starting with her tight-fitting gown, and the structured layout of the garden. But she also imagines a seduction scene with her lover, so there are many high emotional elements, leading up to her final outburst about the “pattern called a war” that has deprived her of happiness.

When the time came to make a stand-alone mix of the piece, I decided to record the poem with Elizabeth Carefoot, a local acquaintance whose lower-pitched articulate voice I admired. Admittedly, the role could have been for a younger woman, but I thought it would also work with someone more mature.

Elizabeth claimed not to have any acting experience, but she was comfortable being recorded, and so we spent a noon-hour when she was free trying it out. The poem, as you can see here, is quite lengthy, and a lot of the time was spent getting a clean recording and working on the high emotional moments.

However, there is also a low-key moment after the messenger delivers the news that shatters her. He asks “Any answer, madam?” and she responds negatively, but then, realizing that she needs to be a considerate hostess, she asks someone to "see that the messenger gets some refreshment”. The line seems simple, but getting the right nuance was tricky, as you can hear from the original studio recording. Elizabeth tried it various ways over four takes, and in the background you can hear me giving some suggestions, the final one being to seem to be holding back tears. Only later, with time for reflection, can one decide which take is best. The complete work can be heard on sonus.ca, or the CSR-CD 0102, Twin Souls.

Multiple takes of a line from Amy Lowell's Patterns, as recorded by Elizabeth Carefoot

Wings of Fire. Another work combining a text on a soundtrack with a live performer, in this case a female cellist, is a poem by Joy Kirsten called Wings of Fire. I turned to another friend whose lower-pitched voice seemed suitable for a cello piece, Ellie Epp, a filmmaker whose visual work is strongly influenced by sound.

Being an artist in her own right, Ellie had some good ideas about the poem and how it should be recorded – in her home, close-miked, all in one take in an intimate manner she described as “pillow talk”. I agreed and we did some run-throughs which I thought were quite successful. Then she agreed to “polish” a few lines that were tricky, and then did a final complete take which was the one used.

The beauty of her reading is the contrast between the emotional intensity of the poem and her quiet voice at close range. We get so used to the proximity of miked sound, it’s easy to forget when we hear it on speakers, that this is a paradoxical situation (intimate but larger-than-life). But this approach is very dramatic. The entire piece can be heard on the same website and CD mentioned above, and some examples of processing in the piece will be presented below.

Excerpt of a reading of Joy Kirstin's Wings of Fire by Ellie Epp

Sometimes, with a bit of luck and persistence, one can find an archival recording of a voice that is inspiring, and not just the famous ones. A student discovered a good recording of British poet Stevie Smith reading her rather famous poem “Not Waving But Drowning” with her inimitable voice, along with another interview where she speaks about it. Such material can go a long way to being effective when surrounded by some appropriate other sounds that elaborate on the meaning of the poem, which turns out to have a more intimate side than just the story of the unfortunate man who drowned.

Found sound. If you’re a persistent recordist, you may catch random items including voice fragments that are suggestive. Of course, the uncontrolled acoustic conditions and the lack of close miking are a challenge, but they can be overcome. Katharine Norman has a piece called “Anything from the minibar?” which a hotel employee asked her on checking out, and which Katharine turned into something much less banal. I’d also recommend her pieces (London, Three sound pieces) based on her mother’s reminiscences of the war years in London, and People Underground, recorded in the foot tunnel under the Thames at Greenwich, as well as her innovative writing and multi-media projects.

One of the members of the WSP, Peter Huse, being a writer himself, took an interest in all aspects of language, so when he and Bruce Davis were on the cross-Canada recording tour in 1973, they made it a practice to have the Nagra tape recorder running every time they rolled down the car window and asked for directions.

The result, called Directions, is a complex montage of dozens of snippets of language from coast to coast, and because of the banality of the answers (though often amusing), our attention is drawn to the paralanguage of how things are said, and how it differs with regional and immigrant accents.

This radiophonic piece is entirely made with untransformed original clips, but some overlap in mixing them was used, perhaps inspired by Glenn Gould’s Idea of North which had appeared a few years earlier and was famous for the initial 3-minute introduction where voices overlap in what Gould called “contrapuntal radio”. Here is the section Peter devoted to Newfoundland.

The Newfoundland section of Peter Huse's Directions (1974) from Soundscapes of Canada


Index

B. The role of processing. In the first section, we asked a couple of questions about processing that are worth repeating here.
Processing vocal material

Can audio processing enhance the content? (i.e. could the listener 'experience' the idea?)

What types of processing will make the material more aurally appealing and reflective of its meaning?

Transforming a sound is both a challenge and a pleasure in the studio, particularly if it is guided by a connection to the semantic, symbolic or emotional meaning involved with the vocal material. The first of the above questions suggests a motivation of letting the listener experience something in the context that’s been created or referred to. That is, we’re looking for a way to communicate other than with words (or paradoxically, by the way the words are processed, not the meaning of the words themselves).

The second question suggests that, as fascinating as vocal material can be, a mix can be more aurally appealing if it includes non-verbal elements. This is the big problem with the “talking heads” style of documentary or podcast, which is basically a lecture, maybe supplemented by interviews, or media clips. Of course, some topics have very little to do with sound, so those cases will be a challenge.

Of course, one can use several voices, and if their timbres and paralanguage are different, they can be edited or mixed smoothly for aural contrast. Once self-identified, each speaker can come to represent their own perspective or role without a narrator, and interact fluidly.

However, with more abstract ideas, an aural component other than voice may be difficult to find – but don’t fall back on background music! That merely imposes an emotional underlay that basically tells the listener how they should feel, which when you think of it, is quite manipulative.

Besides adding appropriate environmental sounds, we can also use transformation techniques on vocal material to add a counterpoint to the text.

The Blind Man. This work was created during the “high analog” period of the 1970s in the Bourges studio, as documented here. The speech transformations realized there are examples of classical sound object processing, such as filtering, transposition and gating.

In the first example, all of the words in Norbert's improv with sibilants were edited together into a long sequence, including the short excerpt shown next. In order to maximize the extraction of just the sibilants, their frequency band (approx. 5-8 kHz) was boosted with an EQ before passing it through a bandpass filter set to the same range, in other words a parametric style of processing.

Because this band would be transposed downwards by 1, 2 and 3 octaves with a doubling of length each time, this overly bright version kept its spectral loudness during each transposition where it was EQ’d again each time. Because the “ss” sibilants are quite narrow noise bands in the top end of the overall spectrum, it is very easy to isolate them with a bandpass filter.

And in a classic moment of discovery, the transpositions revealed a hidden acoustic element, a whistled phoneme in what was originally the phrase “whistles and scents”. After the first octave down, this element becomes very clear (after the fourth sound), and overall the aural impression of the transformation is to turn the sibilants into the sound of wind, one of the main images in the poem.

Excerpt of edited sibilants text from The Blind Man

Equalized and bandpass filtered sibilants

Sibilants transposed down 1 octave

Sibilants transposed down 2 octaves

Sibilants transposed down 3 octaves


click to enlarge

An important element in the poem is the presence of many percussive words with hard consonants (e.g. pick, tap, cane, catcalls). Two loops were created with a montage of these words taken from the improvisation, similar to the sibilant montage. One loop (“touch, tap”) was placed on the left channel, and the other (“catcalls, tapping his cane”) on the right channel. A very fast analog gate was applied to each channel, with a sharp attack and release, and the threshold level set to isolate just the stronger consonants. A high frequency boost was also added because of the downward transpositions in order to keep the consonants bright.

Then, like the sibilants, the first gated version was transposed down 1 and 2 octaves, with gating applied again to keep the envelopes sharp. In the example, we just show the first section of the resulting sequences. Notice the seemingly random rhythm which actually derives from the text elements. In the final mix, the 2 octave downward version was combined with the original text such that some rhythmic relations between phonemes could be fleetingly heard, the image being of the wind knocking things around.

Left-channel loop "touch, tap" from The Blind Man

Right-channel loop "catcalls, tapping"

Gated EQ'd consonants  (excerpt)

Gated consonants transposed down 1 octave

Gated consonants transposed down 2 octaves


click to enlarge

In Song of Songs, each movement (Morning, Afternoon, Evening, Night & Daybreak) has a characteristic environmental ambience embedded in it. In the Microsound module, I demonstrated some simple time stretching and harmonization to enhance the voices, and one example in another module demonstrated the use of comb filters to alter the voices.

Here is an example from the emotional peak of the work in the Evening section, where a crackling fire is the main ambience. The text (from the Song of Solomon in the Bible) culminates with the line “I am my beloveds and his desire is towards me” and so the emotional, even erotic level reaches a peak. After we hear both the male and female speakers reading the text several times, the phrase “and his desire” is granulated and the final phoneme of “desire” is harmonized with three lower harmonics (3, 2 and 1 where normal pitch is 4) and stretched up to 200 times, to create the desired peak and a rich bandwidth for the resulting texture, even as it gradually ebbs.

Stretched and harmonized text from the Evening movement of Song of Songs
CSR-CD 9401 Song of Songs


click to enlarge

Wings of Fire. The poem includes many striking images that suggest appropriate transformations of the text. Many of these are guided by the basic dramatic concept for the piece, which is that female cellist is addressing the poem’s love imagery to her instrument as her “lover”. This illusion is supported by using the pitches of the open strings on the cello (C, G, D, A) as the resonating pitches with the Karplus-Strong resonator which in fact is based on a physical model of the string itself. As shown here, the resonator can be used as a signal processor such that any sound can be resonated through the instrument.

The first example is a relatively simple granulation of the text “We cross continents together….”, with an added harmonization at the interval of the 4th below, suggesting two voices intertwined. However, the last syllable of the word “touch” is time stretched, and the effect of stretching the sibilant is to create a kind of bird cry.

Stretched and harmonized text from Wings of Fire
CSR-CD 0102 Twin Souls


click to enlarge

The second example, with a highly emotional line from the poem, starting with “Your tongue speaks languages …”, illustrates the effect of the strong resonances described above, pitched on the two lowest strings of the cello, C and G. The voice dominates the first time through, then the input level is lowered and the feedback slowly raised until the resonances dominate and ring at the end. In the performed version of the piece, this processed text is followed by the cellist playing the same open strings in a chord.

Notice how the resonances interact with the pitch inflections of the speaker, Ellie Epp. Her melodic phrasing is quite simple, based generally on pitches C, D and E, but very musically phrased, all of which is brought out by the resonators tuned to the same harmonic relationship.

Resonated text from Wings of Fire

click to enlarge

Androgyne, Mon Amour. This work incorporates a setting of six poems by Tennessee Williams from his book of the same title, as read by Douglas Huffman. The first poem "You and I" is divided into two complementary sections for the start and end of the piece.

The poems are intensely lyrical, intimate and erotic in a celebration of gay love that is acted out, both musically and dramatically, by the live male performer interacting in a variety of conventional and unconventional ways with the double bass which is personified as his lover – the gender being switched from Wings of Fire.

Both the vocal part and various sound material from the bass are digitally processed through resonators that, like Wings of Fire, model the characteristics of the open strings of the instrument, thereby linking them sonically and musically, as if each is speaking through the other.

In terms of the text, it is worth noting that not all of the chosen poems were used in their entirety, but rather were edited into a more compact form. This is a typical problem when using existing texts – there is simply too much material there and it will tend of dominate your work as you try to “get through it all” – whereas with a sound piece, the time is better spent elaborating the accompanying material which sustains aural interest on its own.

We start first with two examples of granulation only, the first being a very brief phrase “shaken up and scattered on the floor”, a definite invitation for granulating the text with very short grains. In fact, there are nearly 40 examples in the piece of individual phrases being processed, making it the most tailored of all the pieces being presented here (a full documentation is on the WSP Database in the HTML section).

The second example is from the “Wolf’s hour” section where that phrase, after some repetitions, is maximally stretched to create the dark and sinister atmosphere of 3 am when one has been left alone. However, notice that after the long stretch of the word “wolf”, the consonant beginning the word “hour” is re-attacked by briefly returning to the original speed before the next stretch begins. The sound is also harmonized at the interval of a fourth lower.

Granulation of a short text from Androgyne, Mon Amour
CSR-CD 0102 Twin Souls

Extended time-stretching of "Wolf's hour"


Spectrogram of "Wolf's hour"; click to enlarge

Next we have three examples using the resonators described above to modify the text. First is the title “Androgyne, mon amour” where the first word has a 2.5 times stretch, followed by a 20:1 stretch on the last two words, where the resonances dominate, and this version concludes the work.

The other two examples use the resonators to add a harmonic accompaniment to a very lyrical stanza (from the Winter Smoke poem), starting with “Scent of thyme is cool and tender” and ending with “music to remember”. The resonators on the left and right channel are independent, and real-time changes in the length of the delay line were made for each phrase on the right channel resonator. This processed version is mixed with the phrase “girls are music” resonated similarly, first with a 2:1 stretch, then amplitude correlated for the rest of the phrase, similar to the auto-stretch control option in MacPod.

Title text resonated and stretched in Androgyne, Mon Amour


Double resonator on Winter Smoke text

Stretched version

click to enlarge any image

The overall intent and effect is to integrate the lyrical aspects of the text with the double bass, the performer also acting out the text by adopting different costume changes and unusual playing positions with the instrument. A stand-alone video version which incorporates a dancer interacting with the text and the bassist is available here.


Index

C. Structuring a work.

Some common structural issues:
(a) what guides the time flow? a storyline or soundscape, a 'journey', a text, a series of images, moods, energy levels?
(b) where does it start and where will it end?
(c) what are the foreground and background elements, and are they balanced?
(d) are there natural sections and punctuating moments? do the sections contrast each other?
(e) do any elements build tension, reach a climax, and eventually resolve to a cadence?
(f) do all elements create a unity in the mix, even if some appear to be outliers?
(g) is the pacing of events justified by the level of perceptual interest in the materials?
(h) is the mix linear (i.e. one sound at a time) or is there effective use of counterpoint, i.e. multiple layers
Technical issues:
(a) mixing levels: are all elements balanced for audibility and presence? is there sufficient dynamic contrast?
(b) spectral balance: are all frequency ranges used? (they don't need to be, but it's a resource for adding layers/tension)
(c) timing issues: does the pacing maintain interest and is there a satisfying trajectory to the piece

The time flow. There is a huge advantage to how to structure the time flow when there’s a text or story involved, as well as in the next module when there’s a specific place or a soundwalk type of journey in a soundscape composition. The beginning and end may be known from the start, or else the end may evolve as you work on the project. You may even get a title out of this material, which many composers will tell you is often the hardest part of the process, so if it’s a “given” from the start, count yourself lucky.

This structural issue is much more difficult to resolve in an acousmatic style piece where it’s the sounds themselves, and only them, that determine the overall flow and structure – in which case it might be wise to invoke a soundscape model. Some would argue there’s always a narrative even with abstract sounds, but it’s less likely the average listener will rely on that.

However, you’ve already been warned about “too much text” as it will weigh down the structure of your piece. If nothing else, working with text should make you a more critical editor, as in “do I really need this?”. Of course, if you’re working with the author, you may need to be diplomatic on this point! Keep in mind that your other sounds should be able to elaborate on even a simple text, so you don’t need more words.

Although this is not directly a structural issue, there is also a warning needed about the text, once you’ve used lots of transformations
does it remain intelligible? The trap is that by now, you’ve heard it so many times (and maybe are getting tired of doing so) that you essentially “pre-understand” it, that is, you know what you’re going to hear next before it occurs. One solution is to try listening to a provisional mix with a fresh listener.

Another strategy is to combine more intelligible, untransformed versions of the text with the transformed ones, being careful to adjust levels so the text remains clear. Repetition of key words and phrases is often desirable, particularly if you’re using poetic language, and not colloquial speech where there’s a lot of redundancy. If the vocal material is receding a bit into the ambience, try boosting the presence band (2-3 kHz), rather than just raising its level (and reduce any reverb level if that’s contributing to the problem).

Another structural strategy with too much speech is to use multiple phrases and key words mixed together with each occupying its own “space”, starting with left and right channels (or more in multi-channel mode), or panned to the middle, with a keen ear for the counterpoint between them. Speech doesn’t need to be heard linearly, depending on what it needs to communicate.

What you are doing is catering to the listener’s ability to practice cocktail party effect, that is, to be able to isolate different streams of sound coming from different directions. But always keep in mind, that when listeners have a degree of hearing loss, this will not work as well, and in fact will be confusing.

Sections and paragraphing. It’s fairly easy in a sound object exercise, for instance, to do a sound mix that holds your attention for 2-3 minutes. After that, depending on the nature of the material, it is wise to think in sections, or paragraphs if you want to use the literary analogy. There doesn’t need to be a full stop between them (i.e. silence, although that may help if it’s brief), but what’s important is that the listener senses closure of whatever development they’ve been following. There’s a moment of repose (called a cadence in music), and the listener is then ready for something new.

So, what’s the “new” element you’re introducing? Ideas like “variation, development, and contrast” are typical strategies to answer that. If you’ve started with something relatively simple, this may be the time to introduce new or contrasting elements that build and elaborate on what we’ve already heard. But, it’s even better if there is some underlying coherence between where we’ve been and where we’re going, even if it isn’t apparent right away to the listener (as in, we’ll tie this together later).

The build. Some of the best experiences with mixing are when all of the elements come together and create a coherent whole (as in, greater than the sum of the parts) in terms of mood and energy level. Everything seems to fit and make sense. Usually it takes a minimum of three layers to achieve this, although more may work if evenly balanced in the mix. As mentioned above, a spectral balance of different frequency ranges is a big asset, as they can be easily followed by the listener.

Once your sounds work well together, you can think of the overall shape that we’ll call “the build”. Listeners are very good at perceiving a “tendency” or direction in the sound material, and this raises a sense of “expectation” (or “sweet anticipation” as musicologist David Huron calls it in his book of the same title).

The pacing of the build is crucial, and even more important is the sense that it reaches a definable peak moment, or climax. If the pace is wrong, or there’s no real peak moment, you’re going to lose or confuse the listener. It’s not a matter of loudness or speed, though these may support the build.

The concept of volume, as developed in the Magnitude module as the perceived magnitude of a sound, is the more useful concept here. Volume can build with more layers, not more loudness. This approach will also be less likely for the mix levels to exceed the available dynamic range and require compression. If you’re dealing with textures, they can become more complex, not just louder. Increased rhythmic complexity is another type of build, but in this case it has natural psychoacoustic limits (high density rhythm leads to fusion).

And finally, what do you do once the peak is realized? Well, there are a variety of solutions. A sharp cut is not one of the more successful ones, as it leaves the listener “hanging”. However, if it is brief, and followed by a perceived resolution, then your work will be called “dramatic”. At the other end of the continuum is probably the gentle downward slope. However, it’s harder to keep interest when the energy level is falling, so it’s best to make the downward portion shorter than the ascent.

In literary terms, there’s what’s called the “denouement”, a lower energy “back to normal” type of moment where the listener can react calmly, maybe have a moment to reflect on the journey, maybe indulge in a remembered image, or simply come to a relaxed (and hopefully satisfied) state of awareness.

Our final example will be the last movement (Coda) from The Wings of Nike, which is a three minute rhythmic build with increasing density and energy, based on just two phonemes, lasting about one-third of a second in total. There is a little granular figure floating above this process in the shape of the title, but its basic strategy is based on microsound where as events speed up, they fuse into a continuous texture or pitch, and eventually result in a rising bandwidth, like a rocket taking off.

Although I think it stands on its own, the movement was designed to accompany computer graphic images by Theo Goldberg which followed a similar trajectory. The video version of the work is now available on a CSR DVD, and in a performance version for video and 8-channel soundtrack.

Final movement from The Wings of Nike,
CSR-CD 9101 Pacific Rim


Spectrogram; click to enlarge


Index

D. Maximizing the compositional process (friendly advice).


(a) just get started! what materials do you feel the strongest about (they may or not be the eventual starting sequence)
(b) listen to your materials at every stage (including when you get stuck)
(c) review your work between studio times (you'll get a better impression of it when you're relaxed and not dealing with technical issues)
(d) don't worry too much about perceived flaws (they will either seem to get worse over time or disappear)
(e) play the work in progress for fellow students or in class to get a fresh, less jaded set of reactions
(f) be patient and listen


Index

home