The categories of background scoring and source music are logical ones into which all film music can be divided."

1 At first glance, Irene Kahn Atkins appears to be correct: source music and background music seem unassailable as the basic types for music in cinema. Part of their power is certainly that they form a neat binary pair: source music is tied to something within the physical space depicted in the film, background music is not—thus, the two terms now used commonly in film-music discourse: diegetic, meaning "story-world," and non-diegetic.2

Nevertheless, it requires very little effort of examination to discover that these categories are not so clear and straightforward as one assumes—to find, indeed, that they are quite fragile. A character standing in the middle of a room and humming or singing is obvious enough, provided that sound levels and resonance are believable for such a space. The same conditions apply to someone humming or singing offscreen, whether or not we ever see the person responsible. Likewise, orchestral or dance-band music under a main-title sequence is plainly non-diegetic—it functions like the traditional overture or "curtain-raising" music in the theater, and it coincidentally provides legitimation for all other "disembodied" music in a film.

On the other hand, an appropriate label is harder to determine for music which appears to be coming from a radio during one scene but continues without interruption—and without change of volume—into another scene set in a different location. Kahn Atkins proposes the term "source scoring" for this progression from the diegetic to the non-diegetic, but that only gives a name to the problem, rather than solving it.3 Or, consider the reverse: Strauss waltzes from the main-title sequence of Grand Hotel (1932) reappear in the opening scene after a character says ". . . music all the time. Oh, it's wonderful." The reference is obviously to the hotel orchestra, which would very likely be playing waltzes and similar traditional music during the mornings and afternoons. Yet sound levels are high—to be realistic, the orchestra would have to be assembled in the hotel lobby, rather than, more appropriately, in a large dining room or similar venue somewhere off the lobby. According to Rick Altman, this failure of "auditory perspective"—or a mismatch between image and sound—is actually a defining feature of sound design in Hollywood after the mid-1930's.4 Finally, the "Trolley Song" from Meet Me in St. Louis (1944) is typical of production numbers in most film musicals: the music was recorded first, then the scene was shot and cut to match. Thus, although Judy Garland and her several friends move throughout the trolley in the course of the number, the sound is remarkably consistent with that of a person standing in front of a stationary microphone with a chorus behind her. And what do we make of the orchestra? Is it diegetic? No physical space can reasonably be imagined for the orchestra members to occupy (the sidewalk? another trolley following behind?) Is the orchestra non-diegetic? If so, then the singing cannot be diegetic, since it is utterly implausible that a large group of persons would all be singing to an imaginary orchestra.

Even these few examples should suggest that the source/background or diegetic/non-diegetic pair will not, by itself, offer an adequate basis for a conceptual framework within which to interpret film music. A heuristic for film-music analysis requires a more complex theoretical framework. In what follows, I offer a sketch for such a heuristic, based on a reinterpretation of spatial realism as it relates to music. Brief scenes from two feature films, one very familiar, Casablanca (1943), the other not, Liliom (1934), will serve to demonstrate the model's potential uses.

Several scholars, including Claudia Gorbman, Royal Brown, and Kathryn Kalinak, have already argued in different ways for the complexity of the source/background music relationship.5 The model proposed here goes a step further by demoting the diegetic/non-diegetic pair to just one in a field or network of several interacting binary pairs. The model is presented here as an outline: Figure 1 gives the top levels, Figures 2 and 3 expand specific sub-levels. Figure 1 first separates narrative agency—that is, who the narrator is or might be—from the processes of narration—the "how." Under the latter, I have listed the components of narration as David Bordwell delineates them.6 The first of these, "systems," is the one of interest to us, since it is even more narrowly the "how" of narration, "plot" being the particular ordering of the story's elements and "style" bringing us to the specifically filmic; that is, by this point we know a film is being used to present the narrative rather than a novel or a stage play. A sound film's two physical components are the image track and the sound track.


FIGURE 1: A scheme of filmic narration, with music (in part, after David Bordwell, Narration in the Fiction Film): first two main headings, the second partly expanded

A. Narrative Agency 
B. Narration (as Process) 
   1. Systems 
       a. Plot 
       b. Style 
            1. Image track 
            2. Sound track 
       c. Interplay of plot and story 
       d. Interplay of style and plot 
   2. "Excess" 
   3. Story 


The expanded headings of Figure 2 show the three elements of the sound track: dialogue, effects, and music. The expansion of the "music" subheading is based on the three large musical codes proposed by Gorbman: pure, cultural, and cinematic, or—to explain by giving examples—music as a traditional performance, music as broadly conventional: tam tams for Indians, bagpipes for Scotsmen, and music as tied to a specific film: leitmotives linked to characters, and so on.7 To these, I have added two "mixed" codes. "Cultural-cinematic" refers to cultural codes reinforced by cinematic use and then strongly associated with cinema (that is, used in film music but not often in concert music), such as certain gross style differences which are identified as referential or narrative cues: "Indian" music; jazz for bars, nightclubs, or to signify the "urban"; etc. The leitmotif technique, heavily exploited in early Hollywood sound cinema, may be included here, too, as may stereotyped formal placement of music taken over from the musical stage, as in main titles and end credits, as transitions or entr'actes. "Cinematic-cultural" refers to cinematic codes which have become cultural codes—for example, devices of musical expressionism as codes for horror, suspense, the supernatural, or psychological imbalances; or the quotation of historical musics as referential cues.


FIGURE 2: Same, with the subhead "style" and sub-subheads "sound track" and "music" expanded further

A. Narrative Agency 
B. Narration (as Process)  
    1. Systems  
         a. Plot  
         b. Style  
             1. Image track  
    2. Sound track  
         a. Dialogue (speech)  
         b. Sound effects (all non-speech, non-music sounds) 
         c. Music 
             1. Presence 
                  a. Cultural musical codes (cultural conventions) 
                  b. "Pure" musical codes (familiar processes such as formal design
                      [contrast, recurrence, . . . ], thematic statement and development,
                      textural processes)
                  c. Cinematic musical codes (properly, cinematic-theatrical musical
                  d. Cultural-cinematic musical codes (cultural codes now associated
                      with cinema) 
                  e. Cinematic-cultural musical codes (cinematic codes which have
                      become cultural codes)  
             2. Absence


Finally, Figure 3 takes the third of the large codes, the "cinematic musical," and expands it considerably.8 Only now do we reach the diegetic/non-diegetic pair. The model represented in this outline does not reduce all other attributes of music in the sound track to qualifiers of the diegetic/non-diegetic pair. Instead, the model assumes that any of the categories or oppositions defined under "cinematic musical"—that is, any of the items numbered 1-10 in Figure 3—may be used to direct interpretation of music's functions in a sequence or scene. This is consistent with Rick Altman's evaluation of spatial realism in film sound, "far from constituting the foundation and goal of [a] sequence, [spatial realism] must instead be understood as part of an overall strategy involving not only all image and sound events but also a more general manipulation of audience reaction."9 In other words, music works within the context of a system of narrative processes, a system which uses plausibility in the physical world represented by the image track as one—but only one—of its means.

In general, the binary terms in Figure 3 are better viewed as endpoints on a continuum, rather than absolute categories. The first, unambiguous examples of diegetic and non-diegetic music cited above do represent extremes, but others may be less clear. "Point-of-view music," for instance, sits midway between diegetic and non-diegetic, combining properties of both. Common examples of point-of-view music include a stinger (sudden sforzando chord) to register surprise or shock, or a band playing a melody which has a particular, strong association for a character. Similarly, the "off/on" alternatives for musical closure (item 7 in the list) might be construed, instead, as a series of levels extending from the conventional final cadence of a clearly defined musical form, to clear cadential closure but on an intermediate phrase, to a fade-out managed by the sound editor, to an abrupt mid-phrase cutoff.


FIGURE 3: Expansion of the sub-subhead "cinematic musical codes"

c. Music  
      1. Presence 
             a. Cultural musical codes 
             b. "Pure" musical codes 
             c. Cinematic musical codes (Subheadings assume a continuum, not a simple
                    polar opposition, between items of each pair.) 
                    1. Diegetic/non-diegetic (or source/background).   
                                Spatial anchoring, or plausibility in the physical world represented in
                                the image track. Certain types of music, such as "mickey-mousing"
                                or "point of view" music, are uncertainly situated between the
                                diegetic and non-diegetic.
                    2. Onscreen/offscreen.  
                                What the camera frames at the time the music sounds.  
                    3. Vocal/instrumental: performance forces.  
                    4. Rerecording: synchronized/not-synchronized.
                                 The degree to which synchronization or the lack of it becomes a
                                 noticeable feature of a shot or sequence. 
                    5. Sound levels: "Realistic"/unrealistic (for diegetic music); loud/soft
                                 (for non-diegetic music). Sound levels are perhaps the most likely
                                 sound-track elements to undermine the diegetic/non-diegetic. 
                    6. Musically continuous/discontinuous.   
                                 Traditional musical continuity of phrase, harmonic and melodic
                                 development, etc.  
                    7. Musically closed/open.  
                                 The "cadence" and related devices of musical closure, including
                                 those created by completing conventional musical-form schemata.  
                    8. Thematic/motivic referentiality. 
                                 A specific theme used as a referential cue for person, thing, or
                                 concept (but not the idea of using themes in this way, which be-
                                 longs to a larger code, the cultural-cinematic). 
                     9. Formal interaction of cutting and music: yes/no.   
                                 Design parallelisms between shot sequences and musical phrasing
                                 or expressive articulations. 
                    10. Motivation, or narrative plausibility: yes/no.  
                                 Motivation supplied by narrative, dialogue, or image track.

The list in Figure 3 provides enough options to facilitate richer readings of music's role in a film's discursive structure. Take, for example, a scene from Casablanca (1943). This sequence appears early in the film, beginning just before Major Strasser's first appearance at Rick's Café and ending with Ugarte's arrest for the murder of two German couriers. Music fades in as Major Strasser (Conrad Veidt) enters: The band is offscreen, playing an arrangement of "I'm Just Wild About Harry." Near the end of the scene, music fades out under gun shots as Ugarte (Peter Lorre) tries in vain to escape.

A dance band playing in a nightclub is, of course, one of the most common uses of diegetic music in early Hollywood sound film. The band's performance is entirely offscreen [item 2 in Figure 3], but we readily accept it as diegetic because of earlier onscreen performances [item 1]. Nevertheless, the music's diegetic status is compromised by an unmotivated drop-out in the middle of the scene [10]—the music simply disappears for several moments while Ugarte is talking in the casino room.

The performance, then, is uncertainly diegetic [1], offscreen [2], instrumental [3], synchronized [4], and with a generally realistic sound level: Captain Renault and Rick (Claude Rains and Humphrey Bogart, respectively) are talking in the latter's office; the music comes in as the office door is opened from the outside; sound comes up plausibly with the cut to the café floor, although it then goes under dialogue noticeably [5]. The performance is musically discontinuous because of the unmotivated drop-out [6], and it is also musically open: the ending of the song is masked as the music goes out under gun shots [7]. These last characteristics are unusual but by no means unknown in diegetic performances.

Film editing and musical design interact: the pianist plays the verse, then the orchestra plays the chorus, the articulation happening as Renault greets Strasser [9]. Nor is there any ambiguity about motivation: the performance starts when the musicians are ready; it goes out with the disturbance of Ugarte's arrest, as if the musicians are disconcerted by the gunfire, too [10]. Whether the performance leans more toward the pure or coded is unclear, but this is typical of performances that are not foregrounded. They combine a cultural musical code (café dance band playing popular music) with the role of "disengaged" background music; that is, the music interacts minimally with the image track and could be listened to separately and autonomously.

The juxtaposition of the song "I'm Just Wild About Harry" with Major Strasser's arrival is obviously ironic, and thus less believable as diegetic, since this rather enthusiastic love song is blatantly inappropriate for the unloved, unloving, and unloveable villain [7]. Such links between song titles and action in the environment of the performance ought to occur at random, with nothing like the persistent regularity that they happen in the movies (particularly in Casablanca). One might best place such associations under a larger level of editing and musical design: the song performance is placed "by design" at this point rather than some other equally good spot because of the association. The motivation is not so strong as it might be, of course, because the film makers could not be sure how many in the audience would know the song and therefore catch the allusion.

One might readily argue, then, that the diegetic status of the music is a secondary factor in this sequence, because the performance is partly treated as it would be were it properly non-diegetic [1]. More important are the linking of the song title with Strasser [8], the formal interaction of cutting and musical form [9], and even the lack of musical continuity and closure [6, 7].

The complexity of music in film sound and narrative processes was not unique to the Hollywood sound-design practices of the later 1930s and 1940s. In fact, most of the issues relating to film music were worked out in the early years of sound cinema in the United States and Europe, roughly in the years between 1927 and 1935.10 One film from this era, Fritz Lang's Liliom, may serve to demonstrate.

Lang was among a number of professionals from the German film industry who briefly took refuge in Paris after the Nazis took power in 1933. Most moved on to the United States, including Lang, Billy Wilder, Peter Lorre, and the composer Franz Waxman. Waxman had studied in Dresden in the early 1920s, and thereafter in the Berlin Musikhochschule. By 1928, he was working as the principal arranger for Frederick Hollander's cabaret shows and was later employed as an arranger by UFA, the largest German film studio. Waxman emigrated to Paris in March 1933, shortly after the Reichstag fire. He worked on three films there, the last being Liliom, the first film version of Franz Molnar's stageplay—the same story that was later used for the musical Carousel.11

Lang's Liliom is about one hundred fifteen minutes long and includes more than fifty minutes of music, divided into two main blocks: carnival music, which begins with the main title sequence and continues almost non-stop for almost twenty-five minutes; and background scoring, which begins about the eighty-fifth minute and occupies about twenty four of the film's remaining thirty-two minutes. Waxman's original cues for Liliom made his career; they impressed Hollywood director James Whale so much that he hired the composer to work on The Bride of Frankenstein in 1935.

The carnival music obviously acts as source music, but it is treated with a freedom that belies—and undercuts—a simple realism. For example, when Liliom and one of the other carnival barkers get into a fight, all sound is abruptly cut out while they circle each other; when they engage, the sound—including the music—just as abruptly reappears. Before the fight, Liliom (who is played by Charles Boyer) and the crowd follow through their merry-go-round ride in the manner of a production number in a musical, with many unrealistic coincidences of crowd singing and the playing of the merry-go-round's mechanical organ, not to mention deleted background noise and convenient miking for Boyer. Although the carnival music finally stops in the twenty-fifth minute, it reappears periodically in subsequent scenes, sounding in the distance. The carnival music draws on a number of familiar melodies, including dances and popular songs from operettas and vaudeville.

Of the nine background cues in the latter part of the film, the most important by far is the first, "Going to Heaven." Out of work—and not really interested in working—Liliom becomes involved in an abortive robbery scheme with a fellow street tough, Ficsur. When things go wrong, Liliom sees no way out and stabs himself. He is taken back to the room where he and his pregnant wife Julie stay, and there he dies. Julie lingers briefly by the corpse, then leaves, after which Ficsur sneaks in. He does not at first comprehend that Liliom is dead. At the moment the realization hits him, a high-pitched organ chord sounds and holds for a few seconds until it is joined by a sudden, dissonant fanfare from the brasses. The rest of the scene is Liliom's ascent to heaven accompanied by two angelic policemen. Partway through this ascent Liliom encounters first a choir of cherubs, then a larger choir of saints.

At first glance, Liliom might seem a poor example for my argument. After all, the distinction between the carnival music of the first part and the orchestral background music of the latter part manifestly follows and intensifies a basic articulation of the film. The shift from the one to the other occurs in the scene just described above. At the beginning of this scene, the carnival music represents physical reality and the orchestral music represents the supernatural, as straightforward an opposition of diegetic and non-diegetic as one could want. But placing music in the physical space of the film's story is only one of several things that music is about, even here. Consider the carnival music: the sound should be continuous, at least until the organ and brasses enter, but it is not: it cuts out several times, for no apparent reason, then resumes in mid-phrase. The sound track has obviously been edited; interaction of music and editing [item 9] takes precedence over a "correct" simulation of music's position in filmic space. Similarly, the last quotation of the carnival music—as the camera cuts outdoors and we look down onto the ground—is referential rather than spatial; that is, it "names" the earth and the physical life that Liliom is now leaving. The brass entry is a stinger, a sudden sforzando figure that intensifies shock or surprise. Here it clearly shows Ficsur's terror at seeing the heavenly policemen. Stingers, though widely used, are problematic: the dynamic level is important, and they cannot be considered "pure" non-diegetic music because they are closely tied to something onscreen, in this case either "naming" Ficsur's terror or acting as point-of-view, expressing the terror that he feels. In the stinger, motivation or narrative plausibility [10] is high but non-diegetic status is compromised.

As Liliom and the angels ascend slowly into the air, the camera focuses repeatedly on his face, as he looks toward something which it is reasonable to think we then see in the next shot. This again is point-of view [1]; all the music until the boys choir enters might be uncertainly situated between the diegetic and the non-diegetic, as it expresses Liliom's wonderment at clouds, planets, and stars. Ascending scales in the ondes martenot during this sequence might be taken as referential [8]—they depict the ascent. And of course the heavenly choir is diegetic; even though the viewer can tell that it is just a screen painting, we are meant to understand that Liliom does see the cherubs, and yet there is no break whatever in the musical continuity [6].

In this scene, then, motivation, musical continuity, the shaping of the cue by performance forces, and perhaps referentiality and the formal interaction of editing and music are all as important as the source/background distinction. It is true that, on the largest level of the film's discourse, the diegetic and non-diegetic are placed in a simple binary opposition, but at the level of the sequence the treatment is more complex, a condition which allows the viewer/listener to attach a motivation to the opposition. True, the carnival music is often merely neutral diegetic music, playing behind the scenes because music is expected in such a setting; that is, the setting motivates the music. Nevertheless, the drop-out for the fight, a generally high volume level in the first part, and the intermittent, implausible recurrences (though at believable volume levels) during the second part suggest that the diegetic music serves a referential or even symbolic function: it represents the earthly, or perhaps the banality of Liliom's earthly life. The non-diegetic music, similarly, is used less as dramatic background music than to represent the heavenly, or Liliom's subjectivity in his experience of the heavenly. It is, in other words, point-of-view music. Thus, the diegetic/non-diegetic pair is not the fundamental category for music in this film's discursive system; instead, the pair serves referentiality and the broadest narrative motivation.

The value of an analytic model based on the ten binary pairs in Figure 3 is that it promotes descriptions of film music that are at once more flexible and more intricate, thereby facilitating richer readings of music in the sound track. Despite the speed and compartmentalized functions of studio film production, cinema composers, editors, and directors could achieve practices we may call artistic, and these did shape the final products, even though those can rarely be regarded as perfectly self-contained texts in the sense of a Beethoven piano sonata. As they developed in early sound cinema, those practices were more complex than a simple "source music equals 'reality.'"

David Neumeyer

David Neumeyer is Marlene and Morton Meyerson Professor of Music in the Sarah and Ernest Butler School of Music, The University of Texas at Austin. Before coming to UT in 2000, he taught at Indiana University-Bloomington, where he was also Director of Graduate Studies from 1993-2000. He holds a BM in piano performance from Michigan State University (1972) and a PhD in music theory from Yale University (1976). He is the author of The Music of Paul Hindemith (Yale 1986), editor of the Oxford Handbook of Film Music Studies (2013), and co-author of the textbook Hearing the Movies: Music and Sound in Film History (Oxford 2012).

