Every analytical claim about music rests on a prior decision about what counts as evidence. Is the relevant fact a ratio of string lengths, a chord's function within a key, a voice-leading graph that reveals deep structure, or a statistical pattern across thousands of recordings? Music analysis has never settled on a single answer. Instead, its history is a series of competing frameworks, each proposing different analytical units, different methods for turning sound into data, and different arguments about what music analysis is for.
The earliest systematic frameworks for analyzing music were not primarily about individual works. They were about the order of the cosmos, and music was evidence of that order.
Greek harmonic theory (c. 500–200 BCE) treated intervals as ratios of string lengths. The Pythagoreans discovered that consonant intervals—octave, fifth, fourth—correspond to simple whole-number ratios (2:1, 3:2, 4:3). Analysis meant identifying the mathematical proportions that made a scale or melody coherent. The framework's analytical unit was the interval, and its evidence was numerical. It could explain why certain pitch combinations sounded stable, but it had no vocabulary for rhythm, form, or the temporal unfolding of a piece.
Chinese lü-lü modal theory (c. 300 BCE–present) began from a different cosmological premise: pitch pipes of standardized length generated a twelve-pitch gamut, and the selection of five or seven pitches from that gamut produced modes with ethical and seasonal associations. Analysis meant determining which pitches belonged to a mode and how that mode aligned with ritual or natural cycles. The analytical unit was the modal scale, and the evidence was the tuning system itself. Unlike Greek theory, which abstracted intervals from performance, lü-lü theory was embedded in court ritual: analysis and practice were inseparable.
Sanskritic raga-tala analysis (c. 200 BCE–present) developed a more elaborate grammar. A raga is not merely a scale but a melodic framework with prescribed ascending and descending forms, characteristic phrases, and rules about which notes are emphasized or avoided. Tala is a rhythmic cycle with defined beat groupings. Analysis meant identifying the raga and tala of a performance and judging whether the performer adhered to their constraints while improvising within them. The analytical unit was the melodic-rhythmic type, not the fixed composition. Evidence came from the performer's choices against the background of the grammar.
Arabic maqam theory (c. 800 CE–present) shares with raga-tala analysis the idea of a modal framework with characteristic intervals, melodic pathways, and emotional affect. But maqam theory is more concerned with the microtonal intervals that distinguish one maqam from another, and with the hierarchy of pitches within the maqam (tonic, dominant, resting notes). Analysis meant mapping the interval structure and the typical melodic development. The analytical unit was the maqam family, and evidence was the pitch set plus the conventional melodic gestures.
All four ancient traditions treated music as an instance of a larger order—mathematical, cosmological, grammatical, or affective. None treated a composition as a unique object to be analyzed on its own terms. That shift would come only with European tonal theory.
Functional tonality analysis (1722–present) emerged from Rameau's Traité de l'harmonie (1722). Its core claim was that chords have functions—tonic, dominant, subdominant—that create a directed harmonic motion toward closure. The analytical unit was the chord, and evidence was the chord's label and its position in a key. Functional analysis could explain why a cadence feels final or why a modulation creates tension. It gave analysts a systematic vocabulary for describing harmony, but it treated each chord as a discrete event, not as part of a deeper structure.
Schenkerian analysis (1900–present) absorbed functional tonality's vocabulary but recontextualized it within a hierarchical theory of voice-leading. Heinrich Schenker argued that a tonal work is not a sequence of chord functions but a prolongation of a single underlying contrapuntal structure, the Ursatz—a fundamental bass line and upper voice that unfold the tonic triad across the entire piece. Analysis meant producing a voice-leading graph that showed how surface events (chords, melodies) are elaborations of deeper structural levels. The analytical unit was the prolongational hierarchy, and evidence was the graph itself. Schenkerian analysis narrowed the scope of functional tonality: it claimed that only the deepest structural levels truly mattered, and that only the German canonical repertoire (Bach through Brahms) fully realized this principle. It coexists with functional analysis today, each serving different analytical purposes—functional analysis for labeling chord progressions, Schenkerian analysis for showing large-scale coherence.
Berlin School comparative musicology (1885–1935) broke with the European canon entirely. Its practitioners—Stumpf, Hornbostel, Abraham—collected recordings of non-Western music and analyzed them with acoustic measurements and transcription into Western notation. The analytical unit was the recorded sound sample, and evidence was the measured pitch and rhythm. The framework was ethnocentric: it assumed that Western notation and acoustic categories were universal tools. But it opened the door to asking whether tonal analysis could account for music organized on different principles.
Ethnomusicological analysis (1950–present) absorbed the Berlin School's cross-cultural mission while reversing its method. Instead of imposing Western categories, ethnomusicologists argued that analysis must proceed from the culture's own conceptual categories. This meant learning the tradition's own terminology for scales, modes, rhythms, and forms; transcribing performances in ways that respect the performers' categories; and treating the social context of performance as part of the analytical evidence. The analytical unit shifted from the recorded sound to the culturally defined musical act. Ethnomusicological analysis does not replace tonal or Schenkerian analysis for Western music, but it challenges their claim to universality by showing that different musical systems require different analytical tools.
Pitch-class set theory (1955–present) was developed by Allen Forte and others to analyze atonal and serial music that functional tonality could not handle. Its core move was to treat pitches as members of unordered sets, classified by their interval content. Analysis meant identifying the set classes that a passage uses and showing how they relate through transposition, inversion, and complementation. The analytical unit was the pitch-class set, and evidence was the set's prime form and interval vector. Set theory replaced voice-leading with combinatorial interval-class reasoning: it could show structural relations in music that had no tonal center, but it could not explain why one set succession sounds coherent and another does not. It coexists with Schenkerian analysis by domain: Schenker for tonal music, set theory for post-tonal.
Semiotic music analysis (1970–present), particularly as developed by Jean-Jacques Nattiez, proposed that music is a symbolic form that can be analyzed at three levels: the poietic (what the composer or performer does), the neutral (the acoustic trace), and the esthesic (what the listener perceives). Analysis meant moving among these levels, showing how compositional choices produce acoustic patterns that in turn shape perception. The analytical unit was the sign or symbol, and evidence came from all three levels. Semiotic analysis built a bridge to ethnomusicological analysis by treating cultural meaning as analytically relevant, but it differed in its systematic tripartite method and its roots in linguistic theory.
Neo-Riemannian theory (1980–present) revived an idea from Hugo Riemann's 19th-century harmonic theory—that triads can be transformed into one another through voice-leading operations—but formalized it with algebraic tools. The three basic transformations are Parallel (P: C major to C minor), Leading-tone (L: C major to E minor), and Relative (R: C major to A minor). Analysis meant tracing a sequence of P, L, and R operations through a chromatic triadic progression, showing how the music moves through a network of closely related triads. Neo-Riemannian theory differs from pitch-class set theory in treating triads as directed transformations rather than unordered sets, and it differs from functional tonality in handling chromatic progressions that functional labels cannot explain. It occupies a middle ground: algebraic like set theory, but preserving the triadic vocabulary of tonal analysis.
Cognitive and empirical music analysis (1980–present) shifted the burden of proof from the analyst's expert hearing to the listener's perceptual response. Instead of asserting that a Schenkerian graph reveals the true structure, cognitive analysts test whether listeners actually hear that structure. Methods include behavioral experiments (e.g., probe-tone studies that measure perceived stability), computational modeling of expectation, and brain imaging. The analytical unit is the perceptual response, and evidence is empirical data. This framework directly challenges Schenkerian claims: for example, Schenker's assertion that the Ursatz governs the entire piece has little perceptual support—listeners do not seem to hear a single tonic prolongation across a 30-minute sonata. Cognitive analysis coexists with traditional analysis by asking different questions: not 'what structure does the score contain?' but 'what structure does the listener construct?'
Corpus studies and digital music analysis (1990–present) use large collections of scores or audio recordings to test generalizations about musical style. Instead of analyzing a single masterpiece, corpus analysts ask: how often does a particular chord progression appear in Bach chorales? What is the typical distribution of note lengths in Beethoven string quartets? The analytical unit is the corpus, and evidence is statistical. Corpus studies share an empirical orientation with cognitive analysis, but they draw on different data: not perceptual responses but the frequency of features in a repertoire. A typical finding might be that the leading tone appears in 85% of cadences in a given corpus, a claim that no single-work analysis could support. Corpus studies have transformed music analysis by making it possible to test claims about style and convention with quantitative rigor.
The leading frameworks today—Schenkerian analysis, pitch-class set theory, neo-Riemannian theory, cognitive and empirical analysis, and corpus studies—agree on one thing: no single framework captures everything music does. They disagree on what counts as the most important evidence. Schenkerians and set theorists privilege the score and the analyst's trained judgment. Cognitive analysts privilege perceptual data. Corpus analysts privilege statistical patterns. Ethnomusicologists and semioticians insist that cultural context is inseparable from musical structure. This pluralism is not a failure of the field to converge. It reflects the fact that music is simultaneously a physical signal, a perceptual experience, a cultural practice, and a symbolic system. Each framework illuminates one of those dimensions, and the productive work of music analysis today lies in understanding how those dimensions interact.