Audiovisual Translation (AVT) deals with the transfer of multimodal texts—combining image, sound, and written language—across languages and cultures. From the spatial and temporal constraints of subtitling to the lip-sync demands of dubbing, AVT poses challenges that earlier translation theories, focused primarily on written texts, could not fully address. The subfield’s development has been shaped by five major frameworks, each responding to the limitations of its predecessors and contributing new tools for understanding how and why audiovisual translations take the forms they do.
In the 1970s and 1980s, Descriptive Translation Studies (DTS) broke with earlier equivalence-based models by shifting attention from what translators should do to what they actually do. DTS introduced the concept of translation norms—regularities in translation behavior that reflect social expectations. For AVT, this meant analyzing real subtitling and dubbing choices empirically, revealing patterns such as condensation rates in subtitles or the avoidance of taboo language. DTS provided a rigorous method for describing AVT practices but stopped short of explaining why particular norms emerged in some cultures and not others. This gap was addressed by Polysystem Theory, developed by Itamar Even-Zohar. Polysystem Theory viewed literature as a stratified system of systems, where translated works could occupy a central or peripheral position. Applied to AVT, it offered a systemic explanation for the differing prevalence of dubbing versus subtitling across national contexts: for instance, in countries where translated film and television are central to the cultural polysystem (e.g., Germany, Italy), dubbing tends to dominate, whereas in peripheral translation cultures (e.g., the Netherlands, Scandinavia), subtitling is the norm. Polysystem Theory did not replace DTS but extended it by adding a cultural dimension—where DTS described norms, Polysystem located them within broader power dynamics between source and target cultures. Both frameworks share an empirical, descriptive orientation, and they continue to be used together in studies of national AVT patterns.
Around the same period, Skopos Theory emerged from the work of Katharina Reiss and Hans Vermeer, offering a functionalist perspective that contrasted with the descriptive stance. Skopos argues that the purpose (skopos) of the translation determines the strategies employed, overriding strict equivalence to the source text. In AVT, this provides a powerful rationale for departures from the original—condensing subtitles to fit time constraints, rewriting jokes for cultural resonance, or adjusting dialogue for lip synchronization. Skopos Theory complements DTS and Polysystem rather than supplanting them: while those frameworks describe what is done, Skopos explains why it should be done in functional terms. However, Skopos has been critiqued for potentially legitimizing any alteration as long as it serves the stated purpose, a concern that becomes acute when ideological interests are at stake. This limitation paved the way for Cultural Translation Studies, which gained prominence in the 1990s. Drawing on postcolonial and feminist thought, Cultural Translation Studies insists that translation is never neutral—it is a site of power, identity, and resistance. In AVT, this framework has been used to study how censorship shapes dubbing and subtitling in authoritarian contexts, how gender representation is reinforced or subverted through character speech, and how fansubbing communities challenge official translations as a form of cultural activism. Cultural Translation Studies diverges from Skopos by prioritizing ideological critique over functional efficiency; it shares with Polysystem Theory an interest in cultural hierarchies, but it adds a critical edge, questioning the very systems that Polysystem describes in structural terms. While Skopos remains influential in translator training and industry practice, Cultural Translation Studies opened up AVT to questions of ethics and power that earlier frameworks had largely overlooked.
By the early 2000s, Sociological Translation Studies emerged as the dominant framework, drawing on Pierre Bourdieu’s concepts of field, habitus, and capital to analyze the social conditions of translation work. In AVT, this turn focused attention on the professionalization of subtitlers and dubbers, labor conditions in the subtitling industry, the impact of digital platforms on translation workflows, and the role of translation companies as gatekeepers. Sociological Translation Studies both extends and challenges Cultural Translation Studies: it shares the latter’s concern with power but grounds ideological analysis in material practices—studying, for example, how the habitus of a subtitler is shaped by training, workplace norms, and technology. This framework has proven especially useful for understanding new AVT modes such as crowdsourced subtitling on streaming platforms, audio description for visually impaired audiences, and the global distribution of content via services like Netflix. It does not supersede earlier frameworks but rather absorbs them into a more comprehensive sociological perspective: DTS norms become part of the professional field, Skopos purposes are negotiated within habitus and institutional constraints, and cultural hierarchies (Polysystem) are reproduced or contested through capital.
Today, AVT research draws on all five frameworks, each with a distinct focus. Sociological Translation Studies leads in analyzing institutional and labor-related transformations, but DTS and Polysystem remain essential for large-scale descriptive studies of subtitling norms or dubbing preferences across markets. Skopos Theory continues to inform pedagogical approaches and industry best practices, while Cultural Translation Studies drives critical work on censorship, representation, and activist subtitling. The main disagreements center on the relative weight of structural constraints versus individual agency (sociological vs. cultural approaches) and on whether description should be separated from prescription (DTS vs. Skopos). New AVT modalities—game localization, voice-over for documentaries, audio description, subtitling for the deaf and hard-of-hearing—are pushing existing frameworks to adapt, and no single framework can cover the entire range of AVT phenomena. This pluralism is a sign of the subfield’s maturity, reflecting the recognition that audiovisual translation is too multifaceted to be captured by any one theoretical lens.