Montage sequences where sound and image are deliberately decoupled—music over rapid cuts, voiceover across unrelated visuals, dialogue layered abstractly. Creates rhythmic and emotional density.
You're sitting in the edit suite and realize: this dialogue doesn't work over the images you have. The actress is talking about a memory, but the images show the present. So you consciously decouple them — and suddenly the sequence has a depth that synchronous editing would never have. This is the core principle of phonoscènes: sound and image do not follow the same logic. They work in parallel, contrapuntally, sometimes even in opposition.
The practical power lies in rhythmic and emotional density. You can lay calm music over hectic jump cuts — the contrast creates tension. Or lead a resigned voiceover over fast cuts of a bustling city — the discrepancy between sound and image becomes a statement. In the edit, you work with two independent editing logics simultaneously. The editing rhythm follows the music or the dialogue, not the image action. The images themselves don't need synchronicity with the language — they can illustrate, contrast, or wander off completely. This requires courage in the edit room: you have to be willing to lay dialogue over mismatched or surreal images and accept that this strangeness works.
Typical scenarios: A character's inner monologue runs over their external actions — what they think contradicts what they do. Or an energetic soundtrack sequence with archival material that has nothing to do with it content-wise but drives the visual beat. Also, the classic montage opening — fast cuts to music, the story is told without anyone speaking or the images illustrating the music. The sound sets its own tempo, the cuts follow it or consciously break it.
The danger: Phonoscènes can appear like sloppy editing if they are not precisely constructed. Every decoupling must be intentional. An edit must be even sharper when the audio track tells a different truth. You work with building tension on two levels — and that demands that both levels are precisely timed. Unlike synchronous editing, where image and sound support each other, here you have to adjust both tracks individually. This makes phonoscènes complex, but also one of the most expressive editing techniques there is.