One camera captures depth data via sensor or computational methods instead of stereo pairs—lighter rig, faster setup. Depth reconstructed in post-production.
You unpack one camera, not two. That's the core: A single-sensor unit captures depth information directly during recording — either through structured light (projector casts pattern, sensor reads distortion), time-of-flight (transit time measurement of infrared pulses), or newer phase-shift methods. No stereo rig with two lenses that you have to synchronize, calibrate, and juggle on set. One camera, one cable, done — and yet you have spatial data for the 3D space.
Why this was revolutionary: Classic stereo 3D cameras — two lenses, parallax adjustment, convergence management — required space, maintenance, and precise geometry. Close-ups became critical, movement tricky. Single-film methods free you from that. You shoot with normal freedom of movement; depth reconstruction happens later. The sensor captures RGB + depth in one take. In editing, you then calculate parallax afterwards, shift virtual planes, or create true stereo output for the respective 3D version — depending on the shot and requirements. This not only saves shooting time but also camera weight and allows for lenses that are impossible with classic stereo rigs.
Practically, this means: You shoot standard footage but get a depth map for free. In post, you — or your VFX house — assemble the spatial layers. You need fewer fakes or color keys because the depth is machine-recognized. The quality depends on the scene (reflective surfaces are tricky) and the sensor resolution (some 3D cameras only give you 320x240 depth at 4K RGB). But for documentary material, fast guerrilla shoots, or VFX tracking, it's worth its weight in gold — significantly faster than a stereo setup, similarly flexible as a normal mono camera.
Caution: Depth map ≠ automatic 3D magic. You still need color space management, color correction, and artifacts appear in post if the sensor depth is unclean. But the foundation is solid: one recording, full freedom of movement, spatial data in-house. This doesn't spare you the stereo graphics (convergence, screen parallax) — you do that in grading — but the hardware complexity on set drops dramatically.