Two overlaid perspectives—one for each eye—create spatial depth. Foundation of all 3D cinema and VR production.
On set, you need two cameras or a special rig that simultaneously captures two slightly offset perspectives — exactly how human eyes function. The horizontal offset between the left and right shot is called Interaxial Distance or eye base. You consciously determine this distance: too close, and the 3D effect falls flat; too far, and viewers get headaches because their visual system is overwhelmed. For close-ups, I reduce the base — 50 to 65 millimeters is standard. For landscape shots or action scenes, you can go up to 75 millimeters.
The two images are precisely aligned later in the edit — this is called Convergence. This is a common source of error: if the vertical alignment deviates by even two or three pixels, the image flickers for the viewer, or the eyes cannot fuse — meaning they cannot merge into a spatial impression. Therefore, during dailies review, I always check for parallax errors. Professional editing software has alignment tools for this; without them, it becomes laboriously manual.
A practical point: Depth Budget. You can't just pack infinite depth into the image. The maximum perceived depth is about 2 to 3 percent of the image width — the human eye won't perceive more, or it becomes strenuous. Specifically, this means: on a cinema screen 20 meters wide, your maximum depth stratification should not exceed 40–60 centimeters. Playfully, you can work with Negative Parallax — when objects come out of the screen towards the viewer — but this also has limits. A few centimeters are elegant; a meter feels forced and tiring.
Lighting must be consistent between both cameras, otherwise, visual flickering occurs during fusion. Pay attention to identical colorimetry and brightness. For special effects or VFX — for example, when shooting a green screen with stereoscopic cameras — the effort in compositing doubles: each layer must be created for both perspectives.