3D Video vs 3D Animated Scenes

I have been exploring OpenUSD (formerly USD from Pixar) for creating computer generated animated videos. I create the characters, backgrounds, and animations as 3D models, then render it to a 2D video you can view on social media. Out of curiosity I just got a store demo of the Apple Vision Pro, exploring both 180 degree 3D videos and 3D scenes. The quality of the experience was impressive and it made me wonder about 3D scenes vs 3D videos.

What do I mean by scene vs video? Well, 2D video is what we are all used to. It is on our TV sets, computer monitors, etc. 3D videos add a feeling of depth. 3D image experiences have been around for a LONG time (Wikipedia shows some images from the 1860’s!). 3D videos have been shown in cinemas for years as well (Wikipedia indicates since 1915!), typically involving wearing special glasses. More recently VR headsets have also been used for consuming 3D video by sending a different image to the left and right eye. So 3D videos add a feeling of depth.

How does it work? Put simply, you capture images with two cameras side by side. You then play the images back to the two eyes of the viewer. Our brains have been trained to look at 2 flat images and work out the difference to understand the distance of things we see based on parallax.

So what is the difference between watching a 3D video and a 3D scene in a VR headset? Err, I mean in a spatial computing device.

The difference is pretty simple. A 3D video is shot from a single point. You can perceive depth, you can rotate your head to see left/right, but you cannot move your head sideways to “look behind” an object on the screen. A 3D scene with a VR headset on the other hand allows you to move around in the scene and look at things from different angles. It is more like being in a 3D video game.

The Apple Vision Pro has gone with OpenUSD as its main format instead of glTF. I create animations using OpenUSD, so it raised the question: if I wanted to target content at the device, should I create 3D videos or should I create 3D scenes with animations and play them back directly on the device. Here are my thoughts.

3D videos are easier for content producers. With a 3D video the creator can review exactly what the user sees. With a 3D scene, the audience can try and look at your content from another direction, so it has to look good from all angles. That can involve lot more work. For example, I often hide “problems” from the viewer by changing the camera angle. Maybe the cloth physics for a dress does not look good when a character is sitting. With 3D video I can frame the shot to hide such problems.

Jumping between shots is easier. If viewers get used to moving around in physical space to look at a scene from different angles, what happens when you jump to a shot with a new camera angle? The viewer may suddenly be too close to or even inside something in the scene. Not a good experience.

Performance is easier with 3D videos. With an animated scene, the animation is occurring on the viewing device. Because different people have different powered devices, it can be hard to ensure smooth animation on all viewing devices.

Rendering quality can be higher with 3D videos. Related to performance, you can spend as much time as you need to render out high quality video from a 3D scene. You don’t need to play back animations in real time. This can significantly improve the quality of the final image.

3D video streaming is easier. If you want to play back a movie in real time, you either have to download all the assets in advance, or you may risk getting to a scene where all of the assets are available yet. Think of pauses you get in a video game when you enter a new level.

Lighting and color consistency is easier with 3D videos. The creator of videos has complete control over lighting and color grading. Delivering the same 3D model content to different rendering engines normally results in a different look and feel to the scene. So you either need to ship the rendering engine with the content (like a video game does), or all client side rendering engines need to behave the same (which is not the case today).

So are there any benefits in 3D scenes over 3D video? Yes. If you are sitting watching a movie, a 3D scene does “feel better” if you move slightly sideways. It is more immersive. They are just harder to create.

So for now, even though I use OpenUSD for modeling my 3D scenes, and the Apple Vision Pro supports OpenUSD natively, I still plan to render scenes to 3D video, not shipping them as OpenUSD scenes natively.

The ultimate answer may be more along the lines of Nerfs or Gaussian Splatting. These are techniques to capture a 3D scene that allows minor head movement. Move too far, and the 3D scene illusion is broken. But minor movements do feel better. Interesting times ahead for sure!

VR headsets like the Vision Pro and Quest 3 can be used in other ways for video content creation. Modern devices can capture your body movements and facial expressions. Control your own 3D characters by acting out the roles of characters (capturing body movements and applying them to 3D characters in your show). A solo creator? Record the first character, then play back the recording while capturing a second character, allowing them to more naturally interact.


Leave a comment