OpenUSD on the Web – October ’23 Musings

TL;DR: In this post I explore the pros and cons of using OpenUSD directly in web applications.

What is OpenUSD?

“Developed by Pixar, Universal Scene Description (USD) is the first open-source software that can robustly and scalably interchange 3D scenes that may be composed of many different assets, sources, and animations, while fostering highly collaborative workflows.” — https://www.pixar.com/usd

I personally don’t like the focus of this definition as it is the content that matters to me, not software. And content is stored in files. The software however is the current definition of semantics of the file format, although the Alliance for OpenUSD are working to address this issue. The distinction between software and file format I think is particularly important for web delivery of OpenUSD content.

So, if you are not familiar with OpenUSD, here is my quick overview.

  • A single OpenUSD file contains a tree of “prims” (short for “primitive elements”), such as geometry meshes and materials. A single file could hold part of a 3D model (a “sublayer”), a complete 3D model, or a scene assembled from many models. There are common practices for organizing file contents, but there is a lot of flexibility.
  • Each prim has a “schema type”, where the type defines how to interpret properties of the prim. For example, the Xform (transform) schema type defines properties for position, rotation, and scale of a prim and its descendants; the Mesh schema type defines how to encode the geometry of 3D models.
  • New schema types can be defined through a standardization effort, to make content easier as new uses emerge. For example, there is an existing schema for skeletal animation, but it could be extended to standardize how associate default idle, walk, and sit animations with a character (if enough people would find such standardization useful).
  • A scene can be assembled from a set of files combined by having a prim “reference” a prim from another file, or by layering files where the upper most layer overrides lower layers.
  • The same concepts of files, prims, properties, and metadata are used universally all the way up and down the stack.
  • OpenUSD defines two file encodings: UDSC (.udsc) which is a binary format that is efficient to process, and USDA (.usda) which is a text format. Files with a .usd file extension can be encoded either way.

The end result is it is common for projects to consist of multiple files, combined in different ways. For example, in an animated movie, a character file may be shared between different camera shot files. It is also common for a single character model to consist of multiple files by say splitting meshes, materials, and animation rigging into separate files. This can help larger organizations manage projects where different teams are responsible for different parts of a character. Put each character in a separate directory, then the rigging team can make changes to rigging.usd files with confidence their changes won’t collide with changes made by the materials team.

Do we need yet another file format?

OpenUSD is not a new file format, but it is gaining popularity with over 50 tools now supporting import/export of OpenUSD content. OpenUSD was originally designed by Pixar for their internal projects, but has since been selected by Apple and NVIDIA for their 3D efforts, as well as being supported by creative tools companies such as Adobe (e.g., for substance painter) and Autodesk, and more.

To me, the benefit of OpenUSD is its breadth. It is a single file format that supports all of the different stages of a 3D project. Further, the extensibility of OpenUSD is attractive as groups can form to reach consensus on a new schema, and thus extend the base definition. A recent example of this was defining physics related properties for models. This means the one file format can be used to support a wide range of applications from movies, to video games, marketing materials, architectural visualizations, physics simulations, robotics, and digital twins.

It is this reuse of content that is a core power of OpenUSD. 3D content can be repurposed by a range of projects. 3D models can be expensive to create, so reusing them across projects is highly beneficial. Worried about inefficient files due to too much content? Set up conventions on how to split content across files without duplication – it all stays in sync.

OpenUSD for the Web

The web is a common delivery platform for content. So is OpenUSD a good format for web delivery? Many have expressed the opinion of “no, there are other better formats for publishing 3D content on the web, such as glTF/GLB”. glTF is JSON file with a set of texture (image) files, with GLB being a binary representation that bundles the JSON and textures into a single file.

glTF files can contain full scenes (with lights and cameras), not just 3D models (with geometry, materials, and textures). But  key difference is a GLB file cannot reference other external files. It is more suitable as a delivery format than an authoring format. OpenUSD on the other hand is a better authoring format as it allows large projects to be maintained as a set of smaller files.

So why not just convert OpenUSD files to GLB files for view on the web? I think this is frequently a completely reasonable approach, but it precludes web apps modifying the original content. It is a “web 1.0 read-only” model, not a “web 2.0 read/write” model. Also I find the conversion between OpenUSD and formats like GLB add friction. They add complexity to web applications.

The other mundane problem is converting between file formats is a pain. You have two sets of files that you have to keep in sync. Any change to one file needs the change propagated to the other file format. Can it work? Absolutely. There are many web services that can resize images automatically based on the target screen size. But it adds complexity to the overall solution.

So I am personally interested in whether it is feasible to use OpenUSD natively on the web for both presentation and authoring use cases.

Web Assembly (WASM) OpenUSD port

One effort by Autodesk is to port the current OpenUSD C++ code base using WASM. It makes the entire current code base for manipulating OpenUSD files available in the browser.

Benefits:

  • Full OpenUSD functionality is available in a web browser.
  • The same API everywhere.

Challenges:

  • In a quick unscientific test, the WASM build of the OpenUSD code base grows a single browser session by over 500MB. That is problematic on a mobile device.
  • This can lead to a reputation in web development circles of “OpenUSD is too inefficient for the web”. It can take a lot of effort to change options once set. (Perception often becomes reality.)

I should note I have already seen posts on the web saying “USD is slow”. Having converted a GLB file to a OpenUSD file I have not seen such a big difference myself. The core content of geometry (meshes), textures, and so on are fundamentally the same. The binary encoding of USD appears to me could frequently be more efficient to process than JSON encodings. Going to sites such as Sketchfab has shown USDZ and GLB files for the same claimed texture resolutions etc. being similar in size.

To me, the WASM port is a useful stepping stone, but I think it should be clearly presented as part of a broader story of how OpenUSD could be used on the open web. WASM should be one way of implementing OpenUSD on a web page, not the only way. Want the full existing API available, great – use WASM and eat the cost. Otherwise, use a lighter weight JavaScript library developed for web applications.

Delivering USD files over HTTP

Is it reasonable to make USD files available over HTTP? Relative paths can be supported on URLs, so if you organize files appropriately that should not be a problem.

OpenUSD projects often exist of many smaller files. Could that be inefficient? Possibly, but with the advent of HTTP/2 and HTTP PUSH, having many small files is no longer the same performance bottleneck it once was. In fact, multiple smaller files that can be downloaded in parallel can be faster.

Should USD files just be converted into JSON? I don’t think so as there are performance considerations. For example, JavaScript in a web browser has support for TypedArrays (Float64Array, etc) which are not in JSON. So while using JavaScript objects to represent OpenUSD makes sense, I think it should leverage the richer set of JavaScript types for efficiency, such as typed arrays, not be limited to the JSON subset of types.

So I think the challenge OpenUSD presents to web application developers is the lack of a small footprint JavaScript library to manipulate USD files. The rich set of rules for combining USD files into a final result is great for authoring, but does add complexity at presentation time. Such rules need to be built into a JavaScript library that has been verified to be consistent with the core OpenUSD rules.

And more than just a small footprint library, such a library should follow JavaScript best practices, such as supporting “tree shaking” of JavaScript code. Tree shaking is where JavaScript web application build tools (like webpack) can discard JavaScript code that is never called, even if it is in the same file as other code that is used. Such principles should be taken into account when building the library as it is the expectation of modern web developers.

Modifying 3D content in web applications

Ignoring the friction of converting backwards and forwards between file formats, OpenUSD offers benefits for authoring content (mentioned above). Different teams can manipulate separate files without fear of overwriting work of other teams. This is especially important for binary files where text merging algorithms are not applicable.

It can also improve efficiency. If a web application modifies an OpenUSD file, it can upload that file individually back to the main server, either by uploading individual modified files or by using OpenUSD’s “delta” capability supported by sub-layers. Deltas have the advantage of providing a Google-docs like collaborative editing environment. Capture a delta, send it up to a central shared server, and have it apply the delta to the shared representation of the scene. Support for sublayers does impose efficiency overheads in presenting data, but can be useful to improve the efficiency of modifying content.

It may be worth going further here with conventions for collaborative editing of OpenUSD content, such as is supported by NVIDIA Nucleus. Set in place standards for sharing and modifying content stored on central servers.

Conclusions

Wrapping up:

  • Converting OpenUSD files to GLB (or similar) for presentation in web 1.0 use cases is a completely valid strategy for sites publishing 3D content on the web.
  • For more advanced web applications where you want to modify 3D content, I think using OpenUSD natively is potentially better candidate.
  • WASM may be a short term bridge to get such apps up and going initially.
  • But for wider adoption, OpenUSD needs a lighter weight JavaScript library for use in web applications for it to take off on the web.
  • I have not seen anything inherently wrong with using OpenUSD on directly in web applications.

Building a JavaScript library will not be a small undertaking, due to the richness of OpenUSD. But I think a valuable one for OpenUSD to take off on the open web.

UPDATE: October 21, 2023

Since posting, I discovered there are a large number of industry players thinking through USD and the web. There was a great “encore performance” of a SIGGRAPH session on USD and glTF that I expect to be published “soon” on https://www.youtube.com/user/khronosgroup (recorded Oct 18th). It talked about topics such as the relative power of glTF and USD and relative standardization. Summary was USD a lot richer, but still lots of work to improve standardization across toolchains. glTF -> USD conversion generally works better than the other way around, so it may make sense working in glTF in the browser and exporting to USD for now at least.

Also note glTF encoding of some data structures is closer to what the hardware needs to render, hence the comments about efficiency. USD has more options, which is good for authoring but less efficient for rendering. More options in USD however also means that if you don’t support all of USD, you may end up incompatible with some other tool out there (leading to problems like the blue circle above being much smaller than the total scope of USD).


Leave a comment