WebGPU - What is it and why does it matter?
A new, cross-platform API for advanced graphics & ML
On Monday, May 1st, WebGPU officially launched in Chrome. WebGPU is a new browser API for running computations, most notably graphics & AI/ML workloads, against the device’s GPU. It is probably the most significant change in browser-based GPU workloads since WebGL launched in 2011, so let’s explore what it is, why it matters, and what it might enable.
A Brief History of GPU Computation in the Browser
For most of the past decade, WebGL has been the core library used to do GPU-based computations in the browser. The vast majority of GPU use cases until the mid 2010s were graphics and rendering, and WebGL reflected that reality - it’s APIs were all designed around the idea of running computations in a GPU and rendering the output as a set of pixels in a <canvas> element.
This was transformational at the time; prior to WebGL, you would have had to use something like Flash to do any kind of 3D work in the browser, which had numerous issues associated with it. The rise of WebGL led to a number of new companies being created that otherwise would not have been possible to build. One of the most notable examples of this was Figma, whose spatial canvas would have been essentially impossible to render in the browser prior to WebGL. Google Earth also falls in this camp.
However, over the past decade, a few key things changed in the world of GPUs. First, workloads outside of graphics became extremely commonplace, most notably ML workloads. These are often referred to as GPGPU workloads, aka “General Purpose Computing on GPUs”. Second, GPU infrastructure, and therefore, APIs, changed and evolved dramatically over this period. Native GPU APIs like Microsoft Direct 3D 12, Apple Metal, and Khronos’ Vulkan (Used by Android) all launched over this time, with no planned support by OpenGL.
It became clear that these trends necessitated a rethinking of the API for running GPU computations in the browser, and so a big consortium of all the key players (Google, Mozilla, Apple, Intel Microsoft) came together in 2017 to propose WebGPU. After 6 years, it is finally going live in Chrome, with Firefox and Safari launching very soon.
So, what does it actually do? Broadly, there are a few key things that are interesting about WebGPU relative to WebGL.
Native Support for GPGPU Workloads
WebGPU treats non-graphics workloads, such as ML models, as a first class citizen through its GPU Compute API. While it was theoretically possible to do these sorts of non-graphics workloads in WebGL, it was extremely cumbersome and hacky, essentially involving you to “pretend” you are doing a rendering task by doing all the computation with pixel shaders and then decoding the rendered pixels back to data.
Having a proper API for non-graphics workloads massively simplifies the challenge of running ML models and other GPU-intensive computations in the browser. It also leads to major performance benefits since you avoid unnecessary and inefficient encoding/decoding of data into graphics-oriented pipelines.
Performance
WebGPU is in most respects far more performant than WebGL. Below you can find a matrix multiplication benchmark from PixelsCommander as well as a person segmentation benchmark with tensorflow.js run by Drifting In Space. In each case you observe 2-3x speedups relative to WebGL. Google similarly saw a 3x speedup in a diffusion model when run in WebGPU vs. WebGL.
Performance improvements in WebGPU come from a few key things:
For non-graphics workloads, no unnecessary conversion to and from pixel-space (computations happen directly on buffers)
WebGPU runs asynchronously and does not block the main thread
WebGPU is able to take advantage of many advanced rendering features present in modern GPUs that were previously unavailable in the browser
WebGPU allows many operations that previously had to be run on a CPU in the browser to instead run on the GPU (e.g. culling). This is related to #3
Advanced APIs, like Render Bundles, which optimize performance in certain situations via caching or more efficient operations. Babylon’s Snapshot Rendering is built on Render Bundles and in some cases allows 10x faster rendering.
WebGPU generally allows for more precise control of GPU operations overall, meaning you can tune/tweak it much more from a performance perspective
WebGPU front-loads a lot of GPU validation & security checks that GPUs must do, pushing them outside of the core drawing loop during rendering
Of course, WebGPU doesn’t magically make everything more performant. There are still hard memory limits for what can be done in a browser, and there will be some cases where WebGPU will not be materially faster than WebGL, particularly in graphics use cases that don’t take advantage of newer/modern APIs and have few state transitions or complex re-renders.
Developer Experience
The WebGPU API and abstraction is profoundly different from WebGL. Some of the major differences include:
Statefulness - WebGL requires a developer to reason about a very complex global state object, whereas the WebGPU API has very little global state
Shader Language - A shader is essentially a small function that runs some operations on a GPU. WebGL uses GLSL (OpenGL Shading Language) whereas WebGPU uses WGSL (WebGPU Shading Language), though you can argue that these map to each other somewhat directly.
Debuggability - WebGPU has richer error handling and better error messages
Idiomatic - In general, WebGPU feels more idiomatic to modern programming in the web. It tends to use existing features in the web for common operations like image loading vs. doing everything in its own way.
Abstractions - WebGPU centers around adapters and logical devices. Mozilla has a good overview of this here
Overall, although it still has a high learning curve (especially given the lack of documentation/ecosystem around it due to its recency), it seems far preferred to most developers I have spoken with.
Portability
One of the most interesting facets of WebGPU is the fact that it is not designed for just the web. There are essentially three ways to use WebGPU:
A javascript API in the browser
A Rust/C++ API inside of Web Assembly in the browser
A Rust/C++ API in a standalone application outside of the browser
The offers immense portability benefits. For example, a GPU application running on a server that is implemented with WebGPU could be effortlessly ported to the browser with no code changes. Similarly, WebGPU code running on Windows through Direct3D will have identical results as WebGPU code running on Android Vulkan.
In this sense, WebGPU is in some respects more of an emerging, cross-platform standard for GPU based computation than just a browser based API. wgpu is an emerging rust library that implements the WebGPU API and makes it easy to run GPU computations on the server, and Dawn is a similar library out of Google. Because WebGPU sits and abstracts all the native OS GPU APIs so nicely and has such portability benefits, it is a compelling abstraction to build against as an engineer. This is very analogous to how Web Assembly is starting to be widely adopted outside of the web.
"While WebGL is just a thin wrapper around OpenGL, WebGPU chose a different approach. It introduces its own abstractions and doesn’t directly mirror any of these native APIs. This is partially because no single API is available on all systems, but also because many concepts (such as extremely low-level memory management) aren’t idiomatic for a web-facing API. Instead, WebGPU was designed to both feel “webby” and to comfortably sit on top of any of the native graphics APIs while abstracting their idiosyncrasies. It’s being standardized in the W3C with all major browser vendors having a seat at the table." - Surma
“WebGPU is in the web browser, and Microsoft and Apple are on the browser standards committee, so they're "bought in", not only does WebGPU work good-as-native on their platforms but anything WebGPU can do will remain perpetually feasible on their OSes regardless of future developer lock-in efforts. (You don't have to worry about feature drift like we're already seeing with MoltenVK.) WebGPU will be on day one (today) available with perfectly equal compatibility for JavaScript/TypeScript (because it was designed for JavaScript in the first place), for C++ (because the Chrome implementation is in C, and it's open source) and for Rust (because the Firefox implementation is in Rust, and it's open source).” - Cohost
WebGPU Today
Today, WebGPU is used and being implemented by a number of different projects. Tensorflow.js uses it to run ML inference more efficiently in the browser. Babylon, a popular javascript 3D engine, has full support for WebGPU. PlayCanvas and Three.js are working on support. WebLLM is a new project that runs large language models fully inside the browser.
While it is likely that most existing rendering & graphics libraries will announce support for WebGPU over the coming months & years, what is arguably most interesting about WebGPU are the truly “net new” things it will enable that were generally not possible before. Given that general purpose compute shaders are the most truly new thing about WebGPU, embedded AI in the browser is likely where you’ll see the most novelty.
This is analogous to how, in the early days of WebGL, most people thought it would primarily be used for browser games (a carryover of what Flash was most well known for), but in reality the most interesting use cases ended up being 3D rendering and canvas-based UIs. This podcast discusses this history in more detail. WebLLM is a good example of a project that is only now possible thanks to WebGPU.
Some of the things I suspect will be the most interesting moving forward are:
AI-native design tools in areas like CAD, 3D design, film editing, software engineering, animation, and more (e.g. think things like Bezel & Sequence)
Browser native 3D engines (e.g. Ambient)
Applications that take advantage of WebGPU’s portability - if you can run the same GPU logic on the client, the edge, and the cloud, does this enable novel things to be done? (Analogous to how Motherduck is able to heavily innovate on the concept of a data warehouse since DuckDB can be run everywhere)
More broadly, WebGPU is one more step in the broader direction of the web becoming a powerful operating system in its own right, a concept which Matt Rickard touches on in more depth here. It is increasingly possible to build completely browser-native applications that for the most part run everything locally in the web - see things like RillData & Malloy. You can run a database in the browser, run UDP in the browser, run ML models in the browser, run a data warehouse in the browser…you get the idea. I suspect there is a lot of runway for more applications to be built that take advantage of these concepts, rethinking traditional system architectures.
If you’re building a new startup that is taking advantage of WebGPU, I’d love to chat.
See also:
Hello,
As someone who is entirely new to the GPGPU space, I want to thank you for the very clear and detailed write up. I have been searching the web to explain to me the differences between all of these different shader languages and this article is by far the best explanation that I have found.