**Ray Tracing: GPU Edition**
[Arman Uguray][]
Draft
!!! WARNING
This is a living document for a work in progress. Please bear in mind that the contents will
change frequently and go through many edits before the final version.
Introduction
====================================================================================================
_Ray Tracing_ is a rendering method in Computer Graphics that simulates the flow of light. It can
faithfully recreate a variety of optical phenomena and can be used to render photorealistic images.
_Path tracing_ is an application of this approach used to compute _Global Illumination_. Its
core idea is to repeatedly trace millions of random rays through the scene and bounce them off
objects based on surface properties. The algorithm is remarkably simple and relatively easy
to implement when applied to a small number of material and geometry types. Peter
Shirley's [_Ray Tracing In One Weekend_][RTIOW] (RTIOW) is a great introduction to building the
foundation for a hobby renderer.
A challenge with path tracing is its high computational cost. Rendering a complex scene takes a
long time and this get worse as the rendered scenes get complex. This has historically made path
tracing unsuitable for real-time applications. Fortunately -- like many problems in Computer
Graphics -- the algorithm lends itself very well to parallelism. It is possible to achieve a
significant speedup by distributing the work across many processor cores.
The GPU (Graphics Processing Unit) is a type of processor designed to run the same set of operations
over large amounts of data in parallel. This parallelism has been instrumental to achieving
realistic visuals in real-time applications like video games. GPUs have been traditionally used to
accelerate scanline rasterization but have since become programmable and capable of running
a variety of parallel workloads. Notably, modern GPUs are now equipped with hardware cores dedicated
to ray tracing.
GPUs aren't without limitations. Programming a GPU requires a different approach than a typical CPU
program. Taking full advantage of a GPU often involves careful tuning based on its architecture and
capabilities which can vary widely across vendors and models. Rendering fully path-traced scenes
at real-time rates remains elusive even on the most high-end GPUs. This is an an active and vibrant
area of Computer Graphics research.
This book is an introduction to GPU programming by building a simple GPU accelerated path tracer.
We'll focus on building a renderer that can produce high quality and correct images using a fairly
simple design. It won't be full-featured and its performance will be limited, however it will expose
you to several fundamental GPU programming concepts. By the end, the renderer you'll have built can
serve as a great starting point for extensions and experiments with more advanced GPU techniques. We will
avoid most optimizations in favor of simplicity but the renderer will be able to achieve interactive
frame rates on a decent GPU when targeting simple scenes.[^ch1] The accompanying code intentionally
avoids hardware ray tracing APIs that are present on newer GPU models, instead focusing on
implementing the same functionality on a programmable GPU unit using a shading language.
This book follows a similar progression to [_Ray Tracing In One Weekend_][RTIOW]. It covers some of
the same material but I highly recommend completing _RTIOW_ before embarking on building
the GPU version. Doing so will teach you the path tracing algorithm in a much more approachable
way and it will make you appreciate both the advantages and challenges of moving to a GPU-based
architecture.
If you run into any problems with your implementation, have general questions or corrections, or
would like to share your own ideas or work, check out [the GitHub Discussions forum][discussions].
[^ch1]: A BVH-accelerated implementation can render a version of the RTIOW cover scene with ~32,000
spheres, 16 ray bounces per pixel, and a resolution of 2048x1536 on a 2022 _Apple M1 Max_ in 15
milliseconds. The same renderer performs very poorly on a 2019 _Intel UHD Graphics 630_ which takes
more than 200ms to render a single sample.
GPU APIs
--------
Interfacing with a GPU and writing programs for it typically requires the use of a special API. This
interface depends on your operating system and GPU vendor. You often have various options depending
on the capabilities you want. For example, an application that wants to get the most juice out of a
NVIDIA GPU for general purpose computations may choose to target CUDA. A developer who prefers
broad hardware compatibility for a graphical mobile game may choose OpenGL ES or Vulkan. Direct3D
(D3D) is the main graphics API on Microsoft platforms while Metal is the preferred framework on
Apple systems. Vulkan, D3D12, and Metal all support an API specifically to accelerate ray
tracing.
You can implement this book using any API or framework that you prefer, though I generally assume
you are working with a graphics API. In my examples I use an API based on [WebGPU][webgpu],
which I think maps well to all modern graphics APIs. The code
examples should be easy to adapt to those libraries. I avoid using ray tracing APIs (such as
[DXR][dxr] or [Vulkan Ray Tracing][vkrt]) to show you how to implement similar functionality on
your own.
If you're looking to implement this in CUDA, you may also be interested in Roger Allen's
[blog post][rtiow-cuda] titled _Accelerated Ray Tracing in One Weekend in CUDA_.
Example Code
------------
Like _RTIOW_, you'll find code examples throughout the book. I use [Rust][] as
the implementation language but you can choose any language that supports your GPU API of choice. I avoid
most esoteric aspects of Rust to keep the code easily understandable to a large audience. On the few
occasions where I had to resort to a potentially unfamiliar Rust-ism, I provide a C example to add
clarity.
I provide the finished source code for this book on [GitHub][gt-project] as a reference but I
encourage you to type in your own code. I decided to also provide a minimal source template that you
can use as a starting point if you want to follow along in Rust. The template provides a small
amount of setup code for the windowing logic to help get you started.
### A note on Rust, Libraries, and APIs
I chose Rust for this project because of its ease of use and portability. It is also the language
that I tend to be most productive in.
An important aspect of Rust is that a lot of common functionality is provided by libraries outside
its standard library. I tried to avoid external dependencies as much as possible except for the
following:
* I use *[wgpu][]* to interact with the GPU. This is a native graphics API based on
WebGPU. It's portable and allows the example code to run on Vulkan, Metal, Direct3D 11/12, OpenGL
ES 3.1, as well as WebGPU and WebGL via WebAssembly.
wgpu also has [native bindings in other languages](https://github.com/gfx-rs/wgpu-native).
* I use [*winit*](https://docs.rs/winit/latest/winit/) which is a portable windowing library. It's
used to display the rendered image in real-time and to make the example code interactive.
* For ease of Rust development I use [*anyhow*](https://docs.rs/anyhow/latest/anyhow/) and
[*bytemuck*](https://docs.rs/bytemuck/latest/bytemuck/). *anyhow* is a popular error handling
utility and integrates seamlessly. *bytemuck* provides a safe abstraction for the equivalent of
`reinterpret_cast` in C++, which normally requires [`unsafe`][rust-unsafe] Rust. It's used to
bridge CPU data types with their GPU equivalents.
* Lastly, I use [*pollster*](https://docs.rs/pollster/latest/pollster/) to execute asynchronous
wgpu API functions (which is only called from a single line).
[wgpu][] is the most important dependency as it defines how the example code interacts with the
GPU. Every GPU API is different but their abstractions for the general concepts used in this book
are fairly similar. I will highlight these differences occasionally where they matter.
A large portion of the example code runs on the GPU. Every graphics API defines a programming
language -- a so called **shading language** -- for authoring GPU programs. wgpu is based on WebGPU,
as such my GPU code examples are written in *WebGPU Shading Language* (WGSL)[^ch1.2.1].
I also recommend keeping the following references handy while you're developing:
* wgpu API documentation (version 0.19.1): https://docs.rs/wgpu/0.19.1/wgpu
* WebGPU specification: https://www.w3.org/TR/webgpu
* WGSL specification: https://www.w3.org/TR/WGSL
With all of that out of the way, let's get started!
[^ch1.2.1]: wgpu also supports shaders in the
[SPIR-V](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html) binary format. You could
in theory write your shaders in a shading language that can compile to SPIR-V (such as OpenGL's GLSL
and Direct3D's HLSL) as long as you avoid any language features that can't be expressed in WGSL.
Windowing and GPU Setup
====================================================================================================
The first thing to decide is how you want to view your image. One option is to write the output from
the GPU to a file. I think a more fun option is to display the image inside an application window.
I prefer this approach because it allows you to see your rendering as it resolves over time and it
will allow you to make your application interactive later on. The downside is that it requires a
little bit of wiring.
First, your program needs a way to interact with your operating system to create and manage a
window. Next, you need a way to coordinate your GPU workloads to output a sequence of images at the
right time for your OS to be able to composite it inside the window and send it to your display.
Every operating system with a graphical UI provides a native *windowing API* for this purpose.
Graphics APIs typically define some way to integrate with a windowing system. You'll have various
libraries to choose from depending on your OS and programming language. You mainly need to make sure
that the windowing API or UI toolkit you choose can integrate with your graphics API.
In my examples I use *winit* which is a Rust framework that integrates smoothly with wgpu. I put
together a [project template][gt-template] that sets up the library boilerplate for the window
handling. You're welcome to use it as a starting point.
The setup code isn't a lot, so I'll briefly go over the important pieces in this chapter.
The Event Loop
--------------
The first thing the template does is create a window and associate it with an *event loop*. The OS
sends a message to the application during important "events" that the application should act on,
such as a mouse click or when the window gets resized. Your application can wait for these events
and handle them as they arrive by looping indefinitely:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
use {
anyhow::{Context, Result},
winit::{
event::{Event, WindowEvent},
event_loop::{ControlFlow, EventLoop},
window::{Window, WindowBuilder},
},
};
const WIDTH: u32 = 800;
const HEIGHT: u32 = 600;
fn main() -> Result<()> {
let event_loop = EventLoop::new()?;
let window_size = winit::dpi::PhysicalSize::new(WIDTH, HEIGHT);
let window = WindowBuilder::new()
.with_inner_size(window_size)
.with_resizable(false)
.with_title("GPU Path Tracer".to_string())
.build(&event_loop)?;
// TODO: initialize renderer
event_loop.run(|event, control_handle| {
control_handle.set_control_flow(ControlFlow::Poll);
match event {
Event::WindowEvent { event, .. } => match event {
WindowEvent::CloseRequested => control_handle.exit(),
WindowEvent::RedrawRequested => {
// TODO: draw frame
window.request_redraw();
}
_ => (),
},
_ => (),
}
})?;
Ok(())
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [main-initial]: [main.rs] Creating a window and handling window events]
This code creates a window titled "GPU Path Tracer" and kicks off an event loop.
`event_loop.run()` internally waits for window events and notifies your application by calling the
lambda function that it gets passed as an argument.
The lambda function only handles a few events for now. The most important one is `RedrawRequested`
which is the signal to render and present a new frame. `MainEventsCleared` is simply an event that
gets sent when all pending events have been processed. We call `window.request_redraw()` to draw
repeatedly -- this triggers a new `RedrawRequested` event which is followed by another
`MainEventsCleared`, which requests a redraw, and so on until someone closes the window.
Running this code should bring up an empty window like this:
![Figure [empty-window]: Empty Window](../images/img-01-empty-window.png)
GPU and Surface Initialization
------------------------------
The next thing the template does is establish a connection to the GPU and configure a surface. The
surface manages a set of *textures* that allow the GPU to render inside the window.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
async fn connect_to_gpu(window: &Window) -> Result<(wgpu::Device, wgpu::Queue, wgpu::Surface)> {
use wgpu::TextureFormat::{Bgra8Unorm, Rgba8Unorm};
// Create an "instance" of wgpu. This is the entry-point to the API.
let instance = wgpu::Instance::default();
// Create a drawable "surface" that is associated with the window.
let surface = instance.create_surface(window)?;
// Request a GPU that is compatible with the surface. If the system has multiple GPUs then
// pick the high performance one.
let adapter = instance
.request_adapter(&wgpu::RequestAdapterOptions {
power_preference: wgpu::PowerPreference::HighPerformance,
force_fallback_adapter: false,
compatible_surface: Some(&surface),
})
.await
.context("failed to find a compatible adapter")?;
// Connect to the GPU. "device" represents the connection to the GPU and allows us to create
// resources like buffers, textures, and pipelines. "queue" represents the command queue that
// we use to submit commands to the GPU.
let (device, queue) = adapter
.request_device(&wgpu::DeviceDescriptor::default())
.await
.context("failed to connect to the GPU")?;
// Configure the texture memory backing the surface. Our renderer will draw to a surface
// texture every frame.
let caps = surface.get_capabilities(&adapter);
let format = caps
.formats
.into_iter()
.find(|it| matches!(it, Rgba8Unorm | Bgra8Unorm))
.context("could not find preferred texture format (Rgba8Unorm or Bgra8Unorm)")?;
let size = window.inner_size();
let config = wgpu::SurfaceConfiguration {
usage: wgpu::TextureUsages::RENDER_ATTACHMENT,
format,
width: size.width,
height: size.height,
present_mode: wgpu::PresentMode::AutoVsync,
alpha_mode: caps.alpha_modes[0],
view_formats: vec![],
desired_maximum_frame_latency: 3,
};
surface.configure(&device, &config);
Ok((device, queue, surface))
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [main-initial]: [main.rs] The connect_to_gpu function]
The code that sets this all up is a bit wordy. I'll quickly go over the important bits:
1. What the first ~20 lines do is request a connection to a GPU that is compatible with the
window. The bit about `wgpu::PowerPreference::HighPerformance` is a hint to the API that we want
the higher-powered GPU if the current system has more than one available.
2. The rest of the function configures the dimensions, pixel format, and presentation mode of the
surface. `Rgba8Unorm` and `Bgra8Unorm` are common pixel formats that store each color component
(red, green, blue, and alpha) as an 8-bit unsigned integer. The "unorm" part stands for "unsigned
normalized", which means that our rendering code can represent the component values as a real
number in the range `[0.0, 1.0]`. We set the size to simply span the entire window.
The bit about `wgpu::PresentMode::AutoVsync` tells the surface to synchronize the presentation of
each frame with the display's refresh rate. The surface will manage an internal queue of textures
for us and we will render to them as they become available. This prevents a visual artifact known
as "tearing" (which can happen when frames get presented faster than the display refresh rate) by
setting up the renderer to be *v-sync locked*. We will discuss some of the implications of this
later on.
The last bit that I'll highlight here is `wgpu::TextureUsage::RENDER_ATTACHMENT`. This just
indicates that we are going to use the GPU's rendering function to draw directly into the surface
textures.
After setting all this up the function returns 3 objects: A `wgpu::Device` that represents the
connection to the GPU, a `wgpu::Queue` which we'll use to issue commands to the GPU, and a
`wgpu::Surface` that we'll use to present frames to the window. We will talk a lot about the first
two when we start putting together our renderer in the next chapter.
You may have noticed that the function declaration begins with `async`. This marks the function as
*asynchronous* which means that it doesn't return its result immediately. This is only necessary
because the API functions that we invoke (`wgpu::Instance::request_adapter` and
`wgpu::Adapter::request_device`) are asynchronous functions. The `.await` keyword is syntactic sugar
that makes the asynchronous calls appear like regular (synchronous) function calls. What happens
under the hood is somewhat complex but I wouldn't worry about this too much since this is the one
and only bit of asynchronous code that we will encounter. If you want to learn more about it, I
recommend checking out the [Rust Async Book](https://rust-lang.github.io/async-book/).
### Completing Setup
Finally, the `main` function needs a couple updates: first we make it `async` so that it we can
"await" on `connect_to_gpu`. Technically the `main` function of a program cannot be async and
running an async function requires some additional utilities. There are various alternatives but I
chose to use a library called `pollster`. The library provides a special macro (called `main`) that
takes care of everything. Again, this is the only asynchronous code that we'll encounter so don't
worry about what it does.
The second change to the main function is where it handles the `RedrawRequested` event. For every
new frame, we first request the next available texture from the surface that we just created. The
queue has a limited number of textures available. If the CPU outpaces the GPU (i.e. the GPU takes
longer than a display refresh cycle to finish its tasks), then calling
`surface.get_current_texture()` can block until a texture becomes available.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
#[pollster::main]
async fn main() -> Result<()> {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
let event_loop = EventLoop::new()?;
let window_size = winit::dpi::PhysicalSize::new(WIDTH, HEIGHT);
let window = WindowBuilder::new()
.with_inner_size(window_size)
.with_resizable(false)
.with_title("GPU Path Tracer".to_string())
.build(&event_loop)?;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
let (device, queue, surface) = connect_to_gpu(&window).await?;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
// TODO: initialize renderer
event_loop.run(|event, control_handle| {
control_handle.set_control_flow(ControlFlow::Poll);
match event {
Event::WindowEvent { event, .. } => match event {
WindowEvent::CloseRequested => control_handle.exit(),
WindowEvent::RedrawRequested => {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
// Wait for the next available frame buffer.
let frame: wgpu::SurfaceTexture = surface
.get_current_texture()
.expect("failed to get current texture");
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
// TODO: draw frame
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
frame.present();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
window.request_redraw();
}
_ => (),
},
_ => (),
}
})?;
Ok(())
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [main-setup-complete]: [main.rs] Putting together the initial main function]
Once a frame texture becomes available, the example issues a request to display it as soon as
possible by calling `frame.present()`. All of our rendering work will be scheduled before this call.
That was a lot of boilerplate -- this is sometimes necessary to interact with OS resources. With all
of this in place, we can start building a real-time renderer.
### A note on error handling in Rust
If you're new to Rust, some of the patterns above may look unfamiliar. One of these is error
handling using the `Result` type. I use this pattern frequently enough that it's worth a quick
explainer.
A `Result` is a variant type that can hold either a success (`Ok`) value or an error (`Err`) value.
The types of the `Ok` and `Err` variants are generic:
`T` and `E` can be any type. It's common for a library to define its own error types to represent
various error conditions.
The idea is that a function returns a `Result` if it has a failure mode. A caller must check the
status of the `Result` to unpack the return value or recover from an error.
In a C program, a common way to handle an error is to return early from the calling function and
and perhaps return an entirely new error. For example:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C
bool function_with_a_result(Foo* out_result);
int main() {
Foo foo;
if (!function_with_result(&foo)) {
return -1;
}
// ...do something with `foo`...
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Rust provides the `?` operator to automatically unpack a `Result` and return early if it holds an
error. A Rust version of the C program above could be written like this:
If `function_with_result()` returns an error, the `?` operator will cause `caller` to return and
propagate the error value. This works as long as `caller` and `function_with_result` either return
the same error type or types with a known conversion. There are various other ways to handle an
error:
I like to keep things simple in my code examples and use the `?` operator. Instead of defining
custom error types and conversions, I use a catch all `Error` type from a library called *anyhow*.
You'll often see the examples include `anyhow::Result` (an alias for `Result<, anyhow::Error>`)
and `anyhow::Context`. The latter is a useful trait for adding an error message while converting to
an `anyhow::Error`:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
fn caller() -> anyhow::Result<()> {
let foo: Foo = function_with_result().context("failed to get foo")?;
// ...do something with `foo`...
Ok(())
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can read more about the `Result` type in [its module
documentation](https://doc.rust-lang.org/std/result/index.html).
Drawing Pixels
====================================================================================================
At this stage, we have code that brings up a window, connects to the GPU, and sets up a queue of
textures that is synchronized with the display. In Computer Graphics, the term "texture" is
generally used in the context of *texture mapping*, which is a technique to apply detail to geometry
using data stored in memory. A very common application is to map color data from the pixels of a 2D
image onto the surface of a 3D polygon.
Texture mapping is so essential to real-time graphics that all modern GPUs are equipped with
specialized hardware to speed up texture operations. It's not uncommon for a modern video game to
use texture assets that take up hundreds of megabytes. Processing all of that data involves a lot
of memory traffic which is a big performance bottleneck for a GPU. This is why GPUs come with
dedicated texture memory caches, sampling hardware, compression schemes and other features to
improve texture data throughput.
We are going to use the texture hardware to store the output of our renderer. In wgpu, a *texture
object* represents texture memory that can be used in three main ways: texture mapping, shader
storage, or as a *render target*[^ch3-cit1]. A surface texture is a special kind of texture that can
only be used as a render target.
Not all native APIs have this restriction. For instance, both Metal and Vulkan allow their version
of a surface texture -- a *frame buffer* (Metal) or *swap chain* (Vulkan) texture -- to be
configured for other usages, though this sometimes comes with a warning about impaired performance
and is not guaranteed to be supported by the hardware.
wgpu doesn't provide any other option so I'm going to start by implementing a render pass. This is
a fundamental and very widely used function of the GPU, so it's worth learning about.
[^ch3-cit1]: See [`wgpu::TextureUsages`](https://docs.rs/wgpu/0.17.0/wgpu/struct.TextureUsages.html).
The render Module
---------------------
I like to separate the rendering code from all the windowing code, so I'll start by creating a file
named `render.rs`. Every Rust file makes up a *module* (with the same name) which serves as a
namespace for all functions and types that are declared in it. Here I'll add a data structure called
`PathTracer`. This will hold all GPU resources and eventually implement our path tracing algorithm:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
pub struct PathTracer {
device: wgpu::Device,
queue: wgpu::Queue,
}
impl PathTracer {
pub fn new(device: wgpu::Device, queue: wgpu::Queue) -> PathTracer {
device.on_uncaptured_error(Box::new(|error| {
panic!("Aborting due to an error: {}", error);
}));
// TODO: initialize GPU resources
PathTracer {
device,
queue,
}
}
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [render-initial]: [render.rs] The PathTracer structure]
We start out with an associated function called `PathTracer::new` which will serve as the
constructor and eventually initialize all GPU resources. The `PathTracer` takes ownership of the
`wgpu::Device` and `wgpu::Queue` that we created earlier and it will hold on to them for the rest of
the application's life.
`wgpu::Device` represents a connection to the GPU. It is responsible for creating resources like
texture, buffer, and pipeline objects. It also defines some methods for error handling.
The first thing I do is set up an "uncaptured error" handler. If you look at the [declarations
](https://docs.rs/wgpu/0.17.0/wgpu/struct.Device.html) of resource creation methods you'll notice
that none of them return a `Result`. This doesn't mean that they always succeed, as a matter of fact
all of these operations can fail. This is because wgpu closely mirrors the WebGPU API which uses a
concept called *error scopes* to detect and respond to errors.
Whenever there's an error that I don't handle using an error scope it will trigger the uncaptured
error handler, which will print out an error message and abort the program[^ch3.1-cit1]. For now,
I won't set up any error scopes in `PathTracer::new` and I'll abort the program if the API fails to
create the initial resources.
Next, let's declare the `render` module and initialize a `PathTracer` in the `main` function:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
...
};
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Rust highlight
mod render;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
const WIDTH: u32 = 800;
const HEIGHT: u32 = 600;
#[pollster::main]
async fn main() -> Result<()> {
let event_loop = EventLoop::new();
let window_size = winit::dpi::PhysicalSize::new(WIDTH, HEIGHT);
let window = WindowBuilder::new()
.with_inner_size(window_size)
.with_resizable(false)
.with_title("GPU Path Tracer".to_string())
.build(&event_loop)?;
let (device, queue, surface) = connect_to_gpu(&window).await?;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Rust highlight
let renderer = render::PathTracer::new(device, queue);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
event_loop.run(move |event, _, control_flow| {
control_handle.set_control_flow(ControlFlow::Poll);
match event {
Event::WindowEvent { event, .. } => match event {
WindowEvent::CloseRequested => control_handle.exit(),
WindowEvent::RedrawRequested => {
// Wait for the next available frame buffer.
let frame: wgpu::SurfaceTexture = surface
.get_current_texture()
.expect("failed to get current texture");
// TODO: draw frame
frame.present();
window.request_redraw();
}
_ => (),
},
_ => (),
}
});
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [main-renderer-init]: [main.rs] Initializing a Renderer]
Now that we have the skeleton in place, it's time to paint some pixels on the screen.
[^ch3.1-cit1]: This is actually the default behavior so I didn't really need to call
`on_uncaptured_error`.
Display Pipeline
----------------
Before setting up the render pass let's first talk about how it works. Traditionally, graphics
systems have been modeled after an abstraction called the *graphics pipeline*.[#Hughes13] At a
very high level, the input to the pipeline is a mathematical model that describes what to draw
-- such as geometry, materials, and light -- and the output is a 2D grid of pixels. This
transformation is processed in a series of standard *pipeline stages* which form the basis of the
rendering abstraction provided by GPUs and graphics APIs. wgpu uses the term *render pipeline* which
is what I'll use going forward.
The input to the render pipeline is a polygon stream represented by points in 3D space and their
associated data. The polygons are described in terms of geometric primitives (points, lines, and
triangles) which consist of *vertices*. The *vertex stage* transforms each vertex from the input
stream into a 2D coordinate space that corresponds to the viewport. After some additional processing
(such as clipping and culling) the assembled primitives are passed on to the *rasterizer*.
The rasterizer applies a process called scan conversion to determine the pixels that are covered by
each primitive and breaks them up into per-pixel *fragments*. The output of the vertex
stage (the vertex positions, texture coordinates, vertex colors, etc) gets interpolated between the
vertices of the primitive and the interpolated values get assigned to each fragment. Fragments are
then passed on to the *fragment stage* which computes an output (such as the pixel or sample color)
for each fragment. Shading techniques such as texture mapping and lighting are usually performed
in this stage. The output then goes through several other operations before getting written to the
render target as pixels.[^ch3-footnote1]
![Figure [render-pipeline]: Vertex and Fragment stages of the render pipeline
](../images/fig-01-render-pipeline.svg)
What I just described is very much a data pipeline: a data stream goes through a series of
transformations in stages. The input to each stage is defined in terms of smaller elements (e.g.
vertices and pixel-fragments) that can be processed in parallel. This is the fundamental principle
behind the GPU.
Early commercial GPUs implemented the graphics pipeline entirely in fixed-function hardware. Modern
GPUs still use fixed-function stages (and at much greater data rates) but virtually all of them
allow you to program the vertex and fragment stages with custom logic using *shader programs*.
[^ch3-footnote1]: I glossed over a few pipeline stages (such as geometry and tessellation) and
important steps like multi-sampling, blending, and the scissor/depth/stencil tests. These play an
important role in many real-time graphics applications but we won't make use of them in our path
tracer.
### Compiling Shaders
Let's put together a render pipeline that draws a red triangle. We'll define a vertex shader that
outputs the 3 corner vertices and a fragment shader that outputs a solid color. We'll write
these shaders in the WebGPU Shading Language (WGSL).
Go ahead and create a file called `shaders.wgsl` to host all of our WGSL code (I put it next to the
Rust files under `src/`). Before we can run this code on the GPU we need to compile it into a
form that can be executed on the GPU. We start by creating a *shader module*:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
pub struct PathTracer {
device: wgpu::Device,
queue: wgpu::Queue,
}
impl PathTracer {
pub fn new(device: wgpu::Device, queue: wgpu::Queue) -> PathTracer {
device.on_uncaptured_error(Box::new(|error| {
panic!("Aborting due to an error: {}", error);
}));
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
let shader_module = compile_shader_module(&device);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust delete
// TODO: initialize GPU resources
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
PathTracer {
device,
queue,
}
}
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
fn compile_shader_module(device: &wgpu::Device) -> wgpu::ShaderModule {
use std::borrow::Cow;
let code = include_str!(concat!(env!("CARGO_MANIFEST_DIR"), "/src/shaders.wgsl"));
device.create_shader_module(wgpu::ShaderModuleDescriptor {
label: None,
source: wgpu::ShaderSource::Wgsl(Cow::Borrowed(code)),
})
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [render-shader-module]: [render.rs] Creating the shader module]
The `compile_shader_module` function loads the file we just created into a string using the
`include_str!` macro. This bundles the contents of `shaders.wgsl` into the program binary at build
time. This is followed by a call to `wgpu::Device::create_shader_module` to compile the WGSL source
code.[^ch3-footnote2]
Let's define the vertex and fragment functions, which I'm calling `display_vs` and `display_fs`:
I'm using the "vs" and "fs" suffixes as shorthand for "vertex stage" and "fragment stage". Together,
these two functions form our "display pipeline" (the "display" part will become more clear later).
The `@vertex` and `@fragment` annotations are WGSL keywords that mark these two functions as entry
points to each pipeline stage program.
Since graphics workloads generally involve a high amount of linear algebra, GPUs natively support
SIMD operations over vectors and matrices. All shading languages define built-in types for vectors
and matrices of up to 4 dimensions (4x4 in the case of matrices). The `vec4f` and `vec2f` types that
are in the code represent 4D and 2D vectors of floating point numbers.
`display_vs` returns the vertex position as a `vec4f`. This position is defined relative to a
coordinate space called the *Normalized Device Coordinate Space*. In NDC, the center of the viewport
marks the origin $(0, 0, 0)$. The $x$-axis spans horizontally from $(-1, 0, 0)$ on the left edge of
the viewport to $(1, 0, 0)$ on the right edge while the $y$-axis spans vertically from $(0,-1,0)$ at
the bottom to $(0,1,0)$ at the top. The $z$-axis is directly perpendicular to the viewport, going
*through* the origin.
![Figure [ndc]: Our triangle in Normalized Device Coordinates](../images/fig-02-ndc.svg)
`display_vs` takes a *vertex index* as its parameter. The vertex function gets invoked for every
input vertex across different GPU threads. `vid` identifies the individual vertex that is assigned
to the *invocation*. The number of vertices and where they exist within the topology of the input
geometry is up to us to define. Since we want to draw a triangle, we'll later issue a *draw call*
with 3 vertices and `display_vs` will get invoked exactly 3 times with vertex indices ranging from
$0$ to $2$.
Since our 2D triangle is viewport-aligned, we can set the $z$ coordinate to $0$. The 4th
coordinate is known as a *homogeneous coordinate* used for projective transformations. Don't worry
about this coordinate for now -- just know that for a vector that represents a *position* we set
this coordinate to $1$. We can declare the $x$ and $y$ coordinates for the 3 vertices as an array
of `vec2f` and simply return the element that corresponds to `vid`. I enumerate the vertices in
counter-clockwise order which matches the winding order we'll specify when we create the pipeline.
`display_fs` takes no inputs and returns a `vec4f` that represents the fragment color. The 4
dimensions represent the red, green, blue, and alpha channels of the destination pixel. `display_fs`
gets invoked for all pixel fragments that result from our triangle and the invocations are executed
in parallel across many GPU threads, just like the vertex function. To paint the triangle solid red,
we simply return `vec4f(1., 0., 0., 1.)` for all fragments.
[^ch3-footnote2]: The `Cow::Borrowed` bit is a Rust idiom that creates a "copy-on-write borrow".
This allows the API to take ownership of the WGSL string if necessary. This is not really an
important detail for us.
### Creating the Pipeline Object
Before we can run the shaders, we need to assemble them into a *pipeline state object*. This is
where we specify the data layout of the render pipeline and link the shaders into a runnable binary
program. Let's add a new function called `create_display_pipeline`:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
...
fn compile_shader_module(device: &wgpu::Device) -> wgpu::ShaderModule {
use std::borrow::Cow;
let code = include_str!(concat!(env!("CARGO_MANIFEST_DIR"), "/src/shaders.wgsl"));
device.create_shader_module(wgpu::ShaderModuleDescriptor {
label: None,
source: wgpu::ShaderSource::Wgsl(Cow::Borrowed(code)),
})
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
fn create_display_pipeline(
device: &wgpu::Device,
shader_module: &wgpu::ShaderModule,
) -> wgpu::RenderPipeline {
device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
label: Some("display"),
layout: None,
primitive: wgpu::PrimitiveState {
topology: wgpu::PrimitiveTopology::TriangleList,
front_face: wgpu::FrontFace::Ccw,
polygon_mode: wgpu::PolygonMode::Fill,
..Default::default()
},
vertex: wgpu::VertexState {
module: shader_module,
entry_point: Some("display_vs"),
buffers: &[],
},
fragment: Some(wgpu::FragmentState {
module: shader_module,
entry_point: Some("display_fs"),
targets: &[Some(wgpu::ColorTargetState {
format: wgpu::TextureFormat::Bgra8Unorm,
blend: None,
write_mask: wgpu::ColorWrites::ALL,
})],
}),
depth_stencil: None,
multisample: wgpu::MultisampleState::default(),
multiview: None,
cache: None,
})
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [display-pipeline]: [render.rs] The `create_display_pipeline` function]
This code describes a render pipeline that draws a list of triangle primitives. The vertex winding
order is set to counter-clockwise which defines the orientation of the triangle's *front
face*.[^ch3-footnote3]
We request that the interior of each polygon be completely filled (rather than drawing just the
edges or vertices). We specify that `display_vs` is the main function of the vertex stage and that
we're not providing any vertex data from the CPU (since we declared our vertices in the shader
code). Similarly, we set up a fragment stage with `display_fs` as the entry point and a single
color target.[^ch3-footnote4] I set the pixel format of the render target to `Bgra8Unorm` since
that happens to be widely supported on all of my devices. What's important is that you assign a
pixel format that matches the surface configuration in your windowing setup and that your GPU device
supports this as a *render attachment* format.
Let's instantiate the pipeline and store it in the `PathTracer` object. Pipeline creation is
expensive so we want to create the pipeline state object once and hold on to it. We'll reference it
later when drawing a frame:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
pub struct PathTracer {
device: wgpu::Device,
queue: wgpu::Queue,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
display_pipeline: wgpu::RenderPipeline,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
}
impl PathTracer {
pub fn new(device: wgpu::Device, queue: wgpu::Queue) -> PathTracer {
device.on_uncaptured_error(Box::new(|error| {
panic!("Aborting due to an error: {}", error);
}));
let shader_module = compile_shader_module(&device);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
let display_pipeline = create_display_pipeline(&device, &shader_module);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
PathTracer {
device,
queue,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
display_pipeline,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
}
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [display-pipeline-init]: [render.rs] Initializing the display pipeline]
[^ch3-footnote3]: The GPU can automatically discard triangles that are oriented away from the
viewport. This is a feature called *back face culling* which our code doesn't make use of.
[^ch3-footnote4]: The `fragment` field of `wgpu::RenderPipelineDescriptor` is optional
(notice the *Some* in `Some(wgpu::FragmentState {...})` ?). A render pipeline that only outputs to
the depth or stencil buffers doesn't have to specify a fragment shader or any color attachments. An
example of this is *shadow mapping*: a shadow map is a texture that stores the distances between a
light source and geometry samples from the scene; it can be produced by a depth-only render-pass
from the point of view of the light source. The shadow map is later sampled from a render pass from
the camera's point of view to determine whether a rasterized point is visible from the light or in
shadow.
The Render Pass
---------------
We now have the pieces in place to issue a draw command to the GPU. The general abstraction modern
graphics APIs define for this is called a "command buffer" (or "command list" in D3D12). You can
think of the command buffer as a memory location that holds the serialized list of GPU commands
representing the sequence of actions we want the GPU to take. To draw a triangle we'll *encode*
a draw command into the command buffer and then *submit* the command buffer to the GPU for exection.
With wgpu, the encoding is abstracted by an object called `wgpu::CommandEncoder`, which we'll use to
record our draw command. Once we are done, we will call `wgpu::CommandEncoder::finish()` to produce
a finalized `wgpu::CommandBuffer` which we can submit to the GPU via the `wgpu::Queue` that we
created at start up.
Let's add a new `PathTracer` function called `render_frame`. This function will take a texture as
its parameter (our *render target*) and tell the GPU to draw to it using the pipeline object we
created earlier:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
...
impl PathTracer {
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
pub fn render_frame(&self, target: &wgpu::TextureView) {
let mut encoder = self
.device
.create_command_encoder(&wgpu::CommandEncoderDescriptor {
label: Some("render frame"),
});
let mut render_pass = encoder.begin_render_pass(&wgpu::RenderPassDescriptor {
label: Some("display pass"),
color_attachments: &[Some(wgpu::RenderPassColorAttachment {
view: target,
depth_slice: None,
resolve_target: None,
ops: wgpu::Operations {
load: wgpu::LoadOp::Clear(wgpu::Color::BLACK),
store: wgpu::StoreOp::Store,
},
})],
..Default::default()
});
render_pass.set_pipeline(&self.display_pipeline);
// Draw 1 instance of a polygon with 3 vertices.
render_pass.draw(0..3, 0..1);
// End the render pass by consuming the object.
drop(render_pass);
let command_buffer = encoder.finish();
self.queue.submit(Some(command_buffer));
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [render_frame-stub]: [render.rs] The `render_frame` function]
`target` here is defined as a `wgpu::TextureView`. wgpu makes the distinction between a texture
resource (represented by `wgpu::Texture`) and how that texture's memory is accessed by a pipeline
(which is represented by the *view* into the texture). When we want to bind a texture we first
create a view with the right properties. In this case we'll assume that the caller already created
a `TextureView` of the render target.
The first thing we do in `render_frame` is create a command encoder. We then tell the encoder to
begin a *render pass*. There are 4 important API calls we make to encode the draw command:
1. Create a `wgpu::RenderPass`. We tell it to store the colors that are output by the render
pipeline to the `target` texture by assigning it as the only color attachment. We also tell it
to clear all pixels of the target to black (i.e. $(0, 0, 0, 1)$ in RGBA) before drawing to it.
2. Assign the render pipeline.
3. Record a single draw with 3 vertices.
4. End the render pass by destroying the `wgpu::RenderPass` object.
We then serialize the command buffer and submit it to the GPU. Finally, let's invoke `render_frame`
from our windowing event loop, using the current surface texture as the render target:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
async fn main() -> Result<()> {
...
event_loop.run(move |event, _, control_flow| {
...
Event::RedrawRequested(_) => {
// Wait for the next available frame buffer.
let frame: wgpu::SurfaceTexture = surface
.get_current_texture()
.expect("failed to get current texture");
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust delete
// TODO: draw frame
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
let render_target = frame
.texture
.create_view(&wgpu::TextureViewDescriptor::default());
renderer.render_frame(&render_target);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
frame.present();
}
...
});
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [render_frame-call]: [main.rs] Rendering to a surface texture]
Running this code should bring up a window that looks like this:
![Figure [first-triangle]: First Triangle](../images/img-02-first-triangle.png)
Finally drawing something! A single triangle may not look that interesting but you can model highly
complex 3D scenes and geometry by putting many of them together. It takes only a few tweaks to the
render pipeline to shape, animate, and render millions of triangles many times per second.
Full-Screen Quad
----------------
The render pipeline that we just put together plays a rather small role in the overall renderer:
its purpose is to display the output of the path-tracer on the window surface.
The output of our renderer is a 2D rectangular image and I would like it to fill the whole window.
We can achieve this by having the render pipeline draw two right triangles that are adjacent at
their hypothenuse. Remember that the viewport coordinates span the range $[-1, 1]$ in NDC, so
setting the 4 corners of the rectangle to $(-1, 1)$, $(1, 1)$, $(1, -1)$, $(-1, -1)$ should cover
the entire viewport regardless of its dimensions.
![Figure [half-screen-quad]: Half-Screen Triangle](../images/img-03-half-screen-quad.png)
That painted only one of the triangles. We also need to update the draw command with the new vertex
count:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
...
impl PathTracer {
...
pub fn render_frame(&self, target: &wgpu::TextureView) {
...
render_pass.set_pipeline(&self.display_pipeline);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust delete
// Draw 1 instance of a polygon with 3 vertices.
render_pass.draw(0..3, 0..1);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
// Draw 1 instance of a polygon with 6 vertices.
render_pass.draw(0..6, 0..1);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
// End the render pass by consuming the object.
drop(render_pass);
let command_buffer = encoder.finish();
self.queue.submit(Some(command_buffer));
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [render_frame-stub]: [render.rs] The `render_frame` function]
![Figure [full-screen-quad]: Full-Screen Quad](../images/img-04-full-screen-quad.png)
Viewport Coordinates
--------------------
In this setup, every fragment shader invocation outputs the color of a single pixel. We can identify
that pixel using the built-in `position` input to the pipeline stage.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
@fragment fn display_fs(@builtin(position) pos: vec4f) -> @location(0) vec4f {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
return vec4f(1.0, 0.0, 0.0, 1.0);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [position-builtin]: [shaders.wgsl] Position Built-In]
The input is defined as a `vec4f`. The $x$ and $y$ coordinates are defined in the _Viewport
Coordinate System_. The origin $(0, 0)$ corresponds to the top-left corner pixel of the viewport.
The $x$-coordinate increases towards the right and the $y$-coordinate increases towards the bottom.
A whole number increment in $x$ or $y$ represents an increment by 1 pixel (and fractional increments
can fall "inside" a pixel). For example, for a viewport with the physical dimensions of
$800\times600$, the coordinate ranges are $0\le x\lt799, 0\le y \lt599$.
![Figure [viewport-coords]: Viewport Coordinate System](../images/fig-03-viewport-coords.svg)
Let's assign every pixel fragment a color based on its position in the viewport by mapping the
coordinates to a color channel (red and green). The render target uses a normalized color format
(i.e. the values must be between $0$ and $1$), so we divide each dimension by the largest possible
value to convert it to that range:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
const WIDTH: u32 = 800u;
const HEIGHT: u32 = 600u;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
@fragment fn display_fs(@builtin(position) pos: vec4f) -> @location(0) vec4f {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let color = pos.xy / vec2f(f32(WIDTH - 1u), f32(HEIGHT - 1u));
return vec4f(color, 0.0, 1.0);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [pos-to-color]: [shaders.wgsl]]
There are two language expressions here that are worth highlighting. `pos.xy` is a so called
_vector swizzle_ that extracts the $x$ and $y$ components and produces a `vec2f` containing only
those. Next, we divide that `vec2f` by another `vec2f`. Here, the division operator performs a
component-wise division of every element of the vector on the left-hand side by the corresponding
element on the right-hand side, so `pos.xy / vec2f(f32(WIDTH - 1u), f32(HEIGHT - 1u))` is equivalent
to `vec2f(pos.x / f32(WIDTH - 1u), pos.y / f32(HEIGHT - 1u))`.
Now we are able to separately color each individual pixel. Running this should produce a picture
that looks like this:
![Figure [viewport-gradient]: Viewport Coordinates as a color gradient
](../images/img-05-viewport-gradient.png)
Resource Bindings
====================================================================================================
Our program is split across separate runnable parts: the main executable that runs on the CPU and
pipelines that run on the GPU. As we add more features we will want to exchange data between the
different parts. The main way to achieve this is via memory resources.
The CPU side of our program can create and interact with resources by making API calls. On the GPU
side, the shader program can access those via _bindings_. A binding associates a resource with a
unique slot number that can be referenced by the shader. Each slot is identified by an index number.
The shader code declares a variable for each binding with a decoration that assigns it a binding
index. The CPU side is responsible for setting up the resources for a GPU pipeline according to its
binding layout.
WebGPU introduces an additional concept around bindings called _bind group_. A bind group
associates a group of resources that are frequently bound together.[^ch4-footnote1] Like individual
bindings, each bind group is identified by an index number. Our pipelines won't make use of more
than one bind group at a time, so we'll always assign $0$ as the group index.
[^ch4-footnote1]: The bind group concept is similar to "descriptor set" in Vulkan, "descriptor
table" in D3D12, and "argument buffer" in Metal.
Uniform Declaration
-------------------
The first binding we are going to set up is a _uniform buffer_. Uniforms are read-only data that
don't vary across GPU threads. We are going to use a uniform buffer to store certain globals, like
camera parameters.
Our renderer currently assumes a window dimension of $800\times600$ and declares this in two
different places (`shaders.wgsl` and `main.rs`) which must be kept in sync. Let's make `WIDTH` and
`HEIGHT` uniforms and upload their values from the CPU side. We'll first declare a uniform buffer
and assign it to binding index $0$:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
struct Uniforms {
width: u32,
height: u32,
}
@group(0) @binding(0) var
[^ch9-footnote1]: Radiance has units Watts per square meter per steradian ($\frac{W}{m^2 \cdot sr}$).
"Steradian" is the unit of solid angle. It's analogous to radian but in 3D.
[^ch9-footnote2]: [_Ray Tracing In One Weekend_][RTIOW], Chapter 10.3 Modeling Light Scatter and Reflectance.
[^ch9-footnote3]: In terms of "radiant power leaving or arriving at a unit surface",
albedo is the ratio of radiosity to irradiance.
[^ch9-footnote4]: For a complete derivation, see the article ["Deriving Lambertian BRDF from first principles"](https://sakibsaikia.github.io/graphics/2019/09/10/Deriving-Lambertian-BRDF-From-First-Principles.html) by Sakib Saikia.
[^ch9-footnote5]: For a refresher on Monte Carlo integration, take a look at [_Ray Tracing: The Rest of Your Life_][RTTROYL], Chapters 3 and 4.
[^ch9-footnote6]: The surface area of a sphere is $4\pi r^2$. The surface area of a unit sphere (where $r = 1$) is $4\pi$ and that of a hemisphere is simply $2\pi$.
[^ch9-footnote7]: See [Physically Based-Rendering: From Theory To Implementation, 4th Edition, A.5.3 "Cosine-Weighted Hemisphere Sampling"](https://pbr-book.org/4ed/Sampling_Algorithms/Sampling_Multidimensional_Functions#Cosine-WeightedHemisphereSampling).
Specular Transmission
---------------------
We have so far focused only on the reflected portion of scattered light and assumed the rest gets absorbed into
the material. This approach results in surfaces that are opaque, though some materials (such as glass, water, and air)
transmit light without fully absorbing it. These materials can appear transparent or translucent according to certain
physical properties, including how smooth the surface is and how light interacts with its molecules.
The speed of light varies depending on the medium through which it propagates. The speed of
light in a vacuum is 299,792,458 m/s while the speed in glass is approximately two-thirds of that (~199,861,638 m/s). When a
wave transitions between two media and undergoes a sudden change in its phase velocity (e.g. when when light enters
glass from air, or when water waves travel from shallow to deep water) the wavefronts appear to change direction. This
phenomenon is called _refraction_. The change in phase velocity is given by the material's
_index of refraction_ (IOR)[^ch9-footnote8] and the change in direction is governed by _Snell's Law_.
![Figure [refracting-wavefronts]: Wavefronts undergoing refraction at the transition to a different medium. (Image Credit: Oleg Alexandrov, https://en.wikipedia.org/wiki/File:Snells_law_wavefronts.gif )](../images/Snells_law_wavefronts_by_Oleg_Alexandrov.gif class="small-gif" loop)
The IOR for a given medium is defined as the ratio between the speed of light in that medium and $c$, the speed of light
in vacuum. For example, glass with a IOR of 1.5 propagates light at $\frac{2}{3}c.$ Here are the indices of
refraction for some transparent materials (taken from _Physically Based Rendering: From Theory To Implementation_[^ch9-footnote8]):
Medium | Index of refraction
:------:|:-------------------:
Vacuum | 1.0
Air at sea level | 1.00029
Ice | 1.31
Water (20 degrees C) | 1.333
Fused quartz | 1.46
Glass | 1.5-1.6
Sapphire | 1.77
Diamond | 2.42
Snell's Law relates the angle of incidence $\theta_i$ (the angle between the incident light ray and the surface normal)
to the angle of refraction $\theta_o$ (the angle between the refracted ray and the negative surface normal) in terms
of the ratio of the indices of refraction of the two media:
$$
\frac{\eta_o}{\eta_i} = \frac{sin~\theta_i}{sin~\theta_o}
$$
![Figure [snells-law]: Snell's Law](../images/fig-20-snell.svg)
In order to correctly implement refraction for a transmissive surface, we need to extend our material definition
to specify an index of refraction. For a transmissive material, it makes sense to treat the scattering event at a
ray-surface intersection as a transition from one type of medium (e.g. air) into a different type (e.g. glass).
The surface of a sphere defines an enclosed volume, so the IOR relates the speed of light inside the sphere to the speed
of light outside.
According to Snell's law, we need the ratio of the IOR values on both sides of the surface, i.e. the _relative_
index of refraction. We have a choice when it comes to the material definition: the material can either store the
IOR relative to vacuum or it can directly store the relative IOR at the interface boundary. The former is a bit more
complicated to implement, since we need to keep track of the IOR of the surrounding volume along the light transport
path and compute the relative IOR when we intersect a transmissive material. We'll instead go with the second approach:
if we want to place a glass sphere inside a volume of water, the material will store $\frac{\eta_{glass}}{\eta_{water}}$.
This makes the implementation simpler though at the cost of some flexibility; for instance we can't easily
represent a glass sphere that is partially submerged in water and partially exposed to air.[^ch9-footnote9]
We can represent the relative IOR as a 32-bit floating point number in a new field of type `f32`. For now, we are going
to treat any material that has a non-zero `ior` parameter as transmissive and ignore `specular`:
We now have a 4th material entry with a relative refractive index of $1.5$, which approximately represents the
air/glass interface. The color is set to `vec3(1.)`, so the material should transmit all light without any absorption.
Let's now update the scatter function. Once again we're in luck: just like `reflect`, WGSL defines an intrinsic called
`refract` to compute the refracted ray for an incident ray direction, a surface normal, and a relative index of
refraction. Let's plug this into our `scatter` function:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
var scattered: vec3f;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
if material.ior > 0. {
scattered = refract(input_ray.direction, hit.normal, material.ior);
} else if material.specular == 1 {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
scattered = reflect(input_ray.direction, hit.normal);
} else {
scattered = sample_lambertian(hit.normal);
}
let output_ray = Ray(point_on_ray(input_ray, hit.t), scattered);
let attenuation = material.color;
return Scatter(attenuation, output_ray);
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [refract-in-scatter]: [shaders.wgsl] Implementing a transmissive material with `refract`]
When I run this on my computer, I get a very strange result:
![Figure [nan-propagation]: (video) NaN propagation](../images/vid-09-nan-propagation.mp4 autoplay muted loop)
On the 3 GPUs and operating systems[^ch9-footnote10] that I ran this program on, I see a blue ring with a large black
circle in the middle instead of a refractive sphere. Don't be alarmed if the result you see is different, as the color
and behavior you see can vary depending on your GPU and the graphics API you're using.
For me, the blue pixels originate at the ring and gradually spread to its surroundings. They seemingly "radiate" along
light transport paths, even casting a "shadow" on the opposite side of the lambertian sphere. The problem has something
to do with our use of `refract`, so let's consult the API documentation provided in the WGSL specification:[^ch9-footnote11]
> For the incident vector `e1` and surface normal `e2`, and the ratio of indices of refraction `e3`,
> `let k = 1.0 - e3 * e3 * (1.0 - dot(e2, e1) * dot(e2, e1))`. If `k` < 0.0, returns the refraction vector 0.0,
> otherwise return the refraction vector `e3 * e1 - (e3 * dot(e2, e1) + sqrt(k)) * e2`. The incident vector `e1` and
> the normal `e2` should be normalized for desired results according to Snell’s Law; otherwise, the results may not
> conform to expected physical behavior.
There are two important points being discussed here. Let's first focus on the second half: the input vectors need to be
normalized. Our sphere intersection routine already returns a unit length normal vector, so let's generally assume
that's always the case and `hit.normal` is already normalized. We shouldn't assume the same for the incident ray
direction, so let's update our code to explicitly normalize it:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let incident = normalize(input_ray.direction);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
if material.ior > 0. {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
scattered = refract(incident, hit.normal, material.ior);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
} else if material.specular == 1 {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
scattered = reflect(incident, hit.normal);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
} else {
scattered = sample_lambertian(hit.normal);
}
let output_ray = Ray(point_on_ray(input_ray, hit.t), scattered);
let attenuation = material.color;
return Scatter(attenuation, output_ray);
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [scatter-normalize-incident-ray]: [shaders.wgsl] Normalizing the input ray direction]
Let's run this now:
![Figure [buggy-refract-input-normalized]: Buggy refract with normalized input rays](../images/img-32-buggy-refract-with-normalized-input.png)
That's better. The black hole is gone (or at least it's harder to see) but the refraction is still buggy. The API
documentation states: _If k < 0.0, returns the refraction vector 0.0._ This means that under certain circumstances, the
scattered ray direction can be a null vector. A null vector doesn't represent a meaningful ray direction and is bound to
cause numerical issues, so we need to prevent it.
### NaN and INF
Let's briefly discuss why a null vector can lead to a bug like this. After finding a valid
intersection with the buggy sphere, the null ray direction is used for sphere intersection tests in the next iteration
of the path tracing loop. Let's look back at `intersect_sphere`:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn intersect_sphere(ray: Ray, sphere: Sphere) -> Intersection {
let v = ray.origin - sphere.center;
let a = dot(ray.direction, ray.direction);
let b = dot(v, ray.direction);
let c = dot(v, v) - sphere.radius * sphere.radius;
let d = b * b - a * c;
if d < 0. {
return no_intersection();
}
let sqrt_d = sqrt(d);
let recip_a = 1. / a;
let mb = -b;
let t1 = (mb - sqrt_d) * recip_a;
let t2 = (mb + sqrt_d) * recip_a;
let t = select(t2, t1, t1 > EPSILON);
if t <= EPSILON {
return no_intersection();
}
let p = point_on_ray(ray, t);
let N = (p - sphere.center) / sphere.radius;
return Intersection(N, t);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [intersect-sphere-reminder]: [shaders.wgsl] `intersect_sphere`]
When `ray.direction` is null, `a` and `b` are both $0$, and consequently so is `d`. The `d < 0.` check evaluates to false
and the function moves on to compute `t`. Now we have problem: the line `let recip_a = 1. / a;` divides $1$ by
zero. The value of `recip_a` is mathematically undefined but if your GPU follows the standard
rules for floating-point arithmetic, the result will be _infinity_.
The binary representation and operations over floating point types is governed by the IEEE-754
standard.[^ch9-footnote12] According to the standard, when the result of an operation exceeds
the representable range it can produce positive or negative _infinity_ (`±INF`). An example is when a very large
number is divided by a very small number -- which surprisingly includes dividing a non-zero number by zero). Another
special value is _NaN_ (not a number) which is produced by invalid operations, such as dividing
zero by zero and multiplying zero by infinity. When at least one operand in an operation is NaN, the result is
defined to be NaN. As such, NaNs are said to "poison" any calculation they are involved in. NaNs are also defined
as _unordered_, meaning that boolean comparison operators (i.e. `<`, `<=`, `>=`, `>`, `==`) evaluate to `false` when
an operand is NaN.[^ch9-footnote13]
Following these rules, we should expect `recip_a` to be `+INF`. Since `d` and `b` are both zero, so are `sqrt_d` and `mb`,
which should cause `t1`, `t2`, and `t` to be `NaN`. `t <= EPSILON` would evaluate to false and NaNs would propagate through
the rest of the function. Fortunately, `intersect_scene` should treat this as "no intersection" (since
the `hit.t > 0. && hit.t < closest_hit.t` check would evaluate to false. Because the procedure doesn't find any intersections,
it moves on to compute the sky color with a null ray direction. The first thing that `sky_color` does with
the ray direction is normalize it (`normalize(ray.direction).y`) which is implemented by dividing all components of
a vector by its scalar magnitude (or "length"). The magnitude of a null vector is $0,$ so dividing $0$ by $0$ produces
a `t` value of NaN. This means that whenever a light transport path intersects our buggy sphere, our program computes a
received radiance of NaN and stores it in the radiance texture, poisoning all future samples for that pixel.
Typically NaNs in a framebuffer are treated as $0$, so one might expect the NaN pixels to be displayed as black. I'm seeing
blue, i.e. the color of the sphere is exactly the RGB triplet $(0, 0, 1)$. Oddly enough if I change any
component of the sky color `vec3(0.3, 0.5, 1.0)` to `1.0`, that component is $1$ in the output pixel and any value that
is less than $1$ seems to become $0$. Shouldn't all channels be 0 if the radiance value is `vec3(NaN, NaN, NaN)`?
In practice, shader compilers support optimizations for floating-point arithmetic that don't strictly adhere to the
standard. For instance, Metal and Vulkan both support _fast math_ modes in which NaN and INF behavior is
_undefined_[^ch9-footnote14] (and the WGSL specification explicitly allows them[^ch9-footnote15]). Therefore, `sky_color`
is allowed to return `vec3(NaN, NaN, 1.0)` due to an optimization. Your program may produce blue pixels, black pixels,
or even a plain white sphere with no NaN propagation whatsoever. It all depends on on your shader compiler configuration
and the specific GPU you're running on.
### Total Internal Reflection
Let's get back to `refract`. The specification says the result is a null vector if `k` < 0, where
`k = 1.0 - e3 * e3 * (1.0 - dot(e2, e1) * dot(e2, e1))`, or equivalently, when
`e3 * e3 * (1.0 - dot(e2, e1) * dot(e2, e1))` > 1. This rule is derived from a special case in Snell's Law in which the
angle of refraction has no solution. Let's re-arrange the equation from earlier and solve for $\theta_o$:
$$
\begin{eqnarray}
\frac{\eta_o}{\eta_i} &=& \frac{sin~\theta_i}{sin~\theta_o} \nonumber \\
sin~\theta_o &=& \frac{\eta_i}{\eta_o} sin~\theta_i \nonumber
\end{eqnarray}
$$
If $\eta_i$ is greater than $\eta_o$, it is possible for the right-hand side of the equation to be greater than one. The
range of the sine function is limited to $[-1, 1]$, so no value of $\theta_o$ can satisfy the equation. When this is the
case, no refraction occurs and light is completely reflected instead. To handle this case, we'll compute
$\frac{\eta_i}{\eta_o} sin~\theta_i$ and reflect the ray if the value is greater than 1.
Remember the Pythagorean identity:
$$
\begin{eqnarray}
sin^2~\theta + cos^2~\theta &=& 1 \nonumber \\
sin~\theta &=& \sqrt{(1 - cos^2~\theta)} \nonumber
\end{eqnarray}
$$
This involves a square-root, which is a relatively expensive computation. Given the inequality that we're solving for,
we can technically avoid it. If we square both sides, we get:
$$
\begin{eqnarray}
1 &\ge& \frac{\eta_i}{\eta_o}\sqrt{(1 - cos^2~\theta)} \nonumber \\
1 &\ge& (\frac{\eta_i}{\eta_o})^2(1 - cos^2~\theta) \nonumber
\end{eqnarray}
$$
We know how to compute $\cos\theta_i$ using the dot product: `abs(dot(incident, hit.normal))`. The right-hand side of
the inequality can then be computed as `ior * ior * (1.0 - abs(dot(incident, hit.normal)))`:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
let incident = normalize(input_ray.direction);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let incident_dot_normal = dot(incident, hit.normal);
let cos_theta = abs(incident_dot_normal);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
var scattered: vec3f;
if material.ior > 0. {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let ior = material.ior;
let cannot_refract = ior * ior * (1.0 - cos_theta * cos_theta) > 1.;
if cannot_refract {
scattered = reflect(incident, hit.normal);
} else {
scattered = refract(incident, hit.normal, ior);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
} else if material.specular == 1 {
scattered = reflect(incident, hit.normal);
} else {
scattered = sample_lambertian(hit.normal);
}
let output_ray = Ray(point_on_ray(input_ray, hit.t), scattered);
let attenuation = material.color;
return Scatter(attenuation, output_ray);
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [scatter-handle-total-internal-reflection]: [shaders.wgsl] Handling total internal reflection]
Let's run this now:
![Figure [broken-refraction-ratio]: Handling total internal reflection](../images/img-33-total-internal-reflection.png)
The issue with NaN/INF values is gone and we now see a mirror reflection where the blue ring used to be. The image is
still wrong though, as there is now a black circle in the middle again. In the black region, the camera ray refracts
into the sphere but the rest of the light transport path gets stuck in a total internal reflection loop. The path
tracing loop eventually exits without reaching the sky, hence the black.
### Fixing the refraction ratio
The third parameter of `refract` is defined as the _ratio of the indices of refraction_, i.e. $\frac{\eta_i}{\eta_o}$.
When a ray arrives from outside the sphere the ratio is $\frac{\eta_{outside}}{\eta_{inside}}$. Conversely, when a ray
arrives from the inside the ratio is $\frac{\eta_{inside}}{\eta_{outside}}$. We defined `ior` as
$\frac{\eta_{inside}}{\eta_{outside}}$, so we need to take the reciprocal of it when the ray arrives from the outside.
How we distinguish the "outside" of a shape from the "inside" is a matter of convention. Typically, the surface
normals are defined as facing outward, so we can treat the side facing the normal direction as the "front" face.
![Figure [flipped-normal]: The outward face normal and the flipped normal used for shading when the ray intersects the surface from behind](../images/fig-17-front-facing-normal.svg)
The front face is easy enough to detect: if the dot product of the ray direction and the normal vector is negative, then
the vectors "oppose" each other and the intersection is on the front face. It is also common practice to flip the normal
vector when the intersection is on the back face, so that material evaluation uses a consistent frame of reference.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
let incident = normalize(input_ray.direction);
let incident_dot_normal = dot(incident, hit.normal);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let is_front_face = incident_dot_normal < 0.;
let N = select(-hit.normal, hit.normal, is_front_face);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
let cos_theta = abs(incident_dot_normal);
if material.ior > 0. {
let ior = material.ior;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let ref_ratio = select(ior, 1. / ior, is_front_face);
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
if cannot_refract {
scattered = reflect(incident, N);
} else {
scattered = refract(incident, N, ref_ratio);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
} else if material.specular == 1 {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
scattered = reflect(incident, N);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
} else {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
scattered = sample_lambertian(N);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
let output_ray = Ray(point_on_ray(input_ray, hit.t), scattered);
let attenuation = material.color;
return Scatter(attenuation, output_ray);
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [fix-refraction-ratio]: [shaders.wgsl] Detecting the front face to the correct refraction ratio]
That was the last piece of the puzzle. This should produce the following image:
![Figure [working-specular-transmission]: Working specular transmission](../images/img-34-working-specular-transmission.png)
As a quick test, change the index of refraction assigned to the material from `1.5` to `1.0/1.5`. That should trigger
the total internal reflection case but the black sphere inside should look different:
![Figure [working-specular-transmission-inverted-ior]: Working specular transmission with inverted IOR](../images/img-35-working-specular-transmission-inverted-ior.png)
We now have two distinct scenarios in which the scatter routine can choose a specular reflection. Instead of
having nested branches, let's unify the control flow that can result in a specular reflection. This will help keep
things clean when we later introduce other situations that can result in a specular reflection:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
let incident = normalize(input_ray.direction);
let incident_dot_normal = dot(incident, hit.normal);
let is_front_face = incident_dot_normal < 0.;
let N = select(-hit.normal, hit.normal, is_front_face);
let cos_theta = abs(incident_dot_normal);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
// `ior`, `ref_ratio`, and `cannot_refract` only have meaning if the material is transmissive.
let is_transmissive = material.ior > 0.;
let ior = material.ior;
let ref_ratio = select(ior, 1. / ior, is_front_face);
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
var scattered: vec3f;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
if material.specular == 1 || (is_transmissive && cannot_refract) {
scattered = reflect(incident, N);
} else if is_transmissive {
scattered = refract(incident, N, ref_ratio);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
} else {
scattered = sample_lambertian(N);
}
let output_ray = Ray(point_on_ray(input_ray, hit.t), scattered);
let attenuation = material.color;
return Scatter(attenuation, output_ray);
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [specular-refactor]: [shaders.wgsl] Minor refactor with fewer if-branches]
[^ch9-footnote8]: See [Physically Based Rendering: From Theory To Implementation, 4th Edition, 9.3.2 "Specular Reflection and Transmission: The Index of Refraction"](https://pbr-book.org/4ed/Reflection_Models/Specular_Reflection_and_Transmission#TheIndexofRefraction)
[^ch9-footnote9]: Another property that we overlooked is that the index of refraction varies with the wavelength of light (since phase velocity of a wave depends on its wavelength). The visual effect is called _dispersion_. This is the mechanism that creates a rainbow as light disperses through a cloud of water vapor. Our path tracer doesn't represent spectra, as such it can't render this effect.
[^ch9-footnote10]: I ran this on Apple silicon running macOS (Metal), an Nvidia RTX GPU on Windows (Vulkan + D3D12), and an Intel iGPU on Linux (Vulkan).
[^ch9-footnote11]: WGSL specification, Section 17.5.50: https://www.w3.org/TR/WGSL/#refract-builtin
[^ch9-footnote12]: 754-2008 - IEEE Standard for Floating-Point Arithmetic: https://ieeexplore.ieee.org/document/4610935
[^ch9-footnote13]: The standard defines two kinds of NaN: quiet and signaling. Signaling NaNs allow the hardware to generate an exception while quiet NaNs "quietly" propagate through operations. The WGSL and Metal shading language standards explicitly disallow signaling NaNs.
[^ch9-footnote14]: See Metal Shading Language Specification, Version 3.2, Section 7.1 "INF, NaN, and Denormalized Numbers; SPIR-V Specification, Version 1.6, Revision 5, Section 3.15 "FP Fast Math Mode". At the time of writing of this book, "fast math" is the default behavior of the Metal shader compiler.
[^ch9-footnote15]: See WGSL Specification, Section 15.7.2 "Differences from IEEE-754" (https://www.w3.org/TR/WGSL/#differences-from-ieee754)
Dielectric BSDF
---------------
Most materials exhibit a mixture of transmission and reflection. Water and glass are both mostly transparent when viewed
head on but appear increasingly more mirror-like at grazing angles. The mechanisms that give rise to this effect depend
on the physical properties of the surface material as well as the wavelength and polarization of the incident light.
To render such a surface accurately we have to compute both the reflected and transmitted radiance and combine
them using some view-dependent ratio that is plausible for what we want to simulate.
A physics-based method to compute this ratio is to use the Fresnel equations.[^ch9-footnote16] The Fresnel equations relate
the surface reflectance (i.e. the reflected
portion of the incident beam) to the angle of incidence and the indices of refraction at the surface
interface.[^ch9-footnote17]
Light is fundamentally a wave of oscillations in the electromagnetic field and the
orientation of these oscillations are called _polarizations_. Reflection and transmission are dependent on
the incident polarization, so the Fresnel equations come in a pair that define reflectance separately for the
two orthogonal linear polarizations relative to the _plane of incidence_[^ch9-footnote18], i.e. the "perpendicular"
($\bot$) and "parallel" ($\parallel$) polarizations:
$$
R_{\parallel} = \left(\frac{\eta ~ cos ~ \theta_i - cos ~ \theta_t}{\eta ~ cos ~ \theta_i + cos ~ \theta_t}\right)^2, ~
R_{\bot} = \left(\frac{cos ~ \theta_i - \eta ~ cos ~ \theta_t}{ cos ~ \theta_i + \eta ~ cos ~ \theta_t}\right)^2
$$
where $\eta$ is the relative index of refraction, $\theta_i$ is the angle of the incident ray, and $\theta_t$ is the angle
of refraction given by Snell's law. We can implement these equations directly for a system that takes
polarization into account. We can also take their average to derive a single unpolarized
_Fresnel reflectance_ factor.[^ch9-footnote17] Alternately, we can resort to an approximation that looks good enough
and save on computation.
We defined a valid index of refraction (i.e. a positive valued `ior`) in the `Material` type to
mean that the surface is transparent and refractive. In reality, opaque materials can have an index of refraction too.
Stainless steel -- which visually appears reflective when viewed from any direction -- has a complex-valued refractive
index. The real part represents the change in phase velocity and the imaginary part represents how rapidly
light gets absorbed by the material (known as the _absorption coefficient_). Electrical conductors (like metals) have a
high absorption coefficient for visible light. _Dielectrics_ (which are electrical insulators) generally tend to have a
low absorption coefficient for visible wavelengths.
### Schlick Approximation
The Fresnel equations can be solved for both metals and dielectrics using complex arithmetic but we'll simplify things.
We'll assume that an index of refraction is only present for a dielectric and ignore the absorption coefficient (i.e.
pretend that its value is $0$). We are also going to ignore polarization. Next, instead of solving the equations above
we'll use a very commonly used approximation called _Schlick's Formula_.[^ch9-footnote19] The formula defines Fresnel
reflectance $F$ for incident angle $\theta_i$ as:
$$
F(\theta_i) = F_0 + (F_{90} - F_0) ~ (1 - cos ~ \theta_i)^5
$$
$F_0$ is reflectance at _normal incidence_ (i.e. 0 degrees) where the surface appears the least reflective. $F_{90}$ is
the reflectance at 90 degrees. We'll define $F_{90}$ to be $1$, meaning all materials will exhibit perfect reflectance
at a tangent angle (i.e. transmission is 0):
$$
F(\theta_i) = F_0 + (1 - F_0) ~ (1 - cos ~ \theta_i)^5
$$
For dielectrics (for which the absorption coefficient is $0$), $F_0$ can be computed from the real-valued index of
refraction:
$$
F_0 = \left(\frac{\eta - 1}{\eta + 1}\right)^2
$$
Let's quickly translate this to code. If you look closely, the Schlick formula blends between $F_0$ and $1$ using
$(1 - cos ~ \theta_i)^5$ as the blend factor. This is a simple enough expression that you can just type it out exactly,
or use the [`mix`](https://www.w3.org/TR/WGSL/#mix-builtin) intrinsic. Mix is a convenient utility function that
implements a _linear interpolation_ or _linear blend_. This is an extremely common operation in Computer Graphics
and is used to mix two values with a weight between them. Given two values $a$ and $b$, their linear blend using weight
$x$ can be expressed as $a (1 - x) + bx$ or alternately $a + (b - a)x$. This latter form matches the Schlick formula
above, which blends between $F_0$ and $1$ using $(1 - cos~\theta)^5$ as the weight:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
fn schlick_fresnel(ior: f32, cos_theta: f32) -> f32 {
let u = 1 - cos_theta;
let sqrt_f0 = (ior - 1.) / (ior + 1.);
let f0 = sqrt_f0 * sqrt_f0;
return mix(f0, 1., u * u * u * u * u);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [schlick-fresnel]: [shaders.wgsl] The `schlick_fresnel` function]
Now, if the material is transmissive we need to compute the contributions from both the refracted
and reflected paths and blend them with the Fresnel factor. This involves tracing both paths
all the way to a light source. This is difficult to do in our current setup in which the loop in
the shader's entry point iterates over a single path and invokes `scatter` once for every path
segment.
Let's imagine the combined reflection and transmission distributions (i.e. the BRDF and BTDF) for
a dielectric as a unified _BSDF_ (i.e. a Bi-directional _scattering_ distribution function). The
Fresnel factor represents the relative distribution of reflected and transmitted rays, in other
words it provides the likelihood that a ray gets reflected instead of transmitted.[^ch9-footnote20]
We can use this as a PDF to sample the BSDF one segment at a time and integrate the result over
many frames, just as we did for the Lambertian distribution.
After the check for total internal reflection we are going to randomly decide whether to reflect or
refract using the Fresnel reflectance as the probability of reflecting. This choice of probability
actually fits well into our Monte Carlo framework, in that we are selecting a distribution that
closely matches the BSDF in much the same way as our choice of the cosine-weighted hemisphere
distribution matched the Lambertian BRDF (i.e. the weights cancel out).
With that, let's introduce a new `choose_specular` variable to represent whether we chose specular
reflection over other scattering types and factor in the Fresnel-weighted probability:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
let incident = normalize(input_ray.direction);
let incident_dot_normal = dot(incident, hit.normal);
let is_front_face = incident_dot_normal < 0.;
let N = select(-hit.normal, hit.normal, is_front_face);
let cos_theta = abs(incident_dot_normal);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
// `ior` and `ref_ratio` only have meaning if the material is transmissive.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
let is_transmissive = material.ior > 0.;
let ior = abs(material.specular_or_ior);
let ref_ratio = select(ior, 1. / ior, is_front_face);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL delete
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
// Determine whether to use specular reflection.
var choose_specular: bool;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
choose_specular = cannot_refract || schlick_fresnel(ref_ratio, cos_theta) > rand_f32();
} else {
choose_specular = material.specular == 1;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
var scattered: vec3f;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
if choose_specular {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
scattered = reflect(incident, N);
} else if is_transmissive {
scattered = refract(incident, N, ref_ratio);
} else {
scattered = sample_lambertian(N);
}
let output_ray = Ray(point_on_ray(input_ray, hit.t), scattered);
let attenuation = material.color;
return Scatter(attenuation, output_ray);
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [specular-dielectric-bsdf]: [shaders.wgsl] Implementing a perfect specular dielectric BSDF]
Note that due to WGSL's short-circuiting rules, the call to `schlick_fresnel` won't evaluate if `cannot_refract`
is true, skipping the computation at angles of total internal reflection.
![Figure [dielectric-bsdf]: Specular Dielectric BSDF](../images/img-36-dielectric-bsdf.png)
Here is another look from a different angle. Notice that the reflection of the sky is more prominent on the fringes:
![Figure [dielectric-bsdf]: Specular Dielectric BSDF](../images/img-37-dielectric-bsdf-up-close.png)
### Dielectric Material Color
Let's say we want to add a color tint to the glass sphere. Let's set the material color to black, which means the
material should fully attenuate all incoming light:
This will result in an image like this:
![Figure [black-dielectric-bad]: Black dielectric without Fresnel reflectance](../images/img-38-black-dielectric-bad.png)
That doesn't look like glass as it is missing the sheen that's characteristic of dielectrics. Even if the base color is
defined as black, the dielectric material should appear somewhat reflective. Remember that we are currently using the
material color directly as the attenuation factor but ideally the attenuation should match the reflectance given by
the Fresnel equations. Given that $F_{90}$ is $1$, we should see a mirror reflection of all color channels without
attenuation at grazing angles, and see more of the material color at $F_0$.[^ch9-footnote21] In other words,
the transmitted portion of the incident light should get fully attenuated but not the reflected portion.
Let's special case the Fresnel reflection and set the attenuation to $1$. Since our sampling probability matches the
Fresnel reflectance, the attenuation will naturally interpolate between the base color and $1$ based on the viewing
angle:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular: bool;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
var attenuation = material.color;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
choose_specular = cannot_refract || schlick_fresnel(ref_ratio, cos_theta) > rand_f32();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
if choose_specular {
attenuation = vec3(1.);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
} else {
choose_specular = material.specular == 1;
}
var scattered: vec3f;
if choose_specular {
scattered = reflect(incident, N);
} else if is_transmissive {
scattered = refract(incident, N, ref_ratio);
} else {
scattered = sample_lambertian(N);
}
let output_ray = Ray(point_on_ray(input_ray, hit.t), scattered);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL delete
let attenuation = material.color;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
return Scatter(attenuation, output_ray);
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [specular-reflectance-for-dielectrics]: [shaders.wgsl] Fixing the specular attenuation for dielectrics]
This should give us something that looks more like glass:
![Figure [black-dielectric-good]: Black dielectric with Fresnel reflectance](../images/img-39-black-dielectric-good.png)
The color shift from base color to white makes the sphere really feel like glass. This serves to show that Fresnel
reflectance matters for opaque materials too.
While this isn't exactly physically accurate -- real glass doesn't absorb all light at the surface but rather gradually
attenuates it as light scatters inside the material -- this still gives us a plausible look. Try out different colors and
see how that changes the overall appearance. For example, some common glass varieties have a slightly greenish tint due
to iron-oxide impurities.
### A More Compact Representation
The `Material` structure has a 16-byte alignment due to `color`. `ior` itself only uses 4 bytes, so the
compiler will add 12 bytes of padding to the end of the struct. This means that every entry in the `materials` array
takes up 32 bytes in total. On the other hand, we're not really using all 32-bits that are available in the `specular`
field. The two material types that `specular` represents (lambertian and mirror) are both opaque and defining an
additional index of refrection isn't particularly useful when the material isn't transmissive.
`f32` is a signed number. We're not going to deal with negative IOR values in this book, so the entire negative
range of values is unused. With that knowledge, here is an alternative encoding that keeps the footprint of `Material`
at 16 bytes: we merge `specular` and `ior` into a single field of type `f32`. If the value is negative, then the
material is transmissive and the absolute value represents the IOR. If the value is $0$, then the material is
lambertian. Otherwise, a positive value means the material acts like a mirror.
Let's update the material definition with this in mind. I'm choosing the name `specular_or_ior` for this new field,
which isn't particularly creative but clearly conveys the intent:
[^ch9-footnote16]: Augustin-Jean Fresnel was an early-19th-century physicist. His work on diffraction and polarization
reinforced the theory that light behaves like a wave. He also invented the "Fresnel Lens" -- a type of lighthouse reflector
made of concentric prisms that combine reflection and refraction to focus the beacon.
[^ch9-footnote17]: See [Physically Based Rendering: From Theory To Implementation, 4th Edition, 9.3.5 "Specular Reflection and Transmission, The Fresnel Equations"](https://pbr-book.org/4ed/Reflection_Models/Specular_Reflection_and_Transmission#TheFresnelEquations)
[^ch9-footnote18]: https://en.wikipedia.org/wiki/Plane_of_incidence
[^ch9-footnote19]: Christophe Schlick published this approximation in his paper titled "An Inexpensive BRDF Model for Physically-based Rendering" ([#Schlick1994]), in which he builds on an earlier approximation by Cook and Torrance in their seminal paper "A Reflectance Model for Computer Graphics" ([#CookTorrance1982]). They observed that the complex index of refraction (including the absorption coefficient) is often unknown for all wavelengths in the visible spectrum. Schlick argued that even when the values are known, the precision gained by directly solving the Fresnel equations is not worth the computational cost.
[^ch9-footnote20]: See [Physically Based Rendering: From Theory To Implementation, 4th Edition, 9.5 "Dielectric BSDF"](https://pbr-book.org/4ed/Reflection_Models/Dielectric_BSDF)
[^ch9-footnote21]: See the discussion of "color shift" in [#CookTorrance1982]. Note that the Fresnel equations are parameterized by wavelength -- the (complex) index of refraction depends on wavelength -- so the reflectance curve varies across the spectrum of colors. Cook and Torrance observe that $F_{90}$ matches the color of the light source at every wavelength (for metals as well as dielectrics).
Mixed BRDF
----------
Our material options for opaque surfaces are currently limited to perfectly diffuse and perfectly specular. Most
common objects are actually somewhere in between. Think about ceramic, smooth plastic, polished hardwood flooring, car
paint, etc. These materials are not perfect mirrors but still appear shiny. A common way to achieve this appearance is to
mix diffuse and specular reflectance by some proportion.
### Combining Specular and Diffuse
Non-negative values of `specular_or_ior` currently represent a binary choice between perfectly diffuse and perfectly
specular. Let's change it to represent the proportion of specular reflectance, such that $0$ means perfectly
diffuse, $1$ means perfectly specular, and values in between are a blend between the two. We can employ the same trick
that we used to blend between specular reflection and specular transmission for the dielectric material, i.e. use the
"specularness" value as a probability to sample one or the other:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular: bool;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
choose_specular = cannot_refract || schlick_fresnel(ref_ratio, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
} else {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
choose_specular = material.specular_or_ior > rand_f32();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
var scattered: vec3f;
if choose_specular {
scattered = reflect(incident, N);
} else if is_transmissive {
scattered = refract(incident, N, ref_ratio);
} else {
scattered = sample_lambertian(N);
}
let output_ray = Ray(point_on_ray(input_ray, hit.t), scattered);
let attenuation = material.color;
return Scatter(attenuation, output_ray);
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [specularness]: [shaders.wgsl] Mixing diffuse and specular reflection]
Let's add some new spheres with a range of specular values. We'll also update the initial camera parameters so that
the new spheres are in view:
![Figure [mixed-specular-blend]: Blending diffuse and specular](../images/img-40-mixed-specular-blend.png)
This behaves exactly the way I would expect. It gets us part of the way there, though something important is missing:
Fresnel reflectance. The materials that I listed earlier (such as ceramic and plastic) are dielectrics. The material
would look more natural if reflectance varied with the viewing angle and exhibited a color shift like the black
dielectric from the end of the previous section. I took some pictures from around my house to demonstrate what this
looks like on some real-world objects:
  
 
To compute the Fresnel reflectance we need an index of refraction but our `Material` type only defines IOR for transparent
surfaces. For now, let's hardcode 1.5 (the IOR of glass) and see how that looks:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular: bool;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
choose_specular = cannot_refract || schlick_fresnel(ref_ratio, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
} else {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
choose_specular = schlick_fresnel(1. / 1.5, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [glass-ior-for-specular-pdf]: [shaders.wgsl] Using glass IOR for specular Fresnel term]
![Figure [mixed-brdf-with-glass-ior]: Mixed BRDF with glass IOR](../images/img-46-mixed-brdf-with-glass-ior.png)
That looks really nice. Note that we are using the IOR to only determine the reflectance at normal incidence
(i.e. $F_0 = \left(\frac{\eta - 1}{\eta + 1}\right)^2$) since it's not necessary to compute a refracted ray. Let's
try assigning $F_0$ directly from `specular_or_ior`. We'll also refactor `schlick_fresnel` to accept $F_0$ as a
parameter:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
fn schlick_f0_from_ior(ior: f32) -> f32 {
let sqrt_f0 = (ior - 1.) / (ior + 1.);
return sqrt_f0 * sqrt_f0;
}
fn schlick_fresnel(f0: f32, cos_theta: f32) -> f32 {
let u = 1 - cos_theta;
return mix(f0, 1., u * u * u * u * u);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL delete
fn schlick_fresnel(ior: f32, cos_theta: f32) -> f32 {
let u = 1 - cos_theta;
let sqrt_f0 = (ior - 1.) / (ior + 1.);
let f0 = sqrt_f0 * sqrt_f0;
return mix(f0, 1., u * u * u * u * u);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular: bool;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let f0 = schlick_f0_from_ior(ref_ratio);
choose_specular = cannot_refract || schlick_fresnel(f0, cos_theta) > rand_f32();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
if choose_specular {
attenuation = vec3(1.);
}
} else {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let f0 = material.specular_or_ior;
choose_specular = schlick_fresnel(f0, cos_theta) > rand_f32();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
if choose_specular {
attenuation = vec3(1.);
}
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [specify-f0-directly]: [shaders.wgsl] Specifying $F_0$ directly]
![Figure [specify-f0-directly-img]: Specifying $F_0$ directly](../images/img-47-specify-f0-directly.png)
### Metalness
We're getting closer but there are some issues. First, when $F_0$ is $0$ the surface still appears somewhat shiny due to
Fresnel reflectance even though we want a $0$ valued `specular_or_ior` to represent a perfectly diffuse surface. That's
easy enough to fix: let's make $0$ a special case in which reflections are always Lambertian:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn schlick_f0_from_ior(ior: f32) -> f32 {
let sqrt_f0 = (ior - 1.) / (ior + 1.);
return sqrt_f0 * sqrt_f0;
}
fn schlick_fresnel(f0: f32, cos_theta: f32) -> f32 {
let u = 1 - cos_theta;
return mix(f0, 1., u * u * u * u * u);
}
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
var choose_specular = false;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
let f0 = schlick_f0_from_ior(ref_ratio);
choose_specular = cannot_refract || schlick_fresnel(f0, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
} else if material.specular_or_ior > 0. {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
let f0 = material.specular_or_ior;
choose_specular = schlick_fresnel(f0, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [special-case-lambertian]: [shaders.wgsl] Disable Fresnel reflectance for a Lambertian surface]
That should remove the specular reflections from the ground and the blue diffuse sphere. The remaining issue is that the
material loses its color as it gets more specular.
![Figure [fixed-diffuse-ground]: Fixed diffuse ground](../images/img-47b-fixed-diffuse-ground.png)
All dielectrics reflect a portion of light and allow the rest to travel into the material. Light that travels into
an opaque dielectric scatters within the material and may eventually radiate back out of the surface in some
random direction. This sub-surface scattering behavior is often represented with diffuse reflectance. Importantly,
the reflected portion is not tinted while the diffusely scattered portion takes on the material color. At normal
incidence, the diffuse reflectance accounts for a bigger portion of the appearance and specular reflections are faint
since most dielectrics have fairly small values of $F_0$ (less than $0.1$ for many materials except diamonds).
Unlike dielectrics, Metals have high values of $F_0$ (almost always 0.5 or above) and they exhibit colored
reflections.[^ch9-footnote22] Any transmitted light gets rapidly absorbed by the material, so metals do not exhibit any
sub-surface scattering -- as such no diffuse reflectance. If we want a material to appear "more metallic", we need a way
to incorporate its color into $F_0$ while simultaneously reducing its diffuse reflectance.
The Fresnel reflectance $F$ -- i.e. the value returned by `schlick_fresnel` -- _is_ the correct
specular attenuation value. We have so far simply assigned `vec3(1.)` because we chose $F$ to be the specular
probability and the weight simply manifested itself through the distribution of samples over time. If we think back
to the Monte Carlo formula, we ended up with
$$
\frac{f(\bar{\mathbf{x}})}{\rho(\bar{\mathbf{x}})} = \frac{F}{F} = 1
$$
The diffuse probability (i.e. the probability of _not_ choosing specular) is by definition $1 - F$, so it follows
that the diffuse reflectance value that we want to compute is effectively $\text{color} * (1 - F)$. As the specular and diffuse
samples combine over time, the accummulated color becomes, in some sense, a linear blend between $1$ and $color$,
where $F$ is the blend factor: $1 * F + \text{color} * (1 - F)$.
Let's verify this by choosing between specular and diffuse with equal probability and let's compute the reflectance
values explicitly without canceling out the terms. This will result in a noisier image that takes longer to resolve,
particularly when `specular_or_ior` is closer to the extremes (where the desired distribution is not 50%). However,
the final output should look the same. We are going to compute specular attenuation and diffuse attenuation
both using $\rho(\bar{\mathbf{x}}) = 0.5$:
$$
\begin{eqnarray}
\text{specular} &=& \frac{f_{\text{specular}}(\bar{\mathbf{x}})}{\rho(\bar{\mathbf{x}})} = \frac{F}{0.5} \nonumber \\
\text{diffuse} &=& \frac{f_{\text{diffuse}}(\bar{\mathbf{x}})}{\rho(\bar{\mathbf{x}})} = \frac{\text{color}~(1 - F)}{0.5} \nonumber
\end{eqnarray}
$$
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn schlick_f0_from_ior(ior: f32) -> f32 {
let sqrt_f0 = (ior - 1.) / (ior + 1.);
return sqrt_f0 * sqrt_f0;
}
fn schlick_fresnel(f0: f32, cos_theta: f32) -> f32 {
let u = 1 - cos_theta;
return mix(f0, 1., u * u * u * u * u);
}
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular = false;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
let f0 = schlick_f0_from_ior(ref_ratio);
choose_specular = cannot_refract || schlick_fresnel(f0, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
} else if material.specular_or_ior > 0. {
let f0 = material.specular_or_ior;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let F = schlick_fresnel(f0, cos_theta);
choose_specular = 0.5 > rand_f32();
if choose_specular {
attenuation = vec3(F) / 0.5;
} else {
attenuation = material.color * vec3(1. - F) / 0.5;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [sampling-specular-with-equal-probability]: [shaders.wgsl] Sampling specular and diffuse with equal probability]
![Figure [specular-diffuse-equal-prob]: Choosing between specular and diffuse with equal probability](../images/img-47c-equal-prob.png)
Having removed the implicit cancelations, let's incorporate the material color into the Fresnel factor.
`schlick_fresnel` is defined in terms of a scalar, so let's add a vectorized version so that we can pass in the
material color as $F_0$:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn schlick_f0_from_ior(ior: f32) -> f32 {
let sqrt_f0 = (ior - 1.) / (ior + 1.);
return sqrt_f0 * sqrt_f0;
}
fn schlick_fresnel(f0: f32, cos_theta: f32) -> f32 {
let u = 1 - cos_theta;
return mix(f0, 1., u * u * u * u * u);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
fn schlick_fresnel_vec3(f0: vec3f, cos_theta: f32) -> vec3f {
let u = 1 - cos_theta;
return mix(f0, vec3(1.), u * u * u * u * u);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular = false;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
let f0 = schlick_f0_from_ior(ref_ratio);
choose_specular = cannot_refract || schlick_fresnel(f0, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
} else if material.specular_or_ior > 0. {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let f0 = material.color;
let F = schlick_fresnel_vec3(f0, cos_theta);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
choose_specular = 0.5 > rand_f32();
if choose_specular {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
attenuation = F / 0.5;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
} else {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
attenuation = material.color * (1. - F) / 0.5;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [metallic-f0]: [shaders.wgsl] Metallic $F_0$]
![Figure [metallic-f0-img]: Metallic $F_0$](../images/img-48-metallic-f0.png)
Now we have a way to create a metallic appearance that exhibits the Fresnel color shift at grazing angles. Let's
re-incorporate `specular_or_ior` into how we determine $F_0$. We know how to define a metallic $F_0$ (the
material color) and a dielectric $F_0$. Let's interpolate between the two using `specular_or_ior` such that $0$ means
dielectric and $1$ means metallic. You can choose any value that you like for dielectric $F_0$, though for now let's stick
with the one for glass ($\eta = \frac{1}{1.5}$, $F_0 = 0.04$):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
const MAX_PATH_LENGTH: u32 = 13u;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
const GLASS_F0: vec3f = vec3(0.04);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular = false;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
let f0 = schlick_f0_from_ior(ref_ratio);
choose_specular = cannot_refract || schlick_fresnel(f0, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
} else if material.specular_or_ior > 0. {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let metallic = material.specular_or_ior;
let f0 = mix(GLASS_F0, material.color, metallic);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
let F = schlick_fresnel_vec3(f0, cos_theta);
choose_specular = 0.5 > rand_f32();
if choose_specular {
attenuation = F / 0.5;
} else {
attenuation = material.color * (1. - F) / 0.5;
}
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [metalness]: [shaders.wgsl] Interpolating between metallic and dielectric $F_0$]
![Figure [metalness-img]: Interpolating between metallic and dielectric $F_0$](../images/img-49-interpolate-metallic-and-dielectric-f0.png)
Now we have a clear transition between a metallic sphere on the left, a dielectric sphere on the right, and one in the
middle that behaves like both.
You may have noticed that the color of the "more metallic" spheres look saturated and don't quite match the tone of
the material color. This is because there is still some diffuse reflectance contributing to the overall reflectance.
While `metallic` is factored into `F`, diffuse reflectance doesn't completely go down to $0$ for a perflectly metallic
material when `material.color` is anything other than white. So we need to tweak things a little if we want the
resulting color to remain consistent.
Sometimes it's helpful to visualize things using a graph. The following plot shows how the specular (green),
diffuse (orange), and total (dashed blue) reflectance values vary with different values of `metallic` at normal incidence. The
material color is defined as $C = 0.5$. Notice how the total reflectance (shown as a dashed blue line) is gradually
increasing, meaning the material gets brighter the more metallic it is:
![Figure [saturating-reflectance-plot]: Saturated total reflectance with diffuse reflectance defined as $D = C (1 - F).$](../images/fig-21-saturating-reflectance-plot.png)
The green line (specular) behaves just like we want: it is exactly 0.04 (our dielectric $F_0$) when `metallic` is $0$
and it is equal to $C$ when `metallic` is $1$. Ideally the diffuse reflectance should drop to $0$ when `metallic` is $1$.
One way to guarantee that is to multiply the diffuse color by $(1 - metallic)$:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular = false;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
let f0 = schlick_f0_from_ior(ref_ratio);
choose_specular = cannot_refract || schlick_fresnel(f0, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
} else if material.specular_or_ior > 0. {
let metallic = material.specular_or_ior;
let f0 = mix(GLASS_F0, material.color, metallic);
let F = schlick_fresnel_vec3(f0, cos_theta);
choose_specular = 0.5 > rand_f32();
if choose_specular {
attenuation = F / 0.5;
} else {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
attenuation = material.color * (1. - metallic) * (1. - F) / 0.5;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [metalness-with-diffuse-reflectance]: [shaders.wgsl] Scaling diffuse reflectance for metals]
![Figure [metalness-with-diffuse-reflectance-img]: The sphere on the right (metallic = 0.1) gets almost all of its yellow color from diffuse scattering while specular reflection is untinted. The sphere on the left (metallic = 0.9) gets nearly all of its color from specular reflection and diffuse bounces contribute little to outgoing radiance.](../images/img-50-adjust-diffuse-reflectance-for-metals.png)
This is much better. The colors are no longer saturated and remain much more consistent. We
could certainly stop here but there is still one issue: while the left and right spheres match the color
closely, the middle sphere looks slightly darker. To more clearly demonstrate the issue, I'll introduce a cool material
debugging method called the _white furnace test_.
### White Furnace Test
In a uniformly lit environment with purely white materials (i.e. all materials completely
reflect all incident light with no absorption) shapes should not be discernable from the background. If you can discern
any shading, it is said that a material _loses energy_. We can implement this with a fairly small change: make all materials
white and remove the ground to ensure that light arrives uniformly in all directions. The light color is typically set to
off-white to detect any potential brightening:
![Figure [white-furnace-test]: White furnace test](../images/img-53-white-furnace-test-1.png)
That clearly shows some darkening on all 3 spheres with the middle one appearing the darkest. You should see
no darkening if you set `specular_or_ior` to either $1$ or $0$, so the material seems to darken for values that
are in-between.
Here is a plot of the current version with $C = 1$:
![Figure [darkening-reflectance-plot]: Darkening with diffuse reflectance defined as $D = C (1 - \text{metallic}) (1 - F).$](../images/fig-22-darkening-reflectance-plot.png)
The diffuse curve falls off quadratically which explains the issue. This makes sense, since $F$
factors in `metallic` already and multiplying by $(1 - \text{metallic})$ squares `metallic` in the resulting polynomial.
Let's see if we can come up with a formula that keeps the blue curve constant for all values of `metallic`. Let's define
the total reflectance as $T = S + D$, where $S$ is the specular and $D$ is the diffuse portion. I would like
to keep the total albedo always equal to the material color, so I'll set $T = C$ and solve for $D$:
$$
\begin{eqnarray}
D &=& C - F \\
D &=& C - F_0 - (1 - F_0)(1 - cos~\theta)^5 \\
\end{eqnarray}
$$
Let's consider normal incidence where $cos~\theta = 1$:
$$
\begin{eqnarray}
D &=& C - F_0 - (1 - F_0)(1 - 1)^5 \\
D &=& C - F_0 \\
\end{eqnarray}
$$
Let's rewrite $F_0$ in terms of the material parameters. We expressed this with the WGSL statement `let f0 = mix(GLASS_F0, material.color, metallic)`, which becomes $F_0 = F_0^{\text{glass}} + (C - F_0^{\text{glass}})~\text{metallic}$:
$$
\begin{eqnarray}
D &=& C - [F_0^{\text{glass}} + (C - F_0^{\text{glass}})~\text{metallic}] \\
D &=& (C - F_0^{\text{glass}}) - (C - F_0^{\text{glass}})~\text{metallic} \\
D &=& (C - F_0^{\text{glass}}) (1 - \text{metallic}) \\
\end{eqnarray}
$$
where $F_0^{glass}$ is our default $F_0$ for dielectrics. Here is a plot with this new definition of $D$:
![Figure [ok-reflectance-plot]: No darkening at normal incidence with $D = (C - F_0^{glass})(1 - \text{metallic}).$](../images/fig-23-ok-reflectance-plot.png)
Let's plug this in and see what happens:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular = false;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
let f0 = schlick_f0_from_ior(ref_ratio);
choose_specular = cannot_refract || schlick_fresnel(f0, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
} else if material.specular_or_ior > 0. {
let metallic = material.specular_or_ior;
let f0 = mix(GLASS_F0, material.color, metallic);
let F = schlick_fresnel_vec3(f0, cos_theta);
choose_specular = 0.5 > rand_f32();
if choose_specular {
attenuation = F / 0.5;
} else {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
attenuation = (material.color - GLASS_F0) * (1. - metallic) / 0.5;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [wrong-diffuse-with-metallic]: [shaders.wgsl] New diffuse curve, factoring in `metallic`]
![Figure [white-furnace-test-2]: White furnace test with adjusted diffuse curve, showing brightening](../images/img-54-white-furnace-test-2.png)
Good news and bad news: the color looks correct towards the center of each sphere where the view direction is at
normal incidence. Unfortunately the color gets brighter at grazing angles. Such a material is said to not
_conserve energy_, as the material increases the total energy in the scene.
Note that we still need to include the Fresnel factor to make sure that the diffuse term goes to zero at grazing angles.
Ideally, we should have $D = 0$ when $cos~\theta = 0$ and get the above solution when $cos~\theta = 1$. In-between
values should interpolate between those extremes following the same curve as the Schlick approximation, which defines
$(1 - cos~\theta)^5$ as the interpolant. So let's give that a shot.
Of course, we want to invert it by subtracting it from $1$, so that the factor evaluates to $0$ at grazing angles, which results in $D = (C - F_0^{glass})(1 - \text{metallic})[1 - (1 - cos~\theta)^5]$:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// Determine whether to use specular reflection.
var choose_specular = false;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
let f0 = schlick_f0_from_ior(ref_ratio);
choose_specular = cannot_refract || schlick_fresnel(f0, cos_theta) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
} else if material.specular_or_ior > 0. {
let metallic = material.specular_or_ior;
let f0 = mix(GLASS_F0, material.color, metallic);
let F = schlick_fresnel_vec3(f0, cos_theta);
choose_specular = 0.5 > rand_f32();
if choose_specular {
attenuation = F / 0.5;
} else {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let u = 1 - cos_theta;
let u5 = u * u * u * u * u;
attenuation = (material.color - GLASS_F0) * (1. - metallic) * (1. - u5) / 0.5;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [good-diffuse-with-metallic]: [shaders.wgsl] New diffuse curve, factoring in `metallic` and Schlick approximation]
This should result in the following image:
![Figure [white-furnace-test-3]: White furnace test with energy conserving materials](../images/img-55-white-furnace-test-3.png)
This result looks correct and there seems to be no energy loss or gain. I won't prove here why this formulation
is energy-conserving for all values of $cos~\theta$ and leave that to you as an exercise.[^ch9-footnote23] Let's
revert the scene back to the way it was before the white furnace test and take a look at the result:
![Figure [fixed-metallic]: Spheres with varying metalness](../images/img-56-fixed-metallic.png)
All three spheres look noticeably brighter. More importantly, they have the same color while only their reflectance
properties vary. The new formulation let's us control how metallic a surface appears by effectively reducing the
influence of diffuse scattering and ensures that the color remains consistent.
Incorporating "metallicness" as a configurable parameter is common practice for PBR (i.e. physically-based)
materials and supported by popular rendering tools like Unity, Unreal Engine, and Blender. You could provide
even more control and turn the specular $F_{90}$ and $F_0$ into configurable `Material` parameters as well.
Energy conservation isn't necessarily a requirement -- for instance, the popular Disney BRDF[^ch9-footnote24]
is not strictly energy conserving since it prioritizes artistic freedom over correctness.
Given its new meaning, it makes sense to rename `specular_or_ior` to `metallic_or_ior`:
One more thing: you may have noticed that we now compute the $(1 - cos~\theta)^5$ term twice. We can make a small
optimization and compute it once and reuse it across all branches:
### A Better Sampling PDF
The code currently samples between specular and diffuse scattering with equal (50%) probability. This isn't efficient
especially when one contributes a lot more to the final color than the other. When the material is perfectly metallic,
the diffuse reflectance gets scaled down to 0. Half of the sampled paths don't contribute any radiance while still
influencing the accummulated average in the radiance texture. Similarly, a perflectly dielectric surface wastes many
specular samples near normal incidence.
The ideal PDF for a specular sample should exactly match the proportion of specular reflectance relative to the total
reflectance. In particular, the specular probability should be $1$ at tangent incidence or when `metallic` equals $1$.
Fortunately for us, this is simply given by:
$$
\rho_{specular} = \frac{S}{S + D}
$$
$\rho_{specular}$ approaches $1$ as $D$ decreases, which can happen due to a dark material color, a high value
of `metallic`, or a high Fresnel factor.
We already compute $S$ and $D$ for the final `attenuation` value but we can't use these directly to compute a
probability because they are RGB triplets. Since we don't trace each color channel separately, we need a way to
convert the 3-dimensional RGB value into a scalar.
Fortunately there is a formula we can use. This formula converts a (linear) RGB value to a grayscale _luminance_
(i.e. the perceptual brightness of the color) based on a linear combination of the individual primary colors:[^ch9-footnote25]
$$
Y = 0.2126 * R + 0.7152 * G + 0.0722 * B
$$
The coefficient assigned to each color channel reflects the sensitivity of the human eye to that primary color based on
careful measurements (green being the highest and blue the lowest), giving us a measure of reflectance that is
weighted for human perception. It doesn't mean that a blue surface (R=0, G=0, B=1) is less reflective but it does appear
darker to us than a green surface when viewed on the same computer display.
We'll introduce a new function called `luminance` and use it to compute the specular sampling PDF
from the specular and diffuse attenuation factors:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
// Convert RGB to a grayscale luminance value
fn luminance(rgb: vec3f) -> f32 {
return dot(rgb, vec3(0.2126, 0.7152, 0.0722));
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
...
fn scatter(input_ray: Ray, hit: Intersection, material: Material) -> Scatter {
...
// `ior` and `ref_ratio` only have meaning if the material is transmissive.
let is_transmissive = material.metallic_or_ior < 0.;
let ior = abs(material.metallic_or_ior);
let ref_ratio = select(ior, 1. / ior, is_front_face);
// The (1 - cos(theta))^5 term from the Schlick approximation.
let u = 1 - cos_theta;
let u5 = u * u * u * u * u;
// Determine whether to use specular reflection.
var choose_specular = false;
var attenuation = material.color;
if is_transmissive {
let cannot_refract = ref_ratio * ref_ratio * (1.0 - cos_theta * cos_theta) > 1.;
let f0 = schlick_f0_from_ior(ref_ratio);
choose_specular = cannot_refract || schlick_fresnel(f0, u5) > rand_f32();
if choose_specular {
attenuation = vec3(1.);
}
} else if material.metallic_or_ior > 0. {
let metallic = material.metallic_or_ior;
let f0 = mix(vec3(GLASS_F0), material.color, metallic);
let F = schlick_fresnel_vec3(f0, u5);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
let specular = F;
let diffuse = (material.color - GLASS_F0) * (1. - metallic) * (1. - u5);
let S = luminance(specular);
let D = luminance(diffuse);
let specular_pdf = S / (S + D);
choose_specular = specular_pdf > rand_f32();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
if choose_specular {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
attenuation = specular / specular_pdf;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
} else {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL highlight
attenuation = diffuse / (1. - specular_pdf);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ WGSL
}
}
...
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [fresnel-and-color-weighted-specular-pdf]: [shaders.wgsl] The new specular PDF]
You should notice a significant difference in the amount of noise (especially when you move the camera).
Here is a close-up of the before and after. Both screenshots were taken after running the program for about 5 seconds:
 
That's a significant improvement. We have effectively reduced the number of samples required to resolve
the image with a simple change to our sampling strategy.
[^ch9-footnote22]: See Real-Time Rendering (Fourth Edition), Section 9.5 "Fresnel Reflectance"
[^ch9-footnote23]: It can be proved using algebra that $C \le T \le 1$. I also put together a handy [Desmos page](https://www.desmos.com/calculator/fqvqyydbja) where you can visualize the different formulas and play with their parameters.
[^ch9-footnote24]: See _Physically Based Shading at Disney_ by Brent Burley
[^ch9-footnote25]: This normalized form of luminance (where $1$ is the "perfect white") is called _relative luminance_. This specific formula comes from the ITU-R BT.709 standard used for color displays. The $Y$ refers to the first component in the $YC_BC_R$ color space which encodes video as the combination of "perceived brightness", "redness", and "blueness". This is closely related to the $YUV$ format used in analog television broadcast which is itself derived from the original CIE XYZ color space published in 1931. The history of color science and how color exchange formats evolved over the development of television and computer displays is quite fascinating.
Next Steps
----------
Our material system can combine multiple scattering types to express a mixture of metallic and dielectric surfaces.
This type of representation is highly common; in fact nearly all physically-based shading models that I'm familiar with
use a combination of diffuse and specular reflectance to emulate a variety of surface materials.
We won't go beyond Lambertian diffuse and perfectly specular scattering in this book, however you can make your materials
a lot more expressive with some tweaks.
As a next step, think about adding a _roughness_ parameter to the `Material` struct. Perfectly specular reflection
and refraction give the appearance of a smooth surface but most objects are rough and don't act like perfect mirrors
(think of brushed metal or frosted glass). A simple implementation would slightly perturb the shading normal vector (stored in
the variable `N`) towards a random direction based on the material _roughness_ and use that to compute scattering.
You can refer to [_Ray Tracing In One Weekend, 10.6. Fuzzy Reflection_](https://raytracing.github.io/books/RayTracingInOneWeekend.html#metal/fuzzyreflection), which defines this as "fuzz" and perturbs the specular reflection vector instead of the normal.
An even better approach is to implement a _Microfacet_ BRDF. Microfacet theory defines a surface as numerous microscopic
patches with varying orientations based on its roughness. These "microfacets" act like many tiny mirrors and the theory
models their collective reflectance as a statistical distribution that can be sampled like a BRDF.
Here is a list of books and articles that that I highly recommend if you want to dive deeper into the theory:
* _Real-Time Rendering (Fourth Edition)_ by Tomas Akenine-Möller et al. Chapter 9 is a comprehensive and
practical introduction to Physically Based Shading.
* _Crash Course in BRDF Implementation_ by Jakub Boksansky is an excellent introduction to BRDFs that covers
the fundamentals and provides many references to the literature ([link](https://boksajak.github.io/blog/BRDF)).
* The [course notes](https://blog.selfshadow.com/publications/s2013-shading-course/hoffman/s2013_pbs_physics_math_notes.pdf)
from the 2013 SIGGRAPH course _Physics and Math of Shading_ by Naty Hoffman.
* _Physically Based Shading at Disney_ by Brent Burley is a fundamental study of diffuse and specular reflectance based on
empirical observations. It describes the set of material parameters used at Disney (commonly known as the _Disney BRDF_)
and provides many visuals.
* _Physically Based Rendering: From Theory To Implementation (Fourth Edition)_ by Matt Pharr et al. I recommend
taking a look at the chapters on Reflection Models, materials, and Light Transport.
This is by no means an exhaustive list but I hope it can provide some inspiration. We have really only scratched the
surface (no pun intended).
Field of View
=============
Let's rearrange the scene to show off our new materials. I'll position 5 spheres in a circle around a bigger one
and reposition the camera so that they are all in view:
This should result in the following image:
![Figure [turtle-scene]: Spheres from above](../images/img-57-turtle-scene-from-above.png)
You may have noticed that the 5 smaller spheres surrounding the bigger one look slightly stretched
and egg-shaped. This distortion is more pronounced the closer the object is to the edges of the
frame. Here is the same scene from a different angle that increases the effect on the two closer
spheres:
![Figure [turtle-scene-angled]: Perspective distortion](../images/img-58-turtle-scene-from-above-2.png)
The pinhole camera model projects all visible points in the scene onto a single point along the camera rays
(see Figure [camera-view-space]). When the angle between a camera ray and the look vector (i.e. the vector
towards the center of the viewport) is large, the objects projected onto the image plane look more distorted.
The wider the angle, the bigger the distortion.
The extent of a camera's visible range is called its _field of view_ (or _FOV_ for short). The FOV is measured
in angles, so a camera with a visible range that spans the full range of directions side-to-side would have a
FOV of 180 degrees. The widest angle between a camera a ray and the look vector in such a scenario would be
90 degrees.
![Figure [vertical-fov]: Vertical field of view at 90 degrees and 45 degrees with the same aspect ratio and focus distance](../images/fig-21-fov.svg)
A wide FOV conveys a sense of depth especially when objects are close to the camera. A narrow FOV will minimize
the distortion due to perspective but requires the camera to be positioned at a distance to keep more of the
scene in view. The angle is typically defined as either _horizontal_, _vertical_, or sometimes _diagonal_
relative to the image plane.
Our renderer currently doesn't have a notion of FOV. Remember that we defined the height of the viewport as $2$
-- with a vertical extent ranging from $-1$ to $1$ -- and computed the width of the viewport based on the provided
aspect ratio. We also set the distance between the camera origin and the viewport to $1$.
Now, consider a vertical plane that is parallel to the look vector and intersects the camera origin
and the viewport. The plane intersects the viewport at two points in the center and the top edge. These two
points and the camera origin form an upright isosceles triangle with two 45 degree angles. The triangle spans
only half of the vertical FOV, so the renderer's full vertical FOV is 90 degrees.
![Figure [vertical-fov-2d]: Our renderer currently has a FOV 90 degrees and a viewport height of 2](../images/fig-22-fov-90-deg-triangle.svg)
Since we compute our camera rays based on points on the viewport, changing the height of the viewport will
also change the vertical field of view. If we want to achieve a specific FOV angle $\theta$ given focal
distance $d$, we can compute the viewport height $h$ with some basic trigonometry:
$$
\begin{eqnarray}
tan~\frac{\theta}{2} &=& \frac{h}{2d} \nonumber \\
h &=& 2d~tan~\frac{\theta}{2} \nonumber
\end{eqnarray}
$$
Let's turn this into code. We'll compute the viewport height using this formula and compute the viewport height
using the aspect ratio. Then we'll simply change our UV mapping to use the new dimensions instead of assuming that
width and height both equal 2. For now let's hardcode 90 degrees as the vertical FOV and verify that we get the
same result:
This should result in the same image as Figure [turtle-scene-angled]. `fov_y` is short for "FOV along
the y-axis" and represents the vertical field of view angle. Let's change it to 30 degrees:
![Figure [vertical-fov-30-deg]: Vertical FOV set to 30 degrees](../images/img-59-fov-30-deg.png)
Since the viewport spans a narrower view it also fits less of the scene from the same camera position.
Let's move the camera back a little:
![Figure [vertical-fov-30-deg-zoomed-out]: 30 degrees vertical FOV zoomed out](../images/img-60-fov-30-deg-zoomed-out.png)
Now all spheres are in view and they no longer look distorted.
Interactive FOV
---------------
Let's turn the vertical field of view into a constant field of the `CameraUniforms` structure. This will
allow us to adjust it interactively like the other camera parameters.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
...
use crate::algebra::Vec3;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
const DEFAULT_FOV_Y: f32 = 45_f32.to_radians();
const MIN_FOV_Y: f32 = 10_f32.to_radians();
const MAX_FOV_Y: f32 = 170_f32.to_radians();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
#[derive(Debug, Copy, Clone, Pod, Zeroable)]
#[repr(C)]
pub struct CameraUniforms {
origin: Vec3,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
fov_y: f32,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
u: Vec3,
_pad1: u32,
v: Vec3,
_pad2: u32,
w: Vec3,
_pad3: u32,
}
...
impl Camera {
...
pub fn from_spherical_coords(
center: Vec3,
up: Vec3,
distance: f32,
azimuth: f32,
altitude: f32,
) -> Camera {
let mut camera = Camera {
uniforms: CameraUniforms::zeroed(),
center,
up,
distance,
azimuth,
altitude,
};
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
camera.uniforms.fov_y = DEFAULT_FOV_Y;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
camera.calculate_uniforms();
camera
}
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
pub fn adjust_fov(&mut self, delta: f32) {
let fov_y = self.uniforms.fov_y;
self.uniforms.fov_y = (fov_y + delta).clamp(MIN_FOV_Y, MAX_FOV_Y);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
fn calculate_uniforms(&mut self) {
...
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [fovy-uniform]: [camera.rs] `fov_y` uniform]
We're not going to use the mouse this time but use the keyboard. Let's implement a handler for
`WindowEvent::KeyboardInput` where we'll assign the "up" and "down" arrow keys to increment and
decrement the vertical FOV by 1 degree:
![Figure [vid-10-fovy-up-and-down]: (video) Interactive FOV](../images/vid-10-fovy-up-and-down.mp4 autoplay muted loop)
That feels like a real camera, doesn't it? Let's finally add a function to specify an optional initial FOV value
during construction:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
...
impl Camera {
...
pub fn from_spherical_coords(
center: Vec3,
up: Vec3,
distance: f32,
azimuth: f32,
altitude: f32,
) -> Camera {
...
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust highlight
pub fn with_fov(mut self, fov_y: f32) -> Self {
self.uniforms.fov_y = fov_y;
self
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rust
pub fn uniforms(&self) -> &CameraUniforms {
&self.uniforms
}
...
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Listing [with-fov-ctor]: [camera.rs]]
(insert acknowledgments.md.html here)
References
==========
[#AlanWolfe2024]: Alan Wolfe [*Beyond White Noise for Real-Time Rendering*](https://youtu.be/tethAU66xaA?si=qIPEwF5XTm8kO3tF)
[#CookTorrance1982]: Robert L. Cook, Kenneth E. Torrance, *A Reflectance Model for Computer Graphics*, 1982
[#Hughes13]: J.F. Hughes, A. van Dam, M. McGuire, D.F. Sklar, J.D. Foley, S.K. Feiner, K. Akeley *Computer Graphics: Principles and Practice, 3rd Edition, Section 1.6*
[#Immel86]: David S. Immel, Michael F. Cohen, Donald P. Greenberg *A Radiosity Method For Non-Diffuse Environments*
[#Jenkins13]: Bob Jenkins, [*A hash function for hash Table lookup*](https://www.burtleburtle.net/bob/hash/doobs.html), 2013
[#Kajiya86]: James T. Kajiya *The Rendering Equation*, 1986
[#Lambert1760]: Johann Heinrich Lambert, *Photometria sive de mensura et gradibus luminis, colorum et umbrae*, 1760. Courtesy of ETH-Bibliothek Zürich, Switzerland.
[#Marsaglia03]: George Marsaglia, [*Xorshift RNGs*](https://www.jstatsoft.org/article/download/v008i14/916), 2003
[#McGuire2024GraphicsCodex]: Morgan McGuire, *The Graphics Codex*, 2024
[#Moller2018]: Tomas Akenine-Möller, Eric Haines, Naty Hoffman, Angelo Pesce, Michal Iwanicki, Sébastien Hillaire, *Real-Time Rendering (Fourth Edition)*
[#Pharr2023]: Matt Pharr, Wenzel Jakob, and Greg Humphreys, *Physically Based Rendering: From Theory To Implementation, 4th Edition*
[#Saikia]: Sakib Saikia [*Deriving Lambertian BRDF from first principles*](https://sakibsaikia.github.io/graphics/2019/09/10/Deriving-Lambertian-BRDF-From-First-Principles.html)
[#Schlick1994]: Christophe Schlick, *An Inexpensive BRDF Model for Physically-based Rendering*, 1994
[#Shirley2019]: Peter Shirley et al, [*Sampling Transformations Zoo*](https://research.nvidia.com/labs/rtr/publication/shirley2019sampling/)
[^ericson]: C. Ericson, Real Time Collision Detection
[^mcguire-codex]: https://graphicscodex.courses.nvidia.com/app.html
[Arman Uguray]: https://github.com/armansito
[Steve Hollasch]: https://github.com/hollasch
[Trevor David Black]: https://github.com/trevordblack
[RTIOW]: https://raytracing.github.io/books/RayTracingInOneWeekend.html
[RTTROYL]: https://raytracing.github.io/books/RayTracingTheRestOfYourLife.html
[rt-project]: https://github.com/RayTracing/
[gt-project]: https://github.com/RayTracing/gpu-tracing/
[gt-template]: https://github.com/RayTracing/gpu-tracing/blob/dev/code/template
[discussions]: https://github.com/RayTracing/gpu-tracing/discussions/
[dxr]: https://en.wikipedia.org/wiki/DirectX_Raytracing
[vkrt]: https://www.khronos.org/blog/ray-tracing-in-vulkan
[rtiow-cuda]: https://developer.nvidia.com/blog/accelerated-ray-tracing-cuda/
[webgpu]: https://www.w3.org/TR/webgpu/
[Rust]: https://www.rust-lang.org/
[rust-unsafe]: https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html
[wgpu]: https://wgpu.rs