5 min read

Configuring our Unreal project to precompile PSOs

In this post, we're going to cover actually using Unreal Engine's PSO in a project after it has been enabled.
Configuring our Unreal project to precompile PSOs

Before we begin, we have configured our project settings to enable the shader pipeline cache with r.ShaderPipelineCache.Enabled in the device profiles window.

For more information, my personal blog has a post that covers configuring this feature.

Onto actually implementing PSO at start up. For this project, we're going to just halt on the title screen until the shaders are compiled.

Set Up

To precompile all of our shaders before the game starts, we want a few things.

Firstly, we want a lightweight title screen. If our title screen is too heavy on shaders, we'll have to pause before we can even start rendering the title screen. If we want a heavier screen, we may want to save it for the menu.

Secondly, we want to configure the shader pipeline cache to precompile on start up with r.ShaderPipelineCache.PreOptimizeEnabled. We don't need this, and it's easy to flip the pipeline into precompile mode in code anyway.

Thirdly, we want a widget that displays a progress bar. Players will probably be slightly confused if the game seemingly freezes for a long period of time.

Widget

For the widget, we're going to create a new User Widget.

Image showing loading bar across the bottom of the screen
Progress Bar

In this widget, I've placed a progress bar across the bottom and created a small function that I can call to update its percentage.

Image showing the update function
Update Function

We need to add this to viewport in our title screen, and that's done with a call to Add to Viewport in blueprint.

Image showing adding the created widget to the viewport
Creating the widget for the local player and adding it to the viewport

Don't worry to much about the use of player index 0 here, it's the first screen and I'm not in a networked environment at this point.

Be sure to promote the widget created by Create UI Shader Bar Widget to a variable as we'll use the reference later to update the percent.

PSO

For this portion, we need to delve into C++. I'm not aware of blueprint calls that can achieve what we want out-of-the-box.

When you want to precompile everything, I recommend you set MaxPrecompileTime to some non-zero number for testing. When waiting on FShaderPipelineCache::NumPrecompilesRemaining(), the returned value is based on an interpolation from the opening of the cache to MaxPrecompileTime. This essentially leads to the compile always taking MaxPrecompileTime when used this way. For actual use in a title, you may want to set max time to zero and rely on the NumPrecompilesRemaining function to check when the compilation is finished.

Before I cover that, let's discuss a thorny issue with PSO caching. As far as I can tell, if another, new PSO is discovered during PSO compilation, the current compilation will halt if MaxPrecompileTime is greater than zero. If you do things correctly during the PSO logging process, you might not notice this as the .psc file will contain the shaders it needs.

If, however, you retrofit a title screen onto a game that was already doing PSO without renewing the cache, that screen might log a new PSO and the halt the process. Bare in mind that I specifically waiting until I have 0 shaders left to compile before I load the menu.

You can get around this issue with a quick r.ShaderPipelineCache.Open or r.ShaderPipelineCache.MaxPrecompileTime 0 to continue, so it's not insurmountable, but it is slightly annoying. If you're wanting to generate PSOs for the run, use the latter. I'm not sure if things will write back correctly if we reopen the cache.

And as a second note: my test project launcher profile doesn't appear to want to actually pass -logpso to the executable, or rather, it passes it, and it has no effect. I used cmd to set the level and pass the PSO argument, which does activate PSO logging. It also unsticks the halting issue above, so you probably won't need to issue the workaround commands.

Anyway, let's get to actually setting it up.

Unreal doesn't, as far as I know, have a way to expose the state of compilation to blueprint. We want it in blueprint to drive the UI, it's just easier that way. What we really want is a blueprint node that fires when updates occurs.

Now, there is a class in Unreal called UBlueprintAsyncActionBase that I've used to implement the blueprint bindings. We want the asynchronous nature of this type of node as we want to avoid stalling the game thread. As an aside, we probably also want to set a 30-fps limit for this menu, there's little happening and we have no reason to go crazy with GPU load.

The code below is a variant of what I'm using to wait for the game to finish precompilation. Our waiting loop is implemented in WaitForBatch using the timer manager. If we did compile more shaders, we call an OnUpdate callback, and if we finish compilation, we call an OnComplete callback.

FShaderPipelineCache::SetBatchMode(FShaderPipelineCache::BatchMode::Precompile);
if (FShaderPipelineCache::IsBatchingPaused())
{
	FShaderPipelineCache::ResumeBatching();
}

// Wait for X shaders to compile, unless we got a zero
if ((-1 == MaxShaders || MaxShaders > 0) && World)
{
	ReferenceCurrentShaders = FShaderPipelineCache::NumPrecompilesRemaining();

	if (0 == ReferenceCurrentShaders)
	{
		// Don't trip update callback
		ExecuteCompleted(0);
		return;
	}
	
	World->GetTimerManager().SetTimer(StepHandler, this, &UPSOLevelLoadHelper::WaitForBatch, 0.35f, true);
	return;
}

In our timer manager call, the 0.35f specifies that we wait for 350ms between the calls to WaitForBatch. This delay is a bit longer than I'd roll for trying to ship something, but it works for testing. We could have this update fire at 0.033f – 33ms – to line it up better with the frame rate, but it won't be perfect.

A better way to have the update sync to the framerate is to use ...SetTimerForNextTick(). This solution also scales, but it does require resetting the timer in WaitForBatch as next tick only triggers once. The flip-side of this requirement is that SetTimer requires that we clear the timer when we complete the compilation.

In my code, the removal of the timer is handled in ExecuteCompleted, which also calls the OnComplete callback and calls SetReadyToDestroy(). The latter is required for proper operation of the async node class.

A quick and curious note about the D3D12 related cache warnings you may see when launching with PSO enabled: neither r.D3D12.PSO.DiskCache nor r.D3D12.PSO.DriverOptimizedDiskCache that get mentioned appear to exist at present.