Skip to content

Shadow Overhaul#7529

Open
BMagnu wants to merge 31 commits into
scp-fs2open:masterfrom
BMagnu:shadow_overhaul_2
Open

Shadow Overhaul#7529
BMagnu wants to merge 31 commits into
scp-fs2open:masterfrom
BMagnu:shadow_overhaul_2

Conversation

@BMagnu

@BMagnu BMagnu commented Jun 16, 2026

Copy link
Copy Markdown
Member

This refactors how FSO handles shadows pretty fundamentally.

First, the user-facing changes:

  • Cascades are now dynamic, users are no longer locked to 4 for cockpits / main scene. More or less can be used as appropriate
  • The smoothing applied to each cascade is now table-configurable
  • VSM is replaced by PCSS, resulting in fewer light leakage and halo artifacts
  • The shadow cascades can now share data, which notably results in the player ship casting a shadow onto both the cockpit and the main scene now, as well as the main scene casting a shadow onto the player cockpit (and show ship, given the correct table settings).

Then the performance enhancements (with reference to the points in #7499):

  • Change to PCSS, letting the hardware do the sampling (point 4)
  • Accordingly also keeping shadow maps only in Depth24 rather than RGBA32 + Depth32, reducing memory from 1.3GB to 400MB at top resolution, as well as reducing bandwidth even further since we only write once per frame rather than twice (point 6 and 7, this and the last bullet point likely gain most of the performance from my testing)
  • Rendering shadow maps without geometry shaders (point 1)
  • Rendering shadows in their own, slim, non-side-effecty render-pass with maximum batching (point 2, 12)
  • Separate shadow shaders with less UBO overhead (point 15)
  • Cockpit cascades are rendered alongside main cascades in a single pass (point 5)

This results in an overall good speedup, as such, closes #7499. Note that on my machine, shadows are rarely the bottleneck cause my GPU is fairly overkill, especially in the physics-heavy scenes where the framerate drops low. Benchmark is Icarus:
Icarus Benchmark
I've had informal tests from @wookieejedi on FotG (which benefits doubly due to the second large buffer clear for the cockpit shadow pass being optimized out entirely), with reports of the framerate effectively doubling from low 30's to consistently over 60.

Backend-wise, this PR also has some cleanup and fixes.

  • Shadow generation now has its own shader and render pipeline, disentangling two render paths that are almost completely divergent except for both rendering models.
  • At the same time, 3D-batched rendering is now abstracted into a common batching class, should we ever need it somewhere else as well.
  • OpenGL buffer binding logic was slightly incorrect. Until now that wasn't an issue, but with specific usage as needed in this PR, this would cause trouble.

@BMagnu BMagnu added enhancement A new feature or upgrade of an existing feature to add additional functionality. cleanup A modification or rewrite of code to make it more understandable or easier to maintain. graphics A feature or issue related to graphics (2d and 3d) opengl Features and Issues related to OpenGL Waiting for Stable Marks a pull request that is to be merged after the next stable release, due to a release cycle labels Jun 16, 2026
@SamuelCho

Copy link
Copy Markdown
Contributor

Nice job, I like what I'm seeing for the most part. But I don't know if I like having two separate model drawing pipelines. It seems like a potential point of divergence if modifications get added to the color model drawing functions but isn't reflected in the shadow pass or vice versa. Seems like a potential maintenance hazard. Virtually all engines I worked on share the same model drawing path with color and shadows so I don't see why we need to do it here.

Everything else seems good. One recommendation is to randomly rotate the Poisson disc per fragment to get even better PCSS results.

@BMagnu

BMagnu commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

The reason I did this was cause the old combined pipeline was a stateful mess, with 90% of the main pipeline not relevant to shadows.
It might be feasible to use the same pass, but that certainly involves a lot more cleanup, and ideally a full refactor of how shadow-gen statefulness is handled, and a different dispatch for renderqueue since we really don't want the full material info in the shadowgen shader to benefit from proper batching.

Also, after all, the only thing they really share deep down is the submodel iteration, and transform handling.
Everything else is subtly different and divergent anyways, and would need branches. Transparent texture handling, detailbox compute (currently handled the same, but long term, detailboxes for shadows should not be wrt to eye position, but wrt to the light direction), material handling, objecttype handling...

@SamuelCho

SamuelCho commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Yeah if I were to redo the model queue code now, we'd just queue all the models once then reuse the draw list for different passes. You can sort of see glimpses of that with regards to the opaque vs transparency passes. Ideally we would rely on one pipeline and queue draws into shadows, opaque, then transparency all at once with each add_buffer_draw().

But yeah I get why you did what you did though. At the very least perhaps the add_model_draw method could shared by the regular model draw and shadow model draw? Conceptually the regular model draw is basically the new shadow add_model_draw just with extra crap. Maybe we can sweep all the extra legacy crap into a function or two and that could be the delta?

But I get it if you don't find that worth doing? Maybe once I'm done stabilizing the Vulkan PR I can try to take a stab at what I described earlier in this post.

@BMagnu

BMagnu commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

That would amount to merging model_render_children_buffers from modelrender.cpp with render_submodel_children from shadows.cpp.
model_render_queue and add_model_draws are conceptually similar in the sense that they push the transform buffer, check root-submodel detailboxes, and submits the geometry to the buffer, but that's where it ends.
There's so many extra gibbons in there, all of which are not necessary for shadows. It supports unbatched drawing, which I really don't feel like making possible to do with shadows. Mostly, model_material reaaaally should not be interacting with the shadow pipeline, cause they carry so much data and state that's outright incomaptible with the shadow pipeline.

I feel like, the only sensible way to do it is as you describe.
Have different queues for transparent, main scene, and shadows, each with their own material definitions.
Walk the tree in a generic way, with callbacks for each to process (since something that might get culled for shadows may not get culled for main scene or the other way around, and almost all of the stuff done in these is pass-specific material and texture handling), while going for:

  1. The tree-walking not being affected by global state, or otherwise being stateful in any way
  2. The code that adds draws to the queues does not interact / is not interwoven between different queues. If any common code exists (which is pretty much just adding the transform buffer, and maybe detailbox handling, the entire rest of the model handling code and queue batching is pretty much fully divergent), it must be in external functions or be part of the superstructure that called the individual type handlers.
  3. We kick some old legacy handling out of the codebase (do we really need non-batched rendering? It'd be so much simpler we just rendered all models the same way using the transform buffer)

I could try to merge the child-walking functions and maaaaaybe the parent render_model_queue, but I'll need to think if I can find a good way to make sure that neither model_material nor model_render_params touch the shadow side of the code, since that's just mostly invalid for shadows.

However, I feel like it's not worth it, for the precise reason that if you do decide to redo the rendering as you describe, that'd make this all obsolete anyways. And if you don't, I probably will do that once I'm through my current ToDo list, though for sure in a different PR, as that needs its own dedicated testing and would be a long PR in itself that we don't just want to tag onto this one.

@BMagnu BMagnu force-pushed the shadow_overhaul_2 branch from f6e65d8 to 00abd57 Compare June 18, 2026 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanup A modification or rewrite of code to make it more understandable or easier to maintain. enhancement A new feature or upgrade of an existing feature to add additional functionality. graphics A feature or issue related to graphics (2d and 3d) opengl Features and Issues related to OpenGL Waiting for Stable Marks a pull request that is to be merged after the next stable release, due to a release cycle

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shadow performance audit

2 participants