Blender Git Loki

Blender Git "tmp-drw-callbatching" branch commits.

Page: 2 / 3

August 17, 2019, 12:48 (GMT)
GPU: Add API to use multidrawindirect using GPUbatch

This new API record a list of command that use the same batch and submit
it to the GPU in one call.
August 17, 2019, 12:48 (GMT)
DRW: Add draw call sorting

This makes rendering lots of similar objects much faster (with lower CPU

29 fps -> 38 fps
34 ms -> 26 ms
In my test case with 30K instances of 4 different meshes
August 17, 2019, 12:48 (GMT)
Workbench: Remove object_id and optimize material hash generation

This greatly reduce shgroup count when rendering with outlines.
In my testcase (30K suzanes with random instancing, 5 materials) it went
from 27 to 39 fps (playback performance, no update).
August 17, 2019, 12:48 (GMT)
Workbench: Use resource_id instead of own index

This only modifies the shader and should not create real improvement.
However, this makes it possible to reduce shgroups count drastically
when outline rendering is enabled.
August 17, 2019, 12:48 (GMT)
DRW: Make workaround for drivers with broken gl_InstanceID
August 17, 2019, 12:48 (GMT)
Cleanup: GPUBatch: rename arguments
August 17, 2019, 12:48 (GMT)
Edit Curve: Fix curve normals
August 17, 2019, 12:48 (GMT)
DRW: Remove common_view_lib uniform default values

Thoses default values makes the uniform never disabled so they
are updated even if not used.
August 17, 2019, 12:48 (GMT)
DRW: Use int instead of uint for DRWCall

This let us tag non-instancing calls by tagging them with -1
August 17, 2019, 12:48 (GMT)
Object Mode: Add back lightprobe selection outlines
August 17, 2019, 12:48 (GMT)
DRW: Add builtin uniform to get full DRWResourceHandle from shader

This solves the issue with loosing outline around meshes that have
the same ID. Now it needs to have 16K objects in the scene for that to
August 17, 2019, 12:48 (GMT)
Object Mode: Outlines: Rewrite id pass generation

This makes the ID pass for outline detection use the new
resource_id in order to differenciate drawcalls.

Since the drawcalls have IDs in the range of [0..511] this means objects
with the same id will have their outlines merges. This will be fixed in
next commit.

Lightprobes have their outlines disabled for now.
August 17, 2019, 12:48 (GMT)
DRW: Refactor to support draw call batching

This refactor improve draw structures CPU/Memory efficiency and lower the
driver overhead of doing many drawcalls.

- Model Matrix is now part of big UBOs that contain 1024 matrices.
- Object Infos follow the same improvement.
- Matrices are indexed by gl_BaseInstanceARB or a fallback uniform.
- All these resources are using a single 32bit identifier (DRWResourceHandle).
- DRWUniform & DRWCall are alloced in chunks to improve cache coherence & memory usage.
- DRWUniform now support up to vec4_copy.
- Draw calls are batch together if their resource id are consecutive.

This has a great impact on CPU usage when using lots of instances. Even if the biggest
bottleneck in these situations is the depsgraph iteration, the driver overhead when doing
thousands of drawcalls is still high.

This only improve situations where the CPU is the bottleneck: small geometry, lots of

The next step is to sort the drawcall inside a DRWCallChunk to improve the batching process
when instancing order is pretty random.

Reviewers: brecht, antoniov

Differential Revision:
August 17, 2019, 12:48 (GMT)
BLI_memblock: Use aligned memory blocks

This is because we upload certain chunks directly as UBO data and
AMD suggests to use aligned block to speedup memcpy.

However, this implies the driver is also doing an aligned allocation
which seems not to be the case on linux_x64 + radeon + mesa.
August 17, 2019, 12:48 (GMT)
Cleanup: DRW: Break rendering loop into smaller functions
August 17, 2019, 12:48 (GMT)
Cleanup: DRW: move DRWShadingGroup uniform locations to DRWUniform
August 17, 2019, 12:48 (GMT)
GPU: Make Eevee shader pass resourceID to the fragment shader

this way we don't need a fallback uniform and the overcome the
impossibility to do draw call merging with these shaders.
August 17, 2019, 12:48 (GMT)
DRW: Put DRWCalls into DRWCallChunks

This is in order to improve cache coherence and make it possible
to sort and merge drawcalls in the future.
August 17, 2019, 12:48 (GMT)
Cleanup: Wireframe: Use vec3_copy and remove unused code
August 17, 2019, 12:48 (GMT)
GPUBatch: Bypass empty drawcalls

These should be taken care of at a higher level but it's not
always possible. So it's still nice for cleaning up
captures in renderdoc.
Tehnyt: Miika HämäläinenViimeksi päivitetty: 07.11.2014 14:18 MiikaH:n Sivut a.k.a. MiikaHweb | 2003-2020