Blender Git Commit Log

Git Commits -> Revision 0421434

Revision 0421434 by Jeroen Bakker (master)
October 26, 2020, 10:02 (GMT)
LatticeDeform: Performance

This patch improves the single core performance of the lattice deform.

1. Prefetching deform vert during initialization. This data is constant for
each innerloop. This reduces the complexity of the inner loop what makes
more CPU resources free for other optimizations.
2. Prefetching the Lattice instance. It was constant. Although performance
wise this isn't noticeable it is always good to free some space in the
branch prediction tables.
3. Remove branching in all loops by not exiting when the effect of the loop
isn't there. The checks in the inner loops detected if this loop didn't
have any effect on the final result and then continue to the next loop.
This made the branch prediction unpredictable and a lot of mis
predictions were done. For smaller inner loops it is always better
to remove unpredictable if statements by using branchless code patterns.
4. Use SSE2 instruction when available.

This gives 50% performance increase measured on a
Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz with GCC 9.3.
Also check other compilers.

Before:
```
performance_no_dvert_10000 (4 ms)
performance_no_dvert_100000 (30 ms)
performance_no_dvert_1000000 (268 ms)
performance_no_dvert_10000000 (2637 ms)
```

After:
```
performance_no_dvert_10000 (3 ms)
performance_no_dvert_100000 (21 ms)
performance_no_dvert_1000000 (180 ms)
performance_no_dvert_10000000 (1756 ms)
```

Reviewed By: Campbell Barton

Differential Revision: https://developer.blender.org/D9087

Commit Details:

Full Hash: 042143440d7668d3e357805ffdd20b1a4d2e2975
Parent Commit: 2ddecff
Lines Changed: +214, -76

By: Miika HämäläinenLast update: Nov-07-2014 14:18 MiikaHweb | 2003-2020