Manage GPU_matrix stacks per GPUContext

Previously, we had one global `GPU_matrix` stack, so the API was not
thread safe. This patch makes the stack be per `GPUContext`, effectively
making it local per thread (`GPUContext` is located in thread local

Reviewed By: brecht
