Add support for cufinufft in Legion
This adds support for cufinufft in Legion:
- In all code paths, we now use the primary context for CUDA
- MPI/sequential DO NOT push/pop the context repeatedly; they do so only once at start and end of program
- Legion does a push/pop on each task
- Important: This is done at the task level so that all local variables fall out of scope before the context is popped; if you don't do this then we leak GPU memory
- cufinufft has been enabled in the Legion CI