`N_orientations` off by factor of 2
Discovered by @ihchang
TL;DR there is an off-by-a-factor-of-2 error in either N_orientaitons
or M
when comparing MPI CUDA and MPI CPU.
I only have a limited understanding based on my discussions on Slack, so I invite @ihchang to close this issue when it's been resolved. Or to explain it further here, if more hands are needed.