A lot of the C++ stuff (in spinfel/device
) will likely move to a seperate PyPI package soon. This MR makes two changes:
- Get the device count from the CUDA API. I checked and this respects
CUDA_VISIBLE_DEVICES
on Cori GPU, Perlmutter, and Summit. - Remove the
gpu.devices_per_node
setting -- this is controlled bysrun
orCUDA_VISIBLE_DEVICES
now.
RE point 2 above: you need to remove the
[gpu]
devices_per_node = ...
section of your own tomls.
This also fixed a bug where the orientation matching code would use the setting rather than context.dev_id