Skip to content

Tpetra::DistObject::doTransferNew: Don't recreate DualViews for LID arrays each call

James Willenbring requested to merge mhoemmen:Tpetra-Transfer-DualView-2 into develop

Created by: mhoemmen

@trilinos/tpetra

  1. Export and Import now store {permuteTo, permuteFrom, remote, export}LIDs arrays as Kokkos::DualView, rather than as Teuchos::Array.

  2. DistObject::doTransferNew no longer allocates new Kokkos::DualView from host Teuchos::ArrayView for the aforementioned four LIDs arrays. This should avoid a major overhead that @crtrott and I noticed last Friday in the Tpetra CG solve benchmark on 2 GPUs on waterman.

  3. Export and Import had duplicated code (for data access) and state (e.g., out_ and verbose_). I factored out duplication into their common base class, Transfer.

  4. I made verbose debugging output (TPETRA_VERBOSE) more consistent.

  5. I fixed CUDA build errors in https://github.com/trilinos/Trilinos/pull/4322 and rebased on top of current develop, in order to get @kyungjoo-kim 's BlockCrs fixes. The above changes address part of Kyungjoo's concerns about DistObject taking const DualView<const T>& but sync'ing it. (We still need to change const DualView& to DualView& in the case where the DistObject subclass will certainly want to sync, but now the four LID arrays mentioned above will never need sync'ing.)

I still want to test downstream, but this is more ready for review now.

Merge request reports