Tpetra::DistObject::doTransferNew: Don't recreate DualViews for LID arrays each call
Created by: mhoemmen
@trilinos/tpetra
-
Export and Import now store
{permuteTo, permuteFrom, remote, export}LIDs
arrays asKokkos::DualView
, rather than asTeuchos::Array
. -
DistObject::doTransferNew
no longer allocates newKokkos::DualView
from hostTeuchos::ArrayView
for the aforementioned four LIDs arrays. This should avoid a major overhead that @crtrott and I noticed last Friday in the Tpetra CG solve benchmark on 2 GPUs on waterman. -
Export and Import had duplicated code (for data access) and state (e.g.,
out_
andverbose_
). I factored out duplication into their common base class, Transfer. -
I made verbose debugging output (
TPETRA_VERBOSE
) more consistent. -
I fixed CUDA build errors in https://github.com/trilinos/Trilinos/pull/4322 and rebased on top of current develop, in order to get @kyungjoo-kim 's BlockCrs fixes. The above changes address part of Kyungjoo's concerns about DistObject taking
const DualView<const T>&
but sync'ing it. (We still need to changeconst DualView&
toDualView&
in the case where the DistObject subclass will certainly want to sync, but now the four LID arrays mentioned above will never need sync'ing.)
I still want to test downstream, but this is more ready for review now.