Tpetra: Make CrsMatrix::transferAndFillComplete do thread-parallel pack & unpack
Created by: mhoemmen
@trilinos/tpetra "Superstory": #797
Tpetra::CrsMatrix::transferAndFillComplete implements a specialized pack and unpack for CrsMatrix. Tpetra's sparse matrix-matrix multiply uses this.
Try to share as much code with #800 (closed) as possible. See e.g., packRow in Trilinos/packages/tpetra/core/src/Tpetra_Import_Util2.hpp. It would make sense to adapt PackTraits methods for use inside Kokkos::parallel_*. That would call for changes to Stokhos and perhaps also Sacado.