Tpetra: Add fused nonblocking dot+norm
Created by: mhoemmen
Story: #748 (closed)
It turns out my collaborator needs a fused nonblocking dot+norm for the variant of Pipelined CG that does one all-reduce per iteration. It fuses the norm of the vector r, with the dot product of r and (A*r).
This has two subtasks:
-
( https://github.com/kokkos/kokkos-kernels/issues/13 ) Add a run-time branch in KokkosBlas::dot for 2-D Views where exactly one of the Views has a single column. -
Add a run-time branch in Tpetra::idot for this use case, implementing the dot product of [Ar, r] with [r, r], without copying r.