Ifpack2::OverlappingRowMatrix::apply is unnecessarily sequential
Created by: mhoemmen
@trilinos/ifpack2
It uses the old ArrayRCP host access interface of Tpetra::MultiVector
, when it could just use Tpetra::CrsMatrix::localApply
and get thread / CUDA parallelism for free. This could also be related to #4353 (closed).
This will require making Tpetra::CrsMatrix::localApply
public, but I'm OK with that; it's a useful function for implementing block operators efficiently.
Related Issues
- Related to #4353 (closed)