Tpetra: Improve MPI+X scalability of sparse matrix-matrix multiply and global assembly
Created by: mhoemmen
@trilinos/tpetra @trilinos/muelu
[mfh edit 13 Jul 2017: Promote #802 (closed) from task to story]
Stories:
- #629: Make sparse matrix-matrix multiply thread-parallel
- #797: Improve thread scalability of Import/Export
- #802 (closed): Improve thread scalability of transferAndFillComplete
- #829 (closed): Improve thread scalability of CrsMatrix::fillComplete
- #832: Improve thread scalability of CrsGraph::fillComplete
#832 may or may not matter directly for sparse matrix-matrix multiply, but it's reasonable to do both CrsMatrix and CrsGraph at the same time, since they share much in common.