KokkosKernels: CrsMatrix sumIntoValuesSorted minor questions
Created by: mhoemmen
@bathmatt 's commit https://github.com/trilinos/Trilinos/commit/1e65cffac05ae95bbc4e6f0d6d4d428886704834 added a sumIntoValuesSorted method to KokkosSparse::CrsMatrix. I have a few comments and questions:
- We should introduce the "hint" that Epetra and Tpetra use for optimizing search for multiple column indices. It introduces an extra branch per input index, but avoids search for common cases. @etphipp first implemented it in Tpetra and found it to be useful, and
findRelOffset(in tpetra/core/src/Tpetra_Util.hpp) does it too.
- It's legit to use
ordinal_type(32-bit) instead of
size_type(64-bit on everything but CUDA) for the difference between two consecutive row offsets, as long as the row doesn't have too many duplicate entries. SparseRowView(Const) already uses
ordinal_typefor the row length, for this reason.
- Was there a particular reason for the
hi - low > 10cut-off, or is that just a good guess?