KokkosKernels: CrsMatrix sumIntoValuesSorted minor questions
Created by: mhoemmen
@bathmatt 's commit https://github.com/trilinos/Trilinos/commit/1e65cffac05ae95bbc4e6f0d6d4d428886704834 added a sumIntoValuesSorted method to KokkosSparse::CrsMatrix. I have a few comments and questions:
- We should introduce the "hint" that Epetra and Tpetra use for optimizing search for multiple column indices. It introduces an extra branch per input index, but avoids search for common cases. @etphipp first implemented it in Tpetra and found it to be useful, and
findRelOffset
(in tpetra/core/src/Tpetra_Util.hpp) does it too. - It's legit to use
ordinal_type
(32-bit) instead ofsize_type
(64-bit on everything but CUDA) for the difference between two consecutive row offsets, as long as the row doesn't have too many duplicate entries. SparseRowView(Const) already usesordinal_type
for the row length, for this reason. - Was there a particular reason for the
hi - low > 10
cut-off, or is that just a good guess?