Tpetra,Ifpack2: Add GEMM for small dense blocks
Created by: mhoemmen
@trilinos/tpetra @amklinv @jhux2 @csiefer2
Ifpack2's incomplete LU factorization wants a matrix-matrix multiply (GEMM, in BLAS terms) for small dense matrices (the "blocks" in Tpetra's BlockCrsMatrix). The existing implementation (at the top of Ifpack2_Experimental_RBILUK_decl.hpp) assumes a particular layout of blocks.
This is like Issue #50 (closed), in that the eventual goal is to make all of these computational kernels available in KokkosKernels. We're starting by putting the kernels in the Tpetra::Experimental namespace.