MueLu: catch-all for KokkosRefactor branch tasks
Created by: aprokop
- Introduce a performance monitoring framework to monitor performance results Could be similar to interface test. Could use performance file structure.
- Implement new smoothed prolongator construction [#1743]
- Kokkosify AmalgamationInfo It should be the only place to do node <-> dof transformations
- Optimize block CoalesceDrop
Get rid of
- Replace LWGraph construction by a wrapper?
- Add a single value set function specialized on the device. This should allow us to skip initializing even rows in crs graph.
- Check Chebyshev smoother setup difference between Serial and OMP_NUM_THREADS=1
KOKKOS_FORCEINLINE_FUNCTION? I'm not sure when to use it instead of regular
- Optimize block Tentative P Even after the rewrite, the Tentative P is dog slow compared to regular. Is it because of shared memory?
getLocalDiagOffsetswith the new
StaticCrsGraph::rowConst()be faster? It would not need to create a subview...
- Distance Laplacian functor
Parallelize loops in
Bypass amalgamation for
Get rid of
Get rid of subview creation in
ViewAllocateWithoutInitializing? See kokkos/kokkos#1073
FIXME_KOKKOScomments to code to indicate things to look at.
- Fix unit tests See #1686 (closed)