Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2016-11-02T19:46:18Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/174Tpetra: Document 3-Map finite-element global assembly use pattern2016-11-02T19:46:18ZJames WillenbringTpetra: Document 3-Map finite-element global assembly use pattern*Created by: mhoemmen*
@trilinos/tpetra @amklinv The three Maps in question refer to mesh points. This is for a finite-element code where elements are uniquely owned by processes, but mesh points or other discretization goodies associa...*Created by: mhoemmen*
@trilinos/tpetra @amklinv The three Maps in question refer to mesh points. This is for a finite-element code where elements are uniquely owned by processes, but mesh points or other discretization goodies associated with elements may be shared by multiple processes. In the text below, I'll assume that degrees of freedom live on mesh points, but the same considerations apply for degrees of freedom that live on edges.
1. Uniquely owned (nonoverlapping Map), with mesh points that my MPI process owns
2. Overlapping Map, with mesh points belonging to elements that my MPI processes owns
3. Overlapping Map, with mesh points connected to points in (1) or (2)
Map (3) is the column Map of the sparse graph / matrix. Use replaceColumnMap if necessary. We will fill out this pattern in more detail in discussion of this issue. It has already proven useful for at least three different applications, two of which use BlockCrsMatrix, and two of which use CrsMatrix.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/157Tpetra::Map should check for duplicate GIDs on a calling process2016-11-02T19:44:52ZJames WillenbringTpetra::Map should check for duplicate GIDs on a calling process*Created by: mhoemmen*
See https://github.com/spdomin/Nalu/commit/b11ced445ea2aa7d22a4b2ac99a6eb2a101c7eeb
*Created by: mhoemmen*
See https://github.com/spdomin/Nalu/commit/b11ced445ea2aa7d22a4b2ac99a6eb2a101c7eeb
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/122Tpetra: getGlobalRowCopy should return an empty row instead of throwing if no...2018-01-23T20:11:28ZJames WillenbringTpetra: getGlobalRowCopy should return an empty row instead of throwing if nonowned row*Created by: mhoemmen*
@trilinos/tpetra
If users give getGlobalRowCopy a row that does not live on the calling process, the method should return an empty row, rather than throwing an exception. This change will make its behavior more...*Created by: mhoemmen*
@trilinos/tpetra
If users give getGlobalRowCopy a row that does not live on the calling process, the method should return an empty row, rather than throwing an exception. This change will make its behavior more consistent with that of replace\* and sumInto*.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/119Tpetra: Graph / matrix insert doesn't merge, taking extra space & hindering t...2018-12-12T22:02:12ZJames WillenbringTpetra: Graph / matrix insert doesn't merge, taking extra space & hindering thread parallelism*Created by: mhoemmen*
@trilinos/tpetra
CrsGraph::insert{Local,Global}Indices and CrsMatrix::insert{Local,Global}Values currently do something nonintuitive: multiple inserts to the same row and column index are stored separately and n...*Created by: mhoemmen*
@trilinos/tpetra
CrsGraph::insert{Local,Global}Indices and CrsMatrix::insert{Local,Global}Values currently do something nonintuitive: multiple inserts to the same row and column index are stored separately and not merged until fillComplete. For example, inserting (1,1) into a CrsGraph 10 times would require storing 10 entries, until fillComplete, at which point the entries get merged together into a single entry. This is especially bad for StaticProfile, which currently would counterintuitively fail on 9 of those 10 inserts if the user reasonably gave CrsGraph an upper bound of 1 entry per row. We don't want users to have to rely on DynamicProfile, which is both slow and (especially due to this issue) memory-intensive.
Commit 68e77d53dbc250add4680244676e47576e6b7e4f begins the process of fixing this. It does not yet change the behavior of CrsGraph or CrsMatrix. For now, Tpetra has new internal utility functions for merging indices (for CrsGraph) or indices and values together (for CrsMatrix). I also added some unit tests for the new functions. However, they still need to be integrated into CrsGraph and CrsMatrix. My initial attempts broke a lot of invariants and made a lot of tests fail. I realize I'll have to do this VERY cautiously.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/61Tpetra::Experimental::GEMM: Fix for mode = "C"(onjugate Transpose)2017-10-26T19:34:46ZJames WillenbringTpetra::Experimental::GEMM: Fix for mode = "C"(onjugate Transpose)*Created by: mhoemmen*
@trilinos/tpetra @amklinv
GEMM currently only implements the Non-Transpose ("N") and Transpose ("T") modes, not the Conjugate Transpose ("C") mode. The GEMM interface has no way to return an error for "not impl...*Created by: mhoemmen*
@trilinos/tpetra @amklinv
GEMM currently only implements the Non-Transpose ("N") and Transpose ("T") modes, not the Conjugate Transpose ("C") mode. The GEMM interface has no way to return an error for "not implemented," and can't throw an exception, so we unfortunately do have to implement all the options.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/56Tpetra: Replace Tpetra_DefaultNode CMake option with Tpetra_DefaultExecutionS...2017-11-29T19:08:35ZJames WillenbringTpetra: Replace Tpetra_DefaultNode CMake option with Tpetra_DefaultExecutionSpace*Created by: mhoemmen*
@trilinos/tpetra @rppawlo @crtrott
Tpetra will get rid of Node altogether at some point, and tie itself completely to Kokkos' execution and memory spaces. It would make sense to deprecate the Tpetra_DefaultNode...*Created by: mhoemmen*
@trilinos/tpetra @rppawlo @crtrott
Tpetra will get rid of Node altogether at some point, and tie itself completely to Kokkos' execution and memory spaces. It would make sense to deprecate the Tpetra_DefaultNode CMake option and replace it with Tpetra_DefaultExecutionSpace.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/57Tpetra: Replace Node with Kokkos space2018-09-06T18:21:35ZJames WillenbringTpetra: Replace Node with Kokkos space*Created by: mhoemmen*
@trilinos/tpetra
This depends on #56 and #505.
Get rid of Node entirely. Replace with Kokkos space.
*Created by: mhoemmen*
@trilinos/tpetra
This depends on #56 and #505.
Get rid of Node entirely. Replace with Kokkos space.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/43Tpetra: Consolidate reduceAll in noncontiguous Map constructor2017-10-26T19:27:00ZJames WillenbringTpetra: Consolidate reduceAll in noncontiguous Map constructor*Created by: DrBooom*
@trilinos/tpetra
The noncontiguous Tpetra::Map constructor has a reduceAll at line 465 of Tpetra_Map_def.hpp:
```
if (numGlobalElements != GSTI) {
numGlobalElements_ = numGlobalElements; // Use the user's...*Created by: DrBooom*
@trilinos/tpetra
The noncontiguous Tpetra::Map constructor has a reduceAll at line 465 of Tpetra_Map_def.hpp:
```
if (numGlobalElements != GSTI) {
numGlobalElements_ = numGlobalElements; // Use the user's value.
} else { // The user wants us to compute the sum.
reduceAll<int, GST> (*comm, REDUCE_SUM, as<GST> (numLocalElements),
outArg (numGlobalElements_));
}
```
And one at line 616:
```
GO minMaxOutput[3];
minMaxOutput[0] = 0;
minMaxOutput[1] = 0;
minMaxOutput[2] = 0;
reduceAll<int, GO> (*comm, REDUCE_MAX, 3, minMaxInput, minMaxOutput);
minAllGID_ = -minMaxOutput[0];
maxAllGID_ = minMaxOutput[1];
const GO globalDist = minMaxOutput[2];
```
Mark thinks that these could be fused into a single call. Since this Map constructor is one of the most expensive and frequent calls in the R4-5 scaling, any reduction in all-reduce calls will help.
Tpetra-backlog