Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2017-10-26T21:12:55Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1770Tpetra: Create methods to test runtime sized scalar types2017-10-26T21:12:55ZJames WillenbringTpetra: Create methods to test runtime sized scalar types*Created by: tjfulle*
This issue is to discuss and implement methods to test runtime sized scalar types as used in @trilinos/stokhos, without having to build all the way through to Stokhos.
@trilinos/tpetra
@mhoemmen
@etphipp *Created by: tjfulle*
This issue is to discuss and implement methods to test runtime sized scalar types as used in @trilinos/stokhos, without having to build all the way through to Stokhos.
@trilinos/tpetra
@mhoemmen
@etphipp Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1256Tpetra::Details::Profiling: Fix if KOKKOS_ENABLE_PROFILING not defined2017-04-20T18:01:02ZJames WillenbringTpetra::Details::Profiling: Fix if KOKKOS_ENABLE_PROFILING not defined*Created by: mhoemmen*
@trilinos/tpetra
@pwxy
Apparently it is possible to disable Kokkos Profiling, though it is enabled by default and was specifically designed to have low overhead such that people could reliably leave it enable...*Created by: mhoemmen*
@trilinos/tpetra
@pwxy
Apparently it is possible to disable Kokkos Profiling, though it is enabled by default and was specifically designed to have low overhead such that people could reliably leave it enabled. Fix Tpetra::Details::Profiling for that nondefault, generally counterindicated case, by doing nothing if `KOKKOS_ENABLE_PROFILING` is not defined.Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1203Tpetra::CrsGraph: Add another constructor that takes a Kokkos::StaticCrsGraph...2018-05-15T23:15:29ZJames WillenbringTpetra::CrsGraph: Add another constructor that takes a Kokkos::StaticCrsGraph & all 4 Maps (not just row and column)*Created by: mhoemmen*
@trilinos/tpetra
Requested by @alanw0 .*Created by: mhoemmen*
@trilinos/tpetra
Requested by @alanw0 .Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1195Tpetra: Use new cuBLAS API for dense matrix-matrix multiply2019-02-17T22:30:47ZJames WillenbringTpetra: Use new cuBLAS API for dense matrix-matrix multiply*Created by: mhoemmen*
@trilinos/tpetra
See #1194 for justification and discussion.
# Use new cuBLAS API
The cuBLAS manual says that "Starting with version 4.0, the cuBLAS Library provides a new updated API, in addition to the e...*Created by: mhoemmen*
@trilinos/tpetra
See #1194 for justification and discussion.
# Use new cuBLAS API
The cuBLAS manual says that "Starting with version 4.0, the cuBLAS Library provides a new updated API, in addition to the existing legacy API. . . . The new cuBLAS library API can be used by including the header file 'cublas_v2.h'."
http://docs.nvidia.com/cuda/cublas/
The manual also says: "In general, new applications should not use the legacy cuBLAS API, and existing existing applications should convert to using the new API if it requires sophisticated and optimal stream parallelism or if it calls cuBLAS routines concurrently from multiple threads."
# Support concurrent tasks
One Tpetra goal is to support use of multiple Kokkos execution spaces (e.g., CUDA streams) concurrently, in order to support task parallelism. Switching to the new cuBLAS API, that takes a "context" handle, will help with that. Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1080Tpetra: CooMatrix test failing on CUDA2017-02-21T23:06:30ZJames WillenbringTpetra: CooMatrix test failing on CUDA*Created by: mhoemmen*
@trilinos/tpetra
Tpetra::Details::CooMatrix test is failing on CUDA with the following message:
```
Create Export object
Call doExport on CooMatrix
p=0: *** Caught standard std::exception of type 'std::r...*Created by: mhoemmen*
@trilinos/tpetra
Tpetra::Details::CooMatrix test is failing on CUDA with the following message:
```
Create Export object
Call doExport on CooMatrix
p=0: *** Caught standard std::exception of type 'std::runtime_error' :
Calling sync on a DualView with a const datatype.
```
I think I know how to fix this.Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1031Tpetra: Add function for "dealing out" sparse matrix triples from a Matrix Ma...2017-02-02T06:30:05ZJames WillenbringTpetra: Add function for "dealing out" sparse matrix triples from a Matrix Market file*Created by: mhoemmen*
@trilinos/tpetra
Story: #353 *Created by: mhoemmen*
@trilinos/tpetra
Story: #353 Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1025Tpetra: Implement sparse matrix class that communicates by triples, for I/O2017-01-29T06:07:30ZJames WillenbringTpetra: Implement sparse matrix class that communicates by triples, for I/O*Created by: mhoemmen*
@trilinos/tpetra
Story: #353 *Created by: mhoemmen*
@trilinos/tpetra
Story: #353 Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1021Tpetra: Matrix Market writeSparse assumes input row Map is one to one2017-01-25T22:29:24ZJames WillenbringTpetra: Matrix Market writeSparse assumes input row Map is one to one*Created by: mhoemmen*
@trilinos/tpetra
Tpetra::MatrixMarket::writeSparse (and writeSparseFile) assumes that the input matrix's row Map is one to one (not overlapping). This is because it does an Import from the input matrix, to a ...*Created by: mhoemmen*
@trilinos/tpetra
Tpetra::MatrixMarket::writeSparse (and writeSparseFile) assumes that the input matrix's row Map is one to one (not overlapping). This is because it does an Import from the input matrix, to a matrix with a gathered row Map (all indices on Process 0).
If the gathered row Map is always one to one, we could fix this easily by using an Export instead of an Import.
@vbrunini reported this issue, via CC by @kddevin .Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1006Add gatherv wrapper to Teuchos2017-10-26T20:52:36ZJames WillenbringAdd gatherv wrapper to Teuchos*Created by: mhoemmen*
@trilinos/teuchos @trilinos/tpetra
@dridzal requested a gatherv wrapper for either Teuchos or Tpetra. Tpetra has an implementation of this wrapper now, but it currently lives in an anonymous namespace. @drid...*Created by: mhoemmen*
@trilinos/teuchos @trilinos/tpetra
@dridzal requested a gatherv wrapper for either Teuchos or Tpetra. Tpetra has an implementation of this wrapper now, but it currently lives in an anonymous namespace. @dridzal would like to see it promoted to the public interface. See commit b5b1b8f09812500d5bcbd53e9b2a99ead10823de.
It would be easy for just about any Trilinos developer without much Tpetra or Teuchos knowledge to promote Tpetra's current internal implementation to a public interface, and add a unit test. I'll be happy to review and accept pull requests.Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/994Belos: Change GMRES default orthogonalizer from DGKS to ICGS 2-pass2019-03-22T09:43:04ZJames WillenbringBelos: Change GMRES default orthogonalizer from DGKS to ICGS 2-pass*Created by: mhoemmen*
@trilinos/belos @jjellio @hkthorn *Created by: mhoemmen*
@trilinos/belos @jjellio @hkthorn Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/964Kokkos::StaticCrsGraph does not use memory space template parameter passed to...2017-01-10T20:54:12ZJames WillenbringKokkos::StaticCrsGraph does not use memory space template parameter passed to KokkosSparse::CrsMatrix*Created by: mndevec*
I am providing a device to CrsMatrix in the constructor. This is different than default, where device is either Kokkos::Cuda with hostpinned space, or Kokkos::OpenMP with Kokkos::Hostspace when HBM is enabled.
...*Created by: mndevec*
I am providing a device to CrsMatrix in the constructor. This is different than default, where device is either Kokkos::Cuda with hostpinned space, or Kokkos::OpenMP with Kokkos::Hostspace when HBM is enabled.
I was expecting CrsMatrix memory to be allocated at the memory space I provide. However, this holds only for the values view, while row pointers and entries are still allocated at the default memory space of the provided execution space.
It seems that StaticCrsGraph do not take the device as template argument, instead it is provided the execution space. It creates a default device, as a result allocated memories diverge for values and entries views.
Shouldn't StaticCrsGraph take the device as template argument instead of execution space?
@srajama1 @crtrott @mhoemmen
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/962Tpetra: Avoid atomic ops for unpack if no duplicate LIDs2019-02-17T23:45:14ZJames WillenbringTpetra: Avoid atomic ops for unpack if no duplicate LIDs*Created by: mhoemmen*
@trilinos/tpetra
Epic: #796
This story matters to other epics besides #796.
Thread-parallel implementations of Tpetra::DistObject::unpackAndCombine(New) do not need to do atomic updates if the result LIDs ha...*Created by: mhoemmen*
@trilinos/tpetra
Epic: #796
This story matters to other epics besides #796.
Thread-parallel implementations of Tpetra::DistObject::unpackAndCombine(New) do not need to do atomic updates if the result LIDs have no duplicates (meaning that at most one thread will ever modify any one value of the destination DistObject at a time). "Result LIDs have no duplicates" is a property of the Import / Export object, so the Import / Export object should remember this at construction time for reuse.
This matters because atomic updates have a run-time cost, even if there is no contention between threads. Sparse matrix-vector multiply does an Import on the input MultiVector. In the common case, the result LIDs in this Import should have no duplicates. (This is because the column Map is constructed that way, if users let CrsGraph or CrsMatrix construct the column Map.) Thus, atomic updates add unnecessary run time to this important kernel.
@jjellio and @tjfulle might be interested in this. I would like to fix this for MultiVector in a way that neatly encapsulates its unpack kernels (and ideally also its pack kernels). Fixing this issue should have the side effect of improving MPI-only performance, which is an important use case for many customers (who do not have hardware that requires MPI + threads for good performance).Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/960Tpetra: Feature request to add "min" CombineMode2017-10-26T20:50:03ZJames WillenbringTpetra: Feature request to add "min" CombineMode*Created by: ikalash*
@mhoemmen
@jrobbin
@trilinos/tpetra
I would like to request the addition of a method equivalent to Epetra_Min to the Tpetra exporter class. It is needed in the Albany code. *Created by: ikalash*
@mhoemmen
@jrobbin
@trilinos/tpetra
I would like to request the addition of a method equivalent to Epetra_Min to the Tpetra exporter class. It is needed in the Albany code. Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/959Tpetra::CrsMatrix: Feature request for replaceDiagonalValues and invRowSum2018-07-17T05:32:44ZJames WillenbringTpetra::CrsMatrix: Feature request for replaceDiagonalValues and invRowSum*Created by: ikalash*
@mhoemmen
@jrobbin
@trilinos/tpetra
I would like to request the addition of the following methods to the Tpetra::CrsMatrix class:
- replaceDiagonalValues
- invRowSum
These methods are needed in th...*Created by: ikalash*
@mhoemmen
@jrobbin
@trilinos/tpetra
I would like to request the addition of the following methods to the Tpetra::CrsMatrix class:
- replaceDiagonalValues
- invRowSum
These methods are needed in the Albany code. Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/958Tpetra::CrsMatrix: Possible bug in transposed apply2018-01-23T17:11:16ZJames WillenbringTpetra::CrsMatrix: Possible bug in transposed apply*Created by: mhoemmen*
@trilinos/tpetra
@ikalash reported the following:
> I am trying to use Tpetra::CrsMatrix apply method with a Teuchos::TRANS combine mode, and the method does not appear to be working correctly. I end up with...*Created by: mhoemmen*
@trilinos/tpetra
@ikalash reported the following:
> I am trying to use Tpetra::CrsMatrix apply method with a Teuchos::TRANS combine mode, and the method does not appear to be working correctly. I end up with a vector of zeros even though the operator is nonzero, nor is the input vector. If I set the combine mode to Teuchos::NO_TRANS, things work correctly. I assume Teuchos::TRANS has been tested, so perhaps I am doing something wrong, although I am not sure what it could be. Is there some caveat about the method’s usage with the TRANS combine mode? I printed the Boolean returned when calling hasTransposeApply() and it prints true.
@ikalash also sent data, which I'll post here.Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/953LinearSolverFactory: Add a way to add new solvers at run time2017-09-06T23:25:48ZJames WillenbringLinearSolverFactory: Add a way to add new solvers at run time*Created by: mhoemmen*
@trilinos/belos
Story: #748
For a given package's LinearSolverFactory, and for a particular template parameter combination, add a way to add new solvers at run time.
For example, if we implement Pipeline...*Created by: mhoemmen*
@trilinos/belos
Story: #748
For a given package's LinearSolverFactory, and for a particular template parameter combination, add a way to add new solvers at run time.
For example, if we implement Pipelined CG _just_ for Tpetra objects, we want a way to add this solver to the list of solvers that Belos' factory knows how to create.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/940Tpetra::CrsGraph::makeIndicesLocal: Don't 0-fill k_lclInds1D_ (StaticProfile ...2016-12-18T06:58:21ZJames WillenbringTpetra::CrsGraph::makeIndicesLocal: Don't 0-fill k_lclInds1D_ (StaticProfile case)*Created by: mhoemmen*
@trilinos/tpetra
In Tpetra::CrsGraph::makeIndicesLocal, in the StaticProfile and LocalOrdinal != GlobalOrdinal cases, don't zero-fill k_lclInds1D_ when allocating it. The subsequent, already thread-parallel con...*Created by: mhoemmen*
@trilinos/tpetra
In Tpetra::CrsGraph::makeIndicesLocal, in the StaticProfile and LocalOrdinal != GlobalOrdinal cases, don't zero-fill k_lclInds1D_ when allocating it. The subsequent, already thread-parallel conversion of global indices to local indices will initialize it.
Note that CrsGraph does not do first-touch allocation correctly for k_lclInds1D_, and this fix won't fix that. The correct way to do first-touch allocation would be to use the row offsets to iterate over the local indices, as if doing a sparse matrix-vector multiply.Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/939Tpetra::MultiVector: Isolate & do separate ETI for pack & unpack kernels2016-12-17T06:54:09ZJames WillenbringTpetra::MultiVector: Isolate & do separate ETI for pack & unpack kernels*Created by: mhoemmen*
@trilinos/tpetra
Pack and unpack kernels for Tpetra::MultiVector currently get built on the fly. It would make sense to isolate those kernels and use the ETI system to build them separately from Tpetra_MultiVec...*Created by: mhoemmen*
@trilinos/tpetra
Pack and unpack kernels for Tpetra::MultiVector currently get built on the fly. It would make sense to isolate those kernels and use the ETI system to build them separately from Tpetra_MultiVector_def.hpp.Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/925Tpetra shadow warning2016-12-14T00:50:38ZJames WillenbringTpetra shadow warning*Created by: kddevin*
Trilinos/packages/tpetra/core/src/Tpetra_Details_FixedHashTable_def.hpp:1043:82: warning: declaration of 'maxVal' shadows a member of 'this' [-Wshadow]
It appears maxVal is also a method in this class.
gcc ...*Created by: kddevin*
Trilinos/packages/tpetra/core/src/Tpetra_Details_FixedHashTable_def.hpp:1043:82: warning: declaration of 'maxVal' shadows a member of 'this' [-Wshadow]
It appears maxVal is also a method in this class.
gcc 4.7.2 compiler on linux
Priority for me is low, but this might trigger warnings=errors problems for others.Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/915Tpetra: Build warning in test, in non-MPI (SERIAL_RELEASE) build2016-12-10T03:10:41ZJames WillenbringTpetra: Build warning in test, in non-MPI (SERIAL_RELEASE) build*Created by: mhoemmen*
@trilinos/tpetra
I saw the following build warning in a Tpetra test, in a non-MPI (SERIAL_RELEASE) build.
```
.../Trilinos/packages/tpetra/core/test/Comm/isInterComm.cpp:80:13: warning: unused variable 'myRank'...*Created by: mhoemmen*
@trilinos/tpetra
I saw the following build warning in a Tpetra test, in a non-MPI (SERIAL_RELEASE) build.
```
.../Trilinos/packages/tpetra/core/test/Comm/isInterComm.cpp:80:13: warning: unused variable 'myRank' [-Wunused-variable]
const int myRank = origCommWrapped.getRank ();
^
.../Trilinos/packages/tpetra/core/test/Comm/isInterComm.cpp:81:13: warning: unused variable 'numProcs' [-Wunused-variable]
const int numProcs = origCommWrapped.getSize ();
```Tpetra-backlog