Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2017-10-27T04:10:00Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/690AllowPadding and WithoutInitializing moving to Kokkos namespace change downst...2017-10-27T04:10:00ZJames WillenbringAllowPadding and WithoutInitializing moving to Kokkos namespace change downstream Trilinos code accordingly*Created by: mhoemmen*
Kokkos/develop has moved AllowPadding and WithoutInitializing out of the Kokkos::Experimental namespace, into the Kokkos namespace. Once this gets moved into Kokkos/master and snapshotted into Trilinos, change Tr...*Created by: mhoemmen*
Kokkos/develop has moved AllowPadding and WithoutInitializing out of the Kokkos::Experimental namespace, into the Kokkos namespace. Once this gets moved into Kokkos/master and snapshotted into Trilinos, change Trilinos downstream code accordingly.
https://github.com/kokkos/kokkos/issues/325
@trilinos/tpetra
@trilinos/stokhos
@trilinos/sacado
@trilinos/shylu
@trilinos/stk
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/688Tpetra: Add TPETRA_DEBUG environment variable2017-10-25T15:55:11ZJames WillenbringTpetra: Add TPETRA_DEBUG environment variable*Created by: mhoemmen*
@trilinos/tpetra
This depends on #654.
*Created by: mhoemmen*
@trilinos/tpetra
This depends on #654.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/684Tpetra: Add TPETRA_USE_BLAS environment variable2017-10-27T04:05:48ZJames WillenbringTpetra: Add TPETRA_USE_BLAS environment variable*Created by: mhoemmen*
@trilinos/tpetra This depends on #654.
The point of this issue is that Tpetra might not be able to trust the BLAS implementation to be threaded. If TPETRA_USE_BLAS is true (has the value 1, or just plain is de...*Created by: mhoemmen*
@trilinos/tpetra This depends on #654.
The point of this issue is that Tpetra might not be able to trust the BLAS implementation to be threaded. If TPETRA_USE_BLAS is true (has the value 1, or just plain is defined?), Tpetra should defer to the BLAS for GEMV and GEMM. Otherwise, Tpetra should implement threading for these methods. Tpetra may choose to call the underlying GEMV / GEMM implementation for single-thread kernels.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/682Tpetra::Crs{Graph,Matrix}: Add local_offset_type typedef2016-11-02T20:11:36ZJames WillenbringTpetra::Crs{Graph,Matrix}: Add local_offset_type typedef*Created by: mhoemmen*
@trilinos/tpetra Per request by @kddevin (see #674 discussion), add a `local_offset_type` typedef to Tpetra::CrsGraph and Tpetra::CrsMatrix. This type tells users the type that Tpetra uses to store row offsets, i...*Created by: mhoemmen*
@trilinos/tpetra Per request by @kddevin (see #674 discussion), add a `local_offset_type` typedef to Tpetra::CrsGraph and Tpetra::CrsMatrix. This type tells users the type that Tpetra uses to store row offsets, in the local sparse graph / matrix. The `local_` prefix makes it clear that this refers to the _local_ data structure.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/660Tpetra::CrsGraph: 2- or 3-level thread parallelization of sortAllIndices & me...2017-10-26T20:30:14ZJames WillenbringTpetra::CrsGraph: 2- or 3-level thread parallelization of sortAllIndices & mergeAllIndices*Created by: mhoemmen*
@trilinos/tpetra Do a 2-level or 3-level thread parallelization of Tpetra::CrsGraph methods sortAllIndices and mergeAllIndices.
This is a "story" because this may call for a thread-parallel segmented sort, or s...*Created by: mhoemmen*
@trilinos/tpetra Do a 2-level or 3-level thread parallelization of Tpetra::CrsGraph methods sortAllIndices and mergeAllIndices.
This is a "story" because this may call for a thread-parallel segmented sort, or segmented sort-and-merge.
Update (12 Nov 2016): I rewrote this issue to reflect a multiple-step process. See #832. The first step will be a single-level thread parallelization. The second step (likely done at the same time) will be to remove any implicit UVM assumptions that the methods may make. The third step would be this issue, a 2-level or 3-level parallelization that relies on a segmented sort (which does not exist yet; see #662).Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/659Tpetra::Map::getRemoteIndexList: Thread-parallelize it & remove any UVM assum...2017-10-26T20:29:11ZJames WillenbringTpetra::Map::getRemoteIndexList: Thread-parallelize it & remove any UVM assumptions*Created by: mhoemmen*
@trilinos/tpetra Thread-parallelize Tpetra::Map::getRemoteIndexList and remove any UVM assumptions.
*Created by: mhoemmen*
@trilinos/tpetra Thread-parallelize Tpetra::Map::getRemoteIndexList and remove any UVM assumptions.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/630Tpetra::{Import, Export}: Add no-communication constructor2017-09-14T19:53:59ZJames WillenbringTpetra::{Import, Export}: Add no-communication constructor*Created by: mhoemmen*
@trilinos/tpetra
This blocks #628. See that issue for details.
*Created by: mhoemmen*
@trilinos/tpetra
This blocks #628. See that issue for details.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/628Tpetra::CrsGraph: Fuse column Map and Import construction2018-05-15T23:30:08ZJames WillenbringTpetra::CrsGraph: Fuse column Map and Import construction*Created by: mhoemmen*
@trilinos/tpetra Tpetra::CrsGraph constructs its column Map, if it doesn't already have one, in makeColMap(). This method needs information that could be reused to build the Import more efficiently, but CrsGraph ...*Created by: mhoemmen*
@trilinos/tpetra Tpetra::CrsGraph constructs its column Map, if it doesn't already have one, in makeColMap(). This method needs information that could be reused to build the Import more efficiently, but CrsGraph currently throws away this information. Here is a comment from the inside of makeColMap() that explains:
```
// FIXME (mfh 03 Apr 2013) Now would be a good time to use the
// information we collected above to construct the Import. In
// particular, building an Import requires:
//
// 1. numSameIDs (length of initial contiguous sequence of GIDs
// on this process that are the same in both Maps; this
// equals the number of domain Map elements on this process)
//
// 2. permuteToLIDs and permuteFromLIDs (both empty in this
// case, since there's no permutation going on; the column
// Map starts with the domain Map's GIDs, and immediately
// after them come the remote GIDs)
//
// 3. remoteGIDs (exactly those GIDs that we found out above
// were not in the domain Map) and remoteLIDs (which we could
// have gotten above by using the three-argument version of
// getRemoteIndexList() that computes local indices as well
// as process ranks, instead of the two-argument version that
// was used above)
//
// 4. remotePIDs (which we have from the getRemoteIndexList()
// call above)
//
// 5. Sorting remotePIDs, and applying that permutation to
// remoteGIDs and remoteLIDs (by calling sort3 above instead
// of sort2)
//
// 6. Everything after the sort3 call in Import::setupExport():
// a. Create the Distributor via createFromRecvs(), which
// computes exportGIDs and exportPIDs
// b. Compute exportLIDs from exportGIDs (by asking the
// source Map, in this case the domain Map, to convert
// global to local)
//
// Steps 1-5 come for free, since we must do that work anyway in
// order to compute the column Map. In particular, Step 3 is
// even more expensive than Step 6a, since it involves both
// creating and using a new Distributor object.
```
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/627Tpetra::CrsMatrix: Store matrix in such a way as to allow overlap of communic...2017-10-26T20:25:13ZJames WillenbringTpetra::CrsMatrix: Store matrix in such a way as to allow overlap of communication & computation*Created by: mhoemmen*
@trilinos/tpetra
Epic: #767.
This blocks #385.
For example, we could keep an extra row offsets array (where remotes start in each row) in CrsGraph.
*Created by: mhoemmen*
@trilinos/tpetra
Epic: #767.
This blocks #385.
For example, we could keep an extra row offsets array (where remotes start in each row) in CrsGraph.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/606Tpetra::Crs{Graph,Matrix}: Either remove getGlobalNumEntries & getGlobalMaxNu...2018-05-08T18:52:15ZJames WillenbringTpetra::Crs{Graph,Matrix}: Either remove getGlobalNumEntries & getGlobalMaxNumRowEntries, or make them global collectives*Created by: mhoemmen*
@trilinos/tpetra
@trilinos/amesos2 @trilinos/ifpack2 @trilinos/muelu @trilinos/xpetra @trilinos/zoltan2
Tpetra::CrsGraph and Tpetra::CrsMatrix provide three methods, getGlobalNumDiags, getGlobalNumEntries, a...*Created by: mhoemmen*
@trilinos/tpetra
@trilinos/amesos2 @trilinos/ifpack2 @trilinos/muelu @trilinos/xpetra @trilinos/zoltan2
Tpetra::CrsGraph and Tpetra::CrsMatrix provide three methods, getGlobalNumDiags, getGlobalNumEntries, and getGlobalMaxNumRowEntries. These methods do _not_ currently have collective semantics. Thus, it must be correct to call them at any time (when the graph / matrix is fillComplete), on any process. This implies that the graph / matrix must compute them via all-reduce at first fillComplete. This increases set-up cost.
Most users don't need to know the global (over all MPI processes in the communicator) number of entries, or diagonal entries. Some users might; for example, Amesos2 might want to know the global number of entries in order to prepare enough space to gather in the matrix for a direct solve. However, those users can do the all-reduce themselves, and save and reuse its result as part of the "symbolic factorization" set-up phase.
Thus, I think it would be good to deprecate and remove these methods. @csiefer2 talked about another option, namely to change the methods to have collective semantics that cache the value on first call and clear the cache at resumeFill (if the graph's structure can change). Please explain which option you would prefer here.
Edit (21 Dec 2016): I changed the issue title, to make clear the consequences of a fix, and edited the text a bit to give affected packages a chance to offer feedback for their preferred solution.Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/608Tpetra::Distributor: Split out execution of comm pattern into separate class 2017-10-26T20:19:36ZJames WillenbringTpetra::Distributor: Split out execution of comm pattern into separate class *Created by: mhoemmen*
@trilinos/tpetra
Tpetra::Distributor currently combines setting up a communication pattern, with executing that pattern. Methods for executing the communication pattern are templated on Packet type.
We might l...*Created by: mhoemmen*
@trilinos/tpetra
Tpetra::Distributor currently combines setting up a communication pattern, with executing that pattern. Methods for executing the communication pattern are templated on Packet type.
We might like to have different Distributor back-ends, in order to support communication protocols other than MPI 2-sided. (For example, we could have an MPI 1-sided implementation, or a PGAS implementation, or a Kokkos-wrapping-PGAS implementation.) This implies a base class with subclasses for the different implementations.
In order to make execution of a communication plan happen through virtual methods, we would need to template the class on Packet, not the methods. (Templated methods can't be virtual.) However, the setup code does _not_ depend on the Packet type. We would not want to build all that setup code redundantly for all Packet types (there are a lot of them!).
This suggests splitting the setup code into a separate class from the execution code.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/616Tpetra::Map: Deprecate & remove getIndexBase() method2017-10-26T20:23:57ZJames WillenbringTpetra::Map: Deprecate & remove getIndexBase() method*Created by: mhoemmen*
@trilinos/xpetra @trilinos/tpetra @tawiesn @csiefer2
Issues like #613 point out the uselessness of the Tpetra::Map::getIndexBase() method. Users often think that "index base" means "0 for C or C++; 1 for Fortra...*Created by: mhoemmen*
@trilinos/xpetra @trilinos/tpetra @tawiesn @csiefer2
Issues like #613 point out the uselessness of the Tpetra::Map::getIndexBase() method. Users often think that "index base" means "0 for C or C++; 1 for Fortran." This is _not_ the case; Tpetra::Map confusingly requires that the index base equal the globally minimum global index. (Tpetra inherited this requirement from Epetra; Epetra_Map also has this requirement.) This makes the index base redundant. Even if Tpetra::Map didn't have this requirement, the getIndexBase() method serves no useful purpose.
The indexBase argument to the contiguous Tpetra::Map constructors _does_ serve a useful purpose, because it tells the Map the globally minimum global index. However, after construction, users can just ask the Map for its globally minimum global index.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/581Tpetra::CrsMatrix: Deprecate and remove localSolve2018-06-08T17:16:25ZJames WillenbringTpetra::CrsMatrix: Deprecate and remove localSolve*Created by: mhoemmen*
@trilinos/tpetra @trilinos/ifpack2
As discussed in #514, sparse triangular solve properly belongs to a solver or preconditioner package, rather than to Tpetra. Removing this method would reduce the amount of co...*Created by: mhoemmen*
@trilinos/tpetra @trilinos/ifpack2
As discussed in #514, sparse triangular solve properly belongs to a solver or preconditioner package, rather than to Tpetra. Removing this method would reduce the amount of code to build in Tpetra. It would also give us the freedom to reduce the cost of fillComplete, by not computing local constants (like whether the matrix is upper or lower triangular).
Tpetra::CrsMatrix::localSolve does give Tpetra users a way to solve triangular systems where the vectors have a different Scalar type than the matrix. However, KokkosSparse::trsv already exposes this functionality. Furthermore, putting a local sparse triangular solve in Tpetra may confuse users who think of Tpetra as offering global computational kernels, and who don't understand the difference between "local" and "global."
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/564Tpetra::Map constructors: If node is null, make default Node2017-03-27T20:15:45ZJames WillenbringTpetra::Map constructors: If node is null, make default Node*Created by: mhoemmen*
This came up for me in the Matrix Market Map reader, when debugging #558.
*Created by: mhoemmen*
This came up for me in the Matrix Market Map reader, when debugging #558.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/555Tpetra: Make it possible to Import/Export a RowGraph into a CrsGraph2016-11-02T20:00:46ZJames WillenbringTpetra: Make it possible to Import/Export a RowGraph into a CrsGraph*Created by: mhoemmen*
@kddevin writes:
> I would like to redistribute a Tpetra::RowGraph into a new Tpetra::CrsGraph.
>
> To distribute a Tpetra::RowMatrix into a new Tpetra::CrsMatrix, we can use CrsMatrix::doImport, as both CrsMatr...*Created by: mhoemmen*
@kddevin writes:
> I would like to redistribute a Tpetra::RowGraph into a new Tpetra::CrsGraph.
>
> To distribute a Tpetra::RowMatrix into a new Tpetra::CrsMatrix, we can use CrsMatrix::doImport, as both CrsMatrix and RowMatrix are SrcDistObject and, thus, the RowMatrix source can be the first argument of CrsMatrix::doImport.
>
> But for RowGraph, we cannot use CrsGraph::doImport, as RowGraph is not a SrcDistObject, even though CrsGraph, RowMatrix, and CrsMatrix are all SrcDistObject.
>
> Why isn't RowGraph a SrcDistObject? Is there a better way to do the redistribution?
CrsGraph's implementation of DistObject methods (in particular, checkSizes, copyAndPermute, and packAndPrepare) all use the RowGraph interface. Furthermore, RowGraph implements Packable. Thus, all we need to do is make RowGraph inherit from SrcDistObject.
This is TRIVIAL since SrcDistObject implements no methods other than a virtual destructor. Would you like to do it?
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/553Tpetra::{Export, Import}: Implement 2-arg "copy constructor" like MultiVector's2016-11-02T20:00:39ZJames WillenbringTpetra::{Export, Import}: Implement 2-arg "copy constructor" like MultiVector's*Created by: mhoemmen*
#531 depends on this (the CrsGraph may store an Import and/or Export; it may make sense to deep-copy them, if the graph's structure may change).
*Created by: mhoemmen*
#531 depends on this (the CrsGraph may store an Import and/or Export; it may make sense to deep-copy them, if the graph's structure may change).
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/532Tpetra::CrsMatrix: Implement 2-arg "copy constructor" like MultiVector's2017-10-27T04:07:44ZJames WillenbringTpetra::CrsMatrix: Implement 2-arg "copy constructor" like MultiVector's*Created by: mhoemmen*
@trilinos/tpetra Give Tpetra::CrsMatrix a two-argument "copy constructor" like MultiVector's. It should take an optional second argument that tells it whether to make a deep copy. If the graph is a const graph, ...*Created by: mhoemmen*
@trilinos/tpetra Give Tpetra::CrsMatrix a two-argument "copy constructor" like MultiVector's. It should take an optional second argument that tells it whether to make a deep copy. If the graph is a const graph, it should be able just to copy the values.
If users want a "true deep copy" of the graph as well, the fix for #531 will offer that.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/531Tpetra::CrsGraph: Implement 2-arg "copy constructor" like MultiVector's2017-10-27T04:07:56ZJames WillenbringTpetra::CrsGraph: Implement 2-arg "copy constructor" like MultiVector's*Created by: mhoemmen*
@trilinos/tpetra Give Tpetra::CrsGraph a two-argument "copy constructor" like MultiVector's. It should take an optional second argument that tells it whether to make a deep copy.
*Created by: mhoemmen*
@trilinos/tpetra Give Tpetra::CrsGraph a two-argument "copy constructor" like MultiVector's. It should take an optional second argument that tells it whether to make a deep copy.
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/505Tpetra: Make it possible to use a Kokkos::Device or Kokkos execution space in...2017-10-26T20:07:17ZJames WillenbringTpetra: Make it possible to use a Kokkos::Device or Kokkos execution space in place of Node*Created by: mhoemmen*
@trilinos/tpetra
#57 depends on this.
1. Move Tpetra objects into an inner, hidden namespace.
2. Use C++11 type aliases in the Tpetra namespace, so supplying a Kokkos::Device or Kokkos execution space in place of...*Created by: mhoemmen*
@trilinos/tpetra
#57 depends on this.
1. Move Tpetra objects into an inner, hidden namespace.
2. Use C++11 type aliases in the Tpetra namespace, so supplying a Kokkos::Device or Kokkos execution space in place of the Node works
For example (default template parameters omitted for brevity):
```
namespace Tpetra {
namespace Impl {
template<class S, class LO, class GO, class Node>
class MultiVector { /* the actual implementation */ };
} // namespace Impl
template<class S, class LO, class GO, class ExecSpace, class MemSpace>
using MultiVector = Impl::MultiVector<S, LO, GO, Kokkos::Compat::KokkosDeviceWrapperNode<ExecSpace, MemSpace> >;
} // namespace Tpetra
```
Tpetra-backloghttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/448Tpetra::CrsMatrix: Use new KokkosKernels sumInto where possible2017-10-26T20:06:54ZJames WillenbringTpetra::CrsMatrix: Use new KokkosKernels sumInto where possible*Created by: mhoemmen*
@trilinos/tpetra See #369 and #447. When @bathmatt added sumIntoValuesSorted to KokkosSparse::CrsMatrix, there was some controversy about whether it was adequately tested. The best way to test it (and optimize i...*Created by: mhoemmen*
@trilinos/tpetra See #369 and #447. When @bathmatt added sumIntoValuesSorted to KokkosSparse::CrsMatrix, there was some controversy about whether it was adequately tested. The best way to test it (and optimize it) would be to make Tpetra::CrsMatrix use it, where that is possible.
This requires implementing the search "hint" (see #369 discussion). I'm doing this by changing those methods to call the existing and tested findRelOffset function. Tpetra::Crs{Graph, Matrix} already test this, and the implementation works even if the graph or matrix has never been fill complete.
Tpetra-backlog