Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2019-02-26T07:30:02Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4439MueLu "repartition: heuristic target rows per process" not found2019-02-26T07:30:02ZJames WillenbringMueLu "repartition: heuristic target rows per process" not found*Created by: spdomin*
New problem with select Muelu Nalu regression tests using develop Trilinos (started with 2/19/2019 nightly testing).
Error is as follows (all platforms):
terminate called after throwing an instance of 'MueLu:...*Created by: spdomin*
New problem with select Muelu Nalu regression tests using develop Trilinos (started with 2/19/2019 nightly testing).
Error is as follows (all platforms):
terminate called after throwing an instance of 'MueLu::Exceptions::RuntimeError'
terminate called after throwing an instance of 'MueLu::Exceptions::RuntimeError'
what(): /scratch/spdomin/nightlyBuildAndTest/Trilinos/packages/muelu/src/MueCentral/MueLu_Level.hpp:196:
Throw number = 1
Throw test that evaluated to true: !IsKey(fac, ename)
"repartition: heuristic target rows per process" not found
terminate called after throwing an instance of 'MueLu::Exceptions::RuntimeError'
what(): /scratch/spdomin/nightlyBuildAndTest/Trilinos/packages/muelu/src/MueCentral/MueLu_Level.hpp:196:
Bad (2/19):
NaluCFD/Nalu SHA1: df59923cdcaf75cb68a715932b134596d5ebf733
Trilinos/develop SHA1: fc0976b09507f493cee89d2e8591bb0fa1e9fbe6
Good (2/18):
NaluCFD/Nalu SHA1: df59923cdcaf75cb68a715932b134596d5ebf733
Trilinos/develop SHA1: 30457dff21c3bbe6dc5ed0de6d44ea776db8f8d9
The following tests FAILED:
1 - ablNeutralEdge (Failed)
27 - elemHybridFluids (Failed)
74 - oversetHybrid (Failed)
75 - uqSlidingMeshDG (Failed)
76 - waleElemXflowMixFrac3.5m (Failed)https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4426ROL: incomplete StepFactory class ?2019-02-19T18:23:18ZJames WillenbringROL: incomplete StepFactory class ?*Created by: jschueller*
@trilinos/rol
I wonder why ROL_StepFactory.hpp allows only to build a subset of the available algorithms:
https://github.com/trilinos/Trilinos/blob/master/packages/rol/src/step/ROL_StepFactory.hpp#L73
As ...*Created by: jschueller*
@trilinos/rol
I wonder why ROL_StepFactory.hpp allows only to build a subset of the available algorithms:
https://github.com/trilinos/Trilinos/blob/master/packages/rol/src/step/ROL_StepFactory.hpp#L73
As some others are available, like Newton-Krylov:
https://trilinos.org/docs/r12.12/packages/rol/doc/html/group__step__group.html
The same applies in isValidStep(), used in the OptimizationSolver interface.
https://github.com/trilinos/Trilinos/blob/master/packages/rol/src/zoo/ROL_Types.hpp#L357
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4418Tpetra::MultiVector: No CombineModes should do atomic updates with OpenMP if ...2019-02-26T19:36:16ZJames WillenbringTpetra::MultiVector: No CombineModes should do atomic updates with OpenMP if only 1 thread *Created by: mhoemmen*
@trilinos/tpetra @crtrott @vbrunini @jjellio
`Tpetra::MultiVector::unpackAndCombineNew` currently uses atomic updates for all CombineModes (but see #4417), except with `Kokkos::Serial`. This means that MultiV...*Created by: mhoemmen*
@trilinos/tpetra @crtrott @vbrunini @jjellio
`Tpetra::MultiVector::unpackAndCombineNew` currently uses atomic updates for all CombineModes (but see #4417), except with `Kokkos::Serial`. This means that MultiVector will always use atomic updates with `Kokkos::OpenMP`, even if the number of threads is one. As a result, MultiVector penalizes users for enabling OpenMP. MultiVector should instead check if the number of threads is one, and not use atomic updates in that case.
Fixing this will require refactoring MultiVector's "Op"s that implement the CombineModes. Right now, the Ops are responsible for deciding whether to do atomic updates. To fix this issue, we could just change the Ops so that they have two methods, one for atomic update and one for nonatomic, and let the calling functors decide which to use (based on a run-time switch). (That would also let us fix #962 later if we wanted; the run-time switch would be "using only one thread, or no duplicate LIDs.") This would also let us remove the "ExecutionSpace" template parameter from the Ops (which currently only is a means to select whether to use atomic updates), so it should prove a net simplification.
## Related Issues
* Related to #962, #4417 https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4412Tpetra::Distributor: Don't fill indicesFrom_2019-02-18T16:38:12ZJames WillenbringTpetra::Distributor: Don't fill indicesFrom_*Created by: mhoemmen*
@trilinos/tpetra @jjellio @kddevin @vbrunini @crtrott
`Tpetra::Distributor` would always fill `indicesFrom_` with 0, 1, ..., `totalReceiveLength_-1`. This was not only a waste of time, it would unnecessarily ...*Created by: mhoemmen*
@trilinos/tpetra @jjellio @kddevin @vbrunini @crtrott
`Tpetra::Distributor` would always fill `indicesFrom_` with 0, 1, ..., `totalReceiveLength_-1`. This was not only a waste of time, it would unnecessarily make all reverse-mode communication take the slow path. (Reversing a Distributor swaps `indicesTo_` and `indicesFrom_`. A nonempty `indicesTo_` tells Distributor to take the slow path. If `indicesFrom_` is just 0, 1, ..., `totalReceiveLength_-1`, then Distributor does not need to take the slow path.)
Note that `Epetra_MpiDistributor` comments out the code that fills `indices_from_`, and claims that doing so fixes reverse communication.
@vbrunini found that applying my patch for fixing this made Aria's post-solve communication of the solution vector of the ViewFactor system a lot faster on 2 GPUs.
## Related Issues
* Blocks
* Is blocked by
* Follows
* Precedes
* Related to
* Part of
* Composed of
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4424Framework: Parameterize CDash Tracks for Pull Request Tests2019-02-25T17:11:05ZJames WillenbringFramework: Parameterize CDash Tracks for Pull Request Tests*Created by: william76*
@trilinos/framework
We discussed parameterizing the PR testing drivers so we can specify in the Jenkins job which track we'd like the tests to go to.
I'm testing this [on my fork][1], but I added a parameter...*Created by: william76*
@trilinos/framework
We discussed parameterizing the PR testing drivers so we can specify in the Jenkins job which track we'd like the tests to go to.
I'm testing this [on my fork][1], but I added a parameter `PULLREQUEST_CDASH_TRACK` which, if set and non-empty will set the `CDASH_TRACK` variable that's used inside the `PullRequestLinuxDriver-Test.sh` script.
I'm running a quick test using my jenkins simulator and it looks like it's working as intended... using '0000' as the PR number, it's showing up in the Clean track [here][2].
### Tasks
- [x] Update `PullRequestLinuxDriver-Test.sh` to take in a parameter for the desired CDash track to report results to.
- [x] Add parameter `PULLREQUEST_CDASH_TRACK` to [Trilinos_pullrequest_cuda_9.2][3]
- [x] Add parameter `PULLREQUEST_CDASH_TRACK` to [Trilinos_pullrequest_gcc_4.8.4][4]
- [x] Add parameter `PULLREQUEST_CDASH_TRACK` to [Trilinos_pullrequest_gcc_4.9.3][5]
- [x] Add parameter `PULLREQUEST_CDASH_TRACK` to [Trilinos_pullrequest_gcc_4.9.3_SERIAL][6]
- [x] Add parameter `PULLREQUEST_CDASH_TRACK` to [Trilinos_pullrequest_gcc_7.2.0][7]
- [x] Add parameter `PULLREQUEST_CDASH_TRACK` to [Trilinos_pullrequest_gcc_7.3.0][8]
- [x] Add parameter `PULLREQUEST_CDASH_TRACK` to [Trilinos_pullrequest_intel_17.0.1][9]
- [x] Update PR Driver configuration in the autotester.
FYI: @jwillenbring
[1]: https://github.com/william76/Trilinos/blob/parameterized-pr-cdash-track/cmake/std/PullRequestLinuxDriver-Test.sh#L298-L312
[2]: https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=2&showfilters=1&filtercombine=and&field1=buildname&compare1=63&value1=PR-0000-test&field2=buildstarttime&compare2=84&value2=NOW
[3]: https://ascic-jenkins.sandia.gov/job/trilinos-folder/job/Trilinos_pullrequest_cuda_9.2/
[4]: https://ascic-jenkins.sandia.gov/job/trilinos-folder/job/Trilinos_pullrequest_gcc_4.8.4/
[5]: https://ascic-jenkins.sandia.gov/job/trilinos-folder/job/Trilinos_pullrequest_gcc_4.9.3/
[6]: https://ascic-jenkins.sandia.gov/job/trilinos-folder/job/Trilinos_pullrequest_gcc_4.9.3_SERIAL/
[7]: https://ascic-jenkins.sandia.gov/job/trilinos-folder/job/Trilinos_pullrequest_gcc_7.2.0/
[8]: https://ascic-jenkins.sandia.gov/job/trilinos-folder/job/Trilinos_pullrequest_gcc_7.3.0/
[9]: https://ascic-jenkins.sandia.gov/job/trilinos-folder/job/Trilinos_pullrequest_intel_17.0.1/
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4417Tpetra::MultiVector: REPLACE / INSERT CombineMode need not use atomic updates2019-02-26T10:34:01ZJames WillenbringTpetra::MultiVector: REPLACE / INSERT CombineMode need not use atomic updates*Created by: mhoemmen*
@trilinos/tpetra @crtrott @vbrunini @jjellio
See discussion here: https://github.com/trilinos/Trilinos/issues/962#issuecomment-464524708
@vbrunini , @crtrott , and I found out Friday that MultiVector Import...*Created by: mhoemmen*
@trilinos/tpetra @crtrott @vbrunini @jjellio
See discussion here: https://github.com/trilinos/Trilinos/issues/962#issuecomment-464524708
@vbrunini , @crtrott , and I found out Friday that MultiVector Import goes faster on multiple GPUs when we remove use of atomic updates (`Kokkos::atomic_assign`) with the REPLACE CombineMode. (INSERT and REPLACE do the same thing with MultiVector.) MultiVector uses atomics to implement "sparse all-reduce," but using them with the REPLACE CombineMode isn't a well-defined reduction, even ignoring rounding error. Thus, it makes sense to get rid of them, for MultiVector's REPLACE implementation at least.
## Related Issues
* Related to #962 https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4407Trilinos build error due to missing file in Sacado2019-02-16T06:49:17ZJames WillenbringTrilinos build error due to missing file in Sacado*Created by: ikalash*
The Albany builds failed again due to Trilinos. Here is the error
```
/home/ikalash/nightlyAlbanyTests/Results/Trilinos/build/install/include/Sacado_MathFunctions.hpp:42:10: fatal error: Sacado_Fad_Ops_Fwd.hpp...*Created by: ikalash*
The Albany builds failed again due to Trilinos. Here is the error
```
/home/ikalash/nightlyAlbanyTests/Results/Trilinos/build/install/include/Sacado_MathFunctions.hpp:42:10: fatal error: Sacado_Fad_Ops_Fwd.hpp: No such file or directory
```
http://cdash.sandia.gov/CDash-2-3-0/viewBuildError.php?buildid=81523
@trilinos/sacado https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4392Ifpack2::OverlappingRowMatrix::apply is unnecessarily sequential2019-02-15T20:01:22ZJames WillenbringIfpack2::OverlappingRowMatrix::apply is unnecessarily sequential*Created by: mhoemmen*
@trilinos/ifpack2
It uses the old ArrayRCP host access interface of `Tpetra::MultiVector`, when it could just use `Tpetra::CrsMatrix::localApply` and get thread / CUDA parallelism for free. This could also be...*Created by: mhoemmen*
@trilinos/ifpack2
It uses the old ArrayRCP host access interface of `Tpetra::MultiVector`, when it could just use `Tpetra::CrsMatrix::localApply` and get thread / CUDA parallelism for free. This could also be related to #4353.
This will require making `Tpetra::CrsMatrix::localApply` public, but I'm OK with that; it's a useful function for implementing block operators efficiently.
## Related Issues
* Related to #4353 https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4394KokkosKernels: KokkosKernels_blas_serial_MPI_1 tests are unstable2019-02-21T19:32:59ZJames WillenbringKokkosKernels: KokkosKernels_blas_serial_MPI_1 tests are unstable*Created by: william76*
@trilinos/kokkos-kernels
The `KokkosKernels_blas_serial_MPI_1` test has been having some [intermittent failures the past few days][1]. Can someone on the @trilinos/kokkos-kernels have a look at this and see w...*Created by: william76*
@trilinos/kokkos-kernels
The `KokkosKernels_blas_serial_MPI_1` test has been having some [intermittent failures the past few days][1]. Can someone on the @trilinos/kokkos-kernels have a look at this and see what's happening?
The [CDash output][2] isn't terribly exciting:
```
Note: Google Test filter = -serial.gemm_double
[==========] Running 89 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 89 tests from serial
[ RUN ] serial.gemm_complex_double
```
@ndellingwood when you have a chance can you have a look at this? I've been seeing the error on the develop->master pull requests. It popped up after the merge of new kokkos stuff last week, but it seems to be intermittent so we don't have a ton of data points just yet.
[1]: https://testing.sandia.gov/cdash/testSummary.php?project=1&name=KokkosKernels_blas_serial_MPI_1&date=2019-02-13
[2]: https://testing-vm.sandia.gov/cdash/testDetails.php?test=65847315&build=4549908https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4398Belos::LinearProblem MV and OP this_type_is_missing_a_specialization2019-02-14T09:23:54ZJames WillenbringBelos::LinearProblem MV and OP this_type_is_missing_a_specialization*Created by: freaklovesmango*
I have an error code while compiling my program and explicitly it has to do with the linear problem of Belos but I am not sure how to solve this problem. What did I miss? Is it something I should have inc...*Created by: freaklovesmango*
I have an error code while compiling my program and explicitly it has to do with the linear problem of Belos but I am not sure how to solve this problem. What did I miss? Is it something I should have included? Is it some typedef thing?
Basically, the following is my code, with matrix A and vectors x and b which are all declared by Teuchos::RCP as well.
```
#include <Tpetra_MultiVector.hpp>
#include <Tpetra_Core.hpp>
#include <Tpetra_Version.hpp>
#include <BelosLinearProblem.hpp>
#include <Teuchos_RCP.hpp>
#include "Tpetra_Operator.hpp"
int main(int argc, char *argv[])
{
typedef Tpetra::Vector<>::scalar_type scalar_type;
typedef Tpetra::Vector<>::local_ordinal_type local_ordinal_type;
typedef Tpetra::Vector<>::global_ordinal_type global_ordinal_type;
typedef Tpetra::Vector<>::node_type node_type;
typedef Tpetra::MultiVector<> mv_type;
typedef Tpetra::Operator<> op_type;
...
Teuchos::RCP<Belos::LinearProblem<scalar_type, mv_type, op_type>> belosProblem =
Teuchos::rcp(new Belos::LinearProblem<scalar_type, mv_type, op_type> (A, x, b));
}
```
And this is the error message I get:
```
In file included from /home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosLinearProblem.hpp:49:0,
from main.cpp:13:
/home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosMultiVecTraits.hpp: In instantiation of ‘static ScalarType Belos::UndefinedMultiVecTraits<ScalarType, MV>::notDefined() [with ScalarType = double; MV = Tpetra::MultiVector<>]’:
/home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosMultiVecTraits.hpp:219:58: required from ‘static int Belos::MultiVecTraits<ScalarType, MV>::GetNumberVecs(const MV&) [with ScalarType = double; MV = Tpetra::MultiVector<>]’
/home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosLinearProblem.hpp:880:51: required from ‘bool Belos::LinearProblem<ScalarType, MV, OP>::setProblem(const Teuchos::RCP<T2>&, const Teuchos::RCP<const T2>&) [with ScalarType = double; MV = Tpetra::MultiVector<>; OP = Tpetra::Operator<>]’
main.cpp:184:41: required from here
/home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosMultiVecTraits.hpp:80:55: error: ‘this_type_is_missing_a_specialization’ is not a member of ‘Tpetra::MultiVector<>’
return MV::this_type_is_missing_a_specialization();
^
In file included from /home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosLinearProblem.hpp:50:0,
from main.cpp:13:
/home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosOperatorTraits.hpp: In instantiation of ‘static void Belos::UndefinedOperatorTraits<ScalarType, MV, OP>::notDefined() [with ScalarType = double; MV = Tpetra::MultiVector<>; OP = Tpetra::Operator<>]’:
/home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosOperatorTraits.hpp:131:62: required from ‘static void Belos::OperatorTraits<ScalarType, MV, OP>::Apply(const OP&, const MV&, MV&, Belos::ETrans) [with ScalarType = double; MV = Tpetra::MultiVector<>; OP = Tpetra::Operator<>]’
/home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosLinearProblem.hpp:893:21: required from ‘bool Belos::LinearProblem<ScalarType, MV, OP>::setProblem(const Teuchos::RCP<T2>&, const Teuchos::RCP<const T2>&) [with ScalarType = double; MV = Tpetra::MultiVector<>; OP = Tpetra::Operator<>]’
main.cpp:184:41: required from here
/home/freaklovesmango/build-trilinos-tpetra/install/Trilinos/include/BelosOperatorTraits.hpp:71:48: error: ‘this_type_is_missing_a_specialization’ is not a member of ‘Tpetra::Operator<>’
OP::this_type_is_missing_a_specialization();
```
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4375Galeri/Triutils: Duplicate symbols in Galeri_iohb.cpp and Trilinos_util_iohb.cpp2019-02-14T22:42:07ZJames WillenbringGaleri/Triutils: Duplicate symbols in Galeri_iohb.cpp and Trilinos_util_iohb.cpp*Created by: vbrunini*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
<!---
Note that a...*Created by: vbrunini*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
<!---
Note that anything between these delimiters is a comment that will not appear
in the issue description once created. Click on the Preview tab to see what
everything will look like when you submit.
-->
<!---
Feel free to delete anything from this template that is not applicable to the
issue you are submitting.
-->
<!---
Replace <teamName> below with the appropriate Trilinos package/team name.
-->
<!---
Assignees: If you know anyone who should likely tackle this issue, select them
from the Assignees drop-down on the right.
-->
<!---
Lables: Choose any applicable package names from the Labels drop-down on the
right. Additionally, choose a label to indicate the type of issue, for
instance, bug, build, documentation, enhancement, etc.
-->
## Expectations
<!---
Tell us what you think should happen, how you think things should work, what
you would like to see in the documentation, etc.
-->
It should be possible to link against both galeri and triutils without any duplicate symbols.
## Current Behavior
I observe linker errors about duplicate symbols for functions defined in packages/triutils/src/Trilinos_util_iohb.cpp and packages/galeri/src-epetra/Galeri_iohb.cpp. I noticed this on an nvidia build with relocatable device code on and ETI off, but have not tested which (if any) of those are required to trigger the linker errors. The contents of those two files appear to be essentially identical, only differences I saw were Trilinos_util_iohb.cpp qualifying calls to std:: functions.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4358MueLu: Driver.cpp, adding option for multiple RHS and for belos solver2019-02-11T14:36:35ZJames WillenbringMueLu: Driver.cpp, adding option for multiple RHS and for belos solver*Created by: lucbv*
@trilinos/muelu
## Expectations
The muelu driver should allow users to run problems with multiple RHS, at the same time this requires a finer granularity to chose the associated belos solver.
## Current Behav...*Created by: lucbv*
@trilinos/muelu
## Expectations
The muelu driver should allow users to run problems with multiple RHS, at the same time this requires a finer granularity to chose the associated belos solver.
## Current Behavior
Currently only single RHS is allowable and only Pseudo CG and Block Gmres are availble from Belos.
## Motivation and Context
This will allow us to test more solver/preconditioner configurations that users and customers might care about.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4348MueLu segfaults in DeterminePartitionPlacement for RefMaxwell on serrano for ...2019-02-14T20:21:41ZJames WillenbringMueLu segfaults in DeterminePartitionPlacement for RefMaxwell on serrano for >= 16 nodes*Created by: pwxy*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
MueLu segfaults in Deter...*Created by: pwxy*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
MueLu segfaults in DeterminePartitionPlacement for RefMaxwell on serrano for >= 16 nodes
MueLu segfaults in DeterminePartitionPlacement for RefMaxwell on serrano.
Unfortunately I am unable to reproduce this issue on mutrino.
Unfortunately the problem is somewhat intermittent. Can first see it on 16 nodes (but run is more likely to succeed than fail on 16 nodes). By 64 nodes it is more likely to fail than succeed. Test case is the awesome blob medium size mesh (refinement=1) with 256 MPI. For 16 node case have 2 OMP threads per MPI. For 64 node case have 8 OMP threads per MPI.
It is failing at line 535 of MueLu_RepartitionFactory_def.hpp in DeterminePartitionPlacement()
```
// Step 4: Assign unassigned partitions if necessary.
// We do that through random matching for remaining partitions. Not all part numbers are valid, but valid parts are a
// subset of [0, numProcs). The reason it is done this way is that we don't need any extra communication, as we don't
// need to know which parts are valid.
// TODO The cost of this loop is numprocs*log(numprocs), as match is a std::set(). Can this cost be reduced?
if (numPartitions - numMatched > 0) {
for (int part = 0, matcher = 0; part < numProcs; part++) {
if (match.count(part) == 0) {
// Find first non-matched rank that accepts partitions
535 while (matchedRanks[matcher] || !procWillAcceptPartition[matcher])
matcher++;
match[part] = matcher++;
}
}
}
```
stack trace:
#0 MueLu::RepartitionFactory<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> >::DeterminePartitionPlacement (this=0x7ffffffe7b18, A=...,
decomposition=..., numPartitions=10, willAcceptPartition=248, allSubdomainsAcceptPartitions=false) at ../../packages/muelu/src/Rebalancing/MueLu_RepartitionFactory_def.hpp:535
#1 0x0000000004fc094d in MueLu::RepartitionFactory<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> >::Build (this=0x7ffffffe7b18, currentLevel=...)
at ../../packages/muelu/src/Rebalancing/MueLu_RepartitionFactory_def.hpp:219
#2 0x0000000004620e69 in MueLu::RefMaxwell<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> >::compute() ()
#3 0x0000000003416562 in MueLu::RefMaxwell<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> >::RefMaxwell(Teuchos::RCP<Xpetra::Matrix<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> > > const&, Teuchos::ParameterList&, bool) ()
#4 0x000000000346131f in Thyra::MueLuRefMaxwellPreconditionerFactory<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> >::initializePrec(Teuchos::RCP<Thyra::LinearOpSourceBase<double> const> const&, Thyra::PreconditionerBase<double>*, Thyra::ESupportSolveUse) const ()
#5 0x0000000005e74234 in Thyra::BelosLinearOpWithSolveFactory<double>::initializeOpImpl(Teuchos::RCP<Thyra::LinearOpSourceBase<double> const> const&, Teuchos::RCP<Thyra::LinearOpSourceBase<double> const> const&, Teuchos::RCP<Thyra::PreconditionerBase<double> const> const&, bool, Thyra::LinearOpWithSolveBase<double>*, Thyra::ESupportSolveUse) const ()
#6 0x0000000005e74c23 in Thyra::BelosLinearOpWithSolveFactory<double>::initializeOp(Teuchos::RCP<Thyra::LinearOpSourceBase<double> const> const&, Thyra::LinearOpWithSolveBase<double>*, Thyra::ESupportSolveUse) const ()
#7 0x0000000001e8575d in void Thyra::initializeOp<double>(Thyra::LinearOpWithSolveFactoryBase<double> const&, Teuchos::RCP<Thyra::LinearOpBase<double> const> const&, Teuchos::Ptr<Thyra::LinearOpWithSolveBase<double> > const&, Thyra::ESupportSolveUse) ()
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4353Ifpack2_unit_tests_MPI_4 randomly failing on ATDM waterman build2019-04-21T01:32:25ZJames WillenbringIfpack2_unit_tests_MPI_4 randomly failing on ATDM waterman build*Created by: fryeguy52*
CC: @trilinos/ifpack2, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "ATDM">
<???: Add label "bug"?>
<???: Add label for affected packages (e.g. ...*Created by: fryeguy52*
CC: @trilinos/ifpack2, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "ATDM">
<???: Add label "bug"?>
<???: Add label for affected packages (e.g. "MueLu", "Tpetra", "Kokkos", etc.)>
<???: Add milestone "Initial cleanup of new ATDM builds of Trilinos" or "Keep promoted ATDM builds of Trilinos clean">
<???: Once GitHub Issue is created, add entries for tests to TrilinosATDMStatus/*.csv files>
<???: Add label "PA: ???Project Area???" (e.g. "PA: Linear Solvers", "PA: Data Services")>
## Next Action Status
<status-and-or-first-action>
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-waterman-cuda-9.2-opt&field2=testname&compare2=61&value2=Ifpack2_unit_tests_MPI_4&field3=site&compare3=61&value3=waterman&field4=buildstarttime&compare4=84&value4=2019-02-08T00%3A00%3A00&field5=buildstarttime&compare5=83&value5=2018-12-27T00%3A00%3A00) the test:
* Ifpack2_unit_tests_MPI_4
is randomly failing in the buils:
* Trilinos-atdm-waterman-cuda-9.2-opt
It has failed roughly 6 times in the last month. Here are some examples of the output when it fails:
```
Error, relErr(Y.get1dView ()[9932],Z.get1dView ()[9932]) = relErr(29832,0) = 1 <= tol = 2.22045e-12: failed!
```
```
p=0 | The following tests FAILED:
p=0 | 48. Ifpack2OverlappingRowMatrix_default_scalar_type_default_local_ordinal_type_default_global_ordinal_type_Test0_UnitTest ...
p=0 |
p=0 | Total Time: 6.49 sec
p=0 |
p=1 | Summary: total = 82, run = 82, passed = 81, failed = 1
p=1 |
p=1 | End Result: TEST FAILED
```
## Current Status on CDash
[2 Week history of this test](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-waterman-cuda-9.2-opt&field2=testname&compare2=61&value2=Ifpack2_unit_tests_MPI_4&field3=site&compare3=61&value3=waterman&field4=buildstarttime&compare4=84&value4=tomorrow&field5=buildstarttime&compare5=83&value5=2%20weeks%20ago)
## Steps to Reproduce
One should be able to reproduce the build on waterman where this test is randomly failing as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for waterman are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#waterman
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-waterman-cuda-9.2-opt
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Ifpack2=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -n 20 ctest -j20
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4344Tpetra::DistObject: DualView arguments have wrong types2019-02-18T15:51:58ZJames WillenbringTpetra::DistObject: DualView arguments have wrong types*Created by: mhoemmen*
@trilinos/tpetra @kyungjoo-kim
This concerns the following DistObject virtual methods that its subclasses must implement:
- copyAndPermuteNew
- packAndPrepareNew
- unpackAndCombineNew
As of PR #4...*Created by: mhoemmen*
@trilinos/tpetra @kyungjoo-kim
This concerns the following DistObject virtual methods that its subclasses must implement:
- copyAndPermuteNew
- packAndPrepareNew
- unpackAndCombineNew
As of PR #4328, we have clarified how `DistObject::doTransferNew` expects these methods to behave, with respect to their DualView parameters. Export and Import setup is responsible for ensuring that `permuteToLIDs`, `permuteFromLIDs`, `exportLIDs`, and `remoteLIDs` are sync'd on both host and device. Thus, the above methods need not sync any of these four DualViews. As a result, their type `const DualView<const LO*, buffer_device_type>&` is correct. (`const DualView&` means you can't sync them; `DualView<const LO*, ...>` means you can't modify their entries.)
PR #4328 further clarifies that `packAndPrepareNew` may fill its `exports` and `numPacketsPerLID` DualView parameters wherever the subclass likes, either on host or device. It is not responsible for sync'ing either of these two DualViews to any particular place; `DistObject::doTransferNew` must sync them wherever it needs them to be. Similarly, `unpackAndCombineNew` may unpack received data (`imports` and `numPacketsPerLIDs`) wherever it wants, either on host or device, but is responsible for sync'ing them wherever it needs them. `DistObject::doTransferNew` is not responsible for sync'ing them.
I justify the previous paragraph as follows:
- Avoid unnecessary syncs to host, if the subclass knows how to unpack on device
- Separation of concerns between communication and pack / unpack
This imposes the following requirements on `exports`, `imports`, and `numPacketsPerLIDs`:
1. `DistObject::doTransferNew` allocates the `numPacketsPerLIDs` argument when it calls `packAndPrepareNew`, but `packAndPrepareNew` must allocate the `exports` output argument. Thus, subclasses cannot reallocate `numPacketsPerLID`.
2. `DistObject::doTransferNew` allocates the `imports` and `numPacketsPerLIDs` arguments when it calls `unpackAndCombineNew`; subclasses cannot reallocate them.
3. Subclasses need to be able to sync any of these arrays wherever they like.
4. `packAndPrepareNew` implementations access `exports` and `numPacketsPerLIDs` in write-only fashion.
5. `unpackAndPrepareNew` implementations technically only need read-only access to `imports` and `numPacketsPerLIDs`. However, `DistObject::doTransferNew` only accesses them in write-only fashion before calling `unpackAndPrepareNew`, and doesn't access them afterwards. Thus, `unpackAndPrepareNew` implementations could freely write to these arrays, e.g., to use them as scratch space.
This means that the correct type of `imports` and `numPacketsPerLIDs` is `DualView<T*, ...>`.
- Can't have `DualView<const T*, ...>`, because that would forbid sync'ing.
- Can't have `const DualView&` or `const DualView`, because that would forbid sync'ing.
- Can't have `DualView&`, because that would expose DistObject's internal DualViews to reallocation by subclasses. Passing by value means that even if the subclass reallocates, the caller won't see that. It's just like passing `double*` into a function, instead of `double*&`.
Currently, these arrays have type `const DualView<const T*, ...>&`. This makes implementations do hacks in order to meet the above four requirements (esp. (2)).
I propose changing the types of `imports`, and `numPacketsPerLID` (as parameters to `packAndPrepareNew` and `unpackAndCombineNew`) from `const DualView<const T*, ...>&`, to `DualView<T*, ...>`.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4334Albany build failing in MueLu: AggregationStructuredAlgorithm_kokkos2019-02-20T15:52:41ZJames WillenbringAlbany build failing in MueLu: AggregationStructuredAlgorithm_kokkos*Created by: ibaned*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
<!---
Note that any...*Created by: ibaned*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
<!---
Note that anything between these delimiters is a comment that will not appear
in the issue description once created. Click on the Preview tab to see what
everything will look like when you submit.
-->
<!---
Feel free to delete anything from this template that is not applicable to the
issue you are submitting.
-->
<!---
Replace <teamName> below with the appropriate Trilinos package/team name.
-->
@trilinos/muelu
<!---
Assignees: If you know anyone who should likely tackle this issue, select them
from the Assignees drop-down on the right.
-->
<!---
Lables: Choose any applicable package names from the Labels drop-down on the
right. Additionally, choose a label to indicate the type of issue, for
instance, bug, build, documentation, enhancement, etc.
-->
## Current Behavior
<!---
Tell us how the current behavior fails to meet your expectations in some way.
-->
```
.../Trilinos/packages/muelu/src/Graph/StructuredAggregation/MueLu_StructuredAggregationFactory_kokkos_def.hpp:221:39: error: unknown type name 'AggregationStructuredAlgorithm_kokkos'; did you mean 'AggregationStructuredAlgorithm'?
myStructuredAlgorithm = rcp(new AggregationStructuredAlgorithm_kokkos());
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
AggregationStructuredAlgorithm
/.../Trilinos/packages/muelu/src/Interface/../Headers/MueLu_UseShortNamesOrdinal.hpp:36:80: note: 'AggregationStructuredAlgorithm' declared here
typedef MueLu::AggregationStructuredAlgorithm<LocalOrdinal,GlobalOrdinal,Node> AggregationStructuredAlgorithm;
^
```https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4343Ifpack2: BlockTriDi test does (bool > 1); probably meant just to eval the bool2019-02-17T18:52:29ZJames WillenbringIfpack2: BlockTriDi test does (bool > 1); probably meant just to eval the bool*Created by: mhoemmen*
@trilinos/ifpack2
NVCC warned about the following line evaluating a `bool > 1` expression, that's always false. https://github.com/trilinos/Trilinos/blob/e476b66bcdd0fb2efa6018a4d5ff10f5bb02d2b9/packages/ifpac...*Created by: mhoemmen*
@trilinos/ifpack2
NVCC warned about the following line evaluating a `bool > 1` expression, that's always false. https://github.com/trilinos/Trilinos/blob/e476b66bcdd0fb2efa6018a4d5ff10f5bb02d2b9/packages/ifpack2/test/unit_tests/Ifpack2_UnitTestBlockTriDiContainerUtil.hpp#L249https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4330Build failures in SPARC 'master' + Trilinos 'develop' starting 2/5/2019 due t...2019-02-11T21:38:59ZJames WillenbringBuild failures in SPARC 'master' + Trilinos 'develop' starting 2/5/2019 due to break in Teuchos::SerialDenseMatrix*Created by: bartlettroscoe*
CC: @trilinos/teuchos , @jwillenbring (Trilinos Framework Product Lead), @bartlettroscoe, @fryeguy52, @sebrowne, @micahahoward
## Next Action Status
Backward compatibility breaking PR #4259 merged to '...*Created by: bartlettroscoe*
CC: @trilinos/teuchos , @jwillenbring (Trilinos Framework Product Lead), @bartlettroscoe, @fryeguy52, @sebrowne, @micahahoward
## Next Action Status
Backward compatibility breaking PR #4259 merged to 'develop' on 2/4/2019 was reverted in PR #4336 merged on 2/11/2019 (and 'atdm-develop-nightly' on 2/10/2018) and the SPARC 'master' + Trilinos 'develop' builds passed (except for some existing SPARC test failures) on 2/11/2019.
## Description
As shown in [this SPARC CDash query for 2019-02-05](http://compsim-dashboard.sandia.gov/cdash/index.php?project=SPARC&date=2019-02-05&filtercount=1&showfilters=1&field1=buildname&compare1=63&value1=-trildev), the version of Trilinos 'develop' for testing day 2019-02-05 has broken the new SPARC 'master' + Trilinos 'develop' builds (see [TRIL-243](https://sems-atlassian-son.sandia.gov/jira/browse/TRIL-243)) which detect defects in Trilinos 'develop' which break SPARC. These build failures did not exist in those builds the previous day as shown in [this SPARC CDash query for 2019-02-04](http://compsim-dashboard.sandia.gov/cdash/index.php?project=SPARC&date=2019-02-04&filtercount=1&showfilters=1&field1=buildname&compare1=63&value1=-trildev).
The 'clang-5.0.1' build shows just one build failure [here](http://compsim-dashboard.sandia.gov/cdash/viewBuildError.php?buildid=92804) which shows:
```
sparc-ear99/src/solver-analysis/ROLInterface.C:313:14: error: invalid operands to binary expression ('std::ofstream' (aka 'basic_ofstream') and 'Teuchos::SerialDenseMatrix' (aka 'SerialDenseMatrix'))
cov_out << cov;
~~~~~~~ ^ ~~~
```
That SPARC 'master' build was built against Trilinos 'develop' for testing day 2019-02-05 for the build `Trilinos-atdm-cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt` shown [here](https://testing.sandia.gov/cdash-dev-view/index.php?project=Trilinos&parentid=4494553).
## Current Status on CDash
The current status of the SPARC 'master' + Trilinos 'develop' builds for the current testing day can be see [here|http://compsim-dashboard.sandia.gov/cdash/index.php?project=SPARC&date=2019-02-05&filtercount=1&showfilters=1&field1=buildname&compare1=63&value1=-trildev] (internal link).
## Steps to Reproduce
For one that has access to the SPARC development resources, one can reproduce the build failures using the `sparc-tril-build-helper` scripts.
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4327Catalyst IOSS Adapter Python Imports2019-02-07T16:27:44ZJames WillenbringCatalyst IOSS Adapter Python Imports*Created by: jrood-nrel*
@tjotaha @jeffmauldin I have one more question about the Catalyst IOSS adapter. I am working on a pull request for Spack regarding this here https://github.com/spack/spack/pull/10501 that might give some more in...*Created by: jrood-nrel*
@tjotaha @jeffmauldin I have one more question about the Catalyst IOSS adapter. I am working on a pull request for Spack regarding this here https://github.com/spack/spack/pull/10501 that might give some more information. Basically if I don't have the Paraview `site-packages/vtkmodules` in the `PYTHONPATH`, then the `phactori.py` file can't import `vtkParallelCorePython`:
```
/python/phactori.py", line 1897, in SmartGetLocalProcessId
import vtkParallelCorePython
ImportError: No module named vtkParallelCorePython
```
Would the better fix be to update the import in the IOSS adapter to account for this, or will the `vtkmodules` be required in the `PYTHONPATH` when loading a Paraview module?https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4298Tpetra: "declaration shadows a typedef" warnings emitted by Apple clang 10.0.02019-02-18T16:38:12ZJames WillenbringTpetra: "declaration shadows a typedef" warnings emitted by Apple clang 10.0.0*Created by: CamelliaDPG*
@trilinos/tpetra
## Expectations
Builds with clang that use the `-Wshadow` flag should not emit warnings.
## Current Behavior
Three such warnings are emitted by clang when building `MatrixMarket_Tpetra....*Created by: CamelliaDPG*
@trilinos/tpetra
## Expectations
Builds with clang that use the `-Wshadow` flag should not emit warnings.
## Current Behavior
Three such warnings are emitted by clang when building `MatrixMarket_Tpetra.hpp`.
```
.../include/MatrixMarket_Tpetra.hpp:6896:56: warning: declaration shadows a typedef in 'Writer<SparseMatrixType>' [-Wshadow]
typedef typename multivector_type::scalar_type scalar_type;
^
.../include/MatrixMarket_Tpetra.hpp:5844:54: note: previous declaration is here
typedef typename SparseMatrixType::scalar_type scalar_type;
^
.../include/MatrixMarket_Tpetra.hpp:7021:56: warning: declaration shadows a typedef in 'Writer<SparseMatrixType>' [-Wshadow]
typedef typename multivector_type::scalar_type scalar_type;
^
.../include/MatrixMarket_Tpetra.hpp:5844:54: note: previous declaration is here
typedef typename SparseMatrixType::scalar_type scalar_type;
^
.../include/MatrixMarket_Tpetra.hpp:976:57: warning: declaration shadows a typedef in 'Reader<SparseMatrixType>' [-Wshadow]
typedef typename ArrayView<const GO>::size_type size_type;
^
.../include/MatrixMarket_Tpetra.hpp:220:49: note: previous declaration is here
typedef Teuchos::ArrayRCP<int>::size_type size_type;
```
## Environment
I'm building on a Mac using Apple clang 10.0.0.