Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2018-04-11T15:58:48Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/891Set up automated Nightly testing for PyTrilinos2018-04-11T15:58:48ZJames WillenbringSet up automated Nightly testing for PyTrilinos*Created by: bartlettroscoe*
**CC:** @trilinos/framework, @wfspotz
**Description:**
Currently PyTrilinos is not under any automated testing that gets posted up to the Trilinos CDash site:
* testing.sandia.gov/cdash/
See the...*Created by: bartlettroscoe*
**CC:** @trilinos/framework, @wfspotz
**Description:**
Currently PyTrilinos is not under any automated testing that gets posted up to the Trilinos CDash site:
* testing.sandia.gov/cdash/
See the below email chain.
----
From: Bartlett, Roscoe A
Sent: Tuesday, November 29, 2016 4:10 PM
To: Spotz, William F
Cc: Willenbring, James M; Perschbacher, Brent M
Subject: RE: [Pytrilinos-regression] FAILED (c=1): Trilinos/PyTrilinos - Linux-GCC-4.7.2-MPI_RELEASE_DEBUG_SHARED_PT_CI - Continuous
Bill,
PyTrilinos was enabled by accident. See:
* https://github.com/trilinos/Trilinos/issues/482#issuecomment-263575023
* https://github.com/trilinos/Trilinos/commit/1b14dadb154a2f49e1097c91a0340d66c39306b6
It is not enabled in the correct PT CI build as shown here:
* http://testing.sandia.gov/cdash/index.php?project=Trilinos&parentid=2634468
Some work is going to need to be done before PyTrilinos can be built using the SEMS env and include in any automated testing. For example, currently, PyTrilinos is not tested in any automated Trilinos build uploaded to CDash:
* http://testing.sandia.gov/cdash/index.php?subproject=PyTrilinos&project=Trilinos&date=2016-11-28
Getting things set up to test PyTrilinos is something that you are going to need to take up with the Trilinos Framework team (i.e. Jim and Brent) and the SEMS team.
But at the very minimum, you should set up PyTrilinos testing on one of your machines where you have this working.
-Ross
----
From: Spotz, William F
Sent: Tuesday, November 29, 2016 4:04 PM
To: Bartlett, Roscoe A
Subject: Fwd: [Pytrilinos-regression] FAILED (c=1): Trilinos/PyTrilinos - Linux-GCC-4.7.2-MPI_RELEASE_DEBUG_SHARED_PT_CI - Continuous
Hi Ross,
So does “Pytrilinos-regression” include both of us?
If not, the configure errors were that
1) No python numpy module was found
2) The SWIG version, 1.3.40, was too old — required is 3.0.0
-Bill
----
From: CDash <trilinos-regression@sandia.gov>
Subject: [Pytrilinos-regression] FAILED (c=1): Trilinos/PyTrilinos - Linux-GCC-4.7.2-MPI_RELEASE_DEBUG_SHARED_PT_CI - Continuous
Date: November 29, 2016 at 1:13:33 AM MST
To: <pytrilinos-regression@software.sandia.gov>
Reply-To: <noreply@sandia.gov>
A submission to CDash for the project Trilinos has configure errors.
You have been identified as one of the authors who have checked in changes that are part of this submission or you are listed in the default contact list.
Details on the submission can be found at http://testing.sandia.gov/cdash/buildSummary.php?buildid=2633954
Project: Trilinos
SubProject: PyTrilinos
Site: crf450.srn.sandia.gov
Build Name: Linux-GCC-4.7.2-MPI_RELEASE_DEBUG_SHARED_PT_CI
Build Time: 2016-11-29T08:13:32 UTC
Type: Continuous
Configure errors: 1
*Configure*
Status: 1 (http://testing.sandia.gov/cdash/viewConfigure.php?buildid=2633954)
Output:
Configuring Trilinos build directory
-- PROJECT_SOURCE_DIR='/ascldap/users/rabartl/Trilinos.base/SEMSCIBuild/Trilinos'
-- PROJECT_BINARY_DIR='/home/rabartl/Trilinos.base/SEMSCIBuild/BUILD'
-- Trilinos_TRIBITS_DIR='/ascldap/users/rabartl/Trilinos.base/SE
-CDash on testing.sandia.gov
_______________________________________________
Pytrilinos-regression mailing list
Pytrilinos-regression@software.sandia.gov
https://software.sandia.gov/mailman/listinfo/pytrilinos-regression
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1091Set up automated testing with SuperLUDist enabled2017-02-24T16:00:22ZJames WillenbringSet up automated testing with SuperLUDist enabled*Created by: ibaned*
As pointed out in #410, #1083, and #1090, it would be valuable to have automated Trilinos testing that enables the SuperLUDist TPL.
@trilinos/framework *Created by: ibaned*
As pointed out in #410, #1083, and #1090, it would be valuable to have automated Trilinos testing that enables the SuperLUDist TPL.
@trilinos/framework https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1159Set up debug Windows Trilinos nightly build2017-09-13T17:02:49ZJames WillenbringSet up debug Windows Trilinos nightly build*Created by: jwillenbring*
@trilinos/framework
This ticket is for setting up a nightly "debug" build for Trilinos on Windows.
Done for this story includes
-having a nightly build reporting to CDash with a debug configuration (...*Created by: jwillenbring*
@trilinos/framework
This ticket is for setting up a nightly "debug" build for Trilinos on Windows.
Done for this story includes
-having a nightly build reporting to CDash with a debug configuration (with package options previously sent to Michael).
-the build should run on a Windows VM provided by the Trilinos team
-filing tickets for all errors found in executing the build. For all errors found, tickets should be filed in GitHub issues. If cause of error is known, a pull request could be offered to the package developers for package encountering the failure.
If necessary, another ticket will be filed for cleaning up all failures once the job is set up and running.
The initial work for setting up the build can be done anywhere, but getting the job running through Jenkins on the appropriate VM depends on tickets #1157 and #1158.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1457Set up new continuous builds2017-08-02T01:27:34ZJames WillenbringSet up new continuous builds*Created by: jwillenbring*
@trilinos/framework
In alignment with the discussion at the Framework Standup this morning, we need to set up 2 new continuous builds on the newer build farm - one to support GCC 4.8.4, the other to suppor...*Created by: jwillenbring*
@trilinos/framework
In alignment with the discussion at the Framework Standup this morning, we need to set up 2 new continuous builds on the newer build farm - one to support GCC 4.8.4, the other to support GCC 4.9.3.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1160Set up release Trilinos Windows nightly build2017-09-13T17:01:42ZJames WillenbringSet up release Trilinos Windows nightly build*Created by: jwillenbring*
@trilinos/framework
This ticket is for setting up a nightly "release" build for Trilinos on Windows.
Done for this story includes
-having a nightly build reporting to CDash with a release configurati...*Created by: jwillenbring*
@trilinos/framework
This ticket is for setting up a nightly "release" build for Trilinos on Windows.
Done for this story includes
-having a nightly build reporting to CDash with a release configuration (with package options previously sent to Michael).
-the build should run on a Windows VM provided by the Trilinos team
-filing tickets for all errors found in executing the build. For all errors found, tickets should be filed in GitHub issues. If cause of error is known, a pull request could be offered to the package developers for package encountering the failure.
If necessary, another ticket will be filed for cleaning up all failures once the job is set up and running.
The initial work for setting up the build can be done anywhere, but getting the job running through Jenkins on the appropriate VM depends on tickets #1157 and #1158.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1156Site name consistency2017-03-21T00:24:31ZJames WillenbringSite name consistency*Created by: jwillenbring*
@trilinos/framework @bartlettroscoe @bmpersc
Kitware is implementing a feature that will help us identify when a test goes missing on the dashboard. To do this, a consistent site name is required. Our mach...*Created by: jwillenbring*
@trilinos/framework @bartlettroscoe @bmpersc
Kitware is implementing a feature that will help us identify when a test goes missing on the dashboard. To do this, a consistent site name is required. Our machines are currently identified on the CDash dashboard by machine name. However, in situations where jobs float (including parameterized builds on the other build farm), jobs can run on different machines, and are currently listed as different sites. Could we perhaps assign a single site to all machines on the build farm, or define a "site" per job run?
Done for this ticket is to better define the issue, determine how to get site consistency, and implement the solution.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/821static analysis using PVS2016-11-11T20:51:01ZJames Willenbringstatic analysis using PVS*Created by: davydden*
I did a static code analysis on Trilnos 12.8.1 with gcc 5.4.0 (build by Spack with `-DTrilinos_ENABLE_EXPLICIT_INSTANTIATION:BOOL=ON`) using [PVS for Linux](http://www.viva64.com/en/b/0441/ ). The results are
```...*Created by: davydden*
I did a static code analysis on Trilnos 12.8.1 with gcc 5.4.0 (build by Spack with `-DTrilinos_ENABLE_EXPLICIT_INSTANTIATION:BOOL=ON`) using [PVS for Linux](http://www.viva64.com/en/b/0441/ ). The results are
```
Total messages: 3907
Filtered messages: 1657
```
There are certainly some false positives, but I am sure that there are also those to be fixed.
[pvs_tasks.txt](https://github.com/trilinos/Trilinos/files/586596/pvs_tasks.txt)
Keep in mind that (from http://www.viva64.com/en/m/0036/)
> It is important to understand that all files to be analyzed should be compiled. If your project actively uses code generation, then this project should be built before analysis, otherwise there may be errors during preprocessing.
Steps to reproduce
```
$ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=On <blah-blah-blah>
$ make all -j8
$ pvs-studio-analyzer analyze -l PVS-Studio.lic -o pvs.log -j8
$ plog-converter -a GA:1,2 -t tasklist -o pvs_tasks.txt pvs.log
```
For a trial Linux license see http://www.viva64.com/en/b/0441/ .
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4404Switch to C bindings of BLAS & LAPACK?2019-02-15T22:57:28ZJames WillenbringSwitch to C bindings of BLAS & LAPACK?*Created by: mhoemmen*
@trilinos/framework
@mwglass pointed out that:
- Trilinos, kokkos-kernels, Sierra, ATDM, etc. all spent a lot of effort deducing and maintaining Fortran mangling of BLAS and LAPACK, yet
- the BLAS (and...*Created by: mhoemmen*
@trilinos/framework
@mwglass pointed out that:
- Trilinos, kokkos-kernels, Sierra, ATDM, etc. all spent a lot of effort deducing and maintaining Fortran mangling of BLAS and LAPACK, yet
- the BLAS (and even LAPACK)[
http://netlib.org/lapack/#_standard_c_language_apis_for_lapack] come with standard C bindings.
Why don't we all switch to the C bindings? That would get rid of all that mangling deduction code.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/440Switch to 'develop'/'master' branch in all repos listed in ExtraRepositoriesL...2018-08-15T01:35:32ZJames WillenbringSwitch to 'develop'/'master' branch in all repos listed in ExtraRepositoriesList.cmake*Created by: bartlettroscoe*
## Next Action Status:
???
**Blocked By:** #370, #176, #452
## Description:
As described in [this comment](https://github.com/trilinos/Trilinos/issues/370#issuecomment-224391677) in #370, the cu...*Created by: bartlettroscoe*
## Next Action Status:
???
**Blocked By:** #370, #176, #452
## Description:
As described in [this comment](https://github.com/trilinos/Trilinos/issues/370#issuecomment-224391677) in #370, the current implementation of the `TRIBITS_CTEST_DRIVER()` function will clone and use the 'master' branch of the extra repos even if `Trilinos_BRANCH=develop` or `trilinos-release-X-Y-branch`, etc. That is not consistent with treating this set of repos as "one big repo".
Therefore, this story is to switch all of the extra repos listed in Trilinos/cmake/ExtraRepositoriesList.cmake over to the 'develop'/'master' workflow that will be released or used by outside users. Any extra repo that is not going to switch over to the 'develop'/'master' workflow should not be listed in that file and should not be part of automated testing of Trilinos or Trilinos releases. More can be discussed about this but that basics are simple.
## Tasks:
1. Implement TriBITSPub/TriBITS#130 in TriBITS (This will result in `Trilinos_BRANCH=<branch>` getting checked out in each extra repo as well as the base Trilinos repo.) **[DONE]**
2. Determine what extra repos currently listed in ExtraRepositoriesList.cmake should continue being tested and perhaps released and which should not.
3. For those repos listed in ExtraRepositoriesList.cmake (i.e. that will continue to be tested and perhaps released), add a 'develop' branch.
4. Transition to the 'develop'/'master' workflow for all of the extra repos listed in ExtraRepositoriesList.cmake (in one push to the Trilinos 'develop' branch):
a. Update 'develop' branch from 'master' branch in all these extra repos and then lock down 'master' branch like for the main Trilinos git repo in #370.
b. Update the process that updates the Trilinos 'develop' branch to the 'master' branch also update from the 'develop' to the 'master' branches in these extra repos as well.
Improve productivity, stability, and quality of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3585Test Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 appears to be randomly fail...2019-04-02T18:21:50ZJames WillenbringTest Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 appears to be randomly failing in many builds including CI, PR, and ATDM builds*Created by: bartlettroscoe*
CC: @trilinos/framework, @trilinos/anasazi, @srajama1 (Trilinos Linear Solver Product Area Lead)
## Next Action Status
PR #4052 merged to 'develop' on 12/18/2018 but still failing after that. Next: Tr...*Created by: bartlettroscoe*
CC: @trilinos/framework, @trilinos/anasazi, @srajama1 (Trilinos Linear Solver Product Area Lead)
## Next Action Status
PR #4052 merged to 'develop' on 12/18/2018 but still failing after that. Next: Try to fix again?
## Description
It would seem that the test `Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4` is very occasionally randomly failing in various builds. As shown in [this query](https://testing.sandia.gov/cdash-dev-view/queryTests.php?project=Trilinos&date=2018-10-09&filtercount=4&showfilters=1&filtercombine=and&field1=testname&compare1=61&value1=Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4&field2=status&compare2=61&value2=failed&field3=details&compare3=64&value3=timeout&field4=buildstarttime&compare4=83&value4=2018-07-01), this test failed 10 times since 7/1/2018 in the builds:
* `Linux-GCC-4.8.4-MPI_RELEASE_DEBUG_SHARED_PT_OPENMP_CI` (post-push CI build): 1 time (today)
* `PR-XXXX-test-Trilinos_pullrequest_gcc_4.9.3-YYYY` (standard PR build): 4 times
* `PR-XXXX-test-Trilinos_pullrequest_gcc_4.8.4-YYYY` (standard PR build): 1 time
* `Trilinos-atdm-chama-intel-debug-openmp` (standard ATDM build): 1 time
* `Trilinos-atdm-rhel6-gnu-opt-openmp` (standard ATDM build): 2 times
* `Trilinos-atdm-waterman-cuda-9.2-debug` (standard ATDM build): 1 time
In each of these 10 failures in the last 3 months, such as the CI failure today shown [here](https://testing.sandia.gov/cdash-dev-view/testDetails.php?test=56264374&build=4031303), it shows failures like:
```
projectAndNormalizeGen() returned rank 5
|| <S,S> - I || after : 2.65912e-11
1|| S_in - X1*C1 - X2*C2 - S_out*B || : 1.70776e-09
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv tolerance exceeded! test failed!
```
The location of these failures seems to change in this test but all of the failures appear to be "tolerance exceeded! test failed!"
Is there some type of non-deterministic behavior in this test or in the underlying Anasazi code that allows for these types of random failures?
## Steps to Reproduce
Given that this test seems to be failing randomly only very occasionally, this might be hard to reproduce locally. But given that this has failed in the post-push GCC 4.8.4 CI build and the GCC 4.9.3 PR build one might be able to use one of those.
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1909Testing gap exists for Epetra642019-01-16T00:30:49ZJames WillenbringTesting gap exists for Epetra64*Created by: rhoope*
I don't think there is any automated testing for builds based on Epetra64.*Created by: rhoope*
I don't think there is any automated testing for builds based on Epetra64.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3103Test randomly failing test ROL_example_poisson-inversion_example_01_MPI_1 fai...2018-11-30T11:15:41ZJames WillenbringTest randomly failing test ROL_example_poisson-inversion_example_01_MPI_1 failing in PR Intel build*Created by: bartlettroscoe*
CC: @trilinos/framework, @trilinos/rol, @rppawlo (Trilinos Nonlinear Solvers Product Lead)
## Next Action Status
PR #3104 merged to 'develop' on 7/13/2018 which disables `ROL_example_poisson-inversion_...*Created by: bartlettroscoe*
CC: @trilinos/framework, @trilinos/rol, @rppawlo (Trilinos Nonlinear Solvers Product Lead)
## Next Action Status
PR #3104 merged to 'develop' on 7/13/2018 which disables `ROL_example_poisson-inversion_example_01_MPI_1` in Intel PR test build. Next: ROL developers fix behavior of test offline ...
## Description
As you can see in [this query](https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-07-12&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=3&showfilters=1&filtercombine=and&field1=testname&compare1=61&value1=ROL_example_poisson-inversion_example_01_MPI_1&field2=buildstarttime&compare2=83&value2=2018-07-01&field3=groupname&compare3=61&value3=Pull%20Request), the test ROL_example_poisson-inversion_example_01_MPI_1` seems to be failing randomly in the Intel PR build. This just killed my PR testing iteration shown in #3100. (Now I have to put on a `AT: RETEST` and hope this does not fail again and then stay up late to click the "merge" button in order for this to clean up the build tomorrow.)
In the case of the [#3100 PR testing iteration](https://github.com/trilinos/Trilinos/pull/3100#issuecomment-404665175), the [failing test output](https://testing-vm.sandia.gov/cdash/testDetails.php?test=48059673&build=3716185) shows:
```
Newton-Krylov using Conjugate Gradients
Line Search: Cubic Interpolation satisfying Strong Wolfe Conditions
iter value gnorm snorm #fval #grad iterCG flagCG ls_#fval ls_#grad
0 2.340112e-03 1.927880e-03
1 1.597727e-04 4.157593e-04 3.727069e+00 2 2 4 2 1 0
2 5.442664e-06 5.009082e-05 8.348624e-01 3 3 5 2 1 0
3 1.146552e-06 6.106086e-06 3.163006e+00 4 4 11 2 1 0
4 8.023717e-07 3.144919e-06 5.128519e-01 6 5 11 2 2 0
5 6.126545e-07 2.642767e-06 5.167993e-01 8 6 15 2 2 0
6 4.613227e-07 2.330904e-06 4.228759e-01 10 7 14 2 2 0
7 3.685626e-07 2.259062e-06 3.602303e-01 12 8 16 2 2 0
8 3.352764e-07 3.447963e-06 5.608285e-01 15 9 19 2 3 0
9 3.352764e-07 3.447963e-06 0.000000e+00 35 10 22 2 20 0 Optimization Terminated with Status: Step Tolerance Met
old_optimal_value = 1.0485417402164909e-07
new_optimal_value = 3.3527637557488306e-07
abs(new_optimal_value - old_optimal_value) / abs(old_optimal_value) = 2.19754915532174255333e+00 > 1.49011611938476562500e-08
End Result: TEST FAILED
```
If you look at a previous Intel PR build shown [here](https://testing-vm.sandia.gov/cdash/testDetails.php?test=45813048&build=3716020), it shows the output:
```
Newton-Krylov using Conjugate Gradients
Line Search: Cubic Interpolation satisfying Strong Wolfe Conditions
iter value gnorm snorm #fval #grad iterCG flagCG ls_#fval ls_#grad
0 2.340112e-03 1.927880e-03
1 1.597727e-04 4.157593e-04 3.727069e+00 2 2 4 2 1 0
2 5.442664e-06 5.009082e-05 8.348624e-01 3 3 5 2 1 0
3 2.334731e-06 2.793001e-05 3.695260e+00 4 4 11 2 1 0
4 1.076543e-06 1.668824e-05 5.083248e-01 6 5 6 2 2 0
5 8.388745e-07 1.439672e-05 1.215272e+00 7 6 7 2 1 0
6 5.152760e-07 9.169432e-06 1.560582e+00 9 7 17 2 2 0
7 1.398695e-07 2.702421e-06 4.159034e-01 10 8 10 0 1 0
8 1.089003e-07 4.590927e-07 2.184686e-01 11 9 11 0 1 0
9 1.060664e-07 6.754781e-07 9.860141e-01 12 10 39 0 1 0
10 1.051188e-07 1.569364e-08 9.726082e-02 13 11 15 0 1 0
11 1.048559e-07 2.698522e-08 2.153267e-01 14 12 50 1 1 0
12 1.048544e-07 2.590253e-10 1.010471e-03 15 13 10 0 1 0
13 1.048542e-07 1.052513e-11 4.755420e-03 16 14 50 1 1 0
14 1.048542e-07 2.725544e-13 1.514146e-04 17 15 50 1 1 0 Optimization Terminated with Status: Converged
old_optimal_value = 1.0485417402273531e-07
new_optimal_value = 1.0485417402191586e-07
End Result: TEST PASSED
```
If you look at the last time this test failed in an Intel PR build on 7/3/2018 [here](https://testing-vm.sandia.gov/cdash/testDetails.php?test=48059673&build=3684202) is showed the output:
```
Newton-Krylov using Conjugate Gradients
Line Search: Cubic Interpolation satisfying Strong Wolfe Conditions
iter value gnorm snorm #fval #grad iterCG flagCG ls_#fval ls_#grad
0 2.340112e-03 1.927880e-03
1 1.597727e-04 4.157593e-04 3.727069e+00 2 2 4 2 1 0
2 5.442664e-06 5.009082e-05 8.348624e-01 3 3 5 2 1 0
3 1.146552e-06 6.106086e-06 3.163006e+00 4 4 11 2 1 0
4 8.023717e-07 3.144919e-06 5.128519e-01 6 5 11 2 2 0
5 6.126545e-07 2.642767e-06 5.167993e-01 8 6 15 2 2 0
6 4.613227e-07 2.330904e-06 4.228759e-01 10 7 14 2 2 0
7 3.685626e-07 2.259062e-06 3.602303e-01 12 8 16 2 2 0
8 3.352764e-07 3.447963e-06 5.608285e-01 15 9 19 2 3 0
9 3.352764e-07 3.447963e-06 0.000000e+00 35 10 22 2 20 0
Optimization Terminated with Status: Step Tolerance Met
old_optimal_value = 1.0485417402164909e-07
new_optimal_value = 3.3527637557488306e-07
abs(new_optimal_value - old_optimal_value) / abs(old_optimal_value) = 2.19754915532174255333e+00 > 1.49011611938476562500e-08
End Result: TEST FAILED
```
If you compare the output, you can see that the passing and the failing algorithms seem to diverge on the 3rd iteration.
So it seems there is some non-deterministic behavior of this code that causes it to reach a different solution randomly. Could there be different local minima and randomly floating point rounding can cause the algorithm.
## Motivation and Context
This is occurring in a PR build that is blocking other developers branch merges.
## Possible Solution
Long-term, the test should be fixed to not randomly fail.
Short-term, the test should be disabled in the Intel auto PR build. It can still be
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3263Test ShyLU_NodeTacho_Tacho_TestSerial_double_MPI_1 randomly failing in CI and...2018-08-09T19:59:42ZJames WillenbringTest ShyLU_NodeTacho_Tacho_TestSerial_double_MPI_1 randomly failing in CI and PR GCC 4.8.4 + OpenMP builds*Created by: bartlettroscoe*
@trilinos/shylu, @trilinos/framework, @srajama1 (Trilinos Linear Solvers Product Lead)
## Expectations
A test should not fail unless a changes is made to break it. A test should not randomly fail.
...*Created by: bartlettroscoe*
@trilinos/shylu, @trilinos/framework, @srajama1 (Trilinos Linear Solvers Product Lead)
## Expectations
A test should not fail unless a changes is made to break it. A test should not randomly fail.
## Current Behavior
Looking at the four most recent failures of the test `ShyLU_NodeTacho_Tacho_TestSerial_double_MPI_1` in [this query](https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=testname&compare1=61&value1=ShyLU_NodeTacho_Tacho_TestSerial_double_MPI_1&field2=status&compare2=62&value2=passed&field3=status&compare3=62&value3=notrun&field4=buildstarttime&compare4=84&value4=now) the test appears to be randomly failing in the GCC 4.8.4 OpenMPI builds. In the most recent case, this test broke the auto PR GCC 4.8.4 + OpenMP build in PR #3260. IN each of the last for failures of this test dating back to 6/28/2018, they all fail showing:
```
....
[ RUN ] CrsMatrixBase.matrixmarket
unknown file: Failure
C++ exception with description "View bounds error of view ap ( 13 < 13 )
Traceback functionality not available
" thrown in the test body.
[ FAILED ] CrsMatrixBase.matrixmarket (23 ms)
...
[ FAILED ] 1 test, listed below:
[ FAILED ] CrsMatrixBase.matrixmarket
1 FAILED TEST
```
<!---
Tell us how the current behavior fails to meet your expectations in some way.
-->
## Motivation and Context
<!---
How has this expectation failure affected you? What are you trying to
accomplish? Why do we need to address this? What does it have to do with
anything? Providing context helps us come up with a solution that is most
useful in the real world.
-->
## Definition of Done
The test `ShyLU_NodeTacho_Tacho_TestSerial_double_MPI_1` is fixed to make it so that it does not randomly fail or is removed for CI and auto PR testing.
## Possible Solution
Fix it so that it does not randomly fail or remove it from CI and auto PR testing.
## Steps to Reproduce
See https://github.com/trilinos/Trilinos/wiki/Reproducing-PR-Testing-Errors.
## Your Environment
Standard SEMS GCC 4.8.4 auto PR build env (see above).
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4272The package Pliris needs to be elevated from ST to PT since it is being used ...2019-01-26T04:04:51ZJames WillenbringThe package Pliris needs to be elevated from ST to PT since it is being used by ATDM APP Gemma*Created by: bartlettroscoe*
**CC:** @trilinos/framework, @trilinos/shylu, @srajama1 (Trilinos Linear Solvers Product Area Lead)
**Blocking:** #2597
## Description
The Gemma configuration of Trilinos currently enables the Tril...*Created by: bartlettroscoe*
**CC:** @trilinos/framework, @trilinos/shylu, @srajama1 (Trilinos Linear Solvers Product Area Lead)
**Blocking:** #2597
## Description
The Gemma configuration of Trilinos currently enables the Trilinos package `Pliris` (see [TRIL-255](https://sems-atlassian-son.sandia.gov/jira/browse/TRIL-255)). However, the package `Pliris` is currently declared `ST` (Secondary Tested) and therefore is not included in any Trilinos PR builds.
Since an important internal Trilinos customer (i.e Gemma) is using `Pliris`, [by definition](http://trac.trilinos.org/wiki/TribitsLifecycleModelOverview#test_categories), it needs to be elevated from Secondary Tested (ST) to Primary Tested (PT). Otherwise, `Pliris` will not get enabled in Trilinos PR builds and therefore will not protect SPARC (see #2597).
## Proposed Solution
Update the line:
```
Pliris packages/pliris ST
```
to be
```
Pliris packages/pliris PR
```
in the file
* `Trilinos/PackagesList.cmake`
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1872There isn't an '@trilinos/<packagename>' tag for all packages2017-11-27T15:32:49ZJames WillenbringThere isn't an '@trilinos/<packagename>' tag for all packages*Created by: william76*
@trilinos/framework
Since I'm keeping an eye on the nightlies, etc. when the clean track goes red I'll probably put in a ticket if there doesn't appear to be one about that already. I've noticed that there i...*Created by: william76*
@trilinos/framework
Since I'm keeping an eye on the nightlies, etc. when the clean track goes red I'll probably put in a ticket if there doesn't appear to be one about that already. I've noticed that there isn't an @trilinos/<package> tag for all the packages set up. (i.e., today there is a test failing in FEI and there is no @trilinos/fei tag)
It'd be nice if we had one for every package. If packages have no 'team' then perhaps to the owner or someone who is best able to look at a package and know what's going on in it?
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/113TriBITS picking up the wrong boost libraries.2016-02-25T15:41:18ZJames WillenbringTriBITS picking up the wrong boost libraries.*Created by: bathmatt*
Building STK_classic is finding the wrong boost. I set to a specific version of boost and try to compile and I'm getting missing symbols. The include is being propagated but the libs are not even thoush I am set...*Created by: bathmatt*
Building STK_classic is finding the wrong boost. I set to a specific version of boost and try to compile and I'm getting missing symbols. The include is being propagated but the libs are not even thoush I am setting both
-D TPL_Boost_LIBRARY_DIRS:FILEPATH="${BOOST_BASE_DIR}/lib" \
and
-D TPL_Boost_LIBRARIES="${BOOST_BASE_DIR}/lib/libboost_program_options.so;${BOOST_BASE_DIR}/lib/libboost_system.so" does not
I still get the system libs.
/home/projects/x86-64-haswell-nvidia/openmpi/1.10.0/gcc/4.8.4/cuda/7.5.7/bin/mpicxx -std=c++11 -g -O0 CMakeFiles/STKClassic_stk_algsup_unit_tests.dir/UnitTestAlgorithmRunner.cpp.o CMakeFiles/STKClassic_stk_algsup_unit_tests.dir/UnitTestCudaMgr.cpp.o CMakeFiles/STKClassic_stk_algsup_unit_tests.dir/UnitTestMain.cpp.o CMakeFiles/STKClassic_stk_algsup_unit_tests.dir/UnitTest_helpers.cpp.o -o STKClassic_stk_algsup_unit_tests.exe -rdynamic ../stk_algsup/libstkclassic_algsup.a ../../stk_mesh/stk_mesh/fixtures/libstkclassic_mesh_fixtures.a ../../stk_mesh/stk_mesh/fem/libstkclassic_mesh_fem.a ../../stk_mesh/stk_mesh/base/libstkclassic_mesh_base.a ../../stk_util/stk_util/unit_test_support/libstkclassic_util_unit_test_support.a ../../stk_util/stk_util/parallel/libstkclassic_util_parallel.a ../../stk_util/stk_util/diag/libstkclassic_util_diag.a ../../stk_util/stk_util/environment/libstkclassic_util_env.a ../../stk_util/stk_util/util/libstkclassic_util_util.a ../../../../seacas/libraries/exodus/cbind/libexodus.a ../../../../fei/support-Trilinos/libfei_trilinos.a ../../../../fei/base/libfei_base.a ../../../../belos/tpetra/src/libbelostpetra.a ../../../../belos/epetra/src/libbelosepetra.a ../../../../belos/src/libbelos.a ../../../../ml/src/libml.a ../../../../galeri/src-epetra/libgaleri-epetra.a ../../../../tpetra/core/ext/libtpetraext.a ../../../../tpetra/core/inout/libtpetrainout.a ../../../../tpetra/core/src/libtpetra.a ../../../../tpetra/kernels/src/libtpetrakernels.a ../../../../kokkos/algorithms/src/libkokkosalgorithms.a ../../../../kokkos/containers/src/libkokkoscontainers.a ../../../../tpetra/classic/LinAlg/libtpetraclassiclinalg.a ../../../../tpetra/classic/NodeAPI/libtpetraclassicnodeapi.a ../../../../tpetra/classic/src/libtpetraclassic.a ../../../../ifpack/src/libifpack.a ../../../../amesos/src/libamesos.a ../../../../epetraext/src/libepetraext.a ../../../../seacas/libraries/ioss/src/init/libIonit.a ../../../../seacas/libraries/ioss/src/transform/libIotr.a ../../../../seacas/libraries/ioss/src/heartbeat/libIohb.a ../../../../seacas/libraries/ioss/src/generated/libIogn.a ../../../../seacas/libraries/ioss/src/pamgen/libIopg.a ../../../../seacas/libraries/ioss/src/exo_fac/libIoexo_fac.a ../../../../seacas/libraries/ioss/src/exo_par/libIopx.a ../../../../seacas/libraries/ioss/src/exo_fpp/libIofx.a ../../../../seacas/libraries/ioss/src/exodus/libIoex.a ../../../../seacas/libraries/ioss/src/libIoss.a ../../../../seacas/libraries/exodus/cbind/libexodus.a -Wl,-Bstatic -lnetcdf -Wl,-Bdynamic -L/home/projects/x86-64-haswell-nvidia/netcdf-exo/4.3.3.1/openmpi/1.10.0/gcc/4.8.4/cuda/7.5.7/lib -lnetcdf -L/home/projects/x86-64-haswell-nvidia/hdf5/1.8.15/openmpi/1.10.0/gcc/4.8.4/cuda/7.5.7/lib -lhdf5_hl -lhdf5 -lz -ldl ../../../../pamgen/src/libpamgen_extras.a ../../../../pamgen/src/libpamgen.a ../../../../aztecoo/src/libaztecoo.a ../../../../triutils/src/libtriutils.a ../../../../epetra/src/libepetra.a ../../../../shards/src/libshards.a ../../../../zoltan/src/libzoltan.a -lm ../../../../sacado/src/libsacado.a ../../../../teuchos/kokkoscomm/src/libteuchoskokkoscomm.a ../../../../teuchos/kokkoscompat/src/libteuchoskokkoscompat.a ../../../../teuchos/remainder/src/libteuchosremainder.a ../../../../teuchos/numerics/src/libteuchosnumerics.a -L/home/projects/x86-64-haswell/lapack/3.5.0/gcc/4.8.4 -llapack -L/home/projects/x86-64-haswell/blas/20150602/gcc/4.8.4 -lblas ../../../../teuchos/comm/src/libteuchoscomm.a ../../../../teuchos/parameterlist/src/libteuchosparameterlist.a ../../../../teuchos/core/src/libteuchoscore.a ../../../../kokkos/core/src/libkokkoscore.a -lcudart -lcublas -lcufft -lboost_program_options -lboost_system -lmpi_usempi -lmpi_mpifh -lgfortran -lquadmath
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3276Trilinos auto PR tester stability issues2019-05-02T13:20:11ZJames WillenbringTrilinos auto PR tester stability issues*Created by: bartlettroscoe*
@trilinos/framework
## Description
Over the last few weeks and months, the Trilinos auto PR tester has seen several cases where one or more PR builds for a given PR testing iteration failed to produce ...*Created by: bartlettroscoe*
@trilinos/framework
## Description
Over the last few weeks and months, the Trilinos auto PR tester has seen several cases where one or more PR builds for a given PR testing iteration failed to produce results on CDash or showed build or test failures that were not related to the changes on that particular PR.
This Story is to log these fails and keep track of them in order to provide some statistics about these cases in order to inform how to address them. This should replace making comments in individual PRs that exhibit these types of problems like #3260 and #3213.
## PR Builds Showing Random Failures
Below are a few examples of the stability problems (but are not all of the problems).
| PR ID | Num PR Builds to reach passing | First test trigger | Start first test| Passing test | Merge PR |
| --: | --: | --: | --: | --: | --: |
| #3258 | 2 | [8/8/2018 2:35 PM ET](https://github.com/trilinos/Trilinos/pull/3258#issue-207098955) | [8/8/2018 2:44 PM](https://github.com/trilinos/Trilinos/pull/3258#issuecomment-411510956) | [8/8/2018 9:15 PM ET]() | Not merged |
| #3260 | 4 | [8/8/2018 5:22 PM ET](https://github.com/trilinos/Trilinos/pull/3260#issue-207141537) | [8/8/2018 6:31 PM ET](https://github.com/trilinos/Trilinos/pull/3260#issuecomment-411574370) | [8/10/2018 4:13 AM ET](https://github.com/trilinos/Trilinos/pull/3260#issuecomment-412010497) | [8/10/2018 8:25 AM](https://github.com/trilinos/Trilinos/pull/3260#event-1782381644) |
| #3213 | 3 | [7/31/2018 4:30 PM ET](https://github.com/trilinos/Trilinos/pull/3213#issue-205233060) | [7/31/2018 4:57 PM ET](https://github.com/trilinos/Trilinos/pull/3213#issuecomment-409365522) | [8/1/2018 9:48 AM ET](https://github.com/trilinos/Trilinos/pull/3213#issuecomment-409580677) | [8/1/2018 9:53 AM ET](https://github.com/trilinos/Trilinos/pull/3213#event-1765281809) |
| #3098 | 4 | [7/12/2018 12:52 PM ET](https://github.com/trilinos/Trilinos/pull/3098#issue-201063953) | [7/12/2018 1:07 PM ET](https://github.com/trilinos/Trilinos/pull/3098#issuecomment-404582631) | [7/13/2018 11:12 PM ET](https://github.com/trilinos/Trilinos/pull/3098#issuecomment-404994581) | [7/14/2018 10:59 PM ET](https://github.com/trilinos/Trilinos/pull/3098#event-1733896640) |
| #3369 | 6 | [8/29/2018 9:08 AM ET](https://github.com/trilinos/Trilinos/pull/3369#issue-211746901) | [8/29/2018 9:16 AM ET](https://github.com/trilinos/Trilinos/pull/3369#issuecomment-416948915) | [8/31/2018 6:09 AM ET](https://github.com/trilinos/Trilinos/pull/3369#issuecomment-417618824) | [8/31/2018 8:33 AM ET](https://github.com/trilinos/Trilinos/pull/3369#event-1820478271) |
Improve productivity, stability, and quality of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1795Trilinos build error on VS20172017-10-02T14:22:58ZJames WillenbringTrilinos build error on VS2017*Created by: yiyangzhang37*
Hello guys. Not sure if this is the right place to post such problems...
I am trying to build Trilinos on Windows 10 with VS2017. Things seem to be weird as I get a lot of errors like
```
Severity Code...*Created by: yiyangzhang37*
Hello guys. Not sure if this is the right place to post such problems...
I am trying to build Trilinos on Windows 10 with VS2017. Things seem to be weird as I get a lot of errors like
```
Severity Code Description Project File Line Suppression State
Error C2065 'arg_data_ptr': undeclared identifier kokkoscore C:\ProgramData\Trilinos-master\packages\kokkos\core\src\impl\Kokkos_ViewMapping.hpp 2419
```
```
Severity Code Description Project File Line Suppression State
Error C3646 'assign': unknown override specifier kokkoscore C:\ProgramData\Trilinos-master\packages\kokkos\core\src\impl\Kokkos_ViewMapping.hpp 2386
```
```
Severity Code Description Project File Line Suppression State
Error C2668 'Kokkos::atomic_increment': ambiguous call to overloaded function kokkoskernels C:\ProgramData\Trilinos-master\packages\kokkos\core\src\impl\Kokkos_TaskQueue.hpp 229
```
```
Severity Code Description Project File Line Suppression State
Error C2953 'Kokkos::Impl::FunctorAnalysis<PatternInterface,Policy,Functor>::has_final_function<F,>': class template has already been defined kokkoskernels C:\ProgramData\Trilinos-master\packages\kokkos\core\src\impl\Kokkos_FunctorAnalysis.hpp 560
```
```
Severity Code Description Project File Line Suppression State
Error C3646 'pointer_type': unknown override specifier kokkoskernels C:\ProgramData\Trilinos-master\packages\kokkos\core\src\Kokkos_View.hpp 568
```
What should I do? https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2292Trilinos "Clean" and auto PR builds need a Trilinos_ENABLE_DEBUG=ON build2018-05-30T18:37:02ZJames WillenbringTrilinos "Clean" and auto PR builds need a Trilinos_ENABLE_DEBUG=ON build*Created by: bartlettroscoe*
**CC:** @trilinos/framework, @maherou, @rppawlo
## Description
It would seem that all of the current "Clean" builds of Trilinos shown for example yesterday at:
* https://testing.sandia.gov/cdash/in...*Created by: bartlettroscoe*
**CC:** @trilinos/framework, @maherou, @rppawlo
## Description
It would seem that all of the current "Clean" builds of Trilinos shown for example yesterday at:
* https://testing.sandia.gov/cdash/index.php?project=Trilinos&date=2018-02-21&filtercount=1&showfilters=1&field1=groupname&compare1=61&value1=Clean
all have `Trilinos_ENABLE_DEBUG=OFF` set. You can see this, for example, by looking at the uploaded CMakeCache.txt files for these three builds at:
* https://testing.sandia.gov/cdash/viewNotes.php?buildid=3398366##note0
* https://testing.sandia.gov/cdash/viewNotes.php?buildid=3398237##note3
* https://testing.sandia.gov/cdash/viewNotes.php?buildid=3398203##note3
which all show:
```
Trilinos_ENABLE_DEBUG:BOOL=OFF
```
This is not a good thing because there are a lot of run-time checks turned on with you configure Trilinos with `-DTrilinos_ENABLE_DEBUG=ON`. It catches a lot of undefined and otherwise illegal behavior that a `Trilinos_ENABLE_DEBUG=OFF` does not catch.
Because none of the "Clean" builds have `-DTrilinos_ENABLE_DEBUG=ON` one would assume that none of the auto PR builds have it set either. Therefore, can that auto PR builds have at least one build that has this turned on. And from looking at recent PRs like:
* https://github.com/trilinos/Trilinos/pull/2289#issuecomment-367782068
it looks like the auto PR tester is now only running one build. If that is the case, it is critical that this one build set `-DTrilinos_ENABLE_DEBUG=ON`.
This is a big issue for supporting developers and users of Trilinos and especially for ATDM builds of Trilinos that set `-DTrilinos_ENABLE_DEBUG=ON`. For example, this let skip through defects like #2270.
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1272Trilinos config problem in FiniteValue.cmake and GCC 5.4.02017-05-01T02:17:41ZJames WillenbringTrilinos config problem in FiniteValue.cmake and GCC 5.4.0*Created by: raovgarimella*
Trilinos 12.6.1 and 12.10.1 configuration fails with gcc 5.4.0 and cmake 3.5.1 because of an issue with trilinos/cmake/tribits/core/config_tests/FiniteValue.cmake. The complaint is that "```isnan was not dec...*Created by: raovgarimella*
Trilinos 12.6.1 and 12.10.1 configuration fails with gcc 5.4.0 and cmake 3.5.1 because of an issue with trilinos/cmake/tribits/core/config_tests/FiniteValue.cmake. The complaint is that "```isnan was not declared in this scope```".
Upon investigating further, it seems that FiniteValue.cmake tries to test for existence of '```isnan```' within the global namespace and the ```std::``` namespace but in both cases uses the ```<cmath>``` header.
According to the following post on StackOverflow, it seems however that one should include ```<math.h>``` when calling '```isnan```' and ```<cmath>``` when calling '```std::isnan```'
http://stackoverflow.com/questions/18128899/is-isnan-in-the-std-namespace-more-in-general-when-is-std-necessary-optio
Making this change allows the configuration to proceed. I checked and this problem persists in the HEAD version of the master as well.
Here is a trivial patch file for fixing the problem in 12.10.1 release
```
--- cmake/tribits/core/config_tests/FiniteValue.cmake 2016-11-22 15:43:07.000000000 -0700
+++ cmake/tribits/core/config_tests/FiniteValue.cmake.new 2017-04-26 22:27:08.016402800 -0600
@@ -58,7 +58,7 @@ INCLUDE(CheckCXXSourceCompiles)
SET(SOURCE_GLOBAL_ISNAN
"
-#include <cmath>
+#include <math.h>
int main()
{
double x = 1.0;
@@ -105,7 +105,7 @@ ENDIF()
SET(SOURCE_GLOBAL_ISINF
"
-#include <cmath>
+#include <math.h>
int main()
{
double x = 1.0;
```