Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2019-02-11T15:10:03Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4304MueLu: structured region driver2019-02-11T15:10:03ZJames WillenbringMueLu: structured region driver*Created by: lucbv*
@trilinos/muelu
@rstumin @mayrmt @pohm01
## Expectations
This new driver constructs a problem on a perfectly structured grid and provides necessary data for HHG region matrix construction
## Motivation and ...*Created by: lucbv*
@trilinos/muelu
@rstumin @mayrmt @pohm01
## Expectations
This new driver constructs a problem on a perfectly structured grid and provides necessary data for HHG region matrix construction
## Motivation and Context
This driver will provide a testing ground for new HHG capabilities on a simple mesh.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4205Panzer: Cleanup of Workset interface2019-01-22T15:22:38ZJames WillenbringPanzer: Cleanup of Workset interface*Created by: seamill*
## Motivation and Context
Currently the workset is more or less a massive storage container for everything we require, and much of it is stored in structure form (direct access to class members). The goal of thi...*Created by: seamill*
## Motivation and Context
Currently the workset is more or less a massive storage container for everything we require, and much of it is stored in structure form (direct access to class members). The goal of this update is to add a function interface to access these calls, and to strip everything out of worksets that isn't required.
The main reason for doing so is to allow for on-demand allocation and filling of geometry/topological data arrays as required by the derived code. Currently the workset allocates and fills *everything* that could possibly be required by the user, which takes lots of memory, and a bit of compute time. By making these calls on-demand, the total allocations made will be greatly reduced for most applications.
Making the interface on-demand also cleans up the workset construction by not requiring the user to know what Basis/Cubature/Geometry/Topology is required by the derived FEM/FV/FD code during workset construction.
@trilinos/panzer
## What do we keep/remove/redesign
Ideally, worksets would be a simple container that would look like:
```c++
class Workset
{
public:
using Scalar = double;
using Index = int;
Index
numDimensions() const;
Index
subcellIndex() const;
Kokkos::View<const Scalar***>
cellVertices() const;
const std::string &
blockID() const;
Kokkos::View<const Index*>
localCellIndexes() const;
const panzer::SubcellConnectivity &
getSubcellConnectivity(const int subcell_dimension) const;
const panzer::IntegrationValues2<Scalar> &
getIntegrationValues(const panzer::IntegrationDescriptor & description) const;
const panzer::IntegrationRule &
getIntegrationRule(const panzer::IntegrationDescriptor & description) const;
panzer::BasisValues2<Scalar> &
getBasisValues(const panzer::BasisDescriptor & basis_description,
const panzer::IntegrationDescriptor & integration_description) const;
const panzer::BasisValues2<Scalar> &
getBasisValues(const panzer::BasisDescriptor & basis_description,
const panzer::IntegrationDescriptor & integration_description) const;
const panzer::BasisValues2<Scalar> &
getBasisValues(const panzer::BasisDescriptor & basis_description,
const panzer::PointDescriptor & point_description) const;
const panzer::PointValues2<Scalar> &
getPointValues(const panzer::PointDescriptor & point_description) const;
const panzer::PureBasis &
getBasis(const panzer::BasisDescriptor & description) const;
Index
numCells() const;
Index
numOwnedCells() const;
Index
numGhostCells() const;
Index
numVirtualCells() const;
std::ostream &
operator<<(std::ostream& os, const panzer::Workset& w);
protected:
...
};
```
Naturally, there are a few issues with this. Having 'const' everywhere is required since worksets are passed around as const, but it will make things tricky when adding an on-demand feature. For now we can get away with having a shared pointer to an underlying class that does all the mutable stuff... assuming this doesn't cause issues with CUDA.
It also leaves the question of what to do with the rest of the interface. As far as I can tell, the above is missing the following (most of which probably shouldn't be in the workset):
```c++
Teuchos::RCP< std::vector<std::string> > basis_names;
void setNumberOfCells(int o_cells,int g_cells,int v_cells);
void setIdentifier(std::size_t identifier) { identifier_ = identifier; }
std::size_t getIdentifier() const { return identifier_; }
double alpha;
double beta;
double time;
double step_size;
double stage_number;
std::vector<double> gather_seeds; // generic gather seeds
bool evaluate_transient_terms;
Teuchos::RCP<WorksetDetails> other;
class WorksetDetailsAccessor;
```
I also don't know why there are multiple 'WorksetDetails' available in a workset.
If @eric-c-cyr and @rppawlo could run me though the reasoning behind the additional workset stuff, I would be grateful.
## Definition of Done
The updates to workset are straightforward. Updating Drekar, Charon, and the various EMPIRE codes will be a bit tricky so we will break it into a set of phases. Updating Charon and Drekar may require some help from @jmgate. Updates to Empire will probably go smoother with @CamelliaDPG.
### Phase 1
Workset interface changes:
- [ ] Merge Workset and WorksetDetails into a single class.
- [ ] Clean up and minimize interface based after discussing things with Eric and Roger.
- [ ] Clean up workset construction interface (currently there are multiple conflicting ways to do this)
- [ ] Deprecate the old interface
- [ ] Deprecate old construction scheme
- [ ] Update Drekar
- [ ] Update Charon
- [ ] Update EMPIRE
### Phase 2
Workset on-demand changes:
- [ ] Modify workset so that requesting things using `Descriptors` will spawn new BasisValues/PointValues/IntegrationRules/etc.
- [ ] Remove 'old' deprecated interface
- [ ] Remove alternative construction schemes
Note that we will allow the worksets to be fully allocated/filled if the user requests this using the construction interface (i.e. WorksetNeeds is passed to the construction scheme).
### Phase 3
Update BasisValues2 class
- [ ] Split `BasisValues2` into two classes: `BasisValues` and `BasisIntegrationValues`
- [ ] Add function interface to new classes
- [ ] Implement on-demand calculations of various datasets
- [ ] Deprecate old interface
- [ ] Update Drekar
- [ ] Update Charon
- [ ] Update EMPIRE
### Phase 4
Update IntegrationValues2 class
- [ ] Rename to `IntegrationValues`
- [ ] Add function interface to new classes
- [ ] Implement on-demand calculations of various datasets
- [ ] Deprecate old interface
- [ ] Update Drekar
- [ ] Update Charon
- [ ] Update EMPIRE
### Phase 5
Update PointValues2 class
- [ ] Rename to `PointValues`
- [ ] Add function interface to new classes
- [ ] Implement on-demand calculations of various datasets
- [ ] Deprecate old interface
- [ ] Update Drekar
- [ ] Update Charon
- [ ] Update EMPIRE
### Phase 6
Update subcell connectivity interface
- [ ] Add node/edge/face subcell connectivity to topological mesh analysis in Panzer (lots of questions here)
- [ ] Add node/edge/face subcell connectivities to workset (on-demand)
- [ ] Deprecate old interface
- [ ] Update Drekar
- [ ] Update Charon
- [ ] Update EMPIRE
### Phase 7
Cleanup
- [ ] Once everyone stops yelling at me, we can go ahead and remove the deprecated interface
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3832Framework: Add GCC 7.3 MPI dev->master and PR builds2018-12-05T17:18:40ZJames WillenbringFramework: Add GCC 7.3 MPI dev->master and PR builds*Created by: jwillenbring*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
<!---
Note th...*Created by: jwillenbring*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
<!---
Note that anything between these delimiters is a comment that will not appear
in the issue description once created. Click on the Preview tab to see what
everything will look like when you submit.
-->
<!---
Feel free to delete anything from this template that is not applicable to the
issue you are submitting.
-->
<!---
Replace <teamName> below with the appropriate Trilinos package/team name.
-->
@trilinos/framework
<!---
Assignees: If you know anyone who should likely tackle this issue, select them
from the Assignees drop-down on the right.
-->
<!---
Lables: Choose any applicable package names from the Labels drop-down on the
right. Additionally, choose a label to indicate the type of issue, for
instance, bug, build, documentation, enhancement, etc.
-->
## Expectations
<!---
Tell us what you think should happen, how you think things should work, what
you would like to see in the documentation, etc.
-->
We want to have dev->master and PR builds that use GCC 7.3.
## Motivation and Context
<!---
How has this expectation failure affected you? What are you trying to
accomplish? Why do we need to address this? What does it have to do with
anything? Providing context helps us come up with a solution that is most
useful in the real world.
-->
Due to key customer requirements, GCC 7.2 is an important compiler for Trilinos to build cleanly with. The closest version we currently have on the SEMS NFS mount is 7.3, so we are going to use that, at least for the time being.
Eventually we want to add warnings as errors flags to this build too, but that will be covered in a separate ticket.
## Definition of Done
<!---
Tell us what needs to happen. If necessary, give us a task list along the
lines of:
- [ ] First do this.
- [ ] Then do that.
- [ ] Also this other thing.
-->
Definition of Done is that there is a PR build in place and a dev->master build in place, building cleanly (prior to the changes being tested in that instance).
## Possible Solution
<!---
Not obligatory, but suggest a fix for the bug or documentation, or suggest
ideas on how to implement the addition or change.
-->
1) Create files for GCC 7.3 analogous to Trilinos/cmake/std/PullRequestLinuxGCC4.9.3TestingSettings.cmake
Trilinos/cmake/std/sems/PullRequestGCC4.9.3TestingEnv.sh
and get those files into the develop branch
2) Modify the autotester config file for dev->master testing to include this build and run the build once to detect any issues. It might be good to set up a parameterized build for 7.3 to find most issues before doing this.
3) Once the dev->master build is clean, repeat the process at the PR level.
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3825Lumped projection fails for tri's and maybe other basis as well...2018-11-24T02:22:51ZJames WillenbringLumped projection fails for tri's and maybe other basis as well...*Created by: bathmatt*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
The projection produ...*Created by: bathmatt*
<!---
Provide a general summary of the issue in the Title above. If this issue
pertains to a particular package in Trilinos, it's worthwhile to start the
title with "PackageName: ".
-->
The projection produces bad values for higher order basis and lumping,
<!---
Note that anything between these delimiters is a comment that will not appear
in the issue description once created. Click on the Preview tab to see what
everything will look like when you submit.
-->
<!---
Feel free to delete anything from this template that is not applicable to the
issue you are submitting.
-->
<!---
Replace <teamName> below with the appropriate Trilinos package/team name.
-->
@trilinos/<teamName>
<!---
Assignees: If you know anyone who should likely tackle this issue, select them
from the Assignees drop-down on the right.
-->
<!---
Lables: Choose any applicable package names from the Labels drop-down on the
right. Additionally, choose a label to indicate the type of issue, for
instance, bug, build, documentation, enhancement, etc.
-->
## Expectations
<!---
Tell us what you think should happen, how you think things should work, what
you would like to see in the documentation, etc.
-->
## Current Behavior
<!---
Tell us how the current behavior fails to meet your expectations in some way.
-->
## Motivation and Context
<!---
How has this expectation failure affected you? What are you trying to
accomplish? Why do we need to address this? What does it have to do with
anything? Providing context helps us come up with a solution that is most
useful in the real world.
-->
## Definition of Done
<!---
Tell us what needs to happen. If necessary, give us a task list along the
lines of:
- [ ] First do this.
- [ ] Then do that.
- [ ] Also this other thing.
-->
## Possible Solution
<!---
Not obligatory, but suggest a fix for the bug or documentation, or suggest
ideas on how to implement the addition or change.
-->
## Steps to Reproduce
<!---
Provide a link to a live example, or an unambiguous set of steps to reproduce
this issue. Include code to reproduce, if relevant.
1. Do this.
1. Do that.
1. Shake fist angrily at computer.
-->
## Your Environment
<!---
Include relevant details about your environment such that we can replicate this
issue.
-->
- **Relevant repo SHA1s:**
- **Relevant configure flags or configure script:**
- **Operating system and version:**
- **Compiler and TPL versions:**
## Related Issues
<!---
If applicable, let us know how this bug is related to any other open issues:
-->
* Blocks
* Is blocked by
* Follows
* Precedes
* Related to
* Part of
* Composed of
## Additional Information
<!---
Anything else that might be helpful for us to know in addressing this issue:
* Configure log file:
* Build log file:
* Test log file:
* When was the last time everything worked (date/time; SHA1s; etc.)?
* What did you do that made the bug rear its ugly head?
* Have you tried turning it off and on again?
-->
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3580Tpetra::Distributor: Fix "slow path" so we can use MPI_Isend2019-02-17T23:34:06ZJames WillenbringTpetra::Distributor: Fix "slow path" so we can use MPI_Isend*Created by: mhoemmen*
@trilinos/tpetra @jjellio @csiefer2
Fix the "slow path" of `Distributor::doPosts`, so we can use nonblocking sends (`MPI_Isend`). The "slow path" kicks in when the data to send are not neatly grouped in conti...*Created by: mhoemmen*
@trilinos/tpetra @jjellio @csiefer2
Fix the "slow path" of `Distributor::doPosts`, so we can use nonblocking sends (`MPI_Isend`). The "slow path" kicks in when the data to send are not neatly grouped in contiguous chunks per process. It permutes the data into contiguous-by-target-process-rank chunks for sending. Currently, the slow path uses the same send buffer for all the messages. This means that it cannot use nonblocking sends.
We must fix both the "three-argument" (all messages have the same size) and "four-argument" (different messages may have different sizes) overloads of `doPosts`, and both the `Teuchos::ArrayRCP` and `Kokkos::View` versions of each.
## Motivation and Context
This is part of the overall effort to improve MPI+CUDA performance and make Tpetra's boundary exchange and sparse matrix-vector multiply communication nonblocking.
## Definition of Done
- [x] Fix 3-argument `Teuchos::ArrayRCP` overload of `doPosts`
- [x] Fix 3-argument `Kokkos::View` overload of `doPosts`
- [x] Fix 4-argument `Teuchos::ArrayRCP` overload of `doPosts`
- [x] Fix 4-argument `Kokkos::View` overload of `doPosts`
## Related Issues
* Part of #383 https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3227MueLu: Adding region MG for structured/unstructured meshes to test driver2018-10-25T06:41:08ZJames WillenbringMueLu: Adding region MG for structured/unstructured meshes to test driver*Created by: mayrmt*
@trilinos/muelu
## Expectations
Enable the region multigrid driver to handle problems with structured/unstructured meshes, i.e. regional meshes where most regions are structured, but at least one region is unst...*Created by: mayrmt*
@trilinos/muelu
## Expectations
Enable the region multigrid driver to handle problems with structured/unstructured meshes, i.e. regional meshes where most regions are structured, but at least one region is unstructured.
This will also require a *hybrid interface aggregation* which is currently under development by @lucbv and @pohm01.
## Current Behavior
All regions have to be structured.
## Motivation and Context
This will add lots of flexibility in mesh design.
## Definition of Done
- [ ] Read in a mesh with structured/unstructured regions
- [ ] Adapt the MueLu hierarchy setup to deal with these hybrid cases
- [ ] Include an hybrid interface aggregation (done in #2687)
## Possible Solution
@rstumin has provided some code snippets. Let's use them as a starting point.
## Related Issues
* Related to #2687
## Interested Parties
@lucbv @rstumin @pohm01
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2970MueLu: Link to Kokkos Kernels Distance-2 Graph Coloring2018-06-19T00:01:00ZJames WillenbringMueLu: Link to Kokkos Kernels Distance-2 Graph Coloring*Created by: william76*
@trilinos/muelu
So, apparently I can't just link to the MueLu folks directly on the Kokkos-Kernels repo... :p
Here's a link to the PR: [kokkos-kernels #263][1]
@jhux2
@lucbv
[1]: https://github.c...*Created by: william76*
@trilinos/muelu
So, apparently I can't just link to the MueLu folks directly on the Kokkos-Kernels repo... :p
Here's a link to the PR: [kokkos-kernels #263][1]
@jhux2
@lucbv
[1]: https://github.com/kokkos/kokkos-kernels/pull/263https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2933Change from individual CDash error emails to daily summary emails for the ATD...2019-04-10T03:32:57ZJames WillenbringChange from individual CDash error emails to daily summary emails for the ATDM Trilinos builds (and perhaps other efforts)*Created by: bartlettroscoe*
CC: @fryeguy52, @trilinos/framework, @dridzal
## Description
After having to triage the promoted ATDM Trilinos builds for a couple of months now, and from extensive experience on other projects like C...*Created by: bartlettroscoe*
CC: @fryeguy52, @trilinos/framework, @dridzal
## Description
After having to triage the promoted ATDM Trilinos builds for a couple of months now, and from extensive experience on other projects like CASL VERA, I have come to the realization that relying on CDash error emails is not a very effective notification and monitoring scheme in many of these situations. The reasons that CDash error emails are not effective for keeping on top of a lot of builds is that:
1. It is hard to tell if a failing test is new that day or has been failing for multiple days or if that same test is failing across several builds. (All you get is a single email telling you that there is a failure for that one build.)
2. When a failure does occur that results in a CDash error email, there is an urgency to address the problem ASAP (by either fixing, disabling, or reverting commits) in order to make the CDash error email go away. Otherwise, repeated CDash error emails day after day makes people accustomed to seeing CDash error emails and therefore new failures are ignored (and many people will create email filters and just ignore them from that point on).
3. Catastrophic failures due to system issues can occur that result in a huge number of CDash error emails that can spam people (sometimes a Trilinos developer can get a dozen or more emails since the are on several different package regression lists). This can occur for many reasons like the disk filling up, or when the Intel license server goes down, or when a module does not load correctly. The huge glut of CDash error emails that can occur in these cases can obscure new real failures and can cause some people to add email filters (which then makes the CDash error emails worthless).
Instead of relying on individual CDash error emails, we could move to a notification scheme that created a single email each day that summarized the builds and tests and gave some information about the history of failing tests. Such a system could solve all of the problems listed above and make top-level triaging and monitoring of a bunch of related builds much easier.
(NOTE: Really CDash error notification emails are the best solution for a small number of post-push CI builds that you expect to fail only very rarely and you need a notification ASAP. For nightly builds, they are not effective for the reasons described above.)
## Possible Solution
It seems that a straightforward solution would be to write a Python script that extracted data off of the CDash site using multiple queries using the API interface that provides data as JSON data-structures. The Python script would analyze the data and create an HTML-formatted email with useful summary information and CDash URL links.
The full specification is given at:
* https://docs.google.com/document/d/13A6tIXCS5EnL0a3ramu-4TvCMwFeiIKEKjOlP-z1Qvo
The input that would would provide to the Python script would be:
* Name of the set of builds being analyzed (e.g. "ATDM Trilinos Builds")
* Base CDash site (e.g. "https://testing-vm.sandia.gov/cdash/")
* CDash project name (e.g. "project=Trilinos")
* Current testing day (e.g. "YYYY-MM-DD")
* CDash query URL fields (minus `data`, `project`, etc.) for queryTests.php to determine tests to be examined
* CDash query URL fields (minus `data`, `project`, etc.) for index.php for list of builds to be examined
* List of expected builds in the triplet ('site', 'build-name', 'group')
Given this data, the Python script would run queries and extract data off of the queryTests.php page for the current day and the previous two testing days (using the `data=YYYY-MM-DD` URL field) and then display that data as described below (sorted into various lists).
The Python script would then run the query on the index.php page and would note the builds that had any configure, build or test failures (including "not run" tests) and it would compare the list of builds extracted to the input list of expected builds and then note the expected builds that did not show up.
Then the Python script would construct an HTML-formatted email with the body having the following data:
* (limited) List of tests that failed today but not the previous day (`t1=???` in summary line)
* (limited) List of tests that failed today and the previous day but not the day before that (`t2=???` in summary line)
* (limited) List of tests that failed today and the previous two consecutive days (`t3+=???` in summary line)
* Total number of "not-run" (non-disabled) tests for current testing day and CDash URL to that list (`tnr=???` in summary line)
* List of current-day builds that had any configure, build, or test failures (including "not run" tests) (`b=???` is the sum of the build failures in those builds shown in summary line)
* List of missing expected builds or builds that exist and pass the configure but don't have test results (`meb=???` in summary) (NOTE: The current CDash implementation will only alert about missing expected builds but it will not alert about builds with missing tests.)
* Total number of builds run and URL to the list of builds.
* Total number of failing tests for the current testing day and the CDash URL
* URL(s) to the list of all failing tests for the current day (but excluding "not run" tests)
The summary line for the email could be something like:
```
FAILED (t1=2, t2=1, t3+=5, tnr=18, b=3, meb=1): ATDM Trilinos Builds
```
That email summary message would look similar to the ones that CDash sends out and one could see just in the summary line how many tests newly failed in the current testing day (i.e. `t1=2`), how many tests failed in the last two consecutive days (i.e. `t2=2`) and how many tests failed in the last three or more consecutive days (i.e. `t3+=5`). It would also show if there were any build failures (i.e. `b=3`) and how may tests were not run (`tnr=18`). Lastly, it would show if there were any missing expected builds (`meb=1`).
For the ATDM Trilinos builds, we could run this script on a cron job or a Jenkins job after 12 midnight MT or so (or wait until 5 am to allow all of the jobs to finish).
Other data we might consider reporting on and showing are:
* Number of, URL to, and (limited) list of newly passing tests for the current testing that failed the previous day (or the last day that the matching builds had any test results) (`tnp=???` in summary line)
* Number of, URL to, and (limited) list of newly missing tests compared to yesterday (but only if the build ran the current day and the tests ran for that build and likewise for the previous day) (`tnm=???` in summary line)
The above two bits of data would really help in determining that failing tests got resolved (either by fixing them or temporarily disabling them).
And since you would only get one email, then I think it would be good to send out the email with the summary line:
```
PASSED (tnp=2, tnm=1): ATDM Trilinos Builds
```
and that email would contain links to the set of 100% passing builds!
That is an email that even a manager might want to get :-)
This script could also allow you to specify a set of "expected may fail" tests which would be provided in an array with the four fields `[<test-name>, <build-name>, <site-name>, <github-issue-link>]` and any failing tests that matched this criteria would be listed in their own sublist in the email could could be given `tef=???` in the summary line. These failing tests would not be counted against global pass/fail when the fail but if they go from failing to passing, that would be listed along with the other "newly passing tests" (e.g. `tnp`). However, a better way to handle this would be to have CTest/CDash mark such tests as EXPECTED_MAY_FAIL as described in [this CTest/CDash backlog item](https://docs.google.com/document/d/1TLHRp8eTNKw7udOhwIxrOYShXQUbxAzsXeOq5cwWnKM/edit#bookmark=id.4w8ld6727hpw) and then this script would automatically handle these tests differently without having to provide a separate list to this script. However, allowing someone to label a certain test as "expected may fail" specifically in this script would allow different customers to handle the same test differently. For example, one customer might consider a failing MueLu test as a show stopper and affect global pass/fail while another may not and therefore want to handle it as an "Expected may fail" and not affect global pass/fail. You can't do that with a single CTest/CDash property for each test. But without direct CTest/CDash support, the email body would list out the failing test along with the `<github-issue-link>` so one could immediately go to that issue to see how that failing test is being addressed.
Even for tests that we did **not** want to mark as "Expected may fail" (and therefore taken out of global pass/fail logic), it would also be useful to mark known failing tests that we did want to impact global pass/fail, it would also be nice to mark them with the GitHub issue ID if the failure is known and is being tracked. This could be done by passing in an array of "Known failing" tests with entries `[<test-name>, <build-name>, <site-name>, <github-issue-link>]`. This would be useful to see when looking at the summary email to know if we needed to triage those tests or not. (That is, if one sees failing tests that have failed for more than one consecutive day that don't have a GitHub Issue associated with them, then that would be a trigger to triage the failure and create a new Trilinos GitHub issue and then add to the list of "Expected may fail" tests or "Known failing" tests lists).
The script could also allow you to specify some "flaky" or "unstable" builds as an array of `[<build-name>, <site-name>]` entries where we expect random test failures. If a test failed in one of these "flaky" or "unstable" builds, then it would be reported in a separate section of the email and would not count toward the global pass/fail. Currently (as of 7/14/2018) we would categorize all of the ATDM Trilinos builds on 'ride' (see #2511) and the builds on 'mutrino' (see [TRIL-214](https://software-sandbox.sandia.gov/jira/browse/TRIL-214)) in this category. That way, we could keep track of these builds in case something big went wrong but the they would not count toward global pass/fail (and therefore would not disrupt automated processes that update Trilinos between branches and application customers). But if more than a small number of test failures occurred (e.g. 4 tests per build) then this could impact global pass/fail. This would avoid a new catastrophic failure on one of these platforms from allowing an update of Trilinos to an ATDM APP, for example.
## Tasks:
1. Get initial script working that keeps track of failing tests with existing GitHub issue trackers can detect new failing tests that need to be triaged and get basic unit tests in place (see "TODO.txt" file in 'atdm-email' branch of 'TrilinosATDMStatus' repo and 'atdm-email' in TriBITS branch) ... PROGRESS ...
1. Set up mailman list and Jenkins job to run script and post emails to the mailman list (and we can sign up for the mail list). (The mail list will also provide an archive of past results). (There should be a different mail list for different types of results; .e.g. one for the main "Promoted ATDM Trilinos Builds", a different one for "Specialized ATDM Trilinos Builds", etc.)
1. Create documentation about the script somewhere and put in links to this documentation in the generated HTML emails somehow.
1. Flesh out the script to cover all of the types of failures we need to keep track of.
1. ???
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2689 Add install hooks for ATDM Trilinos configuration and ctest -S driver2019-04-18T15:16:52ZJames Willenbring Add install hooks for ATDM Trilinos configuration and ctest -S driver*Created by: bartlettroscoe*
**CC:** @fryeguy52
## Description
In order for the ATDM projects to adopt the new [ATDM Trilinos Configuration and Jenkins Drivers](https://snl-wiki.sandia.gov/display/CoodinatedDevOpsATDM/ATDM+Builds+...*Created by: bartlettroscoe*
**CC:** @fryeguy52
## Description
In order for the ATDM projects to adopt the new [ATDM Trilinos Configuration and Jenkins Drivers](https://snl-wiki.sandia.gov/display/CoodinatedDevOpsATDM/ATDM+Builds+of+Trilinos) (that submit to CDash as a byproduct), these scripts must support the installation of Trilinos and using Trilinos from that install.
For this to occur:
* There must be a single that can be sourced sitting in the installation directory that will load the correct env to use that installed version of Trilinos. **[Done]**
* Trilinos must be installeable using that ATDM Trilinos configuration from a Jenkins job
## Possible solutions
First, a new env var `ATDM_CONFIG_TRIL_CMAKE_INSTALL_PREFIX` could be added that would be read in the `ATDMDevEnvSettings.cmake` file and would set the CMake var `CMAKE_INSTALL_PREFIX`. That would give you the correct configuration of Trilinos and would know where to install.
As for doing the installation itself, one simple solution would be to provide a `atdm/install-prebuilt-trilinos.sh` script that will run `make install` on the built Trilinos version. This script would need to be explicitly called by some Jenkins driver script. But any failures would not be posted to CDash.
If we wanted to allow for installation errors to get reported up to CDash, then a `CTEST_DO_INSTALL` option could be added to the [TRIBITS_CTEST_DRIVER()](https://tribits.org/doc/TribitsDevelopersGuide.html#tribits-ctest-driver) function and would do a `cmake_build( ... )` command that would run the `install` target (and in parallel) after the project build was complete. That would also allow logic for not installing if there was a build failure and, again, would allow for any install failures to get reported to CDash.
Also, to help to make installation testing for downstream customer codes stronger, the smart-jenkins-driver.sh script could be updated to move the source and build directories out of the way to catch mistakes were the installed configuration files point into the source or build trees. This could be accomplished by adding the steps:
* Run the smart-jenkins-driver.sh script under `$WORKSPACE/Trilinos`
* If `$WORKSPACE/moved/SRC_AND_BUILD/` already exists, the move it back to `$WORKSPACE/SRC_AND_BUILD/`
* After the build, installation and testing are completed, move `$WORKSPACE/SRC_AND_BUILD/` to `$WORKSPACE/moved/SRC_AND_BUILD/`.
NOTE: The reason to move `$WORKSPACE/SRC_AND_BUILD/` to `$WORKSPACE/moved/SRC_AND_BUILD/` instead of say `$WORKSPACE/SRC_AND_BUILD.moved/` is to catch errors where the installation under `$WORKSPACE/local_install/` might have relative paths to `$WORKSPACE/SRC_AND_BUILD/BUILD/` which would also work for `$WORKSPACE/SRC_AND_BUILD.moved/BUILD/` but would not work for `$WORKSPACE/moved/SRC_AND_BUILD/BUILD/`. Also, the reason to move `$WORKSPACE/SRC_AND_BUILD/` instead of deleting it is to avoid having to re-clone the Trilinos git repo again from scratch every time the job runs and to allow looking through the build artifacts on Jenkins after the job is complete.
Lastly, in order to support loading the correct env from a Trilinos install, the relevant files from `Trilinos/cmake/std/atdm/` need to be installed that match the configuration of Trilinos being installed. At a minimum, this must include, for example:
* `Trilinos/cmake/std/atdm/load-dev.sh`
* `Trilinos/cmake/std/atdm/utils/`
* `Trilinos/cmake/std/atdm/<system-name>/environment.sh`
But that would allow loading any env, including those that don't match the install. So the Trilinos install hooks should also be updated to install a script `<install-prefix>/load_matching_env.sh` that takes no arguments and will source the installed `atdm/load-env.sh` script with the right job name. For example, this installed script `load_matching_env.sh` might look like:
```
source <install-prefix>/share/atdm-trilinos/load-env.sh Trilinos-cuda-9.2-opt
```
The a client like EMPIRE or SPARC would just source:
```
source <trilinos-install-prefix>/load_matching_env.sh
```
and then inspect the exported `ATDM_CONFIG_*` vars and know that the right compilers, MPI, TPL locations, etc. were loaded to correctly use that installation of Trilinos.
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2430MueLu: uncoupled aggregation kokkos refactor2018-10-02T22:42:41ZJames WillenbringMueLu: uncoupled aggregation kokkos refactor*Created by: lucbv*
@trilinos/muelu
## Expectations
Aggregation algorithm need to be refactored to run efficiently on new architectures
## Current Behavior
aggregation runs on new platform but large parts of the code are still ...*Created by: lucbv*
@trilinos/muelu
## Expectations
Aggregation algorithm need to be refactored to run efficiently on new architectures
## Current Behavior
aggregation runs on new platform but large parts of the code are still serial
## Motivation and Context
This will be useful for simulation on advanced platform coming in production.
## Definition of Done
- [ ] rewrite aggregation algorithms using kokkos
- [ ] check that the new algorithms compile with both OpenMP and CUDA nodes
- [ ] verify that the aggregates created using these algorithm are reasonable for MueLu's use
## Possible Solution
pull request xxx is submitted with a proposed implementation that should scale reasonable well at least with OpenMP.
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2399remove or fix the broken cuda/panzer test2018-03-19T14:45:04ZJames Willenbringremove or fix the broken cuda/panzer test*Created by: bathmatt*
mySourceTerm.cpp doesn't compile under cuda, can it be removed or fixed. This is an old issue @jmgate can you please resolve? Its your code.
Thanks
*Created by: bathmatt*
mySourceTerm.cpp doesn't compile under cuda, can it be removed or fixed. This is an old issue @jmgate can you please resolve? Its your code.
Thanks
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2230MueLu: adding structured aggregation capabilities2018-09-07T15:17:31ZJames WillenbringMueLu: adding structured aggregation capabilities*Created by: lucbv*
This issue tracks the progresses made toward the addition of structured aggregation capabilities in MueLu.
@rstumin @mayrmt
@trilinos/muelu
## Expectations
The goal is to extract, clean up and extend the...*Created by: lucbv*
This issue tracks the progresses made toward the addition of structured aggregation capabilities in MueLu.
@rstumin @mayrmt
@trilinos/muelu
## Expectations
The goal is to extract, clean up and extend the code that already exists in GeneralGeometricPFactory (GGP) to create a separate aggregation factory that can be used by TentativeP, GGP and BlackBoxP. It should allow for easier maintenance as code in each factory will be reduced.
## Current Behavior
Currently aggregation like capabilities reside in both GGP and BlackBoxP which makes it hard to upgrade the code. At the same time TentativeP is using brick aggregation which only provides limited capabilities and cannot be directly compared to GGP.
## Motivation and Context
This work is part of a bigger project to add structured grid capabilities in MueLu hopefully providing an alternative to unstructured aggregation when mesh alignment is important to the physics or numerics of a problem (line/plane smoothing).
## Definition of Done
Here is a more or less chronological list of tasks that needs to happen in order to complete this work
- [x] extract code from GGP factory and create a structured aggregation factory that provides aggregates for piece-wise constant interpolation algorithm.
- [x] extend the previous work to allow uncoupled aggregation to be used with structured aggregation, currently only coupled aggregation is possible.
- [x] allow support for block structured code. This is more of a long-term goal not clearly defined yet.
- [x] add support for linear interpolation
- [ ] remove duplicated code from GGP and BlackBoxP
## Related Issues
* Is blocked by #2577
## Additional Information
This work will require a significant amount of coding to be completed and some pieces such as block structured aggregation depend on other design decision. Therefore it is more than likely that multiple pull request will be required to complete all of the above.
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2012Phalanx: add support for hierarchic parallelism2017-11-21T21:21:53ZJames WillenbringPhalanx: add support for hierarchic parallelism*Created by: rppawlo*
*Created by: rppawlo*
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1925Tpetra: Make CrsGraph & CrsMatrix fillComplete check domain & range Map one-t...2018-12-23T00:36:40ZJames WillenbringTpetra: Make CrsGraph & CrsMatrix fillComplete check domain & range Map one-to-one-ness in debug mode*Created by: mhoemmen*
@trilinos/tpetra
See https://github.com/NaluCFD/Nalu/issues/211 for an example of a common user error, namely not passing in domain and range Maps to fillComplete when the row Map is overlapping (i.e., not one ...*Created by: mhoemmen*
@trilinos/tpetra
See https://github.com/NaluCFD/Nalu/issues/211 for an example of a common user error, namely not passing in domain and range Maps to fillComplete when the row Map is overlapping (i.e., not one to one).
Tpetra has the `TPETRA_DEBUG` environment variable now; if you set it to 1, you'll get more debug-mode checks. We can exploit this to help users diagnose their incorrect usage of Tpetra. In particular, we can do the following:
- Make the version of fillComplete that takes no Maps check that the row Map is one to one
- Make the version of fillComplete that takes the domain and range Maps check that these Maps are one to one
We should only do these checks in debug mode, since they may be expensive (they may require extra MPI communication).https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1903Panzer: issues with pthread/multiple execution spaces2017-10-26T15:25:24ZJames WillenbringPanzer: issues with pthread/multiple execution spaces*Created by: rppawlo*
Reported in kokkos:
https://github.com/kokkos/kokkos/issues/1186*Created by: rppawlo*
Reported in kokkos:
https://github.com/kokkos/kokkos/issues/1186https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1833Phalanx: explore DAG evaluation performance from single parallel_for launch2017-10-16T22:50:51ZJames WillenbringPhalanx: explore DAG evaluation performance from single parallel_for launch*Created by: rppawlo*
*Created by: rppawlo*
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1751Discuss and document workflow for directly snapshotting SEACAS into Trilinos ...2018-06-26T12:49:47ZJames WillenbringDiscuss and document workflow for directly snapshotting SEACAS into Trilinos to better serve ATDM*Created by: bartlettroscoe*
**Next Action Status:**
@gdsjaar to implement agreed to workflow on SEACAS github site then start doing new workflow ...
**CC:** @trilinos/framework, @gsjaardema
**Description:**
This story is t...*Created by: bartlettroscoe*
**Next Action Status:**
@gdsjaar to implement agreed to workflow on SEACAS github site then start doing new workflow ...
**CC:** @trilinos/framework, @gsjaardema
**Description:**
This story is to document a discussion and the final workflow for allowing @gsjaardema to directly SEACAS directly into Trilinos/packages/seacas/ in addition to shapshotting SEACAS into Sierra.base/seacas/ and then having that snapshotted from ther into Trilinos/packages/seacas/. The nature of the snapshotting process (i.e. never a merge) eliminates the potential for merge conflicts that might make the snapshotting from Sierra.base/seacas/ to Trilinos/packages/seacas/ more difficult. And this would allow @gsjaardema control to address issues for ATDM customers very quickly and simplify the version control and configuration of Trilinos for ATDM customers. For more background and discussion, see:
* https://software-srn.sandia.gov/jira/browse/SPAR-277
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1575Panzer: CONTRIBUTES-Style Evaluators2018-10-24T15:19:43ZJames WillenbringPanzer: CONTRIBUTES-Style Evaluators*Created by: jmgate*
`panzer::Integrator_BasisTimesScalar` and `Integrator_GradBasisDotVector` were refactored in fe8070f to have new constructors that allow a user to create the `Evaluator` in either an `EVALUATES` style (which is what...*Created by: jmgate*
`panzer::Integrator_BasisTimesScalar` and `Integrator_GradBasisDotVector` were refactored in fe8070f to have new constructors that allow a user to create the `Evaluator` in either an `EVALUATES` style (which is what has been done up to this point) or a `CONTRIBUTES` style. When using the latter, no memory is created to store the result of the `evaluateFields()` call; instead the result is simply contributed to some other `Evaluator`. This has great potential for memory savings. Additionally, these new constructors are compile-time checked, as opposed to the old `ParameterList` interface.
We need to begin refactoring the rest of the Panzer `Evaluator`s in a similar fashion. This issue is to be an over-arching issue that will link to smaller issues to tackle the `Evaluator`s themselves.
@trilinos/panzer
**Sub-issues:**
| Class | Issue | Status |
|:----- | ----- | ------ |
| `Integrator_BasisTimesVector` | #1624 | Closed |
| `Integrator_CurlBasisDotVector` | #1890 | Closed |
| `Integrator_DivBasisTimesScalar` | #1891 | Closed |
| `Integrator_GradBasisCrossVector` | #1892 | Closed |
| `Integrator_GradBasisTimesScalar` | #1893 | Open |
| `Integrator_Scalar` | #1894 | Open |
| `Integrator_TransientBasisTimesScalar` | #1895 | Open |https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1473Collect items for major refactoring of Thyra to make simpler and more sustain...2017-08-07T15:30:33ZJames WillenbringCollect items for major refactoring of Thyra to make simpler and more sustainable*Created by: bartlettroscoe*
**CC:** @trilinos/thyra, @rppawlo, @atoth1
**Description:**
Thyra is a feature-full but complex set of software supporting the development of abstract numerical algorithms (ANAs). Over the years since...*Created by: bartlettroscoe*
**CC:** @trilinos/thyra, @rppawlo, @atoth1
**Description:**
Thyra is a feature-full but complex set of software supporting the development of abstract numerical algorithms (ANAs). Over the years since Thyra has been in use, many potential developers have been scared off or intimidated by the complexity of Thyra.
This story is to collect a set of ideas on how we can refactor Thyra to make it simpler to understand, use, and extend, without destroying the OO design features and functionality.
To help collect these ideas, I have created the Google Doc "Simplifying Thyra":
* https://docs.google.com/document/d/1oHZzTHwl9D2lZ_on8HSg0zmAuB6wmToDtq2SOPzxmQs
For now we are just going to collect ideas in that document.
ToDo: Define full scope for this story and a clear definition of done.
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/1403Panzer: Simplify response support2017-06-07T00:47:07ZJames WillenbringPanzer: Simplify response support*Created by: rppawlo*
This issue is to simplify the requirements for adding new response functions to panzer.
1. Naming of objects
2. work to eliminate the builder and factory as separate classes
3. Build simplified derived classes fro...*Created by: rppawlo*
This issue is to simplify the requirements for adding new response functions to panzer.
1. Naming of objects
2. work to eliminate the builder and factory as separate classes
3. Build simplified derived classes from decision tree
4. Implement point values evaluator as an example