Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2017-11-07T18:25:18Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/1945several packages call exit() in the library2017-11-07T18:25:18ZJames Willenbringseveral packages call exit() in the library*Created by: nschloe*
Discussions of why calling `exit()` in production code go back as far as 2010; cf. https://software.sandia.gov/bugzilla/show_bug.cgi?id=4969. Unfortunately, many libraries still call `exit()`:
```
$ lintian * | g...*Created by: nschloe*
Discussions of why calling `exit()` in production code go back as far as 2010; cf. https://software.sandia.gov/bugzilla/show_bug.cgi?id=4969. Unfortunately, many libraries still call `exit()`:
```
$ lintian * | grep shlib-calls-exit
X: libtrilinos-zoltan12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_zoltan.so.12.12.1
X: libtrilinos-aztecoo12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_aztecoo.so.12.12.1
X: libtrilinos-muelu12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_muelu.so.12.12.1
X: libtrilinos-nox12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_noxepetra.so.12.12.1
X: libtrilinos-stokhos12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_stokhos_muelu.so.12.12.1
X: libtrilinos-galeri12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_galeri-epetra.so.12.12.1
X: libtrilinos-epetraext12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_epetraext.so.12.12.1
X: libtrilinos-shylu12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_shylu.so.12.12.1
X: libtrilinos-pamgen12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_pamgen.so.12.12.1
X: libtrilinos-ml12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_ml.so.12.12.1
X: libtrilinos-triutils12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_triutils.so.12.12.1
X: libtrilinos-ifpack12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_ifpack.so.12.12.1
X: libtrilinos-pliris12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_dpliris.so.12.12.1
X: libtrilinos-trilinoscouplings12: shlib-calls-exit usr/lib/x86_64-linux-gnu/libtrilinos_trilinoscouplings.so.12.12.1
```
Would be great to see some progress here.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4117Zoltan: Closing old PR #3122: "Avoid picking up the wrong include from AMPI"2018-12-20T19:31:05ZJames WillenbringZoltan: Closing old PR #3122: "Avoid picking up the wrong include from AMPI"*Created by: william76*
@trilinos/zoltan
Pull Request #3122 "Avoid picking up the wrong include from AMPI" hasn't had any activity in a long time and appears to be stale and/or abandoned. I'm closing that PR and creating this issue...*Created by: william76*
@trilinos/zoltan
Pull Request #3122 "Avoid picking up the wrong include from AMPI" hasn't had any activity in a long time and appears to be stale and/or abandoned. I'm closing that PR and creating this issue to link to it.
I'm doing this because we need to close out some of the old PR's due to some GitHub limitations on the number of checks/hour that are allowed. The pull request autotester uses a polling model to check existing pull requests' status flags, etc. and we have occasionally hit that limit, which causes GitHub to reject the queries and can cause the Autotester to fail until the counter resets at the start of the next hour. Even "WIP" PR's count against this limit... so long-term PR's should probably be converted to issues if they aren't likely to get merged in the near future.
If this PR needs to be brought back to life it can easily be reopened on the pull request page.
If this PR is truly dead, please close out this issue ticket.
FYI: @jbakosi https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2033Zoltan incorrectly reads MatrixMarket matrices2017-12-11T21:23:35ZJames WillenbringZoltan incorrectly reads MatrixMarket matrices*Created by: rsln-s*
<!--- Provide a general summary of the issue in the Title above. -->
<!---
Note that anything between these delimiters is a comment that will not appear
in the issue description once created. Click on the Prev...*Created by: rsln-s*
<!--- Provide a general summary of the issue in the Title above. -->
<!---
Note that anything between these delimiters is a comment that will not appear
in the issue description once created. Click on the Preview tab to see what
everything will look like when you submit.
-->
<!---
Feel free to delete anything from this template that is not applicable to the
issue you are submitting.
-->
<!---
Replace <teamName> below with the appropriate Trilinos package/team name.
-->
@trilinos/zoltan
<!---
Assignees: If you know anyone who should likely tackle this issue, select them
from the Assignees drop-down on the right.
-->
<!---
Lables: Choose any applicable package names from the Labels drop-down on the
right. Additionally, choose a label to indicate the type of issue, for
instance, bug, build, documentation, enhancement, etc.
-->
## Expectations
<!---
Tell us what you think should happen, how you think things should work, what
you would like to see in the documentation, etc.
-->
Zoltan should correctly read MatrixMarket format matrices and interpret them in row-net fashion (i.e. each row in the matrix corresponds to a hyperedge and each column to a vertex).
## Current Behavior
<!---
Tell us how the current behavior fails to meet your expectations in some way.
-->
As pointed out by @SebastianSchlag, on matrix [VDOL/hangGlider_3](https://www.cise.ufl.edu/research/sparse/matrices/VDOL/hangGlider_3.html) after reading in the hypergraph the number of edges in it is 10259, vertices -- 10260 and pins (nonzeros) -- 44643. However, the matrix contains 10260 vertices, 10260 edges and 92703 nonzeros. @SebastianSchlag points out that this results in incorrect cut computations when compared to hMetis, for example. Please note that this problem has been tested in serial mode (i.e. on one processor, without MPI).
## Motivation and Context
<!---
How has this expectation failure affected you? What are you trying to
accomplish? Why do we need to address this? What does it have to do with
anything? Providing context helps us come up with a solution that is most
useful in the real world.
-->
I have previously attempted to fix this problem, but recently discovered (thanks, again, to @SebastianSchlag) that my fix is indeed incorrect. I will close the corresponding [pull request](https://github.com/trilinos/Trilinos/pull/1198) #1198
## Definition of Done
<!---
Tell us what needs to happen. If necessary, give us a task lisk along the
lines of:
- [ ] First do this.
- [ ] Then do that.
- [ ] Also this other thing.
-->
After the fix Zoltan should read the matrix in correctly. For example, for [VDOL/hangGlider_3](https://www.cise.ufl.edu/research/sparse/matrices/VDOL/hangGlider_3.html) the following should be true at the most fine level:
```
hg->nVtx == 10260
hg->nEdge == 10260
hg->nPins == 92703
```
## Steps to Reproduce
<!---
Provide a link to a live example, or an unambiguous set of steps to reproduce
this issue. Include code to reproduce, if relevant.
1. Do this.
1. Do that.
1. Shake fist angrily at computer.
-->
To check the number of vertices, hyperedges and nonzeros I have introduced the following three lines in 'phg/phg.c' (after line 475):
```
printf("RS_VERTEX_NUM=%d\n", hg->nVtx);
printf("RS_HEDGE_NUM=%d\n", hg->nEdge);
printf("RS_PINS_NUM=%d\n", hg->nPins);
```
Then to build and run:
```
cd /home/rshaydu/dev/Trilinos/build
cmake -D CMAKE_INSTALL_PREFIX:FILEPATH="/home/rshaydu/dev/Trilinos/build" -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF -D Trilinos_ENABLE_Zoltan:BOOL=ON -D Zoltan_ENABLE_EXAMPLES:BOOL=ON -D TPL_ENABLE_MPI:BOOL=OFF -D Trilinos_ENABLE_Fortran:BOOL=OFF -D Zoltan_ENABLE_TESTS:BOOL=ON ..
cd ~/test_dir
cp ~/Trilinos/build/packages/zoltan/src/driver/*.exe .
./zdrive.exe zdrive.inp.hangGlider_3.agg.2.110
```
Where `zdrive.inp.hangGlider_3.agg.2.110`:
```
Decomposition Method = hypergraph
Zoltan Parameters = HYPERGRAPH_PACKAGE=phg
Zoltan Parameters = lb_approach=partition
File Type = matrixmarket
File Name = hangGlider_3
Parallel Disk Info = number=0
Zoltan Parameters = NUM_GLOBAL_PARTITIONS = 2
Zoltan Parameters = PHG_COARSENING_METHOD= agg
Zoltan Parameters = PHG_CUT_OBJECTIVE=HYPEREDGES
Zoltan Parameters = IMBALANCE_TOL= 1.1
```https://gitlab.osti.gov/jmwille/Trilinos/-/issues/5106Zoltan: Remove Build Warnings2019-06-08T15:27:25ZJames WillenbringZoltan: Remove Build Warnings*Created by: ZUUL42*
## Enhancement
@trilinos/zoltan
Issue #3178 is working toward turning Warnings as Errors on for _all_ packages in order in ensure Trilinos maintains a high level of SQA practices across the project.
Currently ...*Created by: ZUUL42*
## Enhancement
@trilinos/zoltan
Issue #3178 is working toward turning Warnings as Errors on for _all_ packages in order in ensure Trilinos maintains a high level of SQA practices across the project.
Currently Zoltan has a warning that need to be handled before we can set Werror for Zoltan and eventually all packages.
A recent test build was performed with -Werror set. [The CDash report can be found here.](https://testing.sandia.gov/cdash/index.php?project=Trilinos&parentid=4989925&filtercount=2&showfilters=1&field1=buildstamp&compare1=63&value1=Experimental&field2=buildstarttime&compare2=83&value2=2019/04/30&filtercombine=and)
Once the Zoltan build doesn't emit any warnings that will be promoted to errors, [we can set `-Werror` in the GCC 7.2.0 automated build](https://github.com/ZUUL42/Trilinos/tree/Werror_Zoltan).https://gitlab.osti.gov/jmwille/Trilinos/-/issues/4042Zoltan test diff failures in targeted CUDA PR build Trilinos-atdm-white-ride-...2019-04-06T00:14:47ZJames WillenbringZoltan test diff failures in targeted CUDA PR build Trilinos-atdm-white-ride-cuda-9.2-release-debug-pt*Created by: bartlettroscoe*
CC: @trilinos/zoltan, @kddevin (Trilinos <product-area-name> Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
<status-and-or-first-action>
## Description
As shown in [this query](...*Created by: bartlettroscoe*
CC: @trilinos/zoltan, @kddevin (Trilinos <product-area-name> Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
<status-and-or-first-action>
## Description
As shown in [this query](https://testing.sandia.gov/cdash-dev-view/viewTest.php?onlyfailed&buildid=4287837) the tests:
* `Zoltan_ch_simple_zoltan_parallel`
* `Zoltan_ch_grid20x19_zoltan_parallel`
* `Zoltan_ch_ewgt_zoltan_parallel`
* `Zoltan_ch_nograph_zoltan_parallel`
fail in the build `Trilinos-atdm-white-ride-cuda-9.2-release-debug-pt` which is the current candidate CUDA PR build described in #2464. They have failed since we switched from a `debug` build to a `release-debug` build for the reasons described in https://github.com/trilinos/Trilinos/issues/2464#issuecomment-444637454. These are the only new tests that are failing since we switched from a `debug` to a `release-debug` build.
These all look to be "diff" failures like [here](https://testing.sandia.gov/cdash-dev-view/testDetails.php?test=61372846&build=4287837) showing:
```
DEBUG moving files: simple.out.4.3 output/simple.rib-partlocal4.4.3
DEBUG comparing files: answers/simple.rib-partlocal4.4.3 output/simple.rib-partlocal4.4.3
DEBUG comparing files: answers/simple.rib-partlocal4.drops.4.3 output/simple.rib-partlocal4.drops.4.3
DEBUG COMPARISON 1 1
Test simple:rib-partlocal4 FAILED (Diff failed on 1 files)
```
## Current Status on CDash
The current status of these tests/builds for the current testing day can be found at:
* [Zoltan tests in Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-release-debug-pt build over last two days](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=3&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-release-debug-pt&field2=testname&compare2=65&value2=Zoltan_&field3=buildstarttime&compare3=83&value3=2%20days%20ago)
NOTE: Click "previous" to see the previous day's test results in case this build did not run today or add the filter ["Build Start Time", "is after", "2 weeks ago"] to see history of tests in previous days. (Or create any filters you want from there.)
## Steps to Reproduce
One should be able to reproduce these build errors on either 'white' or 'ride' by cloning the Trilinos git repo, checking out the 'develop' branch, creating a build directory, and then doing:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-9.2-release-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Zoltan=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -q rhel7F -n 16 ctest -j16
```Initial cleanup of new ATDM builds of Trilinos