Zoltan: Several tests fail with 64 bit builds of of Scotch and ParMETIS
Created by: bartlettroscoe
Next Action Status:
64-bit Scotch and ParMETIS not enabled for Zoltan yet. Next: Zoltan team to fix failing tests then enable ...
CC: @trilinos/zoltan
Description:
As @kddevin predicted in this #158 comment, several of the Scotch and ParMETIS tests fail when using a 64 bit build of Scott and ParMETIS. These are the only TPLs that are available with the SEMS Dev Env (see a lengthy discussion in #158 (closed)).
In particular, the following Zoltan tests failed with the 64 bit builds of Scotch and ParMETIS:
$ grep " Test " ctest.out | grep "Failed" | grep "Zoltan_" | grep -i "\(parmetis\|scotch\)"
182/221 Test #3: Zoltan_ch_brack2_3_parmetis_parallel ....................***Failed Error regular expression found in output. Regex=[FAILED] 4.54 sec
184/221 Test #6: Zoltan_ch_bug_parmetis_parallel .........................***Failed Error regular expression found in output. Regex=[FAILED] 2.37 sec
192/221 Test #18: Zoltan_ch_ewgt_parmetis_parallel ........................***Failed Error regular expression found in output. Regex=[FAILED] 3.64 sec
193/221 Test #19: Zoltan_ch_ewgt_scotch_parallel ..........................***Failed Error regular expression found in output. Regex=[FAILED] 0.26 sec
194/221 Test #21: Zoltan_ch_grid20x19_parmetis_parallel ...................***Failed Error regular expression found in output. Regex=[FAILED] 3.81 sec
196/221 Test #24: Zoltan_ch_hammond_parmetis_parallel .....................***Failed Error regular expression found in output. Regex=[FAILED] 7.71 sec
197/221 Test #25: Zoltan_ch_hammond_scotch_parallel .......................***Failed Error regular expression found in output. Regex=[FAILED] 0.43 sec
202/221 Test #33: Zoltan_ch_nograph_parmetis_parallel .....................***Failed Error regular expression found in output. Regex=[FAILED] 1.67 sec
204/221 Test #36: Zoltan_ch_onedbug_parmetis_parallel .....................***Failed Error regular expression found in output. Regex=[FAILED] 0.54 sec
208/221 Test #42: Zoltan_ch_simple_parmetis_parallel ......................***Failed Error regular expression found in output. Regex=[FAILED] 5.28 sec
209/221 Test #43: Zoltan_ch_simple_scotch_parallel ........................***Failed Error regular expression found in output. Regex=[FAILED] 0.52 sec
212/221 Test #48: Zoltan_ch_vwgt_parmetis_parallel ........................***Failed Error regular expression found in output. Regex=[FAILED] 3.72 sec
213/221 Test #49: Zoltan_ch_vwgt_scotch_parallel ..........................***Failed Error regular expression found in output. Regex=[FAILED] 0.26 sec
However, what is interesting is that several Zoltan "scotch" and "parmetis" tests also passed:
$ grep " Test " ctest.out | grep "Passed" | grep "Zoltan_" | grep -i "\(parmetis\|scotch\)"
183/221 Test #4: Zoltan_ch_brack2_3_scotch_parallel ...................... Passed 0.11 sec
185/221 Test #7: Zoltan_ch_bug_scotch_parallel ........................... Passed 0.11 sec
186/221 Test #9: Zoltan_ch_degenerate_parmetis_parallel .................. Passed 0.11 sec
187/221 Test #10: Zoltan_ch_degenerate_scotch_parallel .................... Passed 0.11 sec
188/221 Test #12: Zoltan_ch_degenerateAA_parmetis_parallel ................ Passed 0.11 sec
189/221 Test #13: Zoltan_ch_degenerateAA_scotch_parallel .................. Passed 0.11 sec
190/221 Test #15: Zoltan_ch_drake_parmetis_parallel ....................... Passed 0.11 sec
191/221 Test #16: Zoltan_ch_drake_scotch_parallel ......................... Passed 0.11 sec
195/221 Test #22: Zoltan_ch_grid20x19_scotch_parallel ..................... Passed 0.11 sec
198/221 Test #27: Zoltan_ch_hammond2_parmetis_parallel .................... Passed 0.11 sec
199/221 Test #28: Zoltan_ch_hammond2_scotch_parallel ...................... Passed 0.11 sec
200/221 Test #30: Zoltan_ch_hughes_parmetis_parallel ...................... Passed 0.11 sec
201/221 Test #31: Zoltan_ch_hughes_scotch_parallel ........................ Passed 0.13 sec
203/221 Test #34: Zoltan_ch_nograph_scotch_parallel ....................... Passed 0.11 sec
205/221 Test #37: Zoltan_ch_onedbug_scotch_parallel ....................... Passed 0.11 sec
206/221 Test #39: Zoltan_ch_serial_parmetis_parallel ...................... Passed 0.11 sec
207/221 Test #40: Zoltan_ch_serial_scotch_parallel ........................ Passed 0.11 sec
210/221 Test #45: Zoltan_ch_simple3d_parmetis_parallel .................... Passed 0.11 sec
211/221 Test #46: Zoltan_ch_simple3d_scotch_parallel ...................... Passed 0.11 sec
214/221 Test #51: Zoltan_ch_vwgt2_parmetis_parallel ....................... Passed 0.68 sec
215/221 Test #52: Zoltan_ch_vwgt2_scotch_parallel ......................... Passed 0.11 sec
There are many possible options to address these failing tests that I can think of:
-
Disable only the currently failing tests for just the SEMS Dev Env build: This could be done by setting cache vars
<test_name>_DISABLE=TRUE
in the SEMSDevEnv.cmake file.- Pro: Easy to implement by non-Zoltan developers
- Pro: Still enables Scotch and ParMETIS TPLs and gets at least some tests run using these
- Con: Does not exercise some functionality of Zoltan for Scotch and ParMETIS
- Con: As Zoltan tests using Scotch and ParMETIS are changed but only tested with 32 bit builds of Scotch and ParMETIS, there is greater risk that these updated tests which are currently passing on the SEMS Dev Env may then fail with the 64 bit builds of these TPLs.
- Summary: Easy short-term solution that yields all passing CI tests with Zoltan
-
Disable Scotch and ParMETIS TPL support for Zoltan for just the SEMS Dev Env build: This could be done by setting
Zoltan_ENABLE_Scotch=OFF
andZoltan_ENABLE_ParMETIS=OFF
in the SEMSDevEnv.cmake file.- Pro: Easy to implement by non-Zoltan developers
- Pro: There would never be a Scotch or ParMETIS related test failure on the SEMS Dev Env.
- Con: The build and usage of Zoltan with Scotch and ParMETIS would not be getting tested on the SEMS Dev Env.
- Summary: Easy short-term solution that yields all passing CI tests with Zoltan
-
Update the Zoltan test suite to work with 64 bit Scotchand ParMETIS: This would require Zoltan developers to do the updates.
- Pro: Would allow full Zoltan test suite to be run on the SEMS Dev Env.
- Pro: Strengthens the Zoltan
- Con: Requires Zoltan developers to update the Zoltan test suite
- Summary: Best long-term solution but requires work from the Zoltan developers
I will provide detailed reproducibility instructions in a later comment.
Definition of Done:
- No failing Zoltan tests in pre-push CI testing with the SEMS Dev Env
- Zoltan developers decide on best approach to dealing with these failing tests.
Tasks:
???