Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • T Trilinos
  • Project information
    • Project information
    • Activity
    • Labels
    • Planning hierarchy
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 936
    • Issues 936
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 22
    • Merge requests 22
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • James Willenbring
  • Trilinos
  • Issues
  • #3991

Closed
Open
Created Dec 04, 2018 by James Willenbring@jmwilleMaintainer

MueLu: MueLu hangs when try to "export data" such as matrices after repartitioning has occurred

Created by: pwxy

MueLu hangs when try to "export data" such as matrices after repartitioning has occurred. The MPI processes that have dropped out after repartitioning will throw and the run hangs:

p=3: *** Caught standard std::exception of type 'Teuchos::bad_any_cast' :

 ../../packages/muelu/src/Interface/../MueCentral/MueLu_VariableContainer.hpp:103:
 
 Throw number = 17
 
 Throw test that evaluated to true: data_->type() != typeid(T)
 
 Error, cast to type Data<Teuchos::RCP<Xpetra::Matrix<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> > >> failed since the actual underlying type is 'Teuchos::RCP<Xpetra::Operator<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> > >!

This is develop Trilinos cloned this morning (Dec 4, 2018), SHA1 573e3290b0500eee45e582cb8fcee0b1c6476cec

Example MueLu_Driver.exe run that exhibits this issue:

mpirun -n 4 MueLu_Driver.exe --matrixType=Laplace3D --nx=50 --ny=50 --nz=4 --mx=2 --my=2 --mz=1


[ptlin@ceerws3709 scaling]$ cat scaling.xml
<ParameterList name="MueLu">

  <!--
    For a generic symmetric scalar problem, these are the recommended settings for MueLu.
  -->

  <!-- ===========  GENERAL ================ -->
    <Parameter        name="verbosity"                            type="string"   value="high"/>

    <Parameter        name="coarse: max size"                     type="int"      value="1000"/>

    <Parameter        name="multigrid algorithm"                  type="string"   value="sa"/>

    <!-- reduces setup cost for symmetric problems -->
    <Parameter        name="transpose: use implicit"              type="bool"     value="true"/>

    <!-- start of default values for general options (can be omitted) -->
    <Parameter        name="max levels"                	        type="int"      value="10"/>
    <Parameter        name="number of equations"                  type="int"      value="1"/>
    <Parameter        name="sa: use filtered matrix"              type="bool"     value="true"/>
    <!-- end of default values -->

  <!-- ===========  AGGREGATION  =========== -->
    <Parameter        name="aggregation: type"                    type="string"   value="uncoupled"/>
    <Parameter        name="aggregation: drop scheme"             type="string"   value="classical"/>
    <!-- Uncomment the next line to enable dropping of weak connections, which can help AMG convergence
         for anisotropic problems.  The exact value is problem dependent. -->
    <!-- <Parameter        name="aggregation: drop tol"                type="double"   value="0.02"/> -->

  <!-- ===========  SMOOTHING  =========== -->
    <Parameter        name="smoother: type"                       type="string"   value="CHEBYSHEV"/>
    <ParameterList    name="smoother: params">
      <Parameter      name="chebyshev: degree"                    type="int"      value="2"/>>
      <Parameter      name="chebyshev: ratio eigenvalue"          type="double"   value="7"/>
      <Parameter      name="chebyshev: min eigenvalue"            type="double"   value="1.0"/>
      <Parameter      name="chebyshev: zero starting solution"    type="bool"     value="true"/>
    </ParameterList>

  <!-- ===========  REPARTITIONING  =========== -->
    <Parameter        name="repartition: enable"                  type="bool"     value="true"/>
    <Parameter        name="repartition: partitioner"             type="string"   value="zoltan2"/>
    <Parameter        name="repartition: start level"             type="int"      value="2"/>
    <Parameter        name="repartition: min rows per proc"       type="int"      value="800"/>
    <Parameter        name="repartition: max imbalance"           type="double"   value="1.1"/>
    <Parameter        name="repartition: remap parts"             type="bool"     value="false"/>
    <!-- start of default values for repartitioning (can be omitted) -->
    <Parameter name="repartition: remap parts"                type="bool"     value="true"/>
    <Parameter name="repartition: rebalance P and R"          type="bool"     value="false"/>
    <ParameterList name="repartition: params">
       <Parameter name="algorithm" type="string" value="multijagged"/>
    </ParameterList> 
    <!-- end of default values -->

    <ParameterList name="export data">
      <Parameter name="A" type="string" value="{2}"/>
    </ParameterList> 


</ParameterList>
[ptlin@ceerws3709 scaling]$ 
Assignee
Assign to
Time tracking