Panzer performance with openmp back end
Created by: bathmatt
When one calls the panzer model evaluator creator with openMP kokkos back end it is 100x slower than normal.
Specifically this function
28.6735 using all the threads, vs 0.249929 using --kokkos-threads=1
template <typename Scalar>
void panzer::ModelEvaluator<Scalar>::
setupModel(const Teuchos::RCP<panzer::WorksetContainer> & wc,
const std::vector<Teuchos::RCP<panzer::PhysicsBlock> >& physicsBlocks,
const std::vector<panzer::BC> & bcs,
const panzer::EquationSetFactory & eqset_factory,
const panzer::BCStrategyFactory& bc_factory,
const panzer::ClosureModelFactory_TemplateManager<panzer::Traits>& volume_cm_factory,
const panzer::ClosureModelFactory_TemplateManager<panzer::Traits>& bc_cm_factory,
const Teuchos::ParameterList& closure_models,
const Teuchos::ParameterList& user_data,
bool writeGraph,const std::string & graphPrefix,
const Teuchos::ParameterList& me_params)
All of this time is in these two functions about 80 for the first and 20 for the second. Not sure where in these functions the cost is, dogigng that down now.
fmb->setupVolumeFieldManagers(physicsBlocks,volume_cm_factory,closure_models,*lof_,user_data);
fmb->setupBCFieldManagers(bcs,physicsBlocks,eqset_factory,bc_cm_factory,bc_factory,closure_models,*lof_,user_data);
@rppawlo @eric-c-cyr @jmgate