Intrepid2 refactor getPhysicalEdge... has a huge overhead on cuda
Created by: bathmatt
So, many routines that get normals and edge tangents call down to
CellTools<SpT>::
setSubcellParametrization( subcellParamViewType &subcellParam,
const ordinal_type subcellDim,
const shards::CellTopology parentCell ) {
Which isn't such a big deal except that in this we create some kokkos views
referenceNodeDataViewType
v0("CellTools::setSubcellParametrization::v0", Parameters::MaxDimension),
v1("CellTools::setSubcellParametrization::v1", Parameters::MaxDimension),
v2("CellTools::setSubcellParametrization::v1", Parameters::MaxDimension),
v3("CellTools::setSubcellParametrization::v1", Parameters::MaxDimension);
Now, on CudaUVM this takes quite a bit of time. Since we do this once for every element we need to get info from this is a huge amount of time. For a small test this is 2 minutes of the time, vs 6 seconds to fill and solve the matrix that comes out of the system.
Not sure on the best way to resolve this?
You call through to getReferenceVertex which takes a view.
How should we move forward? @kyungjoo-kim @eric-c-cyr @rppawlo @crtrott