Intrepid2 Kokkos-View-ization
Created by: bathmatt
I am adding a github issue for the intrepid2 work for more simplicity of tracking.
Summarizing the email below. The initial step is to remove the two MD fields in intrepid2 and replace them with runtime rank kokkos views. @etphipp has a prototype dimension structure which handles FAD objects.
I'm adding all the poeple who have be involved in this in the past like @eric-c-cyr @etphipp @crtrott @kyungjoo-kim @rppawlo
Here are some facts which led us to where we are today. 1: Intrepid2 has a lot of its functionality which uses run time ranked arrays.
2: Intrepid2 has two different multi-dimensional arrays which are sort of interchangeable (The API between these two are not identical) 2A: Intrepid_FieldContainer. This uses an ArrayRCP to store data and tracks dimensions and ranks and provides a multi->1D array. This can't be use in Kokkos functions 2B: Intrepid_FieldContainer_Kokkos, This uses a pointer and special code to allocate and reference data, but this does NOT use a kokkos::view under the hood. This partially works with Kokkos, however, if you have a Intrepid_FC_K this won't work on kokkos,
3: Phalanx has a MDField which wraps a true kokkos view, this should work on both CPU/GPU and with kokkos functions. It should work with Kokkos, however, if it is using Fads, you need to take care to define the size when you great the MDField. If not you will get an empty MDField becuase the view length is based on the Fad derivative length.
4: Intrepid2 uses arrays (field containers or some other MD array) of Fads for performing calculations like grads. However, with either Kokkos:View, I_FC, or MDFields, this code will need to be modified so one defines the Fad length at decl. The MDField could be used for this, however, the other FC won't perform on the GPU and can cause deadlocks due to allocations.
5: The Intrepid FC types don't roll the FAD derivative lengths into the view dimension. This can make things slow
6: ArrayWrappers. These allow, in theory, one to send in a static sized array and use it in a dynamic sized array. It often copies data to temps in/out of parallel regions. These make it harder to debug and may not be needed.
7: We use the kokkos execution space based on the (first) array type passed into the function memory space
So, where do we go from here? Here is how I see things. Please
1: I still believe that we need to remove the Intrepid_FC classes and replace them with something like Phalanx::MDField, or better yet, Kokkos::MDView, which is a multi-dimensional array.
2: I think we need to remove all the array wrappers, we need to default to Kokkos::DefaultSpace
3: We need to make everything copy by value in Views,
4: We need to go through the 100 of basis types and fix up the Fad classes to make sure the length is known when we allocate temps.
This is a big undertaking, thankfully Intrepid2 has good test coverage. I k now there will be some pushback on someof this, but we need to do something IMHO. We need to figure how to move forward.
Matt