Tpetra::DistObject: DualView arguments have wrong types
*Created by: mhoemmen* @trilinos/tpetra @kyungjoo-kim This concerns the following DistObject virtual methods that its subclasses must implement: - copyAndPermuteNew - packAndPrepareNew - unpackAndCombineNew As of PR #4328, we have clarified how `DistObject::doTransferNew` expects these methods to behave, with respect to their DualView parameters. Export and Import setup is responsible for ensuring that `permuteToLIDs`, `permuteFromLIDs`, `exportLIDs`, and `remoteLIDs` are sync'd on both host and device. Thus, the above methods need not sync any of these four DualViews. As a result, their type `const DualView<const LO*, buffer_device_type>&` is correct. (`const DualView&` means you can't sync them; `DualView<const LO*, ...>` means you can't modify their entries.) PR #4328 further clarifies that `packAndPrepareNew` may fill its `exports` and `numPacketsPerLID` DualView parameters wherever the subclass likes, either on host or device. It is not responsible for sync'ing either of these two DualViews to any particular place; `DistObject::doTransferNew` must sync them wherever it needs them to be. Similarly, `unpackAndCombineNew` may unpack received data (`imports` and `numPacketsPerLIDs`) wherever it wants, either on host or device, but is responsible for sync'ing them wherever it needs them. `DistObject::doTransferNew` is not responsible for sync'ing them. I justify the previous paragraph as follows: - Avoid unnecessary syncs to host, if the subclass knows how to unpack on device - Separation of concerns between communication and pack / unpack This imposes the following requirements on `exports`, `imports`, and `numPacketsPerLIDs`: 1. `DistObject::doTransferNew` allocates the `numPacketsPerLIDs` argument when it calls `packAndPrepareNew`, but `packAndPrepareNew` must allocate the `exports` output argument. Thus, subclasses cannot reallocate `numPacketsPerLID`. 2. `DistObject::doTransferNew` allocates the `imports` and `numPacketsPerLIDs` arguments when it calls `unpackAndCombineNew`; subclasses cannot reallocate them. 3. Subclasses need to be able to sync any of these arrays wherever they like. 4. `packAndPrepareNew` implementations access `exports` and `numPacketsPerLIDs` in write-only fashion. 5. `unpackAndPrepareNew` implementations technically only need read-only access to `imports` and `numPacketsPerLIDs`. However, `DistObject::doTransferNew` only accesses them in write-only fashion before calling `unpackAndPrepareNew`, and doesn't access them afterwards. Thus, `unpackAndPrepareNew` implementations could freely write to these arrays, e.g., to use them as scratch space. This means that the correct type of `imports` and `numPacketsPerLIDs` is `DualView<T*, ...>`. - Can't have `DualView<const T*, ...>`, because that would forbid sync'ing. - Can't have `const DualView&` or `const DualView`, because that would forbid sync'ing. - Can't have `DualView&`, because that would expose DistObject's internal DualViews to reallocation by subclasses. Passing by value means that even if the subclass reallocates, the caller won't see that. It's just like passing `double*` into a function, instead of `double*&`. Currently, these arrays have type `const DualView<const T*, ...>&`. This makes implementations do hacks in order to meet the above four requirements (esp. (2)). I propose changing the types of `imports`, and `numPacketsPerLID` (as parameters to `packAndPrepareNew` and `unpackAndCombineNew`) from `const DualView<const T*, ...>&`, to `DualView<T*, ...>`.
issue