Tpetra::DistObject: DualView arguments have wrong types
Created by: mhoemmen
@trilinos/tpetra @kyungjoo-kim
This concerns the following DistObject virtual methods that its subclasses must implement:
- copyAndPermuteNew
- packAndPrepareNew
- unpackAndCombineNew
As of PR #4328, we have clarified how DistObject::doTransferNew
expects these methods to behave, with respect to their DualView parameters. Export and Import setup is responsible for ensuring that permuteToLIDs
, permuteFromLIDs
, exportLIDs
, and remoteLIDs
are sync'd on both host and device. Thus, the above methods need not sync any of these four DualViews. As a result, their type const DualView<const LO*, buffer_device_type>&
is correct. (const DualView&
means you can't sync them; DualView<const LO*, ...>
means you can't modify their entries.)
PR #4328 further clarifies that packAndPrepareNew
may fill its exports
and numPacketsPerLID
DualView parameters wherever the subclass likes, either on host or device. It is not responsible for sync'ing either of these two DualViews to any particular place; DistObject::doTransferNew
must sync them wherever it needs them to be. Similarly, unpackAndCombineNew
may unpack received data (imports
and numPacketsPerLIDs
) wherever it wants, either on host or device, but is responsible for sync'ing them wherever it needs them. DistObject::doTransferNew
is not responsible for sync'ing them.
I justify the previous paragraph as follows:
- Avoid unnecessary syncs to host, if the subclass knows how to unpack on device
- Separation of concerns between communication and pack / unpack
This imposes the following requirements on exports
, imports
, and numPacketsPerLIDs
:
-
DistObject::doTransferNew
allocates thenumPacketsPerLIDs
argument when it callspackAndPrepareNew
, butpackAndPrepareNew
must allocate theexports
output argument. Thus, subclasses cannot reallocatenumPacketsPerLID
. -
DistObject::doTransferNew
allocates theimports
andnumPacketsPerLIDs
arguments when it callsunpackAndCombineNew
; subclasses cannot reallocate them. - Subclasses need to be able to sync any of these arrays wherever they like.
-
packAndPrepareNew
implementations accessexports
andnumPacketsPerLIDs
in write-only fashion. -
unpackAndPrepareNew
implementations technically only need read-only access toimports
andnumPacketsPerLIDs
. However,DistObject::doTransferNew
only accesses them in write-only fashion before callingunpackAndPrepareNew
, and doesn't access them afterwards. Thus,unpackAndPrepareNew
implementations could freely write to these arrays, e.g., to use them as scratch space.
This means that the correct type of imports
and numPacketsPerLIDs
is DualView<T*, ...>
.
- Can't have
DualView<const T*, ...>
, because that would forbid sync'ing. - Can't have
const DualView&
orconst DualView
, because that would forbid sync'ing. - Can't have
DualView&
, because that would expose DistObject's internal DualViews to reallocation by subclasses. Passing by value means that even if the subclass reallocates, the caller won't see that. It's just like passingdouble*
into a function, instead ofdouble*&
.
Currently, these arrays have type const DualView<const T*, ...>&
. This makes implementations do hacks in order to meet the above four requirements (esp. (2)).
I propose changing the types of imports
, and numPacketsPerLID
(as parameters to packAndPrepareNew
and unpackAndCombineNew
) from const DualView<const T*, ...>&
, to DualView<T*, ...>
.