Tpetra: Need work-around for GCC 4.7.2 bug manifesting as Tpetra::Details::idot test failure
Created by: mhoemmen
@trilinos/tpetra
My implementation of Tpetra::Details::idot for noncontiguous Tpetra::MultiVector inputs uses a C++11 lambda to implement a deferred action. The implementation looks like this:
typedef ::Tpetra::Details::CommRequest req_base_type;
std::vector<std::shared_ptr<req_base_type> > requests (numVecs);
for (size_t j = 0; j < numVecs; ++j) {
RCP<const vec_type> X_j = X.getVector (j);
RCP<const vec_type> Y_j = Y.getVector (j);
auto result_j = subview (result, j);
requests[j] = idot<SC, LO, GO, NT> (result_j, *X_j, *Y_j);
}
typedef ::Tpetra::Details::Impl::DeferredActionCommRequest req_type;
return std::shared_ptr<req_base_type> (new req_type ([=] () {
for (size_t j = 0; j < numVecs; ++j) {
if (requests[j].get () != NULL) {
requests[j]->wait ();
}
}
});
GCC 4.7.2 issued warnings that numVecs was being used uninitialized. I thought this was weird, because (a) neither Clang 3.9.0 nor CUDA 7.5 with GCC 4.8.4 issued these warnings, and (b) numVecs was obviously being initialized in the code above this lambda. Even worse, the test I recently added (not in the develop branch yet) that exercises this use case fails with GCC 4.7.2, but not with the other compilers. GCC 4.7.2 has given me troubles with lambdas before, so I suspect this is a compiler bug.
It turns out that if I specify the variables to capture explicitly in the capture clause, the "uninitialized" warning goes away, and the tests pass. I just need to implement and push this fix. It looks like this:
return std::shared_ptr<req_base_type> (new req_type ([numVecs,requests] () {
for (size_t j = 0; j < numVecs; ++j) {
if (requests[j].get () != NULL) {
requests[j]->wait ();
}
}
});