Re-enable the ROL_NonlinearProblemTest_MPI_4 test for cuda testing
Created by: prwolfe
This was disabled in #4572 to keep PRs moving. There are a large number of ROL tests disabled for CUDA as well indicating some issue with code, threading, etc.
Discover the underlying problem and re-enable this test when possible to support downstream users.
This test runs in about 10 seconds on CPU, but varies from 400-600 second on white and ride.
Motivation and Context
This kept several unrelated pull requests from testing cleanly over the past week.