Commit 38ad38f2 authored by Andrey Prokopenko's avatar Andrey Prokopenko
Browse files

MueLu: removed "repartition: keep proc 0" option

and corresponding code in RepartitionFactory.

Note: parallel interface tests now miss a chunk of output, as only
processor 0 can write into a file. This could be fixed:
a) write a script that calls the executable redirecting the output into
   a file. This would rely on MPI combining std::cout output, and using
   system utilities to redirect that. Pretty easy to implement.
b) upgrading FancyOStream to work with parallel file IO
c) make each processor write into a local buffer, and combine those
   buffers at the end of the run. Problematic, as combining would
   require timestamps, or something similar.

I think a) is the way to go, and should be easy enough to implement.
parent 9e0a99f0
......@@ -341,14 +341,6 @@
<visible>false</visible>
</parameter>
<parameter>
<name>repartition: keep proc 0</name>
<type>bool</type>
<default>true</default>
<description>Postprocessing for partitioning to keep processor 0 from dropping out. The goal is to keep processor 0 of the original fine level communication even when we use level subcommunicators.</description>
<visible>false</visible>
</parameter>
<parameter>
<name>repartition: print partition distribution</name>
<type>bool</type>
......
......@@ -93,8 +93,6 @@
\cbb{repartition: remap num values}{int}{4}{Number of maximum components from each processor used to construct partial bipartite graph.}
\cbb{repartition: keep proc 0}{bool}{true}{Postprocessing for partitioning to keep processor 0 from dropping out. The goal is to keep processor 0 of the original fine level communication even when we use level subcommunicators.}
\cbb{repartition: print partition distribution}{bool}{false}{Print partition distribution with '+' and '.'}
\cbb{repartition: rebalance P and R}{bool}{true}{Do rebalancing of R and P during the setup. This speeds up the solve, but slows down the setup phases.}
......
......@@ -840,7 +840,7 @@ namespace MueLu {
int strLength = outstr.size();
MPI_Bcast(&strLength, 1, MPI_INT, root, rawComm);
if (comm->getRank() != root)
outstr.resize(strLength+1);
outstr.resize(strLength);
MPI_Bcast(&outstr[0], strLength, MPI_CHAR, root, rawComm);
#endif
......
......@@ -107,7 +107,6 @@ namespace MueLu {
"<Parameter name=\"repartition: max imbalance\" type=\"double\" value=\"1.2\"/>"
"<Parameter name=\"repartition: remap parts\" type=\"bool\" value=\"true\"/>"
"<Parameter name=\"repartition: remap num values\" type=\"int\" value=\"4\"/>"
"<Parameter name=\"repartition: keep proc 0\" type=\"bool\" value=\"true\"/>"
"<Parameter name=\"repartition: print partition distribution\" type=\"bool\" value=\"false\"/>"
"<Parameter name=\"repartition: rebalance P and R\" type=\"bool\" value=\"true\"/>"
"<Parameter name=\"semicoarsen: coarsen rate\" type=\"int\" value=\"3\"/>"
......
......@@ -123,7 +123,7 @@ namespace MueLu {
Partitions are assigned to processes in order to minimize data movement. The basic idea is that a good choice for partition
owner is to choose the pid that already has the greatest number of nonzeros for a particular partition.
*/
void DeterminePartitionPlacement(const Matrix& A, GOVector& decomposition, GO numPartitions, bool keepProc0) const;
void DeterminePartitionPlacement(const Matrix& A, GOVector& decomposition, GO numPartitions) const;
}; // class RepartitionFactory
......
......@@ -82,7 +82,6 @@ namespace MueLu {
SET_VALID_ENTRY("repartition: start level");
SET_VALID_ENTRY("repartition: min rows per proc");
SET_VALID_ENTRY("repartition: max imbalance");
SET_VALID_ENTRY("repartition: keep proc 0");
SET_VALID_ENTRY("repartition: print partition distribution");
SET_VALID_ENTRY("repartition: remap parts");
SET_VALID_ENTRY("repartition: remap num values");
......@@ -119,7 +118,6 @@ namespace MueLu {
const LO minRowsPerProcessor = pL.get<LO> ("repartition: min rows per proc");
const double nonzeroImbalance = pL.get<double>("repartition: max imbalance");
const bool remapPartitions = pL.get<bool> ("repartition: remap parts");
const bool keepProc0 = pL.get<bool> ("repartition: keep proc 0");
// TODO: We only need a CrsGraph. This class does not have to be templated on Scalar types.
RCP<Matrix> A = Get< RCP<Matrix> >(currentLevel, "A");
......@@ -303,7 +301,7 @@ namespace MueLu {
if (remapPartitions) {
SubFactoryMonitor m1(*this, "DeterminePartitionPlacement", currentLevel);
DeterminePartitionPlacement(*A, *decomposition, numPartitions, keepProc0);
DeterminePartitionPlacement(*A, *decomposition, numPartitions);
}
// ======================================================================================================
......@@ -349,93 +347,6 @@ namespace MueLu {
else
sendMap[id].push_back(GID);
}
if (keepProc0 && !remapPartitions) {
// Figuring out how to keep processor 0 is easily and cheaply done in DeterminePartitionPlacement.
// Here, we are in situation when DeterminePartitionPlacement was not called, but the user still
// asked us to keep processor 0. Figuring that out is going to be slightly more difficult.
// First, lets try to see if processor 0 gets any data. If it does, no need to do anything
// For that, lets calculate the smalles part id that is valid.
GO oldPartId, minPartId = Teuchos::OrdinalTraits<GO>::max();
if (myGIDs.size()) minPartId = std::min(minPartId, Teuchos::as<GO>(myRank));
if (sendMap.size()) minPartId = std::min(minPartId, sendMap.begin()->first);
minAll(comm, minPartId, oldPartId);
if (oldPartId == 0) {
// Somebody owns a part with id 0. That means the processor 0 gets some data, even if it does
// not have any originally. Our work is done.
GetOStream(Statistics0) << "No remapping is necessary despite that \"alwaysKeepProc0\" option is on,"
" as processor 0 already receives some data" << std::endl;
} else if (oldPartId == Teuchos::OrdinalTraits<GO>::max()) {
// This is weird: nobody have any data. Nothing can be done.
} else {
// No partition with id 0, that means processor 0 gets no data. We have to do some extra legwork.
// Specifically, we want to select a part such that the process owning the part has very small
// fraction of the part.
// NOTE: one could also trying minimizing the number of owned GIDs of that part, but assuming
// good load balancing these metrics are the same.
// Here is a neat trick: we can send minimizing information along with partition id but using
// a single double. We use first numFracDigits digits of mantissa for actual fraction, and
// numProcDigits digits after for storing the part id
// NOTE: we need 10^{numAllDigits} to be smaller than INT_MAX
const int numFracDigits = 2, numProcDigits = 7, numAllDigits = numFracDigits + numProcDigits;
const double powF = pow(10.0, numFracDigits), powP = pow(10.0, numProcDigits), powD = pow(10.0, numAllDigits);
TEUCHOS_TEST_FOR_EXCEPTION(numProcs > powP, Exceptions::RuntimeError, "Time to update the constant!");
double defaultFrac = 1.1, frac = defaultFrac, fracMin;
if (myGIDs.size()) {
frac = Teuchos::as<double>(myGIDs.size())/decompEntries.size();
} else {
// Some of the processors may have myGIDs size equal to zero. There are two way one could get that:
// 1) Somebody sends pieces of part id = this processor id to it
// 2) There is no part id corresponding to this processor id
// Differentiatin between these two would require a lot more communication. Therefore, we exclude
// all parts with no local GIDs from consideration. It results in suboptimal algorithm, but with
// no extra communication
}
frac = (floor(frac*powF))/powF; // truncate the fraction to first numFracDigits
frac = (floor(frac*powD) + myRank)/powD; // store part id
minAll(comm, frac, fracMin);
if (fracMin < defaultFrac) {
// Somebody sent some useful informtaion
oldPartId = Teuchos::as<int>(fracMin*powD) % Teuchos::as<int>(powP); // decode
} else {
// Something weird is going on. This probably means that everybody does not keep any of its data
}
GetOStream(Statistics0) << "Remapping part " << oldPartId << " to processor 0 as \"alwaysKeepProc0\" option is on" << std::endl;
// Swap partitions
// If a processor has a part of partition with id = oldPartId, that means that it sends data to it, unless
// its rank is also oldPartId, in which case it some data is stored in myGIDs.
if (myRank != 0 && myRank != oldPartId && sendMap.count(oldPartId)) {
// We know for sure that there is no partition with id = 0 (there was a test for that). So we create one,
// and swap the data with existing one.
sendMap[0].swap(sendMap[oldPartId]);
sendMap.erase(oldPartId);
} else if (myRank == oldPartId && myGIDs.size()) {
// We know for sure that there is no partition with id = 0 (there was a test for that). As all our data
// belongs to processor 0 now, we move the data from myGIDs to the send array.
sendMap[0].swap(myGIDs);
} else if (myRank == 0 && sendMap.count(oldPartId)) {
// We have some data that we send to oldPartId processor in the original distribution. We own that data now,
// so we merge it with myGIDs array
int offset = myGIDs.size(), len = sendMap[oldPartId].size();
myGIDs.resize(offset + len);
memcpy(myGIDs.getRawPtr() + offset, sendMap[oldPartId].getRawPtr(), len*sizeof(GO));
sendMap.erase(oldPartId);
}
}
}
decompEntries = Teuchos::null;
if (IsPrint(Statistics2)) {
......@@ -555,8 +466,8 @@ namespace MueLu {
}
template <class Scalar, class LocalOrdinal, class GlobalOrdinal, class Node>
void RepartitionFactory<Scalar, LocalOrdinal, GlobalOrdinal, Node>::DeterminePartitionPlacement(const Matrix& A, GOVector& decomposition,
GO numPartitions, bool keepProc0) const {
void RepartitionFactory<Scalar, LocalOrdinal, GlobalOrdinal, Node>::
DeterminePartitionPlacement(const Matrix& A, GOVector& decomposition, GO numPartitions) const {
RCP<const Map> rowMap = A.getRowMap();
RCP<const Teuchos::Comm<int> > comm = rowMap->getComm()->duplicate();
......@@ -654,23 +565,7 @@ namespace MueLu {
}
GetOStream(Statistics0) << "Number of unassigned paritions before cleanup stage: " << (numPartitions - numMatched) << " / " << numPartitions << std::endl;
// Step 4 [optional]: Keep processor 0
if (keepProc0) {
if (matchedRanks[0] == 0) {
// Reassign partition to processor 0
// The hope is that partition which we mapped last has few elements in it
GetOStream(Statistics0) << "Remapping part " << lastMatchedPart << " to processor 0 as \"alwaysKeepProc0\" option is on" << std::endl;
matchedRanks[match[lastMatchedPart]] = 0; // unassign processor which was matched to lastMatchedPart part
matchedRanks[0] = 1; // assign processor 0
match[lastMatchedPart] = 0; // match part to processor 0
} else {
GetOStream(Statistics0) << "No remapping is necessary despite that \"alwaysKeepProc0\" option is on,"
" as processor 0 already receives some data" << std::endl;
}
}
// Step 5: Assign unassigned partitions
// Step 4: Assign unassigned partitions
// We do that through random matching for remaining partitions. Not all part numbers are valid, but valid parts are a subset of [0, numProcs).
// The reason it is done this way is that we don't need any extra communication, as we don't need to know which parts are valid.
for (int part = 0, matcher = 0; part < numProcs; part++)
......@@ -682,7 +577,7 @@ namespace MueLu {
match[part] = matcher++;
}
// Step 6: Permute entries in the decomposition vector
// Step 5: Permute entries in the decomposition vector
for (LO i = 0; i < decompEntries.size(); i++)
decompEntries[i] = match[decompEntries[i]];
}
......
......@@ -39,7 +39,6 @@
<Parameter name="repartition: max imbalance" type="double" value="1.327"/>
<Parameter name="repartition: start level" type="int" value="1"/>
<Parameter name="repartition: remap parts" type="bool" value="true"/>
<Parameter name="repartition: keep proc 0" type="bool" value="true"/>
<Parameter name="repartition: partitioner" type="string" value="zoltan2"/>
<ParameterList name="repartition: params">
<Parameter name="algorithm" type="string" value="multijagged"/>
......
......@@ -39,7 +39,6 @@
<Parameter name="repartition: max imbalance" type="double" value="1.327"/>
<Parameter name="repartition: start level" type="int" value="1"/>
<Parameter name="repartition: remap parts" type="bool" value="true"/>
<Parameter name="repartition: keep proc 0" type="bool" value="true"/>
<Parameter name="repartition: partitioner" type="string" value="zoltan2"/>
<ParameterList name="repartition: params">
<Parameter name="algorithm" type="string" value="multijagged"/>
......
......@@ -66,7 +66,6 @@
<Parameter name="repartition: max imbalance" type="double" value="1.327"/>
<Parameter name="repartition: start level" type="int" value="1"/>
<Parameter name="repartition: remap parts" type="bool" value="true"/>
<Parameter name="repartition: keep proc 0" type="bool" value="true"/>
</ParameterList>
<ParameterList name="myRebalanceProlongatorFact">
......
......@@ -67,7 +67,6 @@
<Parameter name="repartition: max imbalance" type="double" value="1.327"/>
<Parameter name="repartition: start level" type="int" value="1"/>
<Parameter name="repartition: remap parts" type="bool" value="true"/>
<Parameter name="repartition: keep proc 0" type="bool" value="true"/>
</ParameterList>
<ParameterList name="myRebalanceProlongatorFact">
......
......@@ -75,7 +75,6 @@ Level 1
repartition: start level = 1
repartition: min rows per proc = 2000
repartition: max imbalance = 1.327
repartition: keep proc 0 = 1
repartition: print partition distribution = 0 [default]
repartition: remap parts = 1
repartition: remap num values = 4 [default]
......@@ -174,7 +173,6 @@ Level 2
repartition: start level = 1
repartition: min rows per proc = 2000
repartition: max imbalance = 1.327
repartition: keep proc 0 = 1
repartition: print partition distribution = 0 [default]
repartition: remap parts = 1
repartition: remap num values = 4 [default]
......@@ -273,7 +271,6 @@ Level 3
repartition: start level = 1
repartition: min rows per proc = 2000
repartition: max imbalance = 1.327
repartition: keep proc 0 = 1
repartition: print partition distribution = 0 [default]
repartition: remap parts = 1
repartition: remap num values = 4 [default]
......
......@@ -75,7 +75,6 @@ Level 1
repartition: start level = 1
repartition: min rows per proc = 2000
repartition: max imbalance = 1.327
repartition: keep proc 0 = 1
repartition: print partition distribution = 0 [default]
repartition: remap parts = 1
repartition: remap num values = 4 [default]
......@@ -98,208 +97,6 @@ Level 1
Computing Ac (MueLu::RebalanceAcFactory)
useSubcomm = 1 [default]
Setup Smoother (MueLu::IfpackSmoother{type = Chebyshev})
chebyshev: degree = 2
chebyshev: ratio eigenvalue = 20 [unused]
chebyshev: min eigenvalue = 1
chebyshev: zero starting solution = 1
chebyshev: eigenvalue max iterations = 10
Level 2
Build (MueLu::RebalanceTransferFactory)
Build (MueLu::RepartitionFactory)
Computing Ac (MueLu::RAPFactory)
Prolongator smoothing (MueLu::SaPFactory)
Matrix filtering (MueLu::FilteredAFactory)
Build (MueLu::CoalesceDropFactory)
aggregation: drop tol = 0 [default]
aggregation: Dirichlet threshold = 0 [default]
aggregation: drop scheme = distance laplacian
lightweight wrap = 1
filtered matrix: use lumping = 1 [unused]
filtered matrix: reuse graph = 1 [default]
filtered matrix: reuse eigenvalue = 1 [default]
Build (MueLu::TentativePFactory)
Build (MueLu::UncoupledAggregationFactory)
aggregation: mode = old [default]
aggregation: max agg size = 2147483647 [default]
aggregation: min agg size = 2 [default]
aggregation: max selected neighbors = 0 [default]
aggregation: ordering = natural [default]
aggregation: enable phase 1 = 1 [default]
aggregation: enable phase 2a = 1 [default]
aggregation: enable phase 2b = 1 [default]
aggregation: enable phase 3 = 1 [default]
aggregation: preserve Dirichlet points = 0 [default]
UseOnePtAggregationAlgorithm = 0 [default]
UsePreserveDirichletAggregationAlgorithm = 0 [default]
UseUncoupledAggregationAlgorithm = 1 [default]
UseMaxLinkAggregationAlgorithm = 1 [default]
UseIsolatedNodeAggregationAlgorithm = 1 [default]
UseEmergencyAggregationAlgorithm = 1 [default]
OnePt aggregate map name = [default]
Build (MueLu::AmalgamationFactory)
[empty list]
Nullspace factory (MueLu::NullspaceFactory)
Fine level nullspace = Nullspace
Build (MueLu::CoarseMapFactory)
Striding info = {} [default]
Strided block id = -1 [default]
Domain GID offsets = {0} [default]
[empty list]
sa: damping factor = 1.33333 [default]
sa: calculate eigenvalue estimate = 0 [default]
sa: eigenvalue estimate num iterations = 10 [default]
Transpose P (MueLu::TransPFactory)
[empty list]
Build (MueLu::CoordinatesTransferFactory)
write start = -1 [default]
write end = -1 [default]
transpose: use implicit = 0 [default]
Keep AP Pattern = 0 [default]
Keep RAP Pattern = 0 [default]
CheckMainDiagonal = 0 [default]
RepairMainDiagonal = 0 [default]
repartition: start level = 1
repartition: min rows per proc = 2000
repartition: max imbalance = 1.327
repartition: keep proc 0 = 1
repartition: print partition distribution = 0 [default]
repartition: remap parts = 1
repartition: remap num values = 4 [default]
repartition: rebalance P and R = 1 [default]
transpose: use implicit = 0 [default]
useSubcomm = 1 [default]
type = Interpolation
write start = -1 [default]
write end = -1 [default]
Build (MueLu::RebalanceTransferFactory)
repartition: rebalance P and R = 1 [default]
transpose: use implicit = 0 [default]
useSubcomm = 1 [default]
type = Restriction
write start = -1 [default]
write end = -1 [default]
Computing Ac (MueLu::RebalanceAcFactory)
useSubcomm = 1 [default]
Setup Smoother (MueLu::IfpackSmoother{type = Chebyshev})
chebyshev: degree = 2
chebyshev: ratio eigenvalue = 20 [unused]
chebyshev: min eigenvalue = 1
chebyshev: zero starting solution = 1
chebyshev: eigenvalue max iterations = 10
Level 3
Build (MueLu::RebalanceTransferFactory)
Build (MueLu::RepartitionFactory)
Computing Ac (MueLu::RAPFactory)
Prolongator smoothing (MueLu::SaPFactory)
Matrix filtering (MueLu::FilteredAFactory)
Build (MueLu::CoalesceDropFactory)
aggregation: drop tol = 0 [default]
aggregation: Dirichlet threshold = 0 [default]
aggregation: drop scheme = distance laplacian
lightweight wrap = 1
filtered matrix: use lumping = 1 [unused]
filtered matrix: reuse graph = 1 [default]
filtered matrix: reuse eigenvalue = 1 [default]
Build (MueLu::TentativePFactory)
Build (MueLu::UncoupledAggregationFactory)
aggregation: mode = old [default]
aggregation: max agg size = 2147483647 [default]
aggregation: min agg size = 2 [default]
aggregation: max selected neighbors = 0 [default]
aggregation: ordering = natural [default]
aggregation: enable phase 1 = 1 [default]
aggregation: enable phase 2a = 1 [default]
aggregation: enable phase 2b = 1 [default]
aggregation: enable phase 3 = 1 [default]
aggregation: preserve Dirichlet points = 0 [default]
UseOnePtAggregationAlgorithm = 0 [default]
UsePreserveDirichletAggregationAlgorithm = 0 [default]
UseUncoupledAggregationAlgorithm = 1 [default]
UseMaxLinkAggregationAlgorithm = 1 [default]
UseIsolatedNodeAggregationAlgorithm = 1 [default]
UseEmergencyAggregationAlgorithm = 1 [default]
OnePt aggregate map name = [default]
Build (MueLu::AmalgamationFactory)
[empty list]
Nullspace factory (MueLu::NullspaceFactory)
Fine level nullspace = Nullspace
Build (MueLu::CoarseMapFactory)
Striding info = {} [default]
Strided block id = -1 [default]
Domain GID offsets = {0} [default]
[empty list]
sa: damping factor = 1.33333 [default]
sa: calculate eigenvalue estimate = 0 [default]
sa: eigenvalue estimate num iterations = 10 [default]
Transpose P (MueLu::TransPFactory)
[empty list]
Build (MueLu::CoordinatesTransferFactory)
write start = -1 [default]
write end = -1 [default]
transpose: use implicit = 0 [default]
Keep AP Pattern = 0 [default]
Keep RAP Pattern = 0 [default]
CheckMainDiagonal = 0 [default]
RepairMainDiagonal = 0 [default]
repartition: start level = 1
repartition: min rows per proc = 2000
repartition: max imbalance = 1.327
repartition: keep proc 0 = 1
repartition: print partition distribution = 0 [default]
repartition: remap parts = 1
repartition: remap num values = 4 [default]
repartition: rebalance P and R = 1 [default]
transpose: use implicit = 0 [default]
useSubcomm = 1 [default]
type = Interpolation
write start = -1 [default]
write end = -1 [default]
Build (MueLu::RebalanceTransferFactory)
repartition: rebalance P and R = 1 [default]
transpose: use implicit = 0 [default]
useSubcomm = 1 [default]
type = Restriction
write start = -1 [default]
write end = -1 [default]
Computing Ac (MueLu::RebalanceAcFactory)
useSubcomm = 1 [default]
Setup Smoother (MueLu::AmesosSmoother{type = Superlu})
presmoother ->
[empty list]
--------------------------------------------------------------------------------
--- Multigrid Summary ---
......
......@@ -77,7 +77,6 @@ Level 1
repartition: start level = 1
repartition: min rows per proc = 2000
repartition: max imbalance = 1.327
repartition: keep proc 0 = 1
repartition: print partition distribution = 0 [default]
repartition: remap parts = 1
repartition: remap num values = 4 [default]
......@@ -100,212 +99,6 @@ Level 1
Computing Ac (MueLu::RebalanceAcFactory)
useSubcomm = 1 [default]
Setup Smoother (MueLu::Ifpack2Smoother{type = CHEBYSHEV})
chebyshev: degree = 2
chebyshev: ratio eigenvalue = 20
chebyshev: min eigenvalue = 1
chebyshev: zero starting solution = 1
chebyshev: eigenvalue max iterations = 10
chebyshev: min diagonal value = 2.22045e-16 [default]
chebyshev: assume matrix does not change = 0 [default]
Level 2
Build (MueLu::RebalanceTransferFactory)
Build (MueLu::RepartitionFactory)
Computing Ac (MueLu::RAPFactory)
Prolongator smoothing (MueLu::SaPFactory)
Matrix filtering (MueLu::FilteredAFactory)
Build (MueLu::CoalesceDropFactory)
aggregation: drop tol = 0 [default]
aggregation: Dirichlet threshold = 0 [default]
aggregation: drop scheme = distance laplacian
lightweight wrap = 1
filtered matrix: use lumping = 1 [unused]
filtered matrix: reuse graph = 1 [default]
filtered matrix: reuse eigenvalue = 1 [default]
Build (MueLu::TentativePFactory)
Build (MueLu::UncoupledAggregationFactory)
aggregation: mode = old [default]
aggregation: max agg size = 2147483647 [default]
aggregation: min agg size = 2 [default]
aggregation: max selected neighbors = 0 [default]
aggregation: ordering = natural [default]
aggregation: enable phase 1 = 1 [default]
aggregation: enable phase 2a = 1 [default]
aggregation: enable phase 2b = 1 [default]
aggregation: enable phase 3 = 1 [default]
aggregation: preserve Dirichlet points = 0 [default]
UseOnePtAggregationAlgorithm = 0 [default]
UsePreserveDirichletAggregationAlgorithm = 0 [default]
UseUncoupledAggregationAlgorithm = 1 [default]
UseMaxLinkAggregationAlgorithm = 1 [default]
UseIsolatedNodeAggregationAlgorithm = 1 [default]
UseEmergencyAggregationAlgorithm = 1 [default]
OnePt aggregate map name = [default]
Build (MueLu::AmalgamationFactory)
[empty list]
Nullspace factory (MueLu::NullspaceFactory)
Fine level nullspace = Nullspace
Build (MueLu::CoarseMapFactory)
Striding info = {} [default]
Strided block id = -1 [default]
Domain GID offsets = {0} [default]
[empty list]
sa: damping factor = 1.33333 [default]
sa: calculate eigenvalue estimate = 0 [default]
sa: eigenvalue estimate num iterations = 10 [default]