In the following, we will give an overview on the different ways to increase solver performance, mainly by considering hardware upgrading. Note that not all solvers have all capabilities and that the feature naming may differ, therefore, we explain a capability by taking a particular solver, as an example. However, we recommend the user to review the documentation of the specific solver of interest in order to see what capabilities that solver provides.
In the discussion we will distinguish between a single computer and a computing grid. We will also distinguish the case, where the model consist of a single model instance, i.e. one solve statement, and the case when the model consists of several model instances. We will highlight the case where the computer has several cores, i.e. CPU's, see GAMS option threads. However, we will not consider the strategy that solves a single model instance by using different solver parameters, i.e. concurrent mode, because this may lead to confusion in this context. Instead we note that, if the solver parameters are well chosen, for example, with the help of the modelers good insights and parameter tuning, then the benefit of concurrent mode optimization may lead to only minor performance improvements.
Here you can find our notes on how the architecture of a single computer affects the performance. We recommend you to read the link before continuing. In conclusion, currently, a simplified guideline to increase computing power may be as follow:
In the next section we assume that the computer(s) has sufficient amount of memory and multiple cores. Please keep in mind that a strong and suitable algorithm may use little computer resources and outperform a unsuitable and weak algorithm that drains the computer out of its resources. We suggest to pay attention to the task manager (Windows) / top (Unix) / activity monitor (Mac) in order to observe how a solver consumes the resources of a computer.
We can first pay attention on how well a solver can take advantage of multiple cores.
For example, if we have a Linear Programming (LP) problem and use GAMS/GUROBI. The main algorithm choice is between Simplex and parallel Barrier. The Simplex method is usually better on models with a dense problem matrix, while the parallel Barrier method is many times more efficient for models with a sparse matrix. The simplex algorithm does not take advantage of several cores, while parallel Barrier can take advantage of several cores. Hence, the fastest method depends on model structure, amount of available memory, and number of cores.
For example, if we have a Mixed Integer linear Programming (MIP) problem and use GAMS/CPLEX. The
parallelmode option offers a deterministic (default) and opportunistic mode. Deterministic mode traverses the Branch-and-Bound (BB) tree in a deterministic and reproducible manner, however, the use of cores may seem inefficient. Opportunistic mode can keep all cores busy, but it is likely that the BB tree is traversed differently for each run and, therefore, only the final global optimal solution should have the same objective value.
Next we mention some other solvers that support the use of multiple cores to some extent (an accepted option name in brackets): CBC(threads), Ipopt(threads), KNITRO(threads), LINDO(IPM NUM THREADS), MOSEK (MSK IPAR NUM THREADS), SULUM(numthreads), XPRESS (threads).
In conclusion, ensure that you have sufficient amount of memory before increasing the use of cores. Consider improving your model formulation and tuning your solver. If this does not help, then you may consider a more powerful computer or even a grid computing environment.
You may use GAMS/CPLEX distributed MIP, i.e. let CPLEX distribute the BB tree search for the solution of a single problem over a number of computers.
We distinguish between two different cases. The first case is when we have several model instances that only differ in regards of parameter values. The second case is when the model instances also differ in structure (size or algebraic formulation).
In the first case we may use the Gather-Update-Solve-Scatter (GUSS) facility, instead of a
loop statement. GUSS keeps the instantiation of the GAMS model (as generated by the
solve statement) in memory and only updates its data between consecutive solves. This can increase performance if generating a model instance is a bottleneck or the solver can be “hot started”.
Note, the following solvers cannot be used as subsolvers of GUSS: ALPHAECP, AMPL, BARON, BDMLP, BENCH, CONVERT, DECISC, DECISM, DICOPT, EXAMINER, GAMSCHK, JAMS, KESTREL, LINGO, LOGMIP, LS, MILES, MPECDUMP, MPSGE, MSNLP, NLPEC, OQNLP, PATHNLP, SBB, and XA.
In situations where input data for the next model instance is dependent on the previous model instance solution, the GAMS API can be used to update a once generated model instantiation. However, in this case, multiple cores can only be utilized, if several independent sequences of model instances can be identified.
If the model instances also differ in structure, it is possible to execute the model instances asynchronously and distribute the workload on different cores. However, keep in mind that your computer needs to have sufficiently much memory to support this approach.
The methods from the previous section also apply here. Furthermore, the GAMS Grid Facility can be used, see GAMS User's Guide Appendix - The GAMS Grid Computing Facility and these presentation slides. The Grid Facility can also be used in combination with GUSS, see this example.