Scaling the Hessian
Q: I am working on a model with a non-linear objective function (2-level nested CES) and over 1600 non-linear variables and 160 linear constraints. I have been using GAMS/MINOS and CONOPT to solve it. All the variables are scaled to 1. But, both MINOS and CONOPT algorithms still stop short of optimal solution due to slow convergence. The problem is, I think, despite the scaling, the Hessian is still ill-conditioned as evidenced by a condition number of around 38000. There is a discussion of how to scale the Hessian in a book “Practical Optimization” by Gill, et. al. My question is: Do any one of you have a practical experience scaling a Hessian, can it be done with GAMS/MINOS - CONOPT, or know of other algorithms that does it in a painless way? I feel comfortable using the base year values of the variables for the numerical calculations needed for the scaling.
(Michael Saunders): Models with linear constraints (LC models) should solve reliably if the objective is smooth and the constraint matrix is well scaled. Here are some things to think about:
It is good that you are concerned about scaling. Note that “All variables are around 1” should not be taken too iterally. In general it means choosing reasonable units for everything, so that a TYPICAL variable of each type is around 1, and a TYPICAL coefficient in EACH ROW and EACH COLUMN of the CONSTRAINT MATRIX is around 1. Some variables want to be small or zero at a solution (that’s fine). Similarly, some matrix coefficients really do have to be small. The main thing is that LARGE variables and matrix entries should be avoided. For LC models, Scale option 2 is probably best in MINOS if there is doubt about the scaling. (Among other things, it allows for big right-hand sides.) Scale option 0 is better if you KNOW that a model is well scaled. In general, try both. To check, turn on the MINOS print file and look at the output following the EXIT message. See if “norm x” and “norm pi” are much different with and without scaling.
The size of the objective function can have an effect. MINOS allows for “large” objectives but not “tiny” ones. If all the shadow prices (the dual variables pi(i)) are small, scale the objective function up. In general you want a TYPICAL objective gradient to be around 1.
Initialize nonlinear variables to realistic values. Does “Slow convergence” mean the linesearch keeps failing to find a better point? Non-smooth functions have that effect, so be sure that you bound the variables away from any singularities in the objective function.
If there are any “ifs” and “buts” in the function (unlikely in GAMS), be sure there are no discontinuities. For NC models (nonlinear constraints), Scale option 2 uses the initial Jacobian (constraint gradients), so may not be safe unless you take great care with the initial values.Scale option 0 or 1 is safer in general
It is hard to say more about your particular model without seeing the MINOS print file (with Print level 1 and Solution Yes). But try Scale option 2 without too much of your own scaling.