Computational Resources
Since nonlinear optimization is an iterative process that
depends on many factors, it is difficult to estimate
how much computer time is necessary to find an optimal
solution satisfying one of the termination criteria.
You can use the MAXTIME=, MAXITER=, and MAXFU= options to
restrict the amount of CPU time, the number of iterations,
and the number of function calls in a single run of PROC NLMIXED.
In each iteration k, the NRRIDG technique uses a
symmetric Householder transformation to decompose the
n ×n Hessian matrix H
-
H = V' T V , V : orthogonal , T : tridiagonal
to compute the (Newton) search direction s
-
s(k) = - [H(k)]-1 g(k) k = 1,2,3, ...
The TRUREG and NEWRAP techniques use the Cholesky
decomposition to solve the same linear system while computing the
search direction. The QUANEW, DBLDOG, CONGRA, and NMSIMP techniques
do not need to invert or decompose a Hessian matrix; thus, they
require less computational resources than the other techniques.
The larger the problem, the more time is needed to compute function
values and derivatives. Therefore, you may want to compare
optimization techniques by counting and comparing the respective
numbers of function, gradient, and Hessian
evaluations.
Finite difference approximations of the derivatives
are expensive because they require additional function or gradient calls:
- forward difference formulas
- For first-order derivatives, n additional function
calls are required.
- For second-order derivatives based on function calls only,
for a dense Hessian, n+n2/2 additional function calls
are required.
- For second-order derivatives based on gradient calls,
n additional gradient calls are required.
- central difference formulas
- For first-order derivatives, 2n additional function
calls are required.
- For second-order derivatives based on function calls only,
for a dense Hessian, 2n+2n2 additional function calls
are required.
- For second-order derivatives based on gradient calls,
2n additional gradient calls are required.
Many applications need considerably more time for computing
second-order derivatives (Hessian matrix) than for computing
first-order derivatives (gradient). In such cases, a dual
quasi-Newton technique is recommended, which does not require
second-order derivatives.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.