tlm_adjoint.optimization

Module Contents

tlm_adjoint.optimization.minimize_scipy(forward, M0, *, manager=None, **kwargs)

Provides an interface with scipy.optimize.minimize() for gradient-based optimization.

Note that the control is gathered onto the root process so that the serial scipy.optimize.minimize() function may be used.

All keyword arguments except for manager are passed to scipy.optimize.minimize().

Parameters:
  • forward – A callable which accepts one or more variable arguments, and which returns a variable defining the forward functional.

  • M0 – A variable or Sequence of variables defining the control, and the initial guess for the optimization.

  • manager – An EquationManager used to create an internal manager via EquationManager.new(). manager() is used if not supplied.

Returns:

A tuple (M, return_value). M is variable or a Sequence of variables depending on the type of M0, and stores the result. return_value is the return value of scipy.optimize.minimize().

class tlm_adjoint.optimization.LBFGSHessianApproximation(m)

L-BFGS Hessian approximation.

Parameters:

m – Maximum number of vector pairs to retain in the L-BFGS Hessian approximation.

append(S, Y, S_inner_Y)

Add a step + gradient change pair.

Parameters:
  • S – A variable or a Sequence of variables defining the step.

  • Y – A variable or a Sequence of variables defining the gradient change.

  • S_inner_Y – The projection of the gradient change onto the step. A separate argument so that a value consistent with that used in the line search can be supplied.

inverse_action(X, *, H_0_action=None, theta=1.0)

Compute the action of the approximate Hessian inverse on some given direction.

Implements the L-BFGS Hessian inverse action approximation as in Algorithm 7.4 of

  • Jorge Nocedal and Stephen J. Wright, ‘Numerical optimization’, Springer, New York, NY, 2006, Second edition, doi: 10.1007/978-0-387-40065-5

Uses ‘theta scaling’ as in equation (3.7) of

  • Richard H. Byrd, Peihuang Lu, and Jorge Nocedal, and Ciyou Zhu, ‘A limited memory algorithm for bound constrained optimization’, SIAM Journal on Scientific Computing, 16(5), 1190–1208, 1995, doi: 10.1137/0916069

Parameters:
  • X – A variable or a Sequence of variables defining the direction on which to compute the approximate Hessian inverse action.

  • H_0_action – A callable defining the action of the non-updated Hessian inverse approximation on some direction. Accepts one or more variables as arguments, defining the direction, and returns a variable or a Sequence of variables defining the action on this direction. Should correspond to a positive definite operator. An identity is used if not supplied.

Returns:

A variable or a Sequence of variables storing the result.

Line search using TAO.

Uses the PETSc.TAOLineSearch.Type.MORETHUENTE line search type, yielding a step which satisfies the Wolfe conditions (and the strong curvature condition). See

  • Jorge J. Moré and David J. Thuente, ‘Line search algorithms with guaranteed sufficient decrease’, ACM Transactions on Mathematical Software 20(3), 286–307, 1994, doi: 10.1145/192115.192132

Parameters:
  • F – A callable defining the functional. Accepts one or more variables as arguments, and returns the value of the functional.

  • Fp – A callable defining the functional gradient. Accepts one or more variables as inputs, and returns a variable or Sequence of variables storing the value of the gradient.

  • X – A variable or a Sequence of variables defining the starting point for the line search.

  • minus_P – A variable or a Sequence of variables defining the negative of the line search direction.

  • c1

    Armijo condition parameter. \(c_1\) in equation (3.6a) of

    • Jorge Nocedal and Stephen J. Wright, ‘Numerical optimization’, Springer, New York, NY, 2006, Second edition, doi: 10.1007/978-0-387-40065-5

  • c2

    Curvature condition parameter. \(c_2\) in equation (3.6b) of

    • Jorge Nocedal and Stephen J. Wright, ‘Numerical optimization’, Springer, New York, NY, 2006, Second edition, doi: 10.1007/978-0-387-40065-5

  • old_F_val – The value of F at the starting point of the line search.

  • old_Fp_val – The value of Fp at the starting point of the line search.

  • comm – A communicator.

Returns:

A tuple (alpha, new_F_val, new_Fp_val) where

  • alpha: Defines the step size. The step is given by the product of alpha with -minus_P (noting the negative sign for the latter).

  • new_F_val: The new value of F.

  • new_Fp_val: The new value of Fp.

tlm_adjoint.optimization.l_bfgs(F, Fp, X0, *, m=30, s_atol, g_atol, converged=None, max_its=1000, H_0_action=None, theta_scale=True, delta=1.0, M_action=None, M_inv_action=None, c1=0.0001, c2=0.9, comm=None)

Functional minimization using the L-BFGS algorithm.

Implements Algorithm 7.5 of

  • Jorge Nocedal and Stephen J. Wright, ‘Numerical optimization’, Springer, New York, NY, 2006, Second edition, doi: 10.1007/978-0-387-40065-5

in a more general inner product space.

By default uses ‘theta scaling’ to define the initial Hessian inverse approximation, based on the approach in equation (3.7) and point 7 on p. 1204 of

  • Richard H. Byrd, Peihuang Lu, and Jorge Nocedal, and Ciyou Zhu, ‘A limited memory algorithm for bound constrained optimization’, SIAM Journal on Scientific Computing, 16(5), 1190–1208, 1995, doi: 10.1137/0916069

and with an initial value for the scaling parameter based on the discussion in ‘Implementation’ in section 6.1 of

  • Jorge Nocedal and Stephen J. Wright, ‘Numerical optimization’, Springer, New York, NY, 2006, Second edition, doi: 10.1007/978-0-387-40065-5

Precisely the Hessian inverse approximation, before being updated, is scaled by \(1 / \theta\) with, on the first iteration,

\[\theta = \frac{\sqrt{\left| g_k^* M^{-1} g_k \right|}}{\delta}\]

and on later iterations

\[\theta = \frac{y_k^* H_0 y_k}{y_k^* s_k},\]

where \(g_k\), \(y_k\), and \(s_k\) are respectively the gradient, gradient change, and step, and where \(M^{-1}\) and \(H_0\) are defined by M_inv_action and H_0_action respectively.

The line search is performed using line_search().

Parameters:
  • F – A callable defining the functional. Accepts one or more variables as arguments, and returns the value of the functional.

  • Fp – A callable defining the functional gradient. Accepts one or more variables as inputs, and returns a variable or Sequence of variables storing the value of the gradient.

  • X0 – A variable or a Sequence of variables defining the initial guess for the parameters.

  • m – The maximum number of step + gradient change pairs to use in the Hessian inverse approximation.

  • s_atol – Absolute tolerance for the step change norm convergence criterion.

  • g_atol – Absolute tolerance for the gradient norm convergence criterion.

  • converged

    A callable defining a callback, and which can be used to define custom convergence criteria. Takes the form

    def converged(it, F_old, F_new, X_new, G_new, S, Y):
    

    with

    • it: The iteration number.

    • F_old: The previous value of the functional.

    • F_new: The new value of the functional.

    • X_new: A variable or a Sequence of variables defining the new value of the parameters.

    • G_new: A variable or a Sequence of variables defining the new value of the gradient.

    • S: A variable or a Sequence of variables defining the step.

    • Y: A variable or a sequence of variables defining the gradient change.

    Input variables should not be modified. Returns a bool indicating whether the optimization has converged.

  • max_its – The maximum number of iterations.

  • H_0_action – A callable defining the action of the non-updated Hessian inverse approximation on some direction. Accepts one or more variables as arguments, defining the direction, and returns a variable or a Sequence of variables defining the action on this direction. Should correspond to a positive definite operator. An identity is used if not supplied.

  • theta_scale – Whether to apply ‘theta scaling’, discussed above.

  • delta – Controls the initial value of \(\theta\) in ‘theta scaling’. If None then on the first iteration \(\theta\) is set equal to one.

  • M_action

    A callable defining a primal space inner product,

    \[\left< x, y \right>_M = y^* M x,\]

    where \(x\) and \(y\) are degree of freedom vectors for primal space elements and \(M\) is a Hermitian and positive definite matrix. Accepts one or more variables as arguments, defining the direction, and returns a variable or a Sequence of variables defining the action of \(M\) on this direction. An identity is used if not supplied. Required if H_0_action or M_inv_action are supplied.

  • M_inv_action

    A callable defining a (conjugate) dual space inner product,

    \[\left< x, y \right>_{M^{-1}} = y^* M^{-1} x,\]

    where \(x\) and \(y\) are degree of freedom vectors for (conjugate) dual space elements and \(M\) is as for M_action. Accepts one or more variables as arguments, defining the direction, and returns a variable or a Sequence of variables defining the action of \(M^{-1}\) on this direction. H_0_action is used if not supplied.

  • c1

    Armijo condition parameter. \(c_1\) in equation (3.6a) of

    • Jorge Nocedal and Stephen J. Wright, ‘Numerical optimization’, Springer, New York, NY, 2006, Second edition, doi: 10.1007/978-0-387-40065-5

  • c2

    Curvature condition parameter. \(c_2\) in equation (3.6b) of

    • Jorge Nocedal and Stephen J. Wright, ‘Numerical optimization’, Springer, New York, NY, 2006, Second edition, doi: 10.1007/978-0-387-40065-5

  • comm – A communicator.

Returns:

A tuple (X, (it, F_calls, Fp_calls, hessian_approx)) with

  • X: The solution. A variable or a tuple of variables.

  • it: The number of iterations.

  • F_calls: The number of functional evaluations.

  • Fp_calls: The number of gradient evaluations.

  • hessian_approx: The LBFGSHessianApproximation.

tlm_adjoint.optimization.minimize_l_bfgs(forward, M0, *, m=30, manager=None, **kwargs)

Functional minimization using the L-BFGS algorithm.

Parameters:
  • forward – A callable which accepts one or more variable arguments, and which returns a variable defining the forward functional.

  • M0 – A variable or Sequence of variables defining the control, and the initial guess for the optimization.

  • manager – An EquationManager used to create an internal manager via EquationManager.new(). manager() is used if not supplied.

Remaining arguments and the return value are described in the l_bfgs() documentation.

tlm_adjoint.optimization.minimize_tao(forward, M0, *, method=None, gatol, grtol, gttol=0.0, M_inv_action=None, pre_callback=None, post_callback=None, manager=None)

Functional minimization using TAO.

Parameters:
  • forward – A callable which accepts one or more variable arguments, and which returns a variable defining the forward functional.

  • M0 – A variable or Sequence of variables defining the control, and the initial guess for the optimization.

  • method – TAO type. Defaults to PETSc.TAO.Type.LMVM.

  • gatol – TAO gradient absolute tolerance.

  • grtol – TAO gradient relative tolerance.

  • gttol – TAO gradient norm change relative tolerance.

  • M_inv_action

    A callable defining a (conjugate) dual space inner product,

    \[\left< x, y \right>_{M^{-1}} = y^* M^{-1} x,\]

    where \(x\) and \(y\) are degree of freedom vectors for (conjugate) dual space elements and \(M\) is a Hermitian and positive definite matrix. Accepts one or more variables as arguments, defining the direction, and returns a variable or a Sequence of variables defining the action of \(M^{-1}\) on this direction.

  • pre_callback – A callable accepting a single petsc4py.PETSc.TAO argument. Used for detailed manual configuration. Called after all other configuration options are set.

  • post_callback – A callable accepting a single petsc4py.PETSc.TAO argument. Called after the petsc4py.PETSc.TAO.solve() method has been called.

  • manager – An EquationManager used to create an internal manager via EquationManager.new(). manager() is used if not supplied.

Returns:

A variable or a Sequence of variables storing the result.