{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "09bb9c5a",
   "metadata": {},
   "source": [
    "# Time-independent example\n",
    "\n",
    "This notebook describes adjoint calculations, and Hessian calculations, using tlm_adjoint with the [Firedrake](https://firedrakeproject.org/) backend. A time-independent problem is considered, and importantly checkpointing is not used for the adjoint calculations. This notebook further describes how variables may be flagged to facilitate caching.\n",
    "\n",
    "The high-level algorithmic differentiation approach used by tlm_adjoint is based on the method described in:\n",
    "\n",
    "- P.&nbsp;E.&nbsp;Farrell, D. A. Ham, S. W. Funke, and M. E. Rognes, 'Automated derivation of the adjoint of high-level transient finite element programs', SIAM Journal on Scientific Computing 35(4), pp. C369&ndash;C393, 2013, doi: 10.1137/120873558\n",
    "\n",
    "The caching of data in tlm_adjoint uses an approach based on that described in:\n",
    "\n",
    "- J.&nbsp;R.&nbsp;Maddison and P. E. Farrell, 'Rapid development and adjoining of transient finite element models', Computer Methods in Applied Mechanics and Engineering, 276, 95&ndash;121, 2014, doi: 10.1016/j.cma.2014.03.010\n",
    "\n",
    "## Forward problem\n",
    "\n",
    "We consider the solution of a linear time-independent partial differential equation, followed by the calculation of the $L^2$-norm of the solution. Extra non-linearity is introduced by allowing the right-hand-side of the partial differential equation to depend non-linearly on the control. We assume real spaces and a real build of Firedrake throughout.\n",
    "\n",
    "Specifically we consider the solution $u \\in V_0$ of\n",
    "\n",
    "$$\n",
    "  \\forall \\zeta \\in V_0 \\qquad \\int_\\Omega \\nabla \\zeta \\cdot \\nabla u = \\int_\\Omega \\zeta m^2,\n",
    "$$\n",
    "\n",
    "where $V$ is a real $P_1$ continuous finite element space defining functions on the domain $\\Omega = \\left( 0, 1 \\right)^2$, with $m \\in V$, and where $V_0$ consists of the functions in $V$ which have zero trace. This corresponds to a discretization of the partial differential equation\n",
    "\n",
    "$$\n",
    "    -\\nabla^2 u = m^2 \\quad \\text{on } \\left( x, y \\right) \\in \\left( 0, 1 \\right)^2,\n",
    "$$\n",
    "\n",
    "subject to homogeneous Dirichlet boundary conditions.\n",
    "\n",
    "A simple implementation in Firedrake, with $m = x y$, takes the form:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "227a8702",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "u = Function(space, name=\"u\")\n",
    "solve(inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "      DirichletBC(space, 0.0, \"on_boundary\"))\n",
    "\n",
    "J_sq = assemble(inner(u, u) * dx)\n",
    "J = sqrt(J_sq)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fa57fc17",
   "metadata": {},
   "source": [
    "## Adding tlm_adjoint\n",
    "\n",
    "We first modify the code so that tlm_adjoint processes the calculations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2a3dd1f9",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "from tlm_adjoint.firedrake import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "\n",
    "def forward(m):\n",
    "    u = Function(space, name=\"u\")\n",
    "    solve(inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "          DirichletBC(space, 0.0, \"on_boundary\"))\n",
    "\n",
    "    J_sq = Functional(name=\"J_sq\")\n",
    "    J_sq.assign(inner(u, u) * dx)\n",
    "    J = np.sqrt(J_sq)\n",
    "    return J\n",
    "\n",
    "\n",
    "start_manager()\n",
    "J = forward(m)\n",
    "stop_manager()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c23c7102",
   "metadata": {},
   "source": [
    "The key changes here are:\n",
    "\n",
    "- To import tlm_adjoint with the Firedrake backend.\n",
    "- Controlling the 'manager' &ndash; an object tlm_adjoint uses to process equations.\n",
    "- Using a `Functional` to compute the square of the $L^2$-norm of the solution of the (discretized) partial differential equation. This facilitates the calculation of simple functionals e.g. using finite element assembly.\n",
    "- Taking the square root of the square of the $L^2$-norm using NumPy.\n",
    "\n",
    "Let's display the information recorded by tlm_adjoint:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a34fba5a",
   "metadata": {},
   "outputs": [],
   "source": [
    "manager_info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "adc31bf5",
   "metadata": {},
   "source": [
    "We see that there are three records.\n",
    "\n",
    "- Equation 0, an `EquationSolver`. This records the solution of the finite element variational problem for `u`.\n",
    "- Equation 1, an `Assembly`. This records the calculation of the square of the $L^2$-norm.\n",
    "- Equation 2, a `FloatEquation`. This records the calculation of the square root of the square of the $L^2$-norm.\n",
    "\n",
    "## Computing derivatives using an adjoint\n",
    "\n",
    "The `compute_gradient` function can be used to compute derivatives using the adjoint method. Here we compute the derivative of the $L^2$-norm of the resulting solution, considered a function of the control defined by `m`, with respect to this control:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d0a15072",
   "metadata": {},
   "outputs": [],
   "source": [
    "dJ_dm = compute_gradient(J, m)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7c4c3bc",
   "metadata": {},
   "source": [
    "Here each degree of freedom associated with `dJ_dm` contains the derivative of the functional with respect to the corresponding degree of freedom for the control. `dJ_dm` represents a 'dual space' object, defining a linear functional which, given a 'direction' $\\zeta \\in V$, can be used to compute the directional derivative with respect to $m$ with direction $\\zeta$.\n",
    "\n",
    "For example we can compute the directional derivative of the functional with respect to the control $m$ with direction equal to the unity valued function via:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0772420c",
   "metadata": {},
   "outputs": [],
   "source": [
    "one = Function(space, name=\"one\")\n",
    "one.interpolate(Constant(1.0))\n",
    "\n",
    "dJ_dm_one = assemble(action(dJ_dm, one))\n",
    "\n",
    "print(f\"{dJ_dm_one=}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b088023",
   "metadata": {},
   "source": [
    "This result is the derivative of the $L^2$-norm of the solution with respect to the amplitude of a spatially constant perturbation to the control $m$. We can compare with the result from finite differencing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b8e8f6ca",
   "metadata": {},
   "outputs": [],
   "source": [
    "def dJ_dm(J, m, *, eps=1.0e-7):\n",
    "    return (J(m + eps) - J(m - eps)) / (2.0 * eps)\n",
    "\n",
    "\n",
    "print(f\"dJ_dm_one approximation = {dJ_dm(lambda eps: float(forward(m + eps * one)), 0.0)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1819f89e",
   "metadata": {},
   "source": [
    "## Computing Hessian information using an adjoint of a tangent-linear\n",
    "\n",
    "### Single direction\n",
    "\n",
    "We next seek to compute the action of the Hessian of the functional on some direction $\\zeta \\in V$, using the adjoint method applied to tangent-linear and forward calculations. This can be handled directly, by configuring the relevant tangent-linear and computing the derivative using `compute_gradient`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14f090f0",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "from tlm_adjoint.firedrake import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "\n",
    "def forward(m):\n",
    "    u = Function(space, name=\"u\")\n",
    "    solve(inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "          DirichletBC(space, 0.0, \"on_boundary\"))\n",
    "\n",
    "    J_sq = Functional(name=\"J_sq\")\n",
    "    J_sq.assign(inner(u, u) * dx)\n",
    "    J = np.sqrt(J_sq)\n",
    "    return J\n",
    "\n",
    "\n",
    "zeta = Function(space, name=\"zeta\")\n",
    "zeta.interpolate(sin(pi * X[0]) * sin(pi * X[1]))\n",
    "configure_tlm((m, zeta))\n",
    "\n",
    "start_manager()\n",
    "J = forward(m)\n",
    "stop_manager()\n",
    "\n",
    "dJ_dm_zeta = var_tlm(J, (m, zeta))\n",
    "\n",
    "d2J_dm_zeta_dm = compute_gradient(dJ_dm_zeta, m)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5757194e",
   "metadata": {},
   "source": [
    "The `Hessian` class applies the same approach, but handles several of the steps for us:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "49588ff6",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "from tlm_adjoint.firedrake import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "\n",
    "def forward(m):\n",
    "    u = Function(space, name=\"u\")\n",
    "    solve(inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "          DirichletBC(space, 0.0, \"on_boundary\"))\n",
    "\n",
    "    J_sq = Functional(name=\"J_sq\")\n",
    "    J_sq.assign(inner(u, u) * dx)\n",
    "    J = np.sqrt(J_sq)\n",
    "    return J\n",
    "\n",
    "\n",
    "H = Hessian(forward)\n",
    "\n",
    "zeta = Function(space, name=\"zeta\")\n",
    "zeta.interpolate(sin(pi * X[0]) * sin(pi * X[1]))\n",
    "\n",
    "_, dJ_dm_zeta, d2J_dm_zeta_dm = H.action(m, zeta)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a78eac8e",
   "metadata": {},
   "source": [
    "### Multiple directions\n",
    "\n",
    "If we want to compute the Hessian action on *multiple* directions we can define multiple tangent-linears:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e23e1725",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "from tlm_adjoint.firedrake import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "\n",
    "def forward(m):\n",
    "    u = Function(space, name=\"u\")\n",
    "    solve(inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "          DirichletBC(space, 0.0, \"on_boundary\"))\n",
    "\n",
    "    J_sq = Functional(name=\"J_sq\")\n",
    "    J_sq.assign(inner(u, u) * dx)\n",
    "    J = np.sqrt(J_sq)\n",
    "    return J\n",
    "\n",
    "\n",
    "zeta_0 = Function(space, name=\"zeta_0\")\n",
    "zeta_0.interpolate(sin(pi * X[0]) * sin(pi * X[1]))\n",
    "configure_tlm((m, zeta_0))\n",
    "\n",
    "zeta_1 = Function(space, name=\"zeta_1\")\n",
    "zeta_1.interpolate(sin(pi * X[0]) * sin(2.0 * pi * X[1]))\n",
    "configure_tlm((m, zeta_1))\n",
    "\n",
    "start_manager()\n",
    "J = forward(m)\n",
    "stop_manager()\n",
    "\n",
    "dJ_dm_zeta_0 = var_tlm(J, (m, zeta_0))\n",
    "dJ_dm_zeta_1 = var_tlm(J, (m, zeta_1))\n",
    "\n",
    "d2J_dm_zeta_0_dm, d2J_dm_zeta_1_dm = compute_gradient((dJ_dm_zeta_0, dJ_dm_zeta_1), m)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94aa1773",
   "metadata": {},
   "source": [
    "There are now calculations for two sets of tangent-linear variables, two sets of first order adjoint variables, and two sets of second order adjoint variables. However the two sets of first order adjoint variables have the same values &ndash; by default tlm_adjoint detects this and only computes them once.\n",
    "\n",
    "The above approach requires us to know the directions *before* the forward calculation. However some algorithms can generate the directions sequentially, and we do not know the next direction until a Hessian action on the previous direction has been computed. If possible we still want to avoid re-running the forward calculation each time we have a new direction.\n",
    "\n",
    "If we have sufficient memory available, and in particular so long as we do not need to use checkpointing for the adjoint calculations, we can make use of the `CachedHessian` class. This stores the forward solution and, by default, caches and reuses first order adjoint values. Here we do *not* need to configure the tangent-linear before the forward calculation &ndash; instead tlm_adjoint performs the tangent-linear calculations *after* the forward calculations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9daddace",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "from tlm_adjoint.firedrake import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "\n",
    "def forward(m):\n",
    "    u = Function(space, name=\"u\")\n",
    "    solve(inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "          DirichletBC(space, 0.0, \"on_boundary\"))\n",
    "\n",
    "    J_sq = Functional(name=\"J_sq\")\n",
    "    J_sq.assign(inner(u, u) * dx)\n",
    "    J = np.sqrt(J_sq)\n",
    "    return J\n",
    "\n",
    "\n",
    "start_manager()\n",
    "J = forward(m)\n",
    "stop_manager()\n",
    "\n",
    "H = CachedHessian(J)\n",
    "\n",
    "zeta_0 = Function(space, name=\"zeta_0\")\n",
    "zeta_0.interpolate(sin(pi * X[0]) * sin(pi * X[1]))\n",
    "\n",
    "zeta_1 = Function(space, name=\"zeta_1\")\n",
    "zeta_1.interpolate(sin(pi * X[0]) * sin(2.0 * pi * X[1]))\n",
    "\n",
    "_, dJ_dm_zeta_0, d2J_dm_zeta_0_dm = H.action(m, zeta_0)\n",
    "_, dJ_dm_zeta_1, d2J_dm_zeta_1_dm = H.action(m, zeta_1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dc0f8ae8",
   "metadata": {},
   "source": [
    "## Assembly and solver caching\n",
    "\n",
    "### Using an `EquationSolver`\n",
    "\n",
    "The calculation for the Hessian action includes four discrete Poisson equations: one for the original forward calculation, one for the tangent-linear calculation, and one each for first and second order adjoint calculations. In this self-adjoint problem the finite element matrix &ndash; a stiffness matrix &ndash; is the *same* across all four calculations. Hence we can cache and reuse it. Moreover we can cache and reuse linear solver data &ndash; for example we can cache and reuse the Cholesky factorization.\n",
    "\n",
    "tlm_adjoint can apply such caching automatically, but we must interact directly with the object tlm_adjoint uses to record the solution of finite element variational problems &ndash; the `EquationSolver` previously seen when we used `manager_info()`. This looks like:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "04d2a68c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "from tlm_adjoint.firedrake import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "clear_caches()\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "\n",
    "def forward(m):\n",
    "    u = Function(space, name=\"u\")\n",
    "    eq = EquationSolver(\n",
    "        inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "        DirichletBC(space, 0.0, \"on_boundary\"),\n",
    "        solver_parameters={\"ksp_type\": \"preonly\",\n",
    "                           \"pc_type\": \"cholesky\"})\n",
    "    eq.solve()\n",
    "\n",
    "    J_sq = Functional(name=\"J_sq\")\n",
    "    J_sq.assign(inner(u, u) * dx)\n",
    "    J = np.sqrt(J_sq)\n",
    "    return J\n",
    "\n",
    "\n",
    "zeta = Function(space, name=\"zeta\")\n",
    "zeta.interpolate(sin(pi * X[0]) * sin(pi * X[1]))\n",
    "configure_tlm((m, zeta))\n",
    "\n",
    "start_manager()\n",
    "J = forward(m)\n",
    "stop_manager()\n",
    "\n",
    "dJ_dm_zeta = var_tlm(J, (m, zeta))\n",
    "\n",
    "d2J_dm_zeta_dm = compute_gradient(dJ_dm_zeta, m)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "35c85f0d",
   "metadata": {},
   "source": [
    "The key changes here are:\n",
    "\n",
    "- The use of `clear_caches`. This ensures that any previously cached data is cleared, avoiding memory leaks if the code is run more than once.\n",
    "- The instantiation of an `EquationSolver`, and the call to its `solve` method.\n",
    "\n",
    "If we query the relevant tlm_adjoint caches we find:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4023b690",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f\"{len(assembly_cache())=}\")\n",
    "print(f\"{len(linear_solver_cache())=}\")\n",
    "\n",
    "assert len(assembly_cache()) == 1\n",
    "assert len(linear_solver_cache()) == 1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8c2a42f1",
   "metadata": {},
   "source": [
    "and in particular we see that tlm_adjoint has cached data associated with a single matrix, and has cached a single assembled object (which turns out to be the matrix itself). The latter is a stiffness matrix, and the former stores its Cholesky factorization. The factorization is used four times: in the forward, tangent-linear, and first and second order adjoint calculations.\n",
    "\n",
    "### Flagging data for caching\n",
    "\n",
    "Now consider the slightly different calculation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e8fbfb7c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "from tlm_adjoint.firedrake import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "clear_caches()\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "one = Constant(1.0, name=\"one\")\n",
    "\n",
    "\n",
    "def forward(m):\n",
    "    u = Function(space, name=\"u\")\n",
    "    eq = EquationSolver(\n",
    "        one * inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "        DirichletBC(space, 0.0, \"on_boundary\"),\n",
    "        solver_parameters={\"ksp_type\": \"preonly\",\n",
    "                           \"pc_type\": \"cholesky\"})\n",
    "    eq.solve()\n",
    "\n",
    "    J_sq = Functional(name=\"J_sq\")\n",
    "    J_sq.assign(inner(u, u) * dx)\n",
    "    J = np.sqrt(J_sq)\n",
    "    return J\n",
    "\n",
    "\n",
    "zeta = Function(space, name=\"zeta\")\n",
    "zeta.interpolate(sin(pi * X[0]) * sin(pi * X[1]))\n",
    "configure_tlm((m, zeta))\n",
    "\n",
    "start_manager()\n",
    "J = forward(m)\n",
    "stop_manager()\n",
    "\n",
    "dJ_dm_zeta = var_tlm(J, (m, zeta))\n",
    "\n",
    "d2J_dm_zeta_dm = compute_gradient(dJ_dm_zeta, m)\n",
    "\n",
    "print(f\"{len(assembly_cache())=}\")\n",
    "print(f\"{len(linear_solver_cache())=}\")\n",
    "\n",
    "assert len(assembly_cache()) == 0\n",
    "assert len(linear_solver_cache()) == 0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "78f750ac",
   "metadata": {},
   "source": [
    "The only difference is the introduction of the multiplication by `one` on the left-hand-side of the finite element variational problem. However we now find that no matrix or linear solver data has been cached. The issue is that tlm_adjoint does not know that it should cache the results of calculations involving `one`.\n",
    "\n",
    "#### The 'cache' flag\n",
    "\n",
    "To resolve this, we can flag `one` for caching using `cache=True`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "41227987",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "from tlm_adjoint.firedrake import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "clear_caches()\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "one = Constant(1.0, name=\"one\", cache=True)\n",
    "\n",
    "\n",
    "def forward(m):\n",
    "    u = Function(space, name=\"u\")\n",
    "    eq = EquationSolver(\n",
    "        one * inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "        DirichletBC(space, 0.0, \"on_boundary\"),\n",
    "        solver_parameters={\"ksp_type\": \"preonly\",\n",
    "                           \"pc_type\": \"cholesky\"})\n",
    "    eq.solve()\n",
    "\n",
    "    J_sq = Functional(name=\"J_sq\")\n",
    "    J_sq.assign(inner(u, u) * dx)\n",
    "    J = np.sqrt(J_sq)\n",
    "    return J\n",
    "\n",
    "\n",
    "zeta = Function(space, name=\"zeta\")\n",
    "zeta.interpolate(sin(pi * X[0]) * sin(pi * X[1]))\n",
    "configure_tlm((m, zeta))\n",
    "\n",
    "start_manager()\n",
    "J = forward(m)\n",
    "stop_manager()\n",
    "\n",
    "dJ_dm_zeta = var_tlm(J, (m, zeta))\n",
    "\n",
    "d2J_dm_zeta_dm = compute_gradient(dJ_dm_zeta, m)\n",
    "\n",
    "print(f\"{len(assembly_cache())=}\")\n",
    "print(f\"{len(linear_solver_cache())=}\")\n",
    "\n",
    "assert len(assembly_cache()) == 2\n",
    "assert len(linear_solver_cache()) == 1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38406c1d",
   "metadata": {},
   "source": [
    "We now see that tlm_adjoint has cached linear solver data associated with a single matrix. However assembly of two objects has been cached &ndash; it turns out there are now *two* cached matrices.\n",
    "\n",
    "The extra cached matrix appears in the tangent-linear calculations, involving the tangent-linear variable associated with `one` &ndash; a tangent-linear right-hand-side term has been converted into a matrix multiply using a *different* matrix. However in the above calculation we know that this tangent-linear variable must be zero, since the calculation for `one` doesn't depend on the control variable. The extra term in the tangent-linear calculation is similarly also known to be zero.\n",
    "\n",
    "#### The 'static' flag\n",
    "\n",
    "We can let tlm_adjoint know that `one` does not change, and resolve this inefficiency, by instead using `static=True`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2642124c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from firedrake import *\n",
    "from tlm_adjoint.firedrake import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "clear_caches()\n",
    "\n",
    "mesh = UnitSquareMesh(10, 10)\n",
    "X = SpatialCoordinate(mesh)\n",
    "\n",
    "space = FunctionSpace(mesh, \"Lagrange\", 1)\n",
    "test = TestFunction(space)\n",
    "trial = TrialFunction(space)\n",
    "\n",
    "m = Function(space, name=\"m\")\n",
    "m.interpolate(X[0] * X[1])\n",
    "\n",
    "one = Constant(1.0, name=\"one\", static=True)\n",
    "\n",
    "\n",
    "def forward(m):\n",
    "    u = Function(space, name=\"u\")\n",
    "    eq = EquationSolver(\n",
    "        one * inner(grad(trial), grad(test)) * dx == inner(m * m, test) * dx, u,\n",
    "        DirichletBC(space, 0.0, \"on_boundary\"),\n",
    "        solver_parameters={\"ksp_type\": \"preonly\",\n",
    "                           \"pc_type\": \"cholesky\"})\n",
    "    eq.solve()\n",
    "\n",
    "    J_sq = Functional(name=\"J_sq\")\n",
    "    J_sq.assign(inner(u, u) * dx)\n",
    "    J = np.sqrt(J_sq)\n",
    "    return J\n",
    "\n",
    "\n",
    "zeta = Function(space, name=\"zeta\")\n",
    "zeta.interpolate(sin(pi * X[0]) * sin(pi * X[1]))\n",
    "configure_tlm((m, zeta))\n",
    "\n",
    "start_manager()\n",
    "J = forward(m)\n",
    "stop_manager()\n",
    "\n",
    "dJ_dm_zeta = var_tlm(J, (m, zeta))\n",
    "\n",
    "d2J_dm_zeta_dm = compute_gradient(dJ_dm_zeta, m)\n",
    "\n",
    "print(f\"{len(assembly_cache())=}\")\n",
    "print(f\"{len(linear_solver_cache())=}\")\n",
    "\n",
    "assert len(assembly_cache()) == 1\n",
    "assert len(linear_solver_cache()) == 1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5c96f6ae",
   "metadata": {},
   "source": [
    "Here `static=True` leads to `one` being flagged as a variable whose value is never updated. From this tlm_adjoint can infer that the relevant associated tangent-linear variable is zero, and avoid adding the zero-valued tangent-linear term.\n",
    "\n",
    "The key difference between using `cache=True` and `static=True` is that in the former the value of the variable *may* be updated. So long as tlm_adjoint is aware of the update (which happens, for example, when tlm_adjoint records a calculation) then updating the value of a variable invalidates cache entries, and invalidated cache entries are cleared automatically."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}