{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "40c3ebad",
   "metadata": {},
   "source": [
    "# Getting started with tlm_adjoint\n",
    "\n",
    "This notebook introduces derivative calculations using tlm_adjoint.\n",
    "\n",
    "tlm_adjoint is primarily intended for first derivative calculations using the adjoint method, and Hessian information calculations using the adjoint method applied to tangent-linear and forward calculations. However the approach used by tlm_adjoint generalizes to higher order.\n",
    "\n",
    "The approach used by tlm_adjoint for higher order differentiation is described in:\n",
    "\n",
    "- James R. Maddison, Daniel N. Goldberg, and Benjamin D. Goddard, 'Automated calculation of higher order partial differential equation constrained derivative information', SIAM Journal on Scientific Computing, 41(5), pp. C417&ndash;C445, 2019, doi: 10.1137/18M1209465\n",
    "\n",
    "## A floating point example\n",
    "\n",
    "We consider a simple floating point calculation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0c79382a",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "x = 1.0\n",
    "y = 0.25 * np.pi\n",
    "z = x * np.sin(y * np.exp(y))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e82699c6",
   "metadata": {},
   "source": [
    "tlm_adjoint is designed for high-level algorithmic differentiation, and not this type of low-level floating point calculation. However it *can* still process simple floating point calculations, so to introduce the key ideas we do that here. We consider differentiating `z` with respect to `x` and `y` &ndash; being precise, we mean computing the derivative of the function used to compute `z` with respect to the variables defined by `x` and `y`.\n",
    "\n",
    "## Adding tlm_adjoint\n",
    "\n",
    "We first modify the code so that tlm_adjoint processes the calculations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dfb3955a",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tlm_adjoint import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "\n",
    "def forward(x, y):\n",
    "    z = x * np.sin(y * np.exp(y))\n",
    "    return z\n",
    "\n",
    "\n",
    "x = Float(1.0, name=\"x\")\n",
    "y = Float(0.25 * np.pi, name=\"y\")\n",
    "\n",
    "start_manager()\n",
    "z = forward(x, y)\n",
    "stop_manager()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4bb98432",
   "metadata": {},
   "source": [
    "The key changes here are:\n",
    "\n",
    "- To import tlm_adjoint.\n",
    "- Controlling the 'manager' &ndash; an object tlm_adjoint uses to process equations. The manager is first reset using `reset_manager`. This clears any previous processing, and disables the manager. `start_manager` and `stop_manager` are then used to enable the manager just when it is needed.\n",
    "- Defining `x` and `y` to be of type `Float`. Calculations involving `x` and `y` are recorded by the manager. The result of the calculations &ndash; here `z` &ndash; will have the same type, and we can access its value with `float(z)`.\n",
    "\n",
    "Let's display the information recorded by tlm_adjoint:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ba11e403",
   "metadata": {},
   "outputs": [],
   "source": [
    "manager_info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a0138110",
   "metadata": {},
   "source": [
    "The key feature here is that there are four `FloatEquation` records, corresponding to the four floating point calculations &ndash; evaluation using `np.exp` and `np.sin`, and two multiplications.\n",
    "\n",
    "## Computing derivatives using an adjoint\n",
    "\n",
    "The `compute_gradient` function can be used to differentiate `z` with respect to `x` and `y` using the adjoint method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "66f6f440",
   "metadata": {},
   "outputs": [],
   "source": [
    "dz_dx, dz_dy = compute_gradient(z, (x, y))\n",
    "\n",
    "print(f\"{float(dz_dx)=}\")\n",
    "print(f\"{float(dz_dy)=}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1796fdfd",
   "metadata": {},
   "source": [
    "For a simple check of the result, we can compare with the result from finite differencing, here using second order centered finite differencing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28d0c8a0",
   "metadata": {},
   "outputs": [],
   "source": [
    "def dJ_dm(J, m, *, eps=1.0e-7):\n",
    "    return (J(m + eps) - J(m - eps)) / (2.0 * eps)\n",
    "\n",
    "\n",
    "print(f\"dz_dx approximation = {dJ_dm(lambda x: forward(x, float(y)), float(x))}\")\n",
    "print(f\"dz_dy approximation = {dJ_dm(lambda y: forward(float(x), y), float(y))}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4ef1b47b",
   "metadata": {},
   "source": [
    "## Computing derivatives using a tangent-linear\n",
    "\n",
    "A tangent-linear computes directional derivatives with respect to a given control and with a given direction.\n",
    "\n",
    "Here we consider the forward to be a function of `x` and `y`, computing a value of `z`, denoted $z \\left( x, y \\right)$. We consider the control $m = \\left( x, y \\right)^T$, and a direction $\\zeta = \\left( 2, 3 \\right)^T$. We can then use a tangent-linear to compute the directional derivative\n",
    "\n",
    "$$\n",
    "    \\frac{dz}{dm} \\zeta = 2 \\frac{dz}{dx} + 3 \\frac{dz}{dy},\n",
    "$$\n",
    "\n",
    "where vector derivatives are notated using row vectors.\n",
    "\n",
    "In most cases tlm_adjoint needs to be told what tangent-linear calculations to perform *ahead* of the forward calculations. `configure_tlm` provides this information to tlm_adjoint:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0220eb94",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tlm_adjoint import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "\n",
    "def forward(x, y):\n",
    "    z = x * np.sin(y * np.exp(y))\n",
    "    return z\n",
    "\n",
    "\n",
    "x = Float(1.0, name=\"x\")\n",
    "y = Float(0.25 * np.pi, name=\"y\")\n",
    "\n",
    "m = (x, y)\n",
    "zeta = (Float(2.0, name=\"zeta_x\"), Float(3.0, name=\"zeta_y\"))\n",
    "configure_tlm((m, zeta))\n",
    "\n",
    "start_manager()\n",
    "z = forward(x, y)\n",
    "stop_manager()\n",
    "\n",
    "dz_dm_zeta = var_tlm(z, (m, zeta))\n",
    "\n",
    "print(f\"{float(dz_dm_zeta)=}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c00043a7",
   "metadata": {},
   "source": [
    "There are three new changes:\n",
    "\n",
    "- The control $m$ and direction $\\zeta$ are defined using `m` and `zeta` respectively.\n",
    "- Tangent-linear calculations are configured *before* the forward calculations are performed, using `configure_tlm`. Note that here, for this first derivative calculation, the argument is a single control-direction pair.\n",
    "- We access the tangent-linear variable, containing the value of $\\left( dz/dm \\right) \\zeta$, using `var_tlm`, and using the same control-direction pair.\n",
    "\n",
    "In fact more has happened here &ndash; if we now display the information recorded by tlm_adjoint:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5b2a2150",
   "metadata": {},
   "outputs": [],
   "source": [
    "manager_info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "527f3b8b",
   "metadata": {},
   "source": [
    "we now see that there are *eight* `FloatEquation` records &ndash; the original four, and four new ones. The extra ones correspond to the tangent-linear calculations. tlm_adjoint has recorded the original forward calculations, and *also* recorded the tangent-linear calculations.\n",
    "\n",
    "## Computing second derivatives using an adjoint of a tangent-linear\n",
    "\n",
    "Since tlm_adjoint has recorded both forward and tangent-linear calculations, we can now compute second derivative information using an adjoint associated with a tangent-linear. Specifically we can compute a Hessian action on $\\zeta$,\n",
    "\n",
    "$$\n",
    "    H \\zeta = \\frac{d}{dm} \\left( \\frac{dz}{dm} \\zeta \\right)^T.\n",
    "$$\n",
    "\n",
    "The inner directional derivative appearing here is computed using the tangent-linear method, and the outer derivative is computed by applying the adjoint method to the tangent-linear and forward calculations. In code we simply use `compute_gradient` to compute the derivative of the tangent-linear result:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8612e275",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tlm_adjoint import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "\n",
    "def forward(x, y):\n",
    "    z = x * np.sin(y * np.exp(y))\n",
    "    return z\n",
    "\n",
    "\n",
    "x = Float(1.0, name=\"x\")\n",
    "y = Float(0.25 * np.pi, name=\"y\")\n",
    "\n",
    "m = (x, y)\n",
    "zeta = (Float(2.0, name=\"zeta_x\"), Float(3.0, name=\"zeta_y\"))\n",
    "configure_tlm((m, zeta))\n",
    "\n",
    "start_manager()\n",
    "z = forward(x, y)\n",
    "stop_manager()\n",
    "\n",
    "dz_dm_zeta = var_tlm(z, (m, zeta))\n",
    "\n",
    "print(f\"{float(dz_dm_zeta)=}\")\n",
    "\n",
    "d2z_dm_zeta_dx, d2z_dm_zeta_dy = compute_gradient(dz_dm_zeta, m)\n",
    "\n",
    "print(f\"{float(d2z_dm_zeta_dx)=}\")\n",
    "print(f\"{float(d2z_dm_zeta_dy)=}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4b0fac5",
   "metadata": {},
   "source": [
    "## Computing second derivatives using a tangent-linear of a tangent-linear\n",
    "\n",
    "We can also compute second derivative information using a tangent-linear associated with a tangent-linear. For example if we define $e_1 = \\left( 1, 0 \\right)^T$, then we can find the first component of the previously computed Hessian action on $\\zeta$ via\n",
    "\n",
    "$$\n",
    "    e_1^T H \\zeta = \\left[ \\frac{d}{dm} \\left( \\frac{dz}{dm} \\zeta \\right) \\right] e_1.\n",
    "$$\n",
    "\n",
    "That is, here we now want to compute a directional derivative of a directional derivative, and we compute this using a tangent-linear associated with the previous tangent-linear and forward calculations.\n",
    "\n",
    "tlm_adjoint handles this case by supplying more arguments to `configure_tlm`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5c02e84f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tlm_adjoint import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "\n",
    "def forward(x, y):\n",
    "    z = x * np.sin(y * np.exp(y))\n",
    "    return z\n",
    "\n",
    "\n",
    "x = Float(1.0, name=\"x\")\n",
    "y = Float(0.25 * np.pi, name=\"y\")\n",
    "\n",
    "m = (x, y)\n",
    "zeta = (Float(2.0, name=\"zeta_x\"), Float(3.0, name=\"zeta_y\"))\n",
    "e_1 = (Float(1.0, name=\"e_1_x\"), Float(0.0, name=\"e_1_y\"))\n",
    "configure_tlm((m, zeta), (m, e_1))\n",
    "\n",
    "start_manager()\n",
    "z = forward(x, y)\n",
    "stop_manager()\n",
    "\n",
    "dz_dm_zeta = var_tlm(z, (m, zeta))\n",
    "dz_dx = var_tlm(z, (m, e_1))\n",
    "d2z_dm_zeta_dx = var_tlm(z, (m, zeta), (m, e_1))\n",
    "\n",
    "print(f\"{float(dz_dm_zeta)=}\")\n",
    "print(f\"{float(dz_dx)=}\")\n",
    "print(f\"{float(d2z_dm_zeta_dx)=}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88b3a4f1",
   "metadata": {},
   "source": [
    "The first control-direction pair supplied to `configure_tlm` indicates that we seek to compute directional derivatives of the forward with respect to the control defined by `m` with direction defined by `zeta`. The second control-direction pair indicates that we seek to compute directional deriatives of *these* directional derivatives, with respect to the control defined by `m` and with direction defined by `e_1`. As a side-effect we find that we also compute the directional derivatives of the forward with respect to the control defined by `m` with direction defined by `e_1`.\n",
    "\n",
    "We then access the tangent-linear variables using `var_tlm`, supplying two control-variable pairs to access a second order tangent-linear variable.\n",
    "\n",
    "As before, tlm_adjoint has not just performed the tangent-linear calculations &ndash; if we display the information recorded by tlm_adjoint:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cb6739a1",
   "metadata": {},
   "outputs": [],
   "source": [
    "manager_info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1e4fde63",
   "metadata": {},
   "source": [
    "we now find that there are *sixteen* `FloatEquation` records, constituting the forward and all tangent-linear calculations.\n",
    "\n",
    "## Computing third derivatives using an adjoint of a tangent-linear of a tangent-linear\n",
    "\n",
    "We can now compute the derivative of the tangent-linear-computed second derivative by simply handing the second order tangent-linear variable to `compute_gradient`. This applies the adjoint method to the higher order tangent-linear calculations and the forward calculations, computing\n",
    "\n",
    "$$\n",
    "    \\frac{d}{dm} \\left( e_1^T H \\zeta \\right) = \\frac{d}{dm} \\left[ \\left[ \\frac{d}{dm} \\left( \\frac{dz}{dm} \\zeta \\right) \\right] e_1 \\right].\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c60bc4b8",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tlm_adjoint import *\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "reset_manager()\n",
    "\n",
    "\n",
    "def forward(x, y):\n",
    "    z = x * np.sin(y * np.exp(y))\n",
    "    return z\n",
    "\n",
    "\n",
    "x = Float(1.0, name=\"x\")\n",
    "y = Float(0.25 * np.pi, name=\"y\")\n",
    "\n",
    "m = (x, y)\n",
    "zeta = (Float(2.0, name=\"zeta_x\"), Float(3.0, name=\"zeta_y\"))\n",
    "e_1 = (Float(1.0, name=\"e_1_x\"), Float(0.0, name=\"e_1_y\"))\n",
    "configure_tlm((m, zeta), (m, e_1))\n",
    "\n",
    "start_manager()\n",
    "z = forward(x, y)\n",
    "stop_manager()\n",
    "\n",
    "dz_dm_zeta = var_tlm(z, (m, zeta))\n",
    "dz_dx = var_tlm(z, (m, e_1))\n",
    "d2z_dm_zeta_dx = var_tlm(z, (m, zeta), (m, e_1))\n",
    "\n",
    "print(f\"{float(dz_dm_zeta)=}\")\n",
    "print(f\"{float(dz_dx)=}\")\n",
    "print(f\"{float(d2z_dm_zeta_dx)=}\")\n",
    "\n",
    "d3z_dm_zeta_dx_dx, d3z_dm_zeta_dx_dy = compute_gradient(d2z_dm_zeta_dx, m)\n",
    "\n",
    "print(f\"{float(d3z_dm_zeta_dx_dx)=}\")\n",
    "print(f\"{float(d3z_dm_zeta_dx_dy)=}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1a3e919",
   "metadata": {},
   "source": [
    "## Higher order\n",
    "\n",
    "The approach now generalizes.\n",
    "\n",
    "- Supplying further arguments to `configure_tlm` indicates directional derivatives of directional derivatives, defining a tangent-linear calculation of increasingly high order.\n",
    "- Supplying these arguments to `var_tlm` accesses the higher-order tangent-linear variables.\n",
    "- These higher order tangent-linear variables can be handed to `compute_gradient` to compute derivatives of the higher order derivatives using the adjoint method."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}