2nd Order Methods

Suppose we have a function $f: \mathbb{R^n} \to \mathbb{R}$. Using dual numbers we can input a dual vector $x + J\epsilon$ and get out a dual number. This induces a function $g: \mathbb{R}^n \to \mathbb{R}^n$ defined as $g(x) \to \text{Dual parts of }f(x + J\epsilon)$. If we now input $y + I\epsilon'$ into $g$, where $\epsilon_i\epsilon'_j \neq 0$ for all $1 \leq i, j \leq n$, the $\epsilon_i\epsilon'_j$ terms give us the Hessian coefficients.

In application, if we allow definitions `DualVector(::DualVector, ::AbstractMatrix)` and `Dual(::Dual, ::DualVector)`, this should have the same effect, and would open up discussion or examples concerning 2nd order methods in `DualArrays.jl`.

Example:
- `f(x) = x[1] * x[2]`
- `g(x) = f(DualVector(x, I(length(x)))).partials`
- `d = DualVector([1, 2], I(2))`
- In order to evaluate `g(d)`, we start with a `DualVector(d, I(2))`
- We then pass through `f` giving us `Dual(d[1], [1, 0]) * Dual(d[2], [0, 1])`
- The product rule makes this evaluate to `Dual(d[1] * d[2], [d[2], d[1]])` (Note: we can overload operations such that `[d[2], d[1]] = d[2] * [1,0] + d[1] * [0,1]` returns a `DualVector`. In general, for the latter argument to be a `DualVector` instead of a `Vector{Dual}` operations between `Dual` and `AbstractVector` also need to be overloaded)
- The jacobian of `[d[2], d[1]]` is actually a matrix of nested dual parts, so this is our Hessian. We verify from the definition of `d` that this is the correct Hessian, `[0 1;1 0]`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2nd Order Methods #19

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

2nd Order Methods #19

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions