Solving Black-Scholes equation: finite differences method

  • Black-Scholes equation solved by Finite Difference method;
  • Discussion of discretization methods;
  • Matrix equation and computational solution;
  • How to apply boundary conditions;
  • Improving calculation grids.

Hello, all!
I have already discussed three paths we can use to derive the Black-Scholes equation:
$\frac{\partial V}{\partial t} + rS \frac{\partial V}{\partial S}+ \frac{(\sigma S)^2}{2} \frac{\partial^2 V}{\partial S^2}=rV$
where $V$ is the derivative value, $S$ is the asset (or index) price modeled as a Geometric Brownian motion, and $r$ is the risk-free interest rate.
But how can we solve it? There are several papers describing solutions of the BS equation. Analytically, the simplest method is to propose a solution type: $V=h(S)g(t)$, the well-known separation of variables. However, this method is not general and only some payoffs can use such a method. Other more complex solutions were proposed in literature, based on Hermite polynomials, or decomposition techniques, for instance. General limitations for analytical solutions are also presented.
Resultado de imagem para discrete mesh 3D
Discretization
The holy grail for the quantitative finance (or at least one of them) would be the existence of analytical solutions for any financial instrument one could create. This could significantly diminish computational cost required by numerical methods, such as Monte Carlo, Finite Element Method or Finite differences. For now, such a dream has not become true as far as I know.
In this context, this post discuss Finite difference method, which is one the numerical methods that can be applied to solve BS equation. As much as possible I will try to keep a generalized discussion, but simple at the same time.

First steps

Among the numerical methods, personally I think the finite differences method is the most intuitive. It is based on a discretization of the derivatives, more specifically using Taylor expansions. For example, consider a time derivative:
$V(t+ \Delta t) \approx V(t) + \frac{\partial V}{\partial t}\Delta t + O(\Delta t^2)$
rearraging we obtain:
$\frac{\partial V(t)}{\partial t} \approx \frac{V(t+ \Delta t) - V(t)}{\Delta t}$
where $O(\Delta t^2)$ denotes an error of second order in $\Delta t$, which is the discretization in time. This is a forward approximation. One could also perform a backward approximation, obtaining:
$\frac{\partial V(t)}{\partial t} \approx \frac{V(t) - V(t- \Delta t)}{\Delta t}$
Or even a centralized approximation, which combines backward and forward approximation:
$\frac{\partial V(t)}{\partial t} \approx \frac{V(t+ \Delta t) - V(t- \Delta t)}{2 \Delta t}$
The centralized approximation is more accurate than the forward or backward, however its use might be less appealing sometimes. We will see an example of that in the next section.

Getting started

The outline of the method is simple:
  1. rewrite the differential equation using the finite approximations for the derivatives;
  2. arrange a matrix equation that can be solved by some numerical linear algebra method.
Therefore, let´s first rewrite the equations using the discrete approximations. Before going further, it is worth mentioning that BS equation is a backward parabolic differential equation, which means one defines a terminal condition instead of an initial condition (this is one of the differences from modeling physical phenomena in which one works with initial conditions. This brings some interesting differences in the theoretical point of view). Due to this characteristic of the BS equation, I will use a backward differential approximation for the time derivative, instead of a centralized one. For the derivative with respect to the asset price, I will keep it simple using a simple forward approximation. Therefore:
$\frac{V(S_j, t_i) - V(S_j, t_{i-1})}{\Delta t}+ rS_j \frac{V(S_j, t_i) - V(S_{j-1}, t_i)}{\Delta S}+ \frac{(\sigma S_j)^2}{2} \frac{V(S_{j+1}, t_i) - 2*V(S_j, t_i) + V(S_{j-1}, t_i)}{\Delta S^2}=rV(S_j, t_i)$
The second derivative is also approximated using Taylor expansion, but this time using a second order approximation (try to derive it by yourself. If you cannot make it, answers are provided in the end of this post).
Now, we reorganize the equation above such as:
$\alpha_j V(S_j, t_i) + \beta_j V(S_j, t_{i-1}) + \gamma_j V(S_{j-1}, t_i) + \delta_j V(S_{j+1}, t_i) = 0$
where:
$\alpha_j = -r+ \frac{1}{\Delta t} + \frac{rS_j}{\Delta t} - \left(\frac{\sigma S_j}{\Delta S} \right)^2$
$\beta_j = - \frac{1}{\Delta t}$
$\gamma_j = -\frac{rS_j}{\Delta S}+\frac{1}{2} \left(\frac{\sigma S_j}{\Delta S} \right)^2$
$\delta_j = \frac{1}{2} \left(\frac{\sigma S_j}{\Delta S} \right)^2$
Once we have the partial differential equation in terms of finite differences, we can reorganize the elements in a matrix equation of the type:
$AX=B$
where $A$ contains the elements $\alpha, \beta, \gamma, \delta$, $X$ contains the unknown variables, and $B$ contains the boundary conditions. This equation can be solved numerically by linear algebra packages. In Python, one can use numpy.linalg.tensorsolve, for instance. In C++, one can use Boost, Arpack, among several other libraries.
Matrices $A$ and $X$ can be given by:
$A = \left[ \begin{array}{ccccccccccc}...\\ 0 & 0 & ... & \beta_j & ... & \gamma_j & \alpha_j & \delta_j & ... & 0 & 0\\  0 & 0 & 0 & ... & \beta_{j+n} & ... & \gamma_{j+n} & \alpha_{j+n }& \delta_{j+n} & ... & 0 \\ ... \end{array} \right]$
and:
$X= \left[ \begin{array}{c} V(S_0, t_0)\\V(S_1, t_0)\\...\\V(S_0,t_{1})\\V(S_1, t_1)\\...\\V(S_{j-1},t_i)\\V(S_j,t_i)\\V(S_{j+1},t_i)\\...\\V(S_N, t_M)  \end{array} \right]$
where $N$ and $M$ denote the maximum discretization of the asset prices and time, respectively. We now can discuss how to fill matrix $B$ corresponding to the boundary conditions.

Boundary conditions

It is worth remembering that one can solve Black-Scholes equation for any derivative whose payoff occurs only at maturity without intermediate exercise dates (some exceptions are possible). The particularization in the study of determined financial instrument is in the associated payoff function which is directly related to the boundary conditions that one should consider in the numerical resolution of the problem.
In general, we will deal with two types of boundary conditions, named Dirichlet and Neumann. A Dirichlet condition is the one defined in terms of the variable itself, such as: $V(S=0, T)=a$. A Neumann condition is defined in terms of the derivative of the unknown variable, such as: $\frac{\partial V(S=0, T)}{\partial S}=b$. These conditions are called homogeneous when equal zero, that is, $a=0$ or $b=0$. Other interesting case is the Robin condition, which is given by a linear combination of a Dirichlet and a Neumann condition.
Starting with the Dirichlet condition which is easier to insert in the matrix equation. In matrix $A$, one insert a row full of zeros except for the position that multiplies the variable in matrix $X$ corresponding to the condition. In the matrix $B$, we simply insert the value of the condition in the corresponding position. The other elements of $B$ that are not associated with any boundary condition are zero. For instance, let´s suppose that the conditions $V(S_0, t_0)=a$, $V(S_1, t_0)=b$, and $V(S_2, t_0)=c$  hold, therefore:
$\left[ \begin{array}{ccccccccccc}1 & 0 & 0 & 0 & ...\\ 0 & 1 & 0 & 0 & ... \\ 0 & 0 & 1 & 0 & ...\\ 0 & 0 & ... & \beta_j & ... & \gamma_j & \alpha_j & \delta_j & ... & 0 & 0\\  0 & 0 & 0 & ... & \beta_{j+n} & ... & \gamma_{j+n} & \alpha_{j+n }& \delta_{j+n} & ... & 0 \\ ... \end{array} \right] X  \left[ \begin{array}{c} V(S_0, t_0)\\V(S_1, t_0)\\V(S_2, t_0)\\...\\V(S_0,t_{1})\\V(S_1, t_1)\\...\\V(S_{j-1},t_i)\\V(S_j,t_i)\\V(S_{j+1},t_i)\\...\\V(S_N, t_M)  \end{array} \right] = \left[ \begin{array}{c} a\\b\\c\\0\\0\\.\\.\\.\\0\\0 \end{array} \right]$

Neumann conditions are more difficult to handle. In fact, there are several methods to consider this type of boundary condition. In general, all methods are based on an finite difference approximation of the boundary condition, for instance:
$\frac{\partial V(S=S_x, t_0)}{\partial S}=g \to \frac{V(S_{4}, t_0) - V(S_{3}, t_0)}{\Delta S} \approx g + O(\Delta S^2)$
where $S_x \in [S_3, S_{4}]$. The difference between the methods are in the accuracy of the approximation as well as in the position of $S_x$ in the discretized equation. Considering the example above, one could modify the matrix equation in order to replace the elements in $A$ with the "modified" elements derived from the equation corresponding to the Neumann condition. Therefore:
$\left[ \begin{array}{ccccccccccc}1 & 0 & 0 & 0 & ...\\ 0 & 1 & 0 & 0 & ... \\ 0 & 0 & 1 & 0 & ...\\ 0 & 0 & 0 & -1 & 1 & 0...\\ 0 & 0 & ... & \beta_j & ... & \gamma_j & \alpha_j & \delta_j & ... & 0 & 0\\  0 & 0 & 0 & ... & \beta_{j+n} & ... & \gamma_{j+n} & \alpha_{j+n }& \delta_{j+n} & ... & 0 \\ ... \end{array} \right] X  \left[ \begin{array}{c} V(S_0, t_0)\\V(S_1, t_0)\\V(S_2, t_0)\\ V(S_3, t_0) \\ V(S_4, t_0) \\...\\V(S_0,t_{1})\\V(S_1, t_1)\\...\\V(S_{j-1},t_i)\\V(S_j,t_i)\\V(S_{j+1},t_i)\\...\\V(S_N, t_M)  \end{array} \right] = \left[ \begin{array}{c} a\\b\\c\\ g \Delta S\\0\\0\\.\\.\\.\\0\\0 \end{array} \right]$
Another very common way to handle Neumann conditions is by the use of "ghost" points. These are auxiliary points located outside the domain of study which are used only for the purposes of the boundary condition. Check this paper and references therein for very interesting examples (but for elliptic and not parabolic equations. Nevertheless, the main idea remains untouched).

Improving calculation grids

Until far, my description here has considered homogeneous discretization in the asset price $\Delta S$ and time $\Delta t$. However, this is usually not optimal in terms of both computational cost and accuracy. In real world applications, one uses grids with variable spacing among the points, considering a selective refinement for some determined situations which I will discuss in the next paragraphs. Before going further, however, it is worth mentioning that one can easily include variable spacing in the discretized BS equation by simply including an index in $\Delta S_j$ and $\Delta t_i$.
In order to decide the situation for which the grid should be refined, one must to estimate the greeks of the financial instrument. For completeness, greeks denote the derivatives of the cost of the financial instrument (or portfolio) with respect to variables of the market. For instance, Delta is the derivative of the instrument cost with respect to the asset price. Vega is the derivative with respect to the volatility. Theta is the derivative with respect to time. For the sake of our discussion regarding the grids, Delta and Theta should be considered. 
The rule of thumb relating greeks and the numerical grid is: as larger the magnitude of the greek, more refined must be the grid with respect to the variable of the greek. For the simple case of European vanilla calls and puts, for instance, one must refine the time grid for shorter times to maturity. In addition, these instruments also require more refinement in the asset price grid for the situations with spot price close to the strike price. In fact, this can also be considered in a generalized rule of thumb for grids:
Short times to maturity / Options at-the-money $\to$ refined grids.
The rule above also holds for other numerical methods, such as Monte Carlo or Finite Element methods.

I hope you enjoyed the post. Leave your comments and share!

Live long and prosper!

#=================================
Diogo de Moura Pedroso

Appendix: Second order derivative

The second order derivative in finite differences method can be easily obtained from a Taylor expansion. Therefore, consider:
$V(x_{i+1}) \approx V(x_i) + \frac{\partial V}{\partial x}\Delta x + \frac{1}{2} \frac{\partial^2 V}{\partial x^2}\Delta x^2 + O(\Delta x^3)$
and:
$V(x_{i-1}) \approx V(x_i) - \frac{\partial V}{\partial x}\Delta x + \frac{1}{2} \frac{\partial^2 V}{\partial x^2}\Delta x^2 + O(\Delta x^3)$
summing these equations one obtains:
$V(x_{i+1})+V(x_{i-1}) \approx 2V(x_i) + \frac{\partial^2 V}{\partial x^2}\Delta x^2$
Then, finally:
$\frac{\partial^2 V}{\partial x^2} \approx \frac{V(x_{i+1})-2V(x_i)+V(x_{i-1})}{\Delta x^2}$

Comments