Adjoint Greeks V – Source Code Transformation

In my last post on automatic differentiation we saw that the operator overloading approach in some situations (quite naturally) comes with a dramatic performance breakdown.

An alternative approach that might do better is called source code transformation. Here the code is differentiated at compile time. While there are quite a few operator overloading tools that work on existing code bases more or less out of the box, this does not seem to be the case for source code transformation tools.

I looked around quite a while but did not really find a mature framework for C++. There is one under development (ADIC 2.0) that should be mentioned. There is also quite an old version of this same project for C (ADIC 1.x).

There is a Fortran framework though called OpenAD/F. Quite interestingly this framework separates the abstract ad engine (“OpenAD”) from language specific parts (“/F”), using a XML representation for the computational core to be differentiated. So in princinple you can implement a front end for C++ replacing the Fortran front end and reuse the algorithms of OpenAD. This is actually what ADIC is doing.

Still I have this faint idea of using the intermediate representation (IR) of the LLVM compiler infrastructure to interface to OpenAD. There are already backends that can translate back IR to C, so the “only” thing left to do would roughly be the transformation of IR to XAIF and back and the implementation of a run time support library. Very roughly. I certainly do not have enough experience in both AD and the IR to really overlook this and also not the time to delve into this. Anyway.

I played around with OpenAD/F some months ago. I solved a toy pde for american option pricing with Fortran code written from scratch. The results were promising in terms of performance. So here is the plan how to benefit from that in a productive setting:

The idea is that a typical core pricing algorithm is actually not too big and can effectively be excavated from the rest of the code.

If this is the case we could reimplement this computational core in Fortran, apply OpenAD/F to differentiate it and use the differentiated code via a plain interface from QuantLib pricing engines to retrieve adjoint greeks.

This post aims to present a minimal example for that procedure to prove the technical feasibility. Later I will try to apply this to the Bermudan Swaption example I mentioned above.

The libraries are organized as the usual QuantLib shared object library and a Fortran shared object library that contains the differentiated code for the numerical cores. The minimal example library is called simplelibad (simple lib ad). An application will then typcially be an usual C++ program linking against QuantLib and the new ad library.

You can find the example code here. It comes with a make file that by default builds a differentiated version of the library functions. You can also run the target plain to build only the original functions without AD, which seems useful for testing the ported code against the original one before doing anything fancy.

Actually only one function is provided in our simplelib wich computes a discount factor from a given continuously compounded zero yield and a day count fraction. The coding looks as follows (sorry, no syntax highlighting)

subroutine discount(zeroyield, dcf, df)
  implicit none
  double precision:: zeroyield, dcf, df
  !$openad INDEPENDENT(zeroyield)
  df = exp(-zeroyield * dcf)
  !$openad DEPENDENT(df)
end subroutine discount

To keep the interface simple no return parameters are present, the result is written to the last input parameter instead.

There are two OpenAD – specific lines starting with !$openad which declare the active variables. The make file invokes a wrapper script openad that is shipped with OpenAD as a simple driver for the actual tools

openad -c -m f simplelib.f90

This produces the differntiated version of the above code. The option -m f sets the mode to forward. The resulting code is then compiled into an object file

gfortran -g -O3 -o simplelib.pre.xb.x2w.w2f.post.o -c simplelib.pre.xb.x2w.w2f.post.f90 -fpic

The same is done with some additional runtime support files provided by OpenAD

gfortran -g -O3 -o w2f__types.o -c w2f__types.f90 -fpic
gfortran -g -O3 -o OAD_active.o -c OAD_active.f90 -fpic

Finally we need a driver that acts as the interface to QuantLib later on

subroutine discount_ad(zeroyield, dcf, df, ddf)
  use OAD_active
  implicit none
  external discount
  double precision:: zeroyield, dcf, df, ddf
  type(active):: zeroyield_ad, res
  zeroyield_ad%v = zeroyield
  zeroyield_ad%d = 1.0
  call discount(zeroyield_ad, dcf, res)
  df = res%v
  ddf = res%d
end subroutine discount_ad

This returns the derivative together with the original result of the computation and does nothing more than invoking the differentiated code. We compile the driver as well

gfortran -g -O3 -o driver_simplelib.o -c driver_simplelib.f90 -fpic

and then everything is linked into a shared object library simplelibad

gfortran -shared -g -O3 -o libsimplelibad.so w2f__types.o OAD_active.o driver_simplelib.o simplelib.pre.xb.x2w.w2f.post.o

In the minimal example here we actually do not use any QuantLib classes, but directly talk to the ad library from our example application, like this

#include <iostream>
#include <cmath>

extern "C" void discount_ad_(double *zeroyield, double *dcf, double *df,
                             double *ddf);

int main() {

    double zeroyield = 0.02;
    double dcf = 10.0;
    double df, ddf;

    discount_ad_(&zeroyield, &dcf, &df, &ddf);

    std::cout << "result1 = " << df
              << " should be: " << std::exp(-zeroyield * dcf) << std::endl;

    std::cout << "result2 = " << ddf
              << " should be: " << -dcf * std::exp(-zeroyield * dcf)
              << std::endl;
}

The application can be compiled as any other C++ code, it just needs to be linked against simplelibad. Running the code yields the expected output

result1 = 0.818731 should be: 0.818731
result2 = -8.18731 should be: -8.18731

Now of course the real work starts … hope to be back with a meaningful example soon.

Adjoint Greeks V – Source Code Transformation

Smile Dynamics by Densities

The term smile dynamics refers to a rule how an implied volatility smile behaves when the underlying moves. This rule can be estimated from actual historical data, implied by a pricing model or set up from scratch.

Today I am looking at some common specifications, but from a slightly different angle. I am not looking at the effect on the implied volatility smile, but rather on the underlying’s density in its natural pricing measure.

This can be done using the Breeden-Litzenberger Theorem which links the density \phi of the underlying to non deflated call prices c for options struck at strike k by

\phi(k) = \frac{\partial^2 c}{\partial k^2} (k)

The idea is to have a more natural look at such dynamics – while implied volatility may be something we think we have an intuition for, a density seems to be a somewhat more grounded description of what is actually going on.

I am looking at an underlying from the interest rate world with forward level 3\% and options on this underlying expiring in 5 years. Think of these as european swaptions for example. I am assuming an implied volatility smile generated by the SABR model, in terms of lognormal volatilities. The observations easily translate to shifted lognormal volatility smiles, as they are currently used for some currencies due to the low rates.

The question is now, what happens to a smile if the underlying’s forward level rises to say 3.5\% or 4\% ? Nobody really “knows” this of course. But you have to make an assumption whenever you look at effects resulting from rate shifts. This assumption can be crucial for the effectiveness of your delta hedges. It is also of paramount importance for the specification of stress tests, where bigger rate shifts are applied. This will become clear shortly.

I am looking at five different rules for the smile dynamics:

  • Sticky strike dynamics, meaning that after the rate shift the volatilities remain the same for each fixed strike
  • Sticky absolute moneyness assigning the original volatility at strike k-s to the strike k if the rate shift is s
  • Sticky model keeping the (SABR) model parameters constant, i.e. the volatilities after the shift are computed with the same SABR parameters as before, except for the forward, which is now f+s instead of f before
  • Sticky approximate basispoint volatility keeping the approximate basispoint volatility, defined as the product f\cdot\sigma of the forward f and the implied lognormal volatility \sigma, constant. At the same time the volatility smile is moved in absolute moneyness space, meaning that the (lognormal) volatility at strike k is f\cdot\sigma_{k-s} / (f+s) after the rate shift.
  • Sticky basispoint volatility keeping the exact basispoint volatility constant, defined as the volatility parameter of the normal Black model, in our case matching a price produced by the lognormal Black model. Again, the price at k-s is relevant for strike k after a shift by s.

Obviously, one would need some extrapolation rule for absolute monenyness and approximate basispoint volatility dynamics, for strikes below the shift size. We just ignore this here and draw the pictures starting at s instead, this is interesting enough for the moment. Also note that for the sticky basispoint volatility dynamics for strikes below s the implied basispoint volatility is zero.

Let’s start with a SABR smile given by parameters \alpha=0.2, \beta=1.0, \nu=0.0 (, \rho=0.0). This is simply a Black76 model, so we have a flat, lognormal smile. It shouldn’t come as a surprise that in this case three of the five dynamics produce the same result. For sticky strike, sticky moneyness and sticky model we have the following underlying densities for the original rate level, the shift to 3.50\% and the shift to 4.00\%.

dyn_1_1

The densities keep their lognormal nature and since the log volatility remains the same the effective absolute variance of the underlying rates get bigger when rates rise. Under big upshifts coming from a low initial rate this can lead to much too fat, unrealistic densities.

The sticky approximate basispoint variant produces a different result

dyn_1_2

keeping the lognormal nature, but adjusting down the log volatility for higher rates such that the absolute variance of the rates stays more (but not totally) constant than in the other alternatives, which is probably more realistic under big upshifts.

I do not have a picture for the last dynamics (sticky exact basispoint), but it is clear, that this dynamics actually just translates the density by s, preserving the shape exactly.

The second exapmle is again without stochastic volatlility (i.e. a pure CEV example, \nu=0), but now with \alpha=0.015, \beta=0.3 producing a skew

dyn_2_0

Sticky strike,

dyn_2_1

and sticky model

dyn_2_2

lead to densities again with slightly increasing absolute variance, while sticky approximate basispoint

dyn_2_3

roughly keeps the absolute variances constant. However there are some artefacts arising for low strikes. This is because formerly rather high volatilities belonging to low strikes are now applied to higher strikes. This is an obvious problem of this dynamics.

Sticky aboslute moneyness has similar problems

dyn_2_4

at the same time leading to higher absolute variances for higher rates.

Again, sticky exact basispoint volatility will just translate the density.

Now let’s add stochastic volatility and look at \alpha=0.015, \beta=0.30, \nu=0.50, \rho=0.2. The smile now looks like this

dyn_3_0

Sticky strike,

dyn_3_1

sticky model,

dyn_3_2

sticky approximate basispoint

dyn_3_3

show a similar qualitative behaviour as for the CEV case, the approximate basispoint dynamics again with the artefacts for low strikes. Similarly for sticky absolute moneyness

dyn_3_4

elsewhere keeping the absolute variance of rates quite the same across the shifts.

Sticky exact basispoint will again translate the original density by s.

What kind of dynamics matches reality best, is in the end subject to a statistical check on historical data (with all the difficulties in that procedure). Or what traders consider to be the most appropriate dynamics. If I had to choose from the alteratives above, sticky (exact) basispoint or sticky model seem most reasonable.

Smile Dynamics by Densities