Adjoint Greeks IV – Exotics

Today I feature a sequel to the adjoint greeks drama (see my prior posts on this).

Before I start I would like to point you to a new and excellent blog authored by my colleague Matthias You will want to follow his posts, I am certain about that.

I am still on my way to convert the essential parts of the library to a template version. Since this is boring work I created a small friend who helps me. No I did not go mad. It is a semi-intelligent emacs-lisp script that does some regular expression based search-and-replacements which speeds up the conversion a lot. Emacs is so cool.

Today I am going to calculate some derivatives of a bermudan swaption in the Gsr / Hull White model. This is the first post-vanilla application and admittedly I am glad that it works at last.

One point I have to make today is that AD is slow. Or to put it differently, doubles can be tremendously fast. Lookit here:

void multiply(T *a, T *b, T *s) {
    for (int i = 0; i < N; ++i) {
        for (int k = 0; k < N; ++k) {
            for (int j = 0; j < N; ++j) {
                s[i * N + j] += a[i * N + k] * b[k * N + j];

This code multiplies two matrices a and b and stores the result in s. It is the same algorithm as implemented in QuantLib for the Matrix class.

When feeding two 1000 x 1000 double matrices the code runs 1.3s on my laptop if compiled with gcc 4.8.2 using -O1. With -O3 I get 950ms. With clang 3.7.0 (fresh from the trunk) I get 960ms (-O1) and 630ms (-O3). With T=CppAD::AD on the other hand the running time goes up to at least (-O3) 10.9s (gcc) and 14.3s (clang). Code optimization seems to be a delicate business.

These timings refer to the situation where we use the AD type without taping, i.e. only as a wrapper for the double in it. This seems to indicate that at least for specific procedures it is not advisable to globally replace doubles by their active counterpart if one is not really interested in taping their operations and calculating the derivatives. Performance may break down dramatically.

What is the reason behind the huge difference ? I am not really the right person to analyze this in great detail. The only thing I spotted when looking into the generated assembler code is that with double there are SIMD (single instruction multiple data) instructions for adding and multiplying in the nested loops (like addpd and mulpd, it’s a long time since I programmed in assembler and it was on a 6510 so I am not really able to read a modern x86 assembler file).

With CppAD::AD there doesn’t seem to be such instructions around. So part of the perfomance loss may be due to the inability of streaming AD calculations. For sure this is not the only point here.

The second point to make today is that AD has pitfalls that may in the end lead to wrong results, if one uses the AD framework blindly. Let’s come back to our specific example. The underlying source code can be found here if you are interested or want to run it by yourself.

It is a bermudan swaption, ten years with yearly exercise dates. The model for pricing will be the Gsr or Hull White model. We just want to compute the bucket vegas of the bermudan, i.e. the change in its NPV when the implied market volatility of the canonical european swaptions used for the model calibration is increased by one percent.

The rate level is set to 3\% and the strike is out of the money at 5\%. The volatilities are produced by SABR parameters \alpha=3\%, \beta=60\%, \nu=12\%, \rho=30\%. All of this is arbitrary, unrealistic and only for explanatory purposes …

The naive AD way of doing this would be to declare the input implied volatilities as our quantities of interest


and then go through the whole model calibration

                  basketAD, methodAD, ecAD);

and pricing

   yAD[0] = swaptionAD->NPV();

to get the derivative d bermudan / d impliedVol

    CppAD::ADFun<Real> f(inputVolAD, yAD);
    std::vector<Real> vega(sigmasAD.size()), w(1, 1.0);
    vega = f.Reverse(1, w);

When I first wrote about AD I found it extremely attractive and magical that you could compute your sensitivities like this. That you can actually differentiate a zero search (like in the case of yield curve bootstrapping) or an optimization (like the Levenberg-Marquardt algorithm which is used here).

However there are dark sides to this simplicity, too. Performance is one thing. The calibration step and the pricing including the gradient calculation takes

AD model calibration = 1.32s
AD pricing+deltas    = 7.11s

We can also do it differently, namely calibrate the model in an ordinary way (using ordinary doubles), then compute the sensitivity of the bermudan to the model’s sigma and additionally compute the sensitivity of the calibration instruments to the model’s sigma. Putting everything together yields the bermudan’s bucketed vega again. I will demonstrate how below. First I report the computation time for this approach:

model calibration = 0.40s
AD pricing+deltas = 5.95s
additional stuff  = 0.97s

This leaves us with a performance gain of around 15 percent (7.32s vs 8.43s). This is not really dramatic, still significant. And there is another good reason to separate the calibration from the greek calculation which I will come to below.

Note also, that the pricing which takes 5.95s with AD (including derivatives) is much faster without AD where it only consumes 0.073s. This is a factor of 80 which is much worse than the theoretical factor of 4 to 5 mentioned in earlier posts (remember, we saw 4.5 for plain vanilla interest rate swap npv and delta computation). This is again due to optimization issues obviously.

The background here is that the swaption pricing engine uses cubic spline interpolation and closed-form integration of the resulting cubic polynominals against the normal density for the roll back. Again a lot of elementary calculations, not separated by OO-things that would hinder the compiler from low level optimizations. You would surely need quite a number of sensitivities to still get a performance gain.

But lets follow the path to compute the bucket vegas further down. First I print the bucket vegas coming from the naive AD way above – direct differentiation by the input implied volatilities with the operation sequence going through the whole model calibration and the pricing.

vega #0: 0.0153%
vega #1: 0.0238%
vega #2: 0.0263%
vega #3: 0.0207%
vega #4: 0.0209%
vega #5: 0.0185%
vega #6: 0.0140%
vega #7: 0.0124%
vega #8: 0.0087%

This looks plausible. We started the calibration from a constant sigma function at 1\%. This is actually far away from the target value (which is around 0.40\%), so the optimizer can walk around before he settles at the minimum. But we could have started with a sigma very close to the optimal solution. What would happen then ? With the target value as an initial guess (which is unlikely to have in reality, sure) we get

vega #0: 0.0000%
vega #1: 0.0238%
vega #2: 0.0263%
vega #3: 0.0207%
vega #4: 0.0209%
vega #5: 0.0448%
vega #6: 0.0000%
vega #7: 0.0000%
vega #8: 0.0000%

Some vegas are zero now. vega #5 is even completely different from the value before. This is a no go, because in a productive application you wouldn’t notice if some deals are contributing a zero sensitivity or a false value.

What is happening here is that the function we differentiate depends on more input variables than only the primary variables of interest (the implied vol), like the initial guess for the optimization. These might alter the derivative drastically even if the function value (which is the model’s sigma function on an intermediate level, or ultimately the bermudan’s npv) stays the same.

For example if the initial guess is so good that the optimizers tolerance is already satisfied with it, the output will stay the same, no matter if the input is pertubed by an infinitesimal shift dx. Here pertubed does not really mean bumped and dx really is infinitesimal small. It is only a concept to get a better intuition of what is going on during the process of automatic differentiation.

Another example is the bisection method for zero searching. This method will always yield zero AD derivatives in the sense if x is an input (e.g. a vector of swap quotes) and y is the output (e.g. a vector of zero rates) linked by a relation

f(x,y) = 0

the iff x is pertubed by dx then the checks in the bisection algorithm will yield exactly the same results for x+dx as for x. Therefore the computed y will be exactly the same, thus the derivative zero.

I don’t know if this explanation is good, but this is how I picture it for myself. It seems to be like this: If it feels too magical what you do with AD, you better don’t do it.

What is the way out ? We just avoid all optimization and zero searches, if possible. And it is for interest rate deltas and vegas: Just calibrate your curve in the usual way, then apply AD to compute sensitivities to zero rates (which then does not involve a zero search any more).

If you want market rate deltas, compute the Jacobian matrix of the market instruments in the curve by the zero rates as well, invert it and multiply it with the zero delta vector. You do not even have to invert it, but just only have to solve one linear equation system. We make this procedure explicit with our example above.

The first step would be to compute d bermudan NPV / d model’s sigma. This is done on the already calibrated model. Technically we separate the calibration step (done with the usual doubles) and the AD step (done on a copy of the model feeded with the model parameters from the first model, but not itself calibrated again). This is what we get for d bermudan / d sigma

vega #0: 1.6991
vega #1: 1.2209
vega #2: 0.8478
vega #3: 0.5636
vega #4: 0.4005
vega #5: 0.2645
vega #6: 0.1573
vega #7: 0.0885
vega #8: 0.0347

Next we compute the Jacobian d helperNpv / d sigma

297.4%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
179.7%	183.3%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
126.9%	129.4%	132.2%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
90.8%	92.6%	94.6%	96.7%	0.0%	0.0%	0.0%	0.0%	0.0%
64.9%	66.2%	67.6%	69.1%	71.2%	0.0%	0.0%	0.0%	0.0%
47.5%	48.4%	49.5%	50.6%	52.1%	53.3%	0.0%	0.0%	0.0%
31.9%	32.6%	33.3%	34.0%	35.0%	35.9%	36.3%	0.0%	0.0%
19.2%	19.6%	20.0%	20.4%	21.0%	21.5%	21.8%	22.3%	0.0%
8.7%	8.9%	9.0%	9.2%	9.5%	9.8%	9.9%	10.1%	10.5%

We also use AD for this, CppAD comes with a driver routine that reads

helperVega = f.Jacobian(sigmas);

This is a lower triangular matrix, because the ith calibration helper depends only on the sigma function up to its expiry time. The inverse of this matrix is also interesting, although we wouldn’t need it in its full beauty for our vega calculation, representing d sigma / d helperNpv

33.6%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
-33.0%	54.6%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
0.0%	-53.4%	75.6%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
0.0%	0.0%	-74.0%	103.5%	0.0%	0.0%	0.0%	0.0%	0.0%
0.0%	0.0%	0.0%	-100.4%	140.4%	0.0%	0.0%	0.0%	0.0%
0.0%	0.0%	0.0%	0.0%	-137.2%	187.5%	0.0%	0.0%	0.0%
0.0%	0.0%	0.0%	0.0%	0.0%	-185.1%	275.3%	0.0%	0.0%
0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	-269.1%	448.1%	0.0%
0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	-431.6%	953.1%

This says how the different sigma steps go up when the helper belonging to the step goes up (these are the positive diagonal elements) and how the next sigma step would need to go down when the same helper as before goes up, but the next helper stays the same. The contra-movement is roughly by the same amount.

Next we need d bermudan / d sigma. AD delivers

vega #1: 1.6991
vega #2: 1.2209
vega #3: 0.8478
vega #4: 0.5636
vega #5: 0.4005
vega #6: 0.2645
vega #7: 0.1573
vega #8: 0.0885
vega #9: 0.0347

And the last ingredient would be the calibration helpers’ vegas, which can be computed analytically (this is the npv change when shifting the input market volatility by one percent up), d helperNpv / d marketVol

vega #1: 0.0904%
vega #2: 0.1117%
vega #2: 0.1175%
vega #3: 0.1143%
vega #4: 0.1047%
vega #5: 0.0903%
vega #6: 0.0720%
vega #7: 0.0504%
vega #8: 0.0263%

Now multiplying everything together gives

d bermudan / d impliedVol = d bermudan / d sigma x d sigma / d helperNpv x d helperNpv / d impliedVol

which is

vega #0: 0.0153%
vega #1: 0.0238%
vega #2: 0.0263%
vega #3: 0.0207%
vega #4: 0.0209%
vega #5: 0.0185%
vega #6: 0.0140%
vega #7: 0.0124%
vega #8: 0.0087%

same as above. Partly because I just copy-pasted it here. But it actually comes out of the code also, just try it.

Adjoint Greeks IV – Exotics

Gaussian Models

This is going to be a guided tour through some example code I wrote to illustrate the usage of the Markov Functional and Gsr (a.k.a. Hull White) model implementations.

Hull White / Gsr is without any doubt the bread and butter model for rates. It calibrates to a given series of vanilla instruments, it has a parameter (the mean reversion) to control intertemporal correlations (which is important both for bermudan pricing and time travelling), but you can not alter its “factory settings” regarding the smile. At least it is not flat but a skew. Not unrealistic from a qualitative standpoint, but you would have to be lucky to match the market skew decently of course. Markov Functional on the other hand mimics any given smile termstructure exactly as long as it is arbitrage free.

Recently I added Hagan’s internal adjusters to the Gsr implementation, trying to make up for the comparative disadvantage. I coming back to them at the end of this article. Internal adjusters here is in distinction to external adjusters of course, which I am working on as well. More on them later.

Let’s delve into the example. I do not reproduce the whole code here, just the main points of interest. You can find the full example in one of the recent releases of QuantLib under Examples / Gaussian1dModels.

First we set the global evaluation date.

  Date refDate(30, April, 2014);
  Settings::instance().evaluationDate() = refDate;

The rate curves will be flat, but we assume basis spreads to demonstrate that the models can handle them in a decent way.

  Real forward6mLevel = 0.025;
  Real oisLevel = 0.02;

I will omit the code to set up the quotes and termstructures here. Swaption volatilities are chosen flat as well

  Real volLevel = 0.20;

Now we set up a deal that we can price later on. The underlying is a standard vanilla spot starting swap with 4\% fixed rate against Euribor 6M.

  Real strike = 0.04;
  boost::shared_ptr<NonstandardSwap> underlying =
      VanillaSwap::Payer, 1.0, fixedSchedule, strike, 
      floatingSchedule, euribor6m, 0.00, Actual360()));

Of course there is a reason that we use a NonstandardSwap instead of a VanillaSwap. You will see later.

We define a bermudan swaption on that underlying with yearly exercise dates, where the notification of a call should be given two TARGET days before the next accrual start period.

 boost::shared_ptr<Exercise> exercise =
                            exerciseDates, false);
 boost::shared_ptr<NonstandardSwaption> swaption =
                            underlying, exercise);

To set up the Gsr model we need to define the grid on which the model volatility is piecewise constant. Since we want to match the market quotes for european calls later on we chose the grid points identical to the exercise dates, except that we do not need a step at the last exercise date obviously. The initial model volatility is set to 1\%

        std::vector<Date> stepDates(exerciseDates.begin(),
                                    exerciseDates.end() - 1);
        std::vector<Real> sigmas(stepDates.size() + 1, 0.01);

The reversion speed is 1\% as well.

        Real reversion = 0.01;

And we are ready to define the model!

  boost::shared_ptr<Gsr> gsr = boost::make_shared<Gsr>(
        yts6m, stepDates, sigmas, reversion);

We will need a swaption engine for calibration

 boost::shared_ptr<PricingEngine> swaptionEngine =
                                    gsr, 64, 7.0, true,
                                    false, ytsOis);

Normally it is enough to pass the model gsr. The 64 and 7.0 are the default parameters for the numerical integration scheme that is used by the engine as well as true and false indicating that the payoff should be extrapolated outside the integration domain in a non-flat manner (this is not really important here). The last parameter denotes the discounting curve that should be used for the swaption valuation. Note that this is different from the model’s “main” yield curve, which is the Eurior 6M forward curve (see above).

We set up a second engine for our instrument we want to price.

 nonstandardSwaptionEngine =
       gsr, 64, 7.0, true, false, Handle<Quote>(), ytsOis);

On top of the parameters from above, we have an empty quote here. This can be used to introduce a flat credit termstructure into the pricing. We will see later how to use this in exotic bond valuations. For the moment it is just empty, so ignored.

Now we assign the engine to our bermudan swaption


How do we calibrate our Gsr model to price this swaption ? Actually there are some handy methods thanks to the fact that we chose an engine which implements the BasketGeneratingEngine interface, so we can just say

std::vector<boost::shared_ptr<CalibrationHelper> > basket =
     swaption->calibrationBasket(swapBase, *swaptionVol,

to get a coterminal basket of at the money swaptions fitting the date schedules of our deal. The swapBase here encodes the conventions for standard market instruments. The last parameter Naive tells the engine just to take the exercise dates of the deal and the maturity date of the underlying and create at the money swaptions from it using the standard market conventions.

We can do more involved things and we will below: as soon as the deal specifics are not matching the standard market swaption conventions, we can choose an adjusted basket of calibration instruments! This can be a little thing like five instead of two notification dates for a call, different day count conventions on the legs, a non-yearly fixed leg payment frequency, or bigger things like a different Euribor index, an amortizing notional schedule and so on. I wrote a short note some time ago on this which you can get here if you are interested.

In any case the naive basket looks like this:

Expiry              Maturity            Nominal             Rate          Pay/Rec     Market ivol   
April 30th, 2015    May 6th, 2024       1.000000            0.025307      Receiver    0.200000      
May 3rd, 2016       May 6th, 2024       1.000000            0.025300      Receiver    0.200000      
May 3rd, 2017       May 6th, 2024       1.000000            0.025303      Receiver    0.200000      
May 3rd, 2018       May 6th, 2024       1.000000            0.025306      Receiver    0.200000      
May 2nd, 2019       May 6th, 2024       1.000000            0.025311      Receiver    0.200000      
April 30th, 2020    May 6th, 2024       1.000000            0.025300      Receiver    0.200000      
May 3rd, 2021       May 6th, 2024       1.000000            0.025306      Receiver    0.200000      
May 3rd, 2022       May 6th, 2024       1.000000            0.025318      Receiver    0.200000      
May 3rd, 2023       May 6th, 2024       1.000000            0.025353      Receiver    0.200000      

The calibration of the model to this basket is done via

   gsr->calibrateVolatilitiesIterative(basket, method, ec);

where method and ec are an optimization method (Levenberg-Marquardt in our case) and end criteria for the optimization. I should note that the calibration method is not the default one defined in CalibratedModel, which does a global optimization on all instruments, but a serialized version calibrating one step of the sigma function to one instrument at a time, which is much faster.

Here is the result of the calibration.

Expiry              Model sigma   Model price         market price        Model ivol    Market ivol   
April 30th, 2015    0.005178      0.016111            0.016111            0.199999      0.200000      
May 3rd, 2016       0.005156      0.020062            0.020062            0.200000      0.200000      
May 3rd, 2017       0.005149      0.021229            0.021229            0.200000      0.200000      
May 3rd, 2018       0.005129      0.020738            0.020738            0.200000      0.200000      
May 2nd, 2019       0.005132      0.019096            0.019096            0.200000      0.200000      
April 30th, 2020    0.005074      0.016537            0.016537            0.200000      0.200000      
May 3rd, 2021       0.005091      0.013253            0.013253            0.200000      0.200000      
May 3rd, 2022       0.005097      0.009342            0.009342            0.200000      0.200000      
May 3rd, 2023       0.005001      0.004910            0.004910            0.200000      0.200000      

and the price of our swaption, retrieved in the QuantLib standard way,

Real npv = swaption->NPV();

is around 38 basispoints.

Bermudan swaption NPV (ATM calibrated GSR) = 0.003808

Now let’s come back to what I mentioned above. Actually the european call rights are not exactly matching the atm swaptions we used for calibration. Namely our underlying swap is not atm, but has a fixed rate of 4\%. So we should use an apdapted basket. Of course in this case you can guess what one should take, but I will use the general machinery to make it trustworthy. The adapted basket can be retrieved by

  basket = swaption->calibrationBasket(
       swapBase, *swaptionVol,

with the parameter MaturityStrikeByDeltaGamma indicating that the market swaptions for calibration are chosen from the set of all possible market swaptions (defined by the swapBase, remember ?) by an optimization of the nominal, strike and maturity as the remaining free parameters such that the zeroth, first and second order derivatives of the exotic’s underlying by the model’s state variable (evaluated at some suitable central point) are matched.

To put it differently, per expiry we seek a market underlying that in all states of the world (here for all values of the state variable of our model) has the same value as the exotic underlying we wish to price. To get this we match the Taylor expansions up to order two of our exotic and market underlying.

Let’s see what this gets us in our four percent strike swaption case. The calibration basket becomes:

Expiry              Maturity            Nominal             Rate          Pay/Rec     Market ivol   
April 30th, 2015    May 6th, 2024       0.999995            0.040000      Payer       0.200000      
May 3rd, 2016       May 6th, 2024       1.000009            0.040000      Payer       0.200000      
May 3rd, 2017       May 6th, 2024       1.000000            0.040000      Payer       0.200000      
May 3rd, 2018       May 7th, 2024       0.999953            0.040000      Payer       0.200000      
May 2nd, 2019       May 6th, 2024       0.999927            0.040000      Payer       0.200000      
April 30th, 2020    May 6th, 2024       0.999996            0.040000      Payer       0.200000      
May 3rd, 2021       May 6th, 2024       1.000003            0.040000      Payer       0.200000      
May 3rd, 2022       May 6th, 2024       0.999997            0.040000      Payer       0.200000      
May 3rd, 2023       May 6th, 2024       1.000002            0.040000      Payer       0.200000      

As you can see the calibrated rate for the market swaption is 4\% as expected. What you can also see is that payer swaptions were generated. This is because always out of the money options are chosen to be calibration instruments for the usual reason. The nominal is slightly different from 1.0, but practically did not change. This is more to prove that some numerical procedure worked for you here.

Recalibrating the model to the new basket gives

Expiry              Model sigma   Model price         market price        Model ivol    Market ivol   
April 30th, 2015    0.006508      0.000191            0.000191            0.200000      0.200000      
May 3rd, 2016       0.006502      0.001412            0.001412            0.200000      0.200000      
May 3rd, 2017       0.006480      0.002905            0.002905            0.200000      0.200000      
May 3rd, 2018       0.006464      0.004091            0.004091            0.200000      0.200000      
May 2nd, 2019       0.006422      0.004766            0.004766            0.200000      0.200000      
April 30th, 2020    0.006445      0.004869            0.004869            0.200000      0.200000      
May 3rd, 2021       0.006433      0.004433            0.004433            0.200000      0.200000      
May 3rd, 2022       0.006332      0.003454            0.003454            0.200000      0.200000      
May 3rd, 2023       0.006295      0.001973            0.001973            0.200000      0.200000      

indeed different, and the option price

Bermudan swaption NPV (deal strike calibrated GSR) = 0.007627

almost doubled from 38 to 76 basispoints. Well actually it more than doubled. Whatever. Puzzle: We did not define a smile for our market swaption surface. So it shouldn’t matter which strike we choose for the calibration instrument, should it ?

There are other applications of the delta-gamma-method. For example we can use an amortizing nominal going linear from 1.0 to 0.1. The calibration basket then becomes

Expiry              Maturity            Nominal             Rate          Pay/Rec     Market ivol   
April 30th, 2015    August 5th, 2021    0.719236            0.039997      Payer       0.200000      
May 3rd, 2016       December 6th, 2021  0.641966            0.040003      Payer       0.200000      
May 3rd, 2017       May 5th, 2022       0.564404            0.040005      Payer       0.200000      
May 3rd, 2018       September 7th, 2022 0.486534            0.040004      Payer       0.200000      
May 2nd, 2019       January 6th, 2023   0.409763            0.040008      Payer       0.200000      
April 30th, 2020    May 5th, 2023       0.334098            0.039994      Payer       0.200000      
May 3rd, 2021       September 5th, 2023 0.255759            0.039995      Payer       0.200000      
May 3rd, 2022       January 5th, 2024   0.177041            0.040031      Payer       0.200000      
May 3rd, 2023       May 6th, 2024       0.100000            0.040000      Payer       0.200000      

First of all, the nominal of the swaptions is adjusted to the amortizing schedule, being some average over the coming periods respectively. Furthermore, the effective maturity is reduced.

As a side note. The nominal is of course not relevant at all for the calibration step. It does not matter, if you calibrate to a swaption with nominal 1.0 or 0.1 or 100000000.0. But it is a nice piece of information as in the last example anyhow, to see if it is plausible what happens.

Now consider a callable bond. You can set this up as a swap, too, with one zero leg and final notional exchange. The NonStandardSwap allows for all this. The exercise has to be extended to carry a rebate payment reflecting the notional reimbursement in case of exercise. This is handled by the RebatedExercise extension. The delta-gamma calibration basket now looks as follows.

Expiry              Maturity            Nominal             Rate          Pay/Rec     Market ivol   
April 30th, 2015    April 5th, 2024     0.984093            0.039952      Payer       0.200000      
May 3rd, 2016       April 5th, 2024     0.985539            0.039952      Payer       0.200000      
May 3rd, 2017       May 6th, 2024       0.987068            0.039952      Payer       0.200000      
May 3rd, 2018       May 7th, 2024       0.988455            0.039952      Payer       0.200000      
May 2nd, 2019       May 6th, 2024       0.990023            0.039952      Payer       0.200000      
April 30th, 2020    May 6th, 2024       0.991622            0.039951      Payer       0.200000      
May 3rd, 2021       May 6th, 2024       0.993111            0.039951      Payer       0.200000      
May 3rd, 2022       May 6th, 2024       0.994190            0.039952      Payer       0.200000      
May 3rd, 2023       May 6th, 2024       0.996715            0.039949      Payer       0.200000      

The notionals are slightly below 1.0 (as well as the maturities and strikes not exactly matching the bermudan swaption case). This is expected however, since the market swaptions are discounted on OIS level, while for the bond we chose to use the 6m curve as a benchmark discounting curve. The effect is small however. Put 6m as discounting to cross check this.

What is more interesting is to assume a positive credit spread. Let’s set this to 100 basispoints for example. The spread is interpreted as an option adjusted spread, continuously compounded with Actual365Fixed day count convention. The calibration basket gets

Expiry              Maturity            Nominal             Rate          Pay/Rec     Market ivol   
April 30th, 2015    February 5th, 2024  0.961289            0.029608      Payer       0.200000      
May 3rd, 2016       March 5th, 2024     0.965356            0.029605      Payer       0.200000      
May 3rd, 2017       April 5th, 2024     0.969520            0.029608      Payer       0.200000      
May 3rd, 2018       April 8th, 2024     0.973629            0.029610      Payer       0.200000      
May 2nd, 2019       April 8th, 2024     0.978124            0.029608      Payer       0.200000      
April 30th, 2020    May 6th, 2024       0.982682            0.029612      Payer       0.200000      
May 3rd, 2021       May 6th, 2024       0.987316            0.029609      Payer       0.200000      
May 3rd, 2022       May 6th, 2024       0.991365            0.029603      Payer       0.200000      
May 3rd, 2023       May 6th, 2024       0.996646            0.029586      Payer       0.200000      

Look what the rate is doing. It is adjusted by roughly the credit spread. Again this is natural, since the hedge swaption for the bond’s call right would have roughly 100 basispoints margin on the float side. Here it is coming automatically out of our optimization procedure.

Let’s come to our final example. The underlying is a swap exchanging a CMS10y rate against Euribor 6M. To make the numbers a bit nicer I changed the original example code to include a 10 basispoint margin on the Euribor leg.

We start with the underlying price retrieved from a replication approach. I am using the LinearTsrPricer here, with the same mean reversion as for the Gsr model above. The pricing is

Underlying CMS     Swap NPV = 0.004447
           CMS     Leg  NPV = -0.231736
           Euribor Leg  NPV = 0.236183

so 44.5 basispoints. Now we consider a bermudan swaption (as above, with yearly exercises) on this underlying. A naively calibrated Gsr model yields

Float swaption NPV (GSR) = 0.004291
Float swap     NPV (GSR) = 0.005250

The npv of the option is 42.9 basispoints. The underlying price, which can be retrieved as an additional result from the engine as follows


is 52.5 basispoints. Please note that the option has its first exercise in one year time, so the first year’s coupons are not included in the exercised into deal. This is why the underlying price is higher than the option value.

What do we see here: The Gsr model is not able to price the underlying swap correctly, the price is around 8 basispoints higher than in the analytical pricer. This is because of the missing smile fit (in our example the fit to a flat smile, which the Gsr can not do). The Markov Functional model on the other hand can exactly do this. We can calibrate the numeraire of the model such that the market swaption surface is reproduced on the fixing dates of the CMS coupons for swaptions with 10y maturity. The goal is to get a better match with the replication price. Let’s go: The model is set up like this

  boost::shared_ptr<MarkovFunctional> markov =
          yts6m, reversion, markovStepDates, 
          markovSigmas, swaptionVol,
          cmsFixingDates, tenors, swapBase,

It is not that different from the Gsr model construction. We just have to provide the CMS coupons’ fixing dates and tenors (and the conventions of the swaptions behind), so that we can calibrate to the corresponding smiles. The last parameter is optional and overwrites some numerical parameter with a more relaxed value, so that the whole thing works a bit faster in our example. Ok, what does the Markov model spit out:

Float swaption NPV (Markov) = 0.003549
Float swap NPV (Markov)     = 0.004301

The underlying is now matched much better than in the Gsr model, it is up to 1.5 basispoints accurate. A perfect match is not expected from theory, because the dynamics of the linear TSR model is not the same as in the Markov model, of course.

The option price, accordingly, is around 7.5 basispoints lower compared to the Gsr model. This is around the same magnitude of the underlying mismatch in the Gsr model.

To complete the picture, the Markov model also has a volatility function that can be calibrated to a second instrument set like coterminal swaptions to approximate call rights. It is rather questionable if a call right of a CMS versus Euribor swap is well approximated by a coterminal swaption. Actually I tried to use the delta-gamma-method to search for any representation of such a call right in the Markov model. The following picture is from a different case (it is taken from the paper I mentioned above), but showing what is going on in principle

Screenshot from 2015-03-29 18:12:23

The exotic underlying is actually well matched by a market swaption’s underlying around the model’s state y=0, which is the expansion point for the method, so it does what it is supposed to do. But the global match is poor, so there does not seem to be a good reason to include additional swaptions into the calibration to represent call rights. They do not hurt, but do not specifically represent the call rights, so just add some more market information to the model.

Anyhow, we can do it, so we do it. If we just take atm coterminals and calibrate the Markov model’s volatility function to them, we get as a calibration result

Expiry              Model sigma   Model price         market price        Model ivol    Market ivol   
April 30th, 2015    0.010000      0.016111            0.016111            0.199996      0.200000      
May 3rd, 2016       0.012276      0.020062            0.020062            0.200002      0.200000      
May 3rd, 2017       0.010534      0.021229            0.021229            0.200001      0.200000      
May 3rd, 2018       0.010414      0.020738            0.020738            0.200001      0.200000      
May 2nd, 2019       0.010361      0.019096            0.019096            0.199998      0.200000      
April 30th, 2020    0.010339      0.016537            0.016537            0.200002      0.200000      
May 3rd, 2021       0.010365      0.013253            0.013253            0.199998      0.200000      
May 3rd, 2022       0.010382      0.009342            0.009342            0.200001      0.200000      
May 3rd, 2023       0.010392      0.004910            0.004910            0.200001      0.200000      

I am not going into details about the volatility function here, but note, that the first step is fixed (at its initial value of 1\%) and the step after the last expiry date matters. In addition a global calibration to all coterminals simultaneously is necessary, the iterative approach will not work for the model.

The pricing results for the underlying does not change that much, the fit is still good as desired:

Float swap NPV (Markov) = 0.004331

There is one last thing I want to mention and which is not yet part of the library or the example code. Hagan introduced a technique called “internal adjusters” to make the Gsr model work in situations like the one we have here, namely that the underlying is not matched well. He mentions this approach in his paper on callable range accrual notes where he uses his LGM (same as Gsr or Hull White) model for pricing and observes that he does not calibrate to underlying Libor caplets or floorlets very well. He suggests to introduce an adjusting factor to be multiplied with the model volatility in case we are evaluating such a caplet or floorlet during the pricing of the exotic. So the missing model fit is compensated by using a modified model volatility “when needed” (and only then, i.e. when evaluating the exotic coupon we want to match).

This sounds like a dirty trick, destroying the model in a way and introducing arbitrage. On the other hand mispricing the market vanillas introduces arbitrage in a much more obvious and undesirable way. So why not. If Pat Hagan says we can do it, it is safe I guess. We should say however that Hagan’s original application was restricted to adjust Libor volatilities. Here we make up for a completely wrong model smile. And we could even go a step further and match e.g. market CMS spread coupon prices in a Gsr model, although the model does not even allow for rate decorrelation. So one should be careful, how far one wants to go with this trick.

I added these adjusters to the Gsr model. If you don’t specify or calibrate them, they are not used though, so nothing changes from an end user perspective. In our example we would set up a calibration basket like this

 std::vector<boost::shared_ptr<CalibrationHelperBase> > 
 for (Size i = 0; i < leg0.size(); ++i) {
     boost::shared_ptr<CmsCoupon> coupon =
     if (coupon->fixingDate() > refDate) {
         boost::shared_ptr<AdjusterHelper> tmp =
                 swapBase, coupon->fixingDate(), 

The adjuster helper created here corresponds to the CMS coupons of our trade. We set the linear TSR pricer (cmsPricer) to produce the reference results and the Gsr pricing engine in order to be able to calibrate the adjusters to match the reference prices. Now we can say

                                   method, ec);

like before for the volatilities. What we get is

Expiry              Adjuster      Model price         Reference price     
April 30th, 2015    1.0032        2447560.9183        2447560.9183        
May 3rd, 2016       1.0353        2402631.1363        2402631.1363        
May 3rd, 2017       1.0640        2378624.8507        2378624.8507        
May 3rd, 2018       1.0955        2324333.7739        2324333.7739        
May 2nd, 2019       1.1239        2295880.1247        2295880.1247        
April 30th, 2020    1.1643        2261229.3425        2261229.3425        
May 3rd, 2021       1.1948        2228406.7519        2228406.7519        
May 3rd, 2022       1.2214        2196901.0808        2196901.0808        
May 3rd, 2023       1.2732        2177227.7967        2177227.7967        

The prices here are with reference to a notional of one hundred million. The adjuster values needed to match the reference prices from the replication pricer are not too far from one, which is good, because it means that the model is not bent too much in order to produce the CMS coupon prices.

The pricing in the new model is as follows

GSR (adjusted) option value     = 0.003519
GSR (adjusted) underlying value = 0.004452

The underlying match is (by construction) very good thanks to the adjusters. Also the option value is adjusted down as desired, we are now very close to the markov model’s value. Needless to say that this does not work out that well all the time. Also as an important disclaimer, an option on CMS10Y against Euribor6M has features of a spread option, which is highly correlation sensitive. Neither the (adjusted) Gsr nor the Markov model can produce correlations other than one for the spread components, so they will always underprice the option in this sense.

Enough for today, look at the example and play around with the code. Or read the paper.

Gaussian Models