Adjoint Greeks III – Vanilla Swaps

Time for a post because next week I will be in Dublin over the weekend. Three days with my beloved wife, just the two of us. Without our four little ones who will have a great time too, with their granny. But also without my laptop, which makes me a bit nervous to be honest. Anyway. Good news from my adjoint greeks project, I reached the first bigger milestone, which is compute the delta vector for a plain vanilla swap.

Or rather for a whole portfolio of vanilla swaps. The setting should be realistic and allow for first performance checks, so I look at portfolios of 1, 10, 100, 1000, 5000 and 10000 swaps with random maturities in a horizon of 10, 20, 30, 50 and 70 years, giving 30 test cases for all combinations in total.

The underlying curve (only one, but it would work with separate discounting and forwarding curves as well) consists of 10 deposits, 5 forward rate agreements and as many swaps as the horizon suggests with yearly spacing, i.e. we have 25, 35, 45, 65 and 85 points on the curve for each of the above horizon scenarios respecticely.

The goal is to

  • compute the NPV of the swap portfolio
  • compute bucket deltas with respect to all instrument in the underlying curve

both in the traditional way (shifting the market quotes of the curve instruments and revalue the portfolio on the new curve) and with AD.

You can see the whole example code here https://github.com/pcaspers/quantlib/blob/adjoint/QuantLib/Examples/AdjointSwapDeltas/adjoint.cpp. Note that no special engine is used to generate the adjoint deltas. We just do a vanilla swap pricing (almost) as always, collecting the market variables of interest in a vector xAD and the result (the NPV) in yAD and then say

CppAD::ADFun<Real> f(xAD, yAD);
std::vector<Real> deltasAD(xAD.size()), w(1, 1.0);
deltasAD = f.Reverse(1, w);

Let’s start looking at one single swap. The delta pillars are numbered from 1 to 45 simply, meaning overnight, tomnext, spotnext, spotweek, 1m, … , 6m, 1m-7m fra, 2m-8m, … , 5m-11m, 1y swap, 2y, … , 30y. The random swap below apparently has a maturity between 23y and 24y from today. The notional of the swap is 100 million (in general the portfolio’s total notional is 100 million in all cases regardless of the number of swaps in it).

The finite difference deltas in the column labeled double are computed with a step size of 1E-6 here. The AD values are in the next column and the rightmost column displays the difference between these two. The CppAD framework once more does not disappoint us and gives reliable results. Note that the deltas are expressed w.r.t. a one basis point (1E-4) shift.

results:                double     AD<double>     difference
   NPV            -64037022.66   -64037022.66           0.00
   Delta #    1          53.94          53.93           0.00
   Delta #    2          17.98          17.98           0.00
   Delta #    3           0.00          -0.00           0.00
   Delta #    4           0.00           0.00          -0.00
   Delta #    5           0.00           0.00          -0.00
   Delta #    6           2.28           2.28          -0.00
   Delta #    7       -2439.17       -2439.17           0.00
   Delta #    8           0.00          -0.00           0.00
   Delta #    9           0.00           0.00          -0.00
   Delta #   10           0.00          -0.00           0.00
   Delta #   11           0.00           0.00          -0.00
   Delta #   12           7.13           7.13          -0.00
   Delta #   13         182.54         182.54          -0.00
   Delta #   14           0.00          -0.00           0.00
   Delta #   15           0.00           0.00          -0.00
   Delta #   16          36.26          36.26           0.00
   Delta #   17         495.51         495.51          -0.00
   Delta #   18         731.54         731.54          -0.00
   Delta #   19         994.67         994.68          -0.00
   Delta #   20        1238.63        1238.63          -0.00
   Delta #   21        1488.41        1488.42          -0.00
   Delta #   22        1745.32        1745.33          -0.01
   Delta #   23        2013.84        2013.85          -0.01
   Delta #   24        2234.96        2234.97          -0.01
   Delta #   25        2535.77        2535.79          -0.01
   Delta #   26        2769.90        2769.92          -0.02
   Delta #   27        3040.39        3040.41          -0.02
   Delta #   28        3321.43        3321.45          -0.02
   Delta #   29        3543.09        3543.12          -0.03
   Delta #   30        3861.57        3861.65          -0.08
   Delta #   31        4109.57        4109.51           0.06
   Delta #   32        4364.59        4364.68          -0.09
   Delta #   33        4659.00        4659.04          -0.05
   Delta #   34        4958.68        4958.73          -0.05
   Delta #   35        5166.84        5166.90          -0.06
   Delta #   36        5529.58        5529.64          -0.06
   Delta #   37        5745.63        5745.69          -0.07
   Delta #   38       60419.04       60419.23          -0.19
   Delta #   39      167048.46      167051.01          -2.55
   Delta #   40           0.00           0.00           0.00
   Delta #   41           0.00           0.00           0.00
   Delta #   42           0.00           0.00           0.00
   Delta #   43           0.00           0.00           0.00
   Delta #   44           0.00           0.00           0.00
   Delta #   45           0.00           0.00           0.00

Now let’s look at timings. In a bigger example, with 70y horizon (85 curve pillars) and 10000 swaps in the portfolio. On my laptop,

maximum maturity           70 years
portfolio size          10000 swaps
delta vector size          85 pillars

timings (ms)            double     AD<double>         factor  eff#NPVs
   pricing                1070           2690
   deltas                99030           2700
   total                100100           5390        18.5714   4.57692

This says that the pricing takes 1070ms with standard double computations and 2690ms with AD. Approximately, because you can not really trust single timings, in particular not for small problems.

The delta computation under the naive bump and revalue approach takes another fatty 99 seconds, while CppAD only consumes 2.7 seconds on top of the 2.7 seconds from the pricing. Obviously more is done in the pricing step compared to a double computation, later on used in the reverse sweep, so ultimately we have to look at the sum of pricing and delta computation. For this we get a speed up of 18.5x with AD.

The last number eff#NPVs is the effective number of NPV calculations during the AD run which is the total time of the AD calculation divided by the average time for one pricing with classic double‘s. From theory we know that this should be bounded by 5, here it is 4.6.

The magic of AD is that with bigger problems, the 4.6 stays constant, not the speed up factor. If we look at 115 pillars instead of the 85 for example,

maximum maturity          100 years
portfolio size          10000 swaps
delta vector size         115 pillars

timings (ms)            double     AD<double>         factor  eff#NPVs
   pricing                1610           4060
   deltas               204630           4100
   total                206240           8160        25.2745   4.55004

Note that swaps with longer term are generated for this example, so a single pricing takes longer than before. Otherwise we stay at 4.6 times one NPV computation for now 115 bucket deltas and get a speed up of over 25 !

It does not matter how many greeks you want, you just get them for free. I already understood that I thought, but seeing it in reality is quite cool. Let your problem grow as it wants, your complexity stays constant. In other circumstances you are glad to be linear. Sometimes you are lucky-logarithmic. Here complexity is constant. Crazy.

Running all the examples mentioned above and plotting the effective number of NPV calculations against the problem size (defined as the product of the delta vector and the portfolio size) we get this picture.

adjointeffnpv

The effective number of NPV calculations seems to stabilize slightly above 4.5. Cough, probably we should produce more points first … but with a sharp eye …

Next steps: Volatility term structures (which I already scratched involuntarily when converting the ibor coupon pricers), Black pricers, Hull White and Markov functional model. The latter both being slow and totally and utterly inaccessible for bump and revalue sensitivities. And of course filling in all the holes I left open so far. Still, help on the conversion process is highly appreciated (including not only pull requests you can send, but also expensive gifts, plain money (no bitcoins please), or a too-good-to-reject QuantLib job offer).

Adjoint Greeks III – Vanilla Swaps