XVA for Bermudan Swaptions

Welcome back. Sorry for the hiatus which was longer than planned. Anyway and finally here is the (presumably) last episode in the series of posts on proxy pricing of exotics for XVA simulations. In my last posts I wrote a bit on the general design principles I came up with and how I used it to implement the FX TaRF. Today I will describe how to do similar things for bermudan swaptions using the same design principles.

Some general thoughts beforehand though. Reflections on what we are doing here at all. What we have is some model M in which we want to compute some XVA or potential exposure figure \Xi for our exotic deal I. Possibly and probably together with a whole portfolio of other deals which interact in a non linear way, i.e.

\Xi( \sum_i I_i ) \neq \sum_i \Xi( I_i)

To do this we need NPVs \nu of each I_i at some time t conditional on some model state m(t) in our model M:

\nu = \nu(I_i,t,m(t))

M might be operated in a real world measure (e.g. for exposure measurement) or a risk neutral measure (e.g. for CVA calculation), dependent on the nature of the figure to compute. The model state m(t) implies a market data scenario like certain discount and forwarding term structures and swaption or cap volatility surfaces and the like.

Ideally the NPV \nu is computed in the same pricing model we use for our regular PL computation or trading activities, calibrated to the market data scenario created by M in the way we would do it for usual pricing. That means in general we have an “outer” model M and an “inner” model N_i for each instrument I_i. Even if the instruments all allow for the same model for the inner evaluation, they will probably be different in their calibrated parameters:

For example if we have a portfolio composed of single currency swaps, callable swaps and swaptions we could choose the Hull White model to evaluate the callable swaps and swaptions and price the swaps directly on the yield curves. But each callable swap / swaption would in general require a different set of calibration instruments, so we would have a set of local, inconsistent models for each individual swaption valuation. We could try to use a global model in which all instruments can be priced consistently and maybe use this model as the outer model for XVA calculation at the same time, provided that the XVA figure \Xi allows for risk neutral valuation at all.

This could be a Libor market model which is one of the most flexible models in terms of global calibration. The global calibration alone would be a delicate task however. If more than one currency is involved and maybe even other asset classes the task gets even more demanding. And what do you do if you want a real world exposure figure? This approach seems quite heavy, not impossible, but heavy.

So it seems reasonable to stick with the concept of separate outer and inner models in general. The question is then how to efficiently compute the inner single NPVs \nu(I_i,t,m(t)) conditional on market scenarios generated by the outer model. Full pricing is not an option even for relatively simple inner models like Hull White one factor models.

Another idea could be to compute some reference NPVs in time direction t assuming some arbitrary market data scenarios (e.g. simply using today’s market data rolled forward), together with a “rich” set of sensitivities which allow to estimate NPVs under arbitrary market scenarios by Taylor approximation and interpolating in time direction (or use a theta sensitivity). Maybe automatic differentiation comes into play here.

The idea I followed in the previous posts is different, more simple, and goes like this. During an initial single deal pricing performed by a monte carlo simulation we collect path data and build a regression model of conditional NPVs on the model state, pricing time and possibly more ancillary variables (like the accumulated amount in the case of the FX TaRF) just as it is done in the Longstaff Schwartz method for early exercise pricing (or almost like this, see below). Then, when asked for an NPV at a pricing time t under a given market scenario m(t) (and possibly more conditions like the already accumulated amount of a target structure) we imply the model state belonging to the market scenario and use our regression model to produce the desired NPV very efficiently.

The limitation of this method is that the given market data can not be replicated in full beauty in general. In case of the FX TaRF we could replicate any given FX spot rate, but not its volatility which is fixed and implied by the original pricing model for example. In case of bermudan swaptions in the Hull White model we will see in this post that we can replicate one given reference rate, i.e. the general level of the yield curve, but not the shape of the curve, the spread to other yield curves involved and in particular not the volatility structure, which are all fixed again and implied by the model used for initial pricing.

Actually what can be replicated is exactly corresponding to what is explicitly modelled as a stochastic quantity in the model: Since for the TaRF we used a Black model, possibly with local volatility but not with a stochastic factor for the volatility, the volatility is fixed. The FX spot on the other hand as the explicitly modelled quantity can be replicated to be any desired value.

In case of the Hull White model we can imply the level of the yield curve, since this is modelled through the one factor in the model, but we can not match a given shape, since the model does not have the flexibility change the shape of the curve (the shape does change in fact, but in a fixed manner, not driven by stochastic factors in the model). The same with the volatility structure, it is implied by the initial pricing model again.

If one accepts these limitations the method offers a very fast and handy way to approximate the desired scenario NPVs however.

Let’s come to an example how this works for bermudan swaptions. I’d like to explain this going through an example I wrote. We start like this

// Original evaluation date and rate level

Real rateLevelOrig = 0.02;
Date refDateOrig(12, January, 2015);
Settings::instance().evaluationDate() = refDateOrig;

// the yield term structure for the original pricing
// this must _not_ be floating, see the warning in
// the proxy engine's header.

Handle<YieldTermStructure> ytsOrig(boost::make_shared<FlatForward>(
         refDateOrig, rateLevelOrig, Actual365Fixed()));

The original pricing date is 12-Jan-2015 and the yield term structure on this date is at 2\% flat. We set QuantLib’s original pricing date to this date and construct the yield term structure.

One important point is that the yield term structure has a fixed reference date, i.e. it does not change when the evaluation date changes. The reason is that this yield term structure will be linked to the initial pricing model which in turn will be used in the proxy pricing engine later on, which finally relies on the fact that the model does not change when shifting the evaluation date for scenario pricing.

It is like having a fixed frame, as seen from the outer model described above, within which we do local pricings at different times and under different market scenarios, but relying on the frame surronding it to not change. Likewise the rates describing the yield term structure should not change, nor should the initial pricing model be recalibrated or change its parameters during the scenario pricings. The pricing model will be a Hull White model

// the gsr model in T-forward measure, T=50 chosen arbitrary here
// the first model uses the fixed yts, used for the mc pricing
// generating the proxy

boost::shared_ptr<Gsr> gsrFixed = boost::make_shared<Gsr>(
            ytsOrig, stepDates, sigmas, reversion, 50.0);

The monte carlo pricing engine will be set up like this

boost::shared_ptr<PricingEngine> mcEngine =
          .withSteps(1) // the gsr model allows for large steps

We do not need a fine time grid since the stochastic process belonging to the GSR model allows for exact evolution over arbitrary large time intervals. We use 10000 steps to calibrate the Longstaff-Schwartz regression models and also for the final pricing. The seed is 42, why not. The last attribute says that we want not only an usual pricing of the swaption, but also generate a proxy information object which can be used for scenario pricing later on. Since the generation consumes some additional time, it is optional to do it.

Now we can do a good old pricing on our instrument swaption2 (I skipped how this was created).

  Real npvOrigMc = swaption2->NPV();
  Real errorOrigMc = swaption2->errorEstimate();

What I also did not show is that I created a reference integral engine to verify the prices of the mc engine and later on of the proxy engine also. The reference engine relies on a floating yield term structure, both with respect to the reference date and the rate level. The same holds for the proxy engine, so we can move forward in time, change the rate level and compare the pricings in the two engines.

The result of the inital pricing is

Bermudan swaption proxy pricing example
Pricing results on the original reference date (January 12th, 2015):
Integral engine npv = 0.0467257 (timing: 97723mus)
MC       engine npv = 0.0459135 error estimate 0.000809767 (timing: 5.55552e+06mus)

The reference NPV from the integral engine is 467bp, slightly higher than the monte carlo price 459bp. This is perfectly expected since the regression model based approximate exercise decisions are sub-optimal, so the option NPV will be underestimated systematically. Together will the error estimate (one standard deviation) of 8bp this all looks quite ok.

One word about the computation times. The monte carlo simulation (10000 paths both for calibration and pricing) takes 5.5 seconds. This is somewhat in the expected region and I am completely relying on the standard monte carlo framework of the library here. The integral engine on the other hand takes 97 milli seconds, which is quite slow. This can easily be brought down to 5 milliseconds just by using less integration points and covered standard deviations (here I used 129 points and 7 standard deviations), without loosing too much accuracy. This seems quite ok, since neither the GSR model nor the integral engine are much optimized for speed currently.

To create the proxy engine we can say

  boost::shared_ptr<PricingEngine> proxyEngine =
              swaption2->proxy(), rateLevelRef, maturityRef, 64, 7.0, true);

where we just pass the proxy result from the pricing above together with quotes representing the rate level of the scenario pricing and the maturity to which the rate level belongs. Remember that we can only prescribe the level, but not the shape of the yield curve in the scenario pricing. So I allowed to chose a maturity, e.g. 10 years, and prescribe the (continuously compounded) zero yield w.r.t. this maturity. The proxy engine will then imply the Hull White’s model state such that this rate level is matched for the given maturity. The shape of the yield curve is implied by the initial model though and can not be changed. I am repeating myself, ain’t I ? Do I ? Repeat ? God.

The next two parameters refer to the way we do the scnenario pricing between two of the original structure’s exexercise dates: We determine the next available exercise date (if any) and build a grid on the exercise date covering n standard deviations (here 5.0) of the state variable around its mean, all this conditional on the pricing time and the model state (which is implied by the rate level as explained above). We then compute the NPV on each of these grid points using the regression model from the initial monte carlo pricing. Finally we interpolate between the points using cubic splines and integrate the resulting function against the state variable’s density (which is possibly in closed form by the way).

Let’s see what we get under some scenarios for the evaluation date and rates. One scenario is simply created by e.g. saying

 Settings::instance().evaluationDate() = Date(12, June, 2015);

which moves the evaluation date half a year ahead to 12-Jun-2015 and requires the 10.5y zero rate (which corresponds to the maturity of the swaption as the reference point) to be 2%. For the scenarios in the example we get

Pricing results on June 12th, 2015, reference rate 0.02 with maturity 10.5
Integral engine npv = 0.0444243
Proxy    engine npv = 0.0430997 (timing: 127mus)

Pricing results on June 11th, 2019, reference rate 0.025 with maturity 6.5
Integral engine npv = 0.0372685
Proxy    engine npv = 0.0361514 (timing: 170mus)

Pricing results on June 11th, 2019, reference rate 0.02 with maturity 6.5
Integral engine npv = 0.0223442
Proxy    engine npv = 0.0223565 (timing: 159mus)

Pricing results on June 11th, 2019, reference rate 0.015 with maturity 6.5
Integral engine npv = 0.0128876
Proxy    engine npv = 0.0141377 (timing: 169mus)

Pricing results on January 11th, 2020, reference rate 0.02 with maturity 6
Integral engine npv = 0.0193142
Proxy    engine npv = 0.0194446 (timing: 201mus)

Pricing results on January 11th, 2021, reference rate 0.02 with maturity 5
Integral engine npv = 0.0145542
Proxy    engine npv = 0.0137726 (timing: 208mus)

Pricing results on June 11th, 2024, reference rate 0.02 with maturity 1.5
Integral engine npv = 0.00222622
Proxy    engine npv = 0.00292998 (timing: 282mus)

which looks not too bad. The proxy engine’s computation time is also not bad, around 100-300 microseconds. Again, this can be made faster by using less integration points, but already seems competitive enough for XVA simulations.

The proxy regression model deserves some comments. It is not just the Longstaff-Schwartz regression model, since this only looks at states where the exercise value is positive. This is essential for a good fit of the regression function (which I chose to be simply a quadratic function), since left from the exercise point the option value (which equals the continuation value in this area) flattens and would destroy the global fit.

What I did until now is just calibrate two separate models, one in the region corresponding to positive exercise values and another in the complementary region. This is similar to the regression model for the FX TaRF, where I also used two quadratic polynomials, if you remember (if not, you can read about it here or more detailled in the paper).

This solution is not perfect and can be improved probably. The following picture shows the regression models from the example. The different exercise dates are distinguished by color (0 denotes the first exercise date, in dark color, 1 the second, and so on until 9 which denotes the 10th exercise date, in bright color).


What you can see is the discontinuity at the cutoff point between exercise and non-exercise area. It is exactly this jump that one should work on I guess. But not now.

In technical terms I relied on Klaus’ implementation of the Longstaff Schwartz algorithm in longstaffschwarzpathpricer.hpp. This needed some amendments to allow for non-american call rights which it was designed for I believe. Apart of this I mainly inserted a hook in the pricer

virtual void post_processing(const Size i,
                             const std::vector<StateType> &state,
                             const std::vector<Real> &price,
                             const std::vector<Real> &exercise) {}

with an empty default implementation. This function is called during the calibration phase, passing the collected data to a potential consumer, which can do useful things with it like building a regression model for proxy pricing like in our use case here. To do this I derived from Klaus’ class as follows

class LongstaffSchwartzProxyPathPricer
    : public LongstaffSchwartzPathPricer<typename SingleVariate<>::path_type> {
        const TimeGrid &times,
        const boost::shared_ptr<EarlyExercisePathPricer<PathType> > &pricer,
        const boost::shared_ptr<YieldTermStructure> &termStructure);

    const std::vector<boost::function1<Real, StateType> > basisSystem() const {
        return v_;
    const std::vector<Array> coefficientsItm() const { return coeffItm_; }
    const std::vector<Array> coefficientsOtm() const { return coeffOtm_; }
    const StateType cutoff() const { return cutoff_; }

which offers inspectors for the two regression models coefficients, the cutoff point separating them and the underlying function system. The implementation of the hook function is simple

void LongstaffSchwartzProxyPathPricer::post_processing(
    const Size i, const std::vector<StateType> &state,
    const std::vector<Real> &price, const std::vector<Real> &exercise) {

    std::vector<StateType> x_itm, x_otm;
    std::vector<Real> y_itm, y_otm;

    cutoff_ = -QL_MAX_REAL;

    for (Size j = 0; j < state.size(); ++j) {
        if (exercise[j] > 0.0) {
        } else {
                cutoff_ = state[j];

    if (v_.size() <= x_itm.size()) {
        coeffItm_[i - 1] =
            GeneralLinearLeastSquares(x_itm, y_itm, v_).coefficients();
    } else {
        // see longstaffschwartzpricer.hpp
        coeffItm_[i - 1] = Array(v_.size(), 0.0);

    if (v_.size() <= x_otm.size()) {
        coeffOtm_[i - 1] =
            GeneralLinearLeastSquares(x_otm, y_otm, v_).coefficients();
    } else {
        // see longstaffschwartzpricer.hpp
        coeffOtm_[i - 1] = Array(v_.size(), 0.0);

and does nothing more than setting up the two data sets on which we estimate the regression models.

The coefficients are read in the monte carlo engine and serve as the essential part in the proxy object which can then be passed to the proxy engine. And we are done. You can find the full example code here.

XVA for Bermudan Swaptions