Deconstructing the Gaussian copula, part II

July 9, 2009 in Finance,Math,Risk

(Parts I II and a half and III of this series are also available.)

Recently, I addressed a great deal of misinformation regarding the Gaussian copula and it's role in the 2008 crisis. I would like to try and follow that up with a succinct description of the copula and its use in CDO pricing. (This may seem a defense of the math behind the process, but you know I'm just setting it up for a fall.)

Introduction

David Li's contribution to quantitative finance was the rapidly-standardized "single factor Gaussian copula" CDO pricing framework. The real crux of the problem was the "single factor" part - not the Gaussian copula itself (though we won't pull any punches here). In an extraordinarily broad sense, a copula is a mathematical function that describes how two or more random variables interact. "Correlation" is a simple way of describing the copula, which should give the function some intuitive grounding. But let's back up a second and figure out why we even need a copula in the first place.

Aside: Why Copulas?

If you try to model the behavior of many random variables, you need a multivariate distribution. The most mathematically friendly distributions are from the Gaussian family, including the familiar bell (or normal) curve. This is why such models are prevalent in all manners of statistics. For most purposes, the model is not only easy to work with but asymptotically correct (which is a nice feature, to put it mildly). However, there are some areas where the model choice is more for pragmatic reasons than justified ones - finance being prime among them. Indeed, financial distributions do not behave normally, but only recently have tools been developed that can describe them - and even there large joint distributions are daunting.

So, it is unsurprising that the Gaussian copula arose as a natural choice for modeling the joint distribution inherent to CDOs - which are essentially just collections of many intercorrelated credits.

But I'm getting ahead of myself. (This is much easier to discuss than to write about, I think, because you can guage your audience's comfort which each boldfaced section before moving on. I hope, brave reader, that you are still there.) Lets talk about CDOs.

CDOs

A CDO is nothing more than a collection of various bonds, all held together in a basket. The principal risk of a CDO is default: the chance that one or more of the bonds will not survive to maturity. To isolate this risk, it is instructive to think of the CDO as a basket of sold CDS contracts, rather than a basket of purchased bonds (and indeed, "synthetic CDOs" are nothing more than CDS portfolios and have rapidly gained market share from bond portfolios). Thus, the buyer of a CDO needs to draw two conclusions regarding the basket:

  1. Will any of the credits default?
  2. When will all of those defaults occur?

The first point is obvious; the second gets at the heart of the problem. Both the timing and the correlation of defaults matter. If the CDO basket is comprised disproportionately of financial companies, then default by one may imply a greater likelihood of default for the others; a more diversified basket may not exhibit such dependencies.

This issue is compounded by the introduction of tranches - a staple of the CDO industry. Again, it is helpful to consider a CDO as a basket of sold CDS. The most junior (or "equity") tranche has, by definition, sold insurance on the first few issuers to default - say, the first 3. The next tranche does not experience a loss until the 4th issuer defaults. The key here is that when a portfolio is tranched, investors have not sold CDS on specific issuers by name, but rather by time of default. They can not know ahead of time which issuers they are effectively responsible or on the hook for.

Bathtub Correlation

To understand why tranching compounds the correlation problem, think of the CDO as a rectangular bathtub interspaced with mines that represent each issuer's default. The CDO investors are aboard a boat on one side of the bathtub, and need to cross to the other side. If the boat hits a mine, that issuer defaults, and the explosion of the mine will damage the boat. The equity tranche has an extremely thin hull and will sink quickly; the senior tranche has a thick hull and can withstand many blasts without taking damage. Finally, the boat moves across the bathtub via geometric brownian motion - which is to say, randomly.

In a low-correlation world, the mines are dispersed uniform randomly across the bathtub; hitting one mine does not imply or necessitate hitting any other. With high correlation, the mines cluster somewhere in the water; hitting one mine makes it relatively certain that another will be hit.

As a consequence, equity investors prefer high correlation. They are indifferent to hitting just a few mines or many, as they are wiped out in both situations. Therefore, they prefer the mines to be clustered, as this leaves more clear paths across the bathtub. In contrast, senior investors prefer low correlation - they can withstand glancing off a few mines, but hitting a cluster would wipe them out.

From this intuitive example, it should be clear that not only the timing of the defaults, but also their expected clustering (i.e. correlation) is important when valuing a CDO tranche.

Correlation in the Guassian Copula

Let us first draw the connection I've sketched out already: CDOs are composed of many issuers that may interact with each other; and a multivarite normal distribution is a common method of describing such behavior. So far, so good.

Like any Gaussian multivariate model, the Gaussian copula takes as parameters the correlation of every pair of variables under consideration. (In other words, to make the model work, you need to "explain" to it how every issuer interacts with every other issuer - these are the parameters.) Thus, the number of parameters increases with the square of the number of variables being considered - specifically, there are \frac{N(N-1)}{2} parameters. If you had a CDO of 100 names, you would need to compute 4,950 parameters to describe their behavior! It doesn't take a statistical degree to appreciate the flimsiness of a model which relies on such assumptions - it's just too many to estimate reliably. Clearly, the traditional model simply won't do.

Enter David Li, whose principal contribution to this field is to boil 4,950 parameters down to just one.

Shocking! Dastardly! The decision that caused the 2008 crisis! Well, not really. Though I am full of doubts about the validity of the Gaussian copula for this task in the first place, I do not think that the compression of its parameter space is the chief culprit by any means.

What Li was suggesting amounted to this: instead of modeling the intricate inter-corporate correlation structure, in which financials are highly correlated to each other but bear little semblance to utilities, which themselves are very similar, he said why not just model everything at the average correlation of the CDO names? Actually, he just said that one correlation level will be enough to describe the CDO price - he did not say it was the average (I just added that to make the notion more tolerable at first glance). He didn't care if you chose a higher or lower correlation than any pair in the whole CDO exhibited; his claim was that there was some single number that would get the model to output a price that matched the market.

Before we get up in arms about this let's remember that most financial instruments are priced this way. One or more variables of the equation are left free to change, such that for some level the model will output the "correct" (or market-observed) price. With options, this is called volatility; with swaps this is the fixed rate; with bonds this is the yield - I particularly like the last example because most people assume this is limited to derivatives. It's not, "real" securities exhibit this problem too --  for stocks, it's called a P/E ratio.

So, we've boiled correlation down to one parameter which can take any value, but forces all issuers to have the same correlation to each other AND (this is a much more important caveat) exhibit a Gaussian dependance structure.

Now What? This Is Getting Boring.

Ok, let's price a CDO.

If I have CDS prices for all the issuers in my CDO, I can back out the probability of each issuer defaulting. (That's a whole other lecture, but please take my word that if we have the price of default insurance, we can calculate the probability of default. Otherwise I'll go on for another 2000 words...) This answers my first question: will defaults occur? Combine that with a correlation number and I can answer the second question: when will all the defaults occur? So now I can price the CDO, right? Unfortunately, no.

The default probabilities backed out of the CDS data are conditional default probabilities, meaning they have the market's 4,950 correlation factors baked into them. Company A may be doing fine, but it's very correlated to company B which is not so healthy. The result is that company A's CDS will exhibit a relatively high default probability even though that's more B's fault than A's.

In statistics, we like to deal with independent or unconditional probabilities, because the math becomes dramatically easier. So the conditional probabilities extracted from the CDS are not so useful, and must be transformed into independent probabilities. To achieve this goal, we do something that I think is very clever:

We set up a model in which defaults are driven by a shared "market factor" and an idiosyncratic factor, similar to a regression with one dependent variable and an error term, hence the name "single factor model." Now, I know I just said there are two factors, but one is specific to each individual issuer, so it doesn't count as one of the model factors -- if this troubles you, chalk it up to statistical nuance. Anyway, the two drivers are weighted by a correlation term; as correlation increases the market factor dominates, and as it decreases the idiosyncratic factor dominates.

Now, suppose for a moment we knew the value of the [random] market factor. In this case, default would be driven solely by the idiosyncratic factor (since the market factor is fixed, and we have chosen it such that all names either are - or are not - in default). The idiosyncratic factor is, by definition, independent across all issuers. Therefore, we have artificially created a scenario in which defaults are independent for each issuer by conditioning the market factor on a certain level. More specifically, we have generated a set of conditionally-independent default probabilities. Now, repeat the process for every issuer and every market factor level. The result is a complete picture of how every issuer behaves in every possible situation. From this, the unconditionally independent probabilities can be extracted.

(If that isn't quite clear, suffice to say there's a bit of math behind it. Interestingly, the math is surprisingly simple, but with the exception of the number of factors in a Gaussian model I have promised not to write out any equations in this post, so in the absence of symbols I hope you will accept my reasoning.)

So now, we have the probability of every issuer independently defaulting at any given time - with that information, it is relatively straightforward to figure out the expected loss on the portfolio. In fact, it's mainly arithmetic at this point: the value of the portfolio is just the probability-weighted average payoff of all the issuers.

And that's really it - that's how the Gaussian copula is used to price a CDO, or a collection of sold CDS on many issuers. We calculate the default probabilities from the CDS, then we use the Gaussian copula to tell us how they relate to each other. You'll notice that I never actually mentioned the copula when discussing the probability model - that's because you don't really need it. It happens that the copula math simplifies nicely into something that is almost, but not quite, entirely unlike a copula (hey! a Douglas Adams reference!). However, the copula-based approach is more informative, even if copula-specific math per se doesn't enter the picture.

And why is this so bad?

A few of the modeling decisions I've described above are unquestionably poor ones, though it may not be obvious how to improve them. Here is my brief rundown:

  • The Gaussian dependence structure - what's wrong with it? What alternatives are there? Why are they better?
  • The single factor - is it really sufficient to describe the behavior?
  • The single correlation number - is it sufficient to describe the behavior? Can we reliably estimate more relationships? Is correlation the right metric in the first place?

I'll attempt to answer all these and more in part III...

{ 10 comments… read them below or add one }

Mike July 15, 2009 at 6:20 pm

It’s interesting to me how a similar move, breaking risk into systematic and idiosyncratic risks, was essentially for the initial discovery CAPM model of equity returns. I like the feeedback because the move was a result of complexity in an exponential number of parameters (like the CDS), but then that model influenced how people invested, and now there’s talk about the way people invested (asset allocations on betas) has mucked up the returns it was supposed to model, as many financial observers are concluding.

Performativity always wins in the end.

I’m excited for part III.

Reply

J July 15, 2009 at 7:49 pm

Great point – I completely agree. I hadn’t thought of it in those terms, but the CAPM would be an excellent distillation for a SFGC student.

Reply

R July 16, 2009 at 12:32 am

Just a random comment. I read this earlier when I was trying to explain tranching and correlation to a friend. The “bathtub” analogy didn’t go over so well, but I came up with another that did. I’m offering it here because it might be more accessible to most people’s experience:

Think of a freeway with random cars traveling along it (the “mines”). Add another car (the “boat”) trying to pass through at high speed — a police chase, or a video game like SpyHunter.

Then as above, if the cars are fairly evenly dispersed (low correlation), a collision with one is unlikely to result in a second collision. But if the cars are traveling in clusters (high correlation), one collision will most likely cause a multi-car pileup.

And if the “boat” car is a lightweight sports car, a single collision will wipe it out (equity investors). But if it’s a Hummer or Range Rover, it might handle several collisions (senior investors) and therefore prefers low correlation.

Just thought this might be helpful! (NB: I’m a humanities grad student, not a finance guy, but I’ve done a lot of reading and think I understand this…)

Reply

J July 16, 2009 at 1:05 am

Thank you, this is another great illustration of the idea.

Reply

R July 16, 2009 at 12:37 am

One other thing that bothers me about the correlation description: you say explicitly that high correlation (clustered mines) “leaves more clear paths across the bathtub”.

Is this actually true for CDOs in real life? The implication would be that the odds of any single (first) default are lower for a highly-correlated basket than for a diversified one.

Why and how would this be the case?

Reply

J July 16, 2009 at 1:02 am

The question is very sensitive to how it is phrased. First, the probability of any single mine getting hit is the same, regardless of correlation. You can see this because the path across the bathtub is random, therefore the location of any single mine has no bearing on its likelihood of getting hit. However, with low correlation there is a greater chance of more than one mine getting hit. In the extreme low correlation case of pure independence, then missing one mine has no bearing on you hitting any other; but with high correlation, missing one mine makes you very likely to miss others.

So, it’s not that the *first* mine is more or less likely to be hit, it’s that conditional on missing the first mine, the others are subsequently less likely to be hit.

For an example, consider a CDO made up of 100 banks. This is high correlation – they’re either all defaulting, or all surviving. You want to own the equity here because if they all survive, which is a distinct possibility, you will get a huge payoff. But if you have a diversified portfolio, its less conceivable that every name will survive (since there is more idiosyncratic risk), and since there are less “clear paths across the bathtub”, you would rather have a stronger tranche.

Reply

Sandrew July 16, 2009 at 9:24 am

One nit:

“A CDO is nothing more than a collection of various bonds… Both the timing and the correlation of defaults matter… This issue is compounded by the introduction of tranches.”

Tranching does not COMPOUND the issue, it IS the issue. Correlation does not matter until you introduce tranching. If you have a 0-100 “CDO”, which is the simple case you describe of holding a basket of bonds (or CDS if synthetic) sans tranching, that is identical to holding the same bonds (CDS) individually, with no basket to hold them.

Reply

J July 17, 2009 at 8:17 am

That’s absolutely correct – a “full” CDO is nothing more than a bond portfolio, and we hardly need a correlation assumption to price it. However, from a risk management perspective (and this falls even outside the realm of the SFGC), the portfolio dynamics will depend on the correlation of its components. A more diverse portfolio will exhibit less volatility than one composed of very similar credits. Essentially, the expected value – or price – of two portfolios could be identical, but second-moment information would affect an investor’s preference for one or the other.

Reply

Sandrew July 23, 2009 at 9:39 am

Thanks for the response, J. I agree that, for a full CDO, joint default distributions matter for risk purposes if not for pricing.

BTW, some of us are still eagerly awaiting part III.

Which reminds me… I wonder if in part III you could address the issue of static recovery assumptions. This is more than the issue that market recovery inputs are difficult to observe (absent transparency into fair spreads for fixed- or zero-recovery swaps). Even if you have such data, it seems a problem–particularly in the pricing of senior tranches–that the standard SFGC model ignores recovery dynamics. The simplest illustration of this problem is a super-senior 60-100 tranche where all recovery rates are 40%. No correlation level will allocate losses to this tranche, since even if all reference credits default simultaneously, this tranche will walk away with their full principal.

I believe there are a few papers out there that describe proposed extensions to the SFGC model to introduce dynamic recovery. I’m not that familiar with these methods, so if someone could deconstruct them, that’d be rad.

Reply

Jia March 29, 2013 at 11:55 am

what is the meaning of depedence structure? how to estimate factors?

Reply

Leave a Comment

Previous post:

Next post: