Posts tagged as:

Gaussian

(Parts I, II and II and a half of this series are also available.)

In the first two parts of this series, I respectively addressed some misperceptions about the Gaussian copula and described its common use in CDO pricing. Part III focuses more on the model components and the intuition driving them.

I am a staunch supporter of a “models are just the tool” viewpoint, an opinion more elaborately and memorably stated by George Box as, “All models are wrong, but some are useful.” With that in mind, what you will find here is not a campaign against the Gaussian copula itself; merely its blind application to certain problems in finance. I find it as difficult to blame this model alone for the 2008 recession as I find it hard to blame the sinking of the Titanic on its hull design (new research actually suggests the rivets were more at fault) – while it certainly contributed to a general sense of invincibility and well-more-than-advisable risk taking, it is naive to think that in the absence of this notorious model, 2008 would have turned out just fine.

As I recently (and strangely, given my campaigns against it) stated about VaR, the Gaussian copula does exactly what it is supposed to do – the error lies in its interpretation and its application in the first place. I join Paul Wilmott in his crusade for less equations and more common sense among quantitative financiers: getting the number is good but explaining it is better.

Gaussian Dependance

Copulas are nothing more than descriptions of how two or more random variables relate to each other. To be more specific, copulas refer to the co-behaviors of uniform random variables only; but any distribution may be transformed to the uniform case via its CDF, and that is the appeal of copula models: they describe dependance without concern of the marginal distributions. The Gaussian copula, we may conclude, doesn’t necessarily have anything to do with normal distributions as we typically think of them (i.e. in the “normal distributions are useless in finance” sense)! Rather, it describes the sort of dependance that arises when a bunch of normally-distributed variables are correlated with each other.

Gaussian dependance isn’t easy to describe like a Gaussian distribution is. For the latter case, just think of a bell curve. The former is more difficult to identify, so here’s a picture of a two uniform random variables with a Gaussian dependance structure (click to zoom):

Gaussian Simulation

A first observation is that the dependence is regular (meaning even) and smooth. It lacks any significant clustering. More importantly, it lacks a property called tail dependence. Tail dependance is the probability of observing extreme observations in all random variables at once. Strictly speaking, it measures the probability of observing joint tail events. As you move further out in the tail, that probability converges to 1 in the limit for structures exhibiting tail dependence. It is extremely surprising and counter-intuitive to learn that the Gaussian copula lacks tail dependence. In plain English, this means that tail events in the Gaussian copula are asymptotically independent of each other – and that is the chief problem with using Gaussian dependence in finance.

In finance, extreme events co-occur all the time, as recent memory bears witness. If risk management is the process of ascertaining, measuring, and avoiding those situations, then doesn’t it seem a little odd to use a model which is explicitly unable to account for them? Tail dependence is a necessary condition for a dependence model in finance. The Student t copula exhibits it and is only marginally more difficult to implement than a Gaussian copula; but simplicity is king and there was obviously a decision made at some point that tail events didn’t require consideration, anyway. It brings to mind my favorite VaR metaphor as an airbag that always works, except in a crash.

Correlation

Another element of the Gaussian model which does not carry well to finance is the idea that linear correlation is a sufficient statistic for the dependence distribution. Consider these two plots, each of which shows two variables that, by construction, have a correlation of 0.7 to each other. First, a Gaussian dependence structure (this looks different than the above plot because the former was the copula itself, as indicated by the uniform marginals, whereas this is a full copula-derived multivariate distribution):

Gaussian copula with 0.7 correlation

Next, a dependance structure exhibiting lower tail dependance (this is from a Clayton copula and is a stylized depiction of behaviors more characteristic of finance). You can plainly see the impact of the tail dependance, in contrast to the Gaussian plot above:Clayton copula with 0.7 correlation

The two distributions are very obviously different, and yet if you merely measured their correlation you’d describe them in exactly the same way. Correlation alone is insufficient to describe more complex dependence structures such as those observed in finance. And yet, it is the only descriptive statistic of a multivariate Gaussian distribution.

Financial covariates tend to resemble the second plot – when a large negative event occurs in one, it more than likely will occur in the other. This, by the way, accounts for some of the skewness in financial distributions – it is possible to have two perfectly normal distributions whose combination is nonetheless skewed if the dependence structure exhibits tail effects like this.

Again, we have a call for clarity: it is imperative for the underlying dynamic of any model to resemble the behaviors of the system in question.

The Single Correlation Factor

In a CDO pricing framework based on the Gaussian copula, not only is correlation the sole determinant of the dependence structure, it is assumed to be the same for every name in the basket. This has caused much alarm. Certainly, using more factors would provide a more accurate model – allowing different industries to have different correlations, for example. Unfortunately, this comes at the cost of model accuracy.

It is very important, where possible, for a model to have no more than one unobserved input for every output. Think of a Black-Scholes option: future volatility can not be known, so we plug in whatever value gets the model to spit out the current market price of the option (a “the market is always right” approach). If there were two volatilities (say, a short term value and a long term value), we would be unable to create a consistent model, for there would likely be an infinite number of volatility pairs that would satisfy the market price. For every additional parameter, we need one more output metric to match. If we could match an option’s price and also it’s delta, just for arguments sake, then there is probably a unique combination of two volatilities for that output space.

This is why using multiple correlations is problematic not just from a fitting standpoint, but from a model integrity standpoint – if you take the thousands of necessary pairwise correlations and estimate just a handful of them incorrectly, the model could deliver completely spurious results.

(For a very concrete example of this, consider pricing a mezzanine or senior CDO tranche, which requires two correlation inputs. Without knowledge of the corresponding equity tranche price – and consequently the attachment point correlation – this becomes a very difficult puzzle indeed).

However, in my mind this is one of the more minor problems. That’s not to say it isn’t an issue, but I’d much rather have a single-parameter tail-dependent model than a multi factor Gaussian one. Why? Because it’s more important to me that a model captures downside risk in some regard than that it captures the distribution’s central dynamics more faithfully.

Correlation (again)

We’ve discussed why correlation is insufficient to describe the CDO dynamics, and also why a single-factor model may lack fidelity. But in some ways, the entire discussion is slightly off base. Correlation (as I’ve alluded before) is an implied measure – it is whatever plug gets the model to output the “right” price.

There is a raging debate about how similar correlation is to Black-Scholes volatility, but I think for the purposes of this exercise we can highlight their similarities (though I will not necessarily agree with that under more rigorous terms). both are plug values; both have intuitively “correct” ranges but can not be directly measured or observed; both are the single unobserved input in the most simple pricing models of their respective derivatives.

Because of this, a lot of our reasoning on the problems with correlation goes backwards, since we begin with the premise that correlation is arbitrary and/or unmeasurable, and therefore conclude that a correlation-based model must fail. However, in practice we actually start with a tranche price, and work out the implied correlation value from that price. So I don’t really care if my correlation comes out to 60% or 70% because I’m not going to read too much into that figure – it’s just a parameter that will keep my model ticking consistently with the market, all else equal.

“But wait,” you say, “that’s the dumbest thing I’ve ever heard!” What if the market price is arbitrarily high and implies a correlation greater than 1 (or just 1, since the input is bounded)? Then that’s great, you get the price right in that instant, but the second you try to measure any sort of risk or even price it the next day, you’ll fail because a correlation of 1 doesn’t reflect reality at all. Moreover, take this to its logical conclusion: why not have a model whose sole input and output is just the price. In this scenario, you would see a tranche trading at 20, and set your “model” to 20 (the implied price). Tomorrow, your model still says 20 – so when the actual tranche trades for 19, you need to adjust your “model parameter” (i.e. price) down. Obviously, a ridiculous situation and it speaks to the critical need for any model to balance a reasonable representation (even if a simple one) of reality with an acceptable range of input parameters.

To reiterate, this is why I would prefer a simplistic one-factor tail dependent model to a multifactor Gaussian one.

Other Copula Models

All of this must raise the question, why are we stuck using the Gaussian copula?

And like so much else, the answer is: because its easy.

As mentioned, the Student t copula exhibits tail dependence and is only slightly harder to build than the Gaussian variety. So why not use it? The dark secret (unless you read part II) is that single factor Gaussian copula models are really just massive simplifications of copula-derived mathematics. The engine itself relies on arithmetic and an integral – nothing that would suggest a copula model on the surface. It is the mathematically friendly properties of the Gaussian distribution that make this possible (though frankly, it seems to me a t implementation shouldn’t be much farther off). More obscure copulas, like those in the Archimedean family, don’t necessarily follow “real world” behaviors in high dimensions, as it pertains to finance.

Moreover, like all problems of this ilk, CDOs suffer from a massive curse of dimensionality. In such situations, familiarity is key – in fact, it is sometimes the only hope of finding answers in the massive cosmos of sparse data.

Finally, Gaussian copulas have a nice property – they are easy to explain (keep in mind, lately such explanations aren’t much at all). In particular, the error rates are easy to quantify – we can be 99.975% sure of an outcome. Knowing a concrete chance of failure, even if that probability is completely bogus, makes the model easy to accept. More complicated copula structures, by contrast, are harder to work with (read: make it harder for risk managers to promise certain error rates within certain error bounds).

Finally, more complicated does not necessarily mean better. Even after all I’ve written, a pinch of common sense applied to a single factor Gaussian model might do more wonders than a more advanced model in the hands of a naive user.

Here endeth the lesson.

{ 0 comments }

Bell curves in action

August 6, 2009 in Math

An exhibit at MOMA invites visitors to mark their heights on a wall. A normal distribution results:

Well, not quite. The distribution is actually slightly negatively skewed by the confounding presence of children, who are obviously shorter than adults – you can see this in the great number of names well below the central band which are not mirrored by names higher up. Rest assured, however, that the ex-children distribution is itself Gaussian.

{ 1 comment }

(Parts II, II and a half, and III of this series are also available.)

Newsweek has a new article about Paul Wilmott called “Revenge of the Nerd” which I really enjoyed, with two caveats.

In its opening the article compares quants to aeronautical engineers who design faulty planes (CDOs). The author observes:

Yet while aeronautical engineers who willfully designed a faulty plane might be on trial for criminal negligence, Wall Street’s math gurus are, for the most part, still employed. Strangely, the banks need quants more than ever right now.

But where is the logical conclusion of the aeronautical metaphor, that other engineers would be needed to fix the planes? In that framework, there’s no contradiction.

But that point is minor. Here’s what really bothered me (my comments are inserted in bold):

In 2000, the CDO market was jump-started by David X. Li, who, while working at JPMorgan, created the Gaussian copula function (no, he didn’t), a formula for determining the correlation between the default rates of different securities (no, it’s not). In theory, the model tells you the odds that, if one CDO goes bad, others will too (no, it doesn’t). The apparent genius of the Gaussian copula is its abstraction (true, but not in the way the author means). Rather than relying on the immense amount of data used to figure the odds that a CDO might default (there is no such data; issuers default, not CDOs), Li appeared to have discovered a law of correlation (no, he didn’t). That is, you didn’t need the data; the correlation was just there. Armed with it, quants could price CDOs much faster, and traders could buy and sell them at record speeds. Gaussian was rocket fuel for the CDO market (“Gaussian” is an adjective, not a noun). The global volume of CDO deals went from $157 billion in 2004 to $520 billion in 2006. As more banks got in on the game, the once large profit margins started to shrink. In order for banks to make the same kind of returns, they had to pack more and more loans into a CDO, essentially making bigger bombs. Li was on his way to a Nobel Prize when the world blew up (no, he wasn’t).

I have no problem with the simplification of difficult topics (in fact, I encourage it). I also have no problem with bashing Gaussian copulas as applied to CDO’s (the argument was featured in my thesis years ago). But I have severe issues and frustrations with poor reporting of false information. Will 99% of this paragraph’s readers realize it’s incorrect or even care? Of course not! But it doesn’t make it all right.

Deep breath. Ready to go deeper?

This paragraph seems to be lifted largely from a recent Wired article (my response to that article would evolve into my “models are just the tool” tirade). Anyway:

“David X. Li…created the Gaussian copula function.” The Gaussian copula is rooted in research from more than 250 years ago. In fact, Gauss – a prodigal mathematician whose influence extends far beyond the bell curve – died in 1855! It’s unclear when the first bivariate extensions were arrived at, but Wikipedia notes that it must have been developed by 1872. The copula itself would not be described until 1959, but almost immediately mathematicians used it to decompose the multivariate normal distribution into a pairing of Gaussian marginals and something the new vocabulary termed a Gaussian copula. All David Li did was pair the copula and CDO pricing for the first time.

…a formula for determining the correlation between the default rates of different securities.” Copulas describe the dependence structure of random variables. Correlation is a way of condensing the information contained in the copula down to a single number. The sentence as written suggests that the copula is used to measure correlation when in fact it is the other way around. In fact, you can not even create a Gaussian copula until after you decide what correlation to use.

“In theory, the model tells you the odds that, if one CDO goes bad, others will too.” I assume the author meant to write “if one issuer goes bad” rather than “if one CDO goes bad”, because the Gaussian copula as applied to CDO’s describes the issuers within the CDO, not CDOs to each other. In this framework, the sentence is correct: copulas describe the dependence structure, which essentially means “how one issuer relates to other issuers.” In this case, the thing being measured is default probability.

“The apparent genius of the Gaussian copula is its abstraction.” This is a true statement as it stands: the brilliance of the copula function is that it abstracts the dependence structure from the marginal distributions, meaning the dependence of, for example, two dice numbered 1-6 has the same copula as the behavior of two dice numbered 2-7. Before the development of the copula, the 1-6 dice would have a completely different function than the 2-7 dice, because one would have to account for the marginal differences while defining their dependence.

However, the abstraction the author is referring to is that “you don’t need the data, you only need the correlation” (see below). The Gaussian copula as Li implemented it boils all correlation down to a single number, enabling such an abstraction.

“Rather than relying on the immense amount of data used to figure the odds that a CDO might default…” The available data consists of CDS spreads and bond z-spreads, which may be used to imply a default probability for each issuer. However, to figure out if a CDO will default, one must evaluate the probability of multiple firms defaulting within a given time frame. This is the correlation parameter. Thus, the data alone does not tell you about the likelihood of CDO default.

The default probabilities extracted from historical data are not independent, and so can not simply be added (or multiplied, to be more precise) together. Moreover, the correlation which may be measured in CDS is the correlation of changes in default probability, and the jump to correlations of actual bankruptcy events is much more difficult, not in the least because there are relatively few historical defaults, compared to the number of issuers.

This isn’t to say that the data can’t be used – in fact the data must be used – but the key realization is that without a model (I struggle to think of one capable of handling such data that isn’t a copula), the data yields no worthwhile insights. Merely having the data is not enough to price a CDO.

“Li appeared to have discovered a law of correlation.” As I’ve mentioned, Li did not “discover” anything. He merely applied an existing model to a new dataset.

“You didn’t need the data; the correlation was just there.” Of course you need the data – the correlation is meaningless without the default probabilities extracted from the data. What the author presumably means is that your correlation number does not have to represent the “true” level of correlation observed in your data (which, as I’ve stated, is a nearly impossible thing to observe in the first place).

But having said that, this is probably the one thing the author has correct. After some futile efforts, researchers stopped measuring correlation and started holding a finger in the air to determine the “right” level. Similar to implied volatility in option pricing, correlation was unobservable and the “right” correlation was whatever level made the model price come out the same as the market price.  Unfortunately, in a space where traders became so dependent on their models, the chain was circular: markets were informed solely by correlation-based models, which were themselves calibrated to the market.

The critique is not limited to the use of a Gaussian copula, however.

“Gaussian was rocket fuel for the CDO market.” Another true statement, but one which reveals the author’s unfamiliarity: “Gaussian” is an adjective used to describe a type of model. It’s a person’s name. This is like saying “Newtonian revolutionized the world of physics” when you want to talk about a model of gravitational acceleration or “Darwinian turned the study of biology upside down.”

“Li was on his way to a Nobel Prize when the world blew up.” No, he most emphatically was not. This is a repeat of a one of Felix’s statements from the Wired article. Even if the model had been perfectly accurate, do today’s financial journalists think pricing a financial derivative is worthy of a Nobel prize? Black/Scholes/Merton didn’t win a Nobel prize for their option pricing model, they won it for the research they did into the economics of asset pricing. The option model was just a nice benefit on the side.

A fundamental issue with this paragraph, on top of all these highlights, is that not once does it explain the actual problem. If you read the paragraph, and I asked you why did they blow up, could you tell me? I’m sure you’d say something about the correlation not being reflective of the data. And I’d respond, well then why didn’t we just start using the data, or start using the right correlation?

I’ll try to answer these questions soon in part II.

{ 7 comments }