Deconstructing the Gaussian copula, part I

June 5, 2009 in Finance,Math

(Parts II, II and a half, and III of this series are also available.)

Newsweek has a new article about Paul Wilmott called "Revenge of the Nerd" which I really enjoyed, with two caveats.

In its opening the article compares quants to aeronautical engineers who design faulty planes (CDOs). The author observes:

Yet while aeronautical engineers who willfully designed a faulty plane might be on trial for criminal negligence, Wall Street's math gurus are, for the most part, still employed. Strangely, the banks need quants more than ever right now.

But where is the logical conclusion of the aeronautical metaphor, that other engineers would be needed to fix the planes? In that framework, there's no contradiction.

But that point is minor. Here's what really bothered me (my comments are inserted in bold):

In 2000, the CDO market was jump-started by David X. Li, who, while working at JPMorgan, created the Gaussian copula function (no, he didn't), a formula for determining the correlation between the default rates of different securities (no, it's not). In theory, the model tells you the odds that, if one CDO goes bad, others will too (no, it doesn't). The apparent genius of the Gaussian copula is its abstraction (true, but not in the way the author means). Rather than relying on the immense amount of data used to figure the odds that a CDO might default (there is no such data; issuers default, not CDOs), Li appeared to have discovered a law of correlation (no, he didn't). That is, you didn't need the data; the correlation was just there. Armed with it, quants could price CDOs much faster, and traders could buy and sell them at record speeds. Gaussian was rocket fuel for the CDO market ("Gaussian" is an adjective, not a noun). The global volume of CDO deals went from $157 billion in 2004 to $520 billion in 2006. As more banks got in on the game, the once large profit margins started to shrink. In order for banks to make the same kind of returns, they had to pack more and more loans into a CDO, essentially making bigger bombs. Li was on his way to a Nobel Prize when the world blew up (no, he wasn't).

I have no problem with the simplification of difficult topics (in fact, I encourage it). I also have no problem with bashing Gaussian copulas as applied to CDO's (the argument was featured in my thesis years ago). But I have severe issues and frustrations with poor reporting of false information. Will 99% of this paragraph's readers realize it's incorrect or even care? Of course not! But it doesn't make it all right.

Deep breath. Ready to go deeper?

This paragraph seems to be lifted largely from a recent Wired article (my response to that article would evolve into my "models are just the tool" tirade). Anyway:

"David X. Li...created the Gaussian copula function." The Gaussian copula is rooted in research from more than 250 years ago. In fact, Gauss - a prodigal mathematician whose influence extends far beyond the bell curve - died in 1855! It's unclear when the first bivariate extensions were arrived at, but Wikipedia notes that it must have been developed by 1872. The copula itself would not be described until 1959, but almost immediately mathematicians used it to decompose the multivariate normal distribution into a pairing of Gaussian marginals and something the new vocabulary termed a Gaussian copula. All David Li did was pair the copula and CDO pricing for the first time.

"...a formula for determining the correlation between the default rates of different securities." Copulas describe the dependence structure of random variables. Correlation is a way of condensing the information contained in the copula down to a single number. The sentence as written suggests that the copula is used to measure correlation when in fact it is the other way around. In fact, you can not even create a Gaussian copula until after you decide what correlation to use.

"In theory, the model tells you the odds that, if one CDO goes bad, others will too." I assume the author meant to write "if one issuer goes bad" rather than "if one CDO goes bad", because the Gaussian copula as applied to CDO's describes the issuers within the CDO, not CDOs to each other. In this framework, the sentence is correct: copulas describe the dependence structure, which essentially means "how one issuer relates to other issuers." In this case, the thing being measured is default probability.

"The apparent genius of the Gaussian copula is its abstraction." This is a true statement as it stands: the brilliance of the copula function is that it abstracts the dependence structure from the marginal distributions, meaning the dependence of, for example, two dice numbered 1-6 has the same copula as the behavior of two dice numbered 2-7. Before the development of the copula, the 1-6 dice would have a completely different function than the 2-7 dice, because one would have to account for the marginal differences while defining their dependence.

However, the abstraction the author is referring to is that "you don't need the data, you only need the correlation" (see below). The Gaussian copula as Li implemented it boils all correlation down to a single number, enabling such an abstraction.

"Rather than relying on the immense amount of data used to figure the odds that a CDO might default..." The available data consists of CDS spreads and bond z-spreads, which may be used to imply a default probability for each issuer. However, to figure out if a CDO will default, one must evaluate the probability of multiple firms defaulting within a given time frame. This is the correlation parameter. Thus, the data alone does not tell you about the likelihood of CDO default.

The default probabilities extracted from historical data are not independent, and so can not simply be added (or multiplied, to be more precise) together. Moreover, the correlation which may be measured in CDS is the correlation of changes in default probability, and the jump to correlations of actual bankruptcy events is much more difficult, not in the least because there are relatively few historical defaults, compared to the number of issuers.

This isn't to say that the data can't be used - in fact the data must be used - but the key realization is that without a model (I struggle to think of one capable of handling such data that isn't a copula), the data yields no worthwhile insights. Merely having the data is not enough to price a CDO.

"Li appeared to have discovered a law of correlation." As I've mentioned, Li did not "discover" anything. He merely applied an existing model to a new dataset.

"You didn't need the data; the correlation was just there." Of course you need the data - the correlation is meaningless without the default probabilities extracted from the data. What the author presumably means is that your correlation number does not have to represent the "true" level of correlation observed in your data (which, as I've stated, is a nearly impossible thing to observe in the first place).

But having said that, this is probably the one thing the author has correct. After some futile efforts, researchers stopped measuring correlation and started holding a finger in the air to determine the "right" level. Similar to implied volatility in option pricing, correlation was unobservable and the "right" correlation was whatever level made the model price come out the same as the market price.  Unfortunately, in a space where traders became so dependent on their models, the chain was circular: markets were informed solely by correlation-based models, which were themselves calibrated to the market.

The critique is not limited to the use of a Gaussian copula, however.

"Gaussian was rocket fuel for the CDO market." Another true statement, but one which reveals the author's unfamiliarity: "Gaussian" is an adjective used to describe a type of model. It's a person's name. This is like saying "Newtonian revolutionized the world of physics" when you want to talk about a model of gravitational acceleration or "Darwinian turned the study of biology upside down."

"Li was on his way to a Nobel Prize when the world blew up." No, he most emphatically was not. This is a repeat of a one of Felix's statements from the Wired article. Even if the model had been perfectly accurate, do today's financial journalists think pricing a financial derivative is worthy of a Nobel prize? Black/Scholes/Merton didn't win a Nobel prize for their option pricing model, they won it for the research they did into the economics of asset pricing. The option model was just a nice benefit on the side.

A fundamental issue with this paragraph, on top of all these highlights, is that not once does it explain the actual problem. If you read the paragraph, and I asked you why did they blow up, could you tell me? I'm sure you'd say something about the correlation not being reflective of the data. And I'd respond, well then why didn't we just start using the data, or start using the right correlation?

I'll try to answer these questions soon in part II.

Previous post:

Next post: