Deconstructing the Gaussian copula, part I

June 5, 2009 in Finance,Math

(Parts II, II and a half, and III of this series are also available.)

Newsweek has a new article about Paul Wilmott called "Revenge of the Nerd" which I really enjoyed, with two caveats.

In its opening the article compares quants to aeronautical engineers who design faulty planes (CDOs). The author observes:

Yet while aeronautical engineers who willfully designed a faulty plane might be on trial for criminal negligence, Wall Street's math gurus are, for the most part, still employed. Strangely, the banks need quants more than ever right now.

But where is the logical conclusion of the aeronautical metaphor, that other engineers would be needed to fix the planes? In that framework, there's no contradiction.

But that point is minor. Here's what really bothered me (my comments are inserted in bold):

In 2000, the CDO market was jump-started by David X. Li, who, while working at JPMorgan, created the Gaussian copula function (no, he didn't), a formula for determining the correlation between the default rates of different securities (no, it's not). In theory, the model tells you the odds that, if one CDO goes bad, others will too (no, it doesn't). The apparent genius of the Gaussian copula is its abstraction (true, but not in the way the author means). Rather than relying on the immense amount of data used to figure the odds that a CDO might default (there is no such data; issuers default, not CDOs), Li appeared to have discovered a law of correlation (no, he didn't). That is, you didn't need the data; the correlation was just there. Armed with it, quants could price CDOs much faster, and traders could buy and sell them at record speeds. Gaussian was rocket fuel for the CDO market ("Gaussian" is an adjective, not a noun). The global volume of CDO deals went from $157 billion in 2004 to $520 billion in 2006. As more banks got in on the game, the once large profit margins started to shrink. In order for banks to make the same kind of returns, they had to pack more and more loans into a CDO, essentially making bigger bombs. Li was on his way to a Nobel Prize when the world blew up (no, he wasn't).

I have no problem with the simplification of difficult topics (in fact, I encourage it). I also have no problem with bashing Gaussian copulas as applied to CDO's (the argument was featured in my thesis years ago). But I have severe issues and frustrations with poor reporting of false information. Will 99% of this paragraph's readers realize it's incorrect or even care? Of course not! But it doesn't make it all right.

Deep breath. Ready to go deeper?

This paragraph seems to be lifted largely from a recent Wired article (my response to that article would evolve into my "models are just the tool" tirade). Anyway:

"David X. Li...created the Gaussian copula function." The Gaussian copula is rooted in research from more than 250 years ago. In fact, Gauss - a prodigal mathematician whose influence extends far beyond the bell curve - died in 1855! It's unclear when the first bivariate extensions were arrived at, but Wikipedia notes that it must have been developed by 1872. The copula itself would not be described until 1959, but almost immediately mathematicians used it to decompose the multivariate normal distribution into a pairing of Gaussian marginals and something the new vocabulary termed a Gaussian copula. All David Li did was pair the copula and CDO pricing for the first time.

"...a formula for determining the correlation between the default rates of different securities." Copulas describe the dependence structure of random variables. Correlation is a way of condensing the information contained in the copula down to a single number. The sentence as written suggests that the copula is used to measure correlation when in fact it is the other way around. In fact, you can not even create a Gaussian copula until after you decide what correlation to use.

"In theory, the model tells you the odds that, if one CDO goes bad, others will too." I assume the author meant to write "if one issuer goes bad" rather than "if one CDO goes bad", because the Gaussian copula as applied to CDO's describes the issuers within the CDO, not CDOs to each other. In this framework, the sentence is correct: copulas describe the dependence structure, which essentially means "how one issuer relates to other issuers." In this case, the thing being measured is default probability.

"The apparent genius of the Gaussian copula is its abstraction." This is a true statement as it stands: the brilliance of the copula function is that it abstracts the dependence structure from the marginal distributions, meaning the dependence of, for example, two dice numbered 1-6 has the same copula as the behavior of two dice numbered 2-7. Before the development of the copula, the 1-6 dice would have a completely different function than the 2-7 dice, because one would have to account for the marginal differences while defining their dependence.

However, the abstraction the author is referring to is that "you don't need the data, you only need the correlation" (see below). The Gaussian copula as Li implemented it boils all correlation down to a single number, enabling such an abstraction.

"Rather than relying on the immense amount of data used to figure the odds that a CDO might default..." The available data consists of CDS spreads and bond z-spreads, which may be used to imply a default probability for each issuer. However, to figure out if a CDO will default, one must evaluate the probability of multiple firms defaulting within a given time frame. This is the correlation parameter. Thus, the data alone does not tell you about the likelihood of CDO default.

The default probabilities extracted from historical data are not independent, and so can not simply be added (or multiplied, to be more precise) together. Moreover, the correlation which may be measured in CDS is the correlation of changes in default probability, and the jump to correlations of actual bankruptcy events is much more difficult, not in the least because there are relatively few historical defaults, compared to the number of issuers.

This isn't to say that the data can't be used - in fact the data must be used - but the key realization is that without a model (I struggle to think of one capable of handling such data that isn't a copula), the data yields no worthwhile insights. Merely having the data is not enough to price a CDO.

"Li appeared to have discovered a law of correlation." As I've mentioned, Li did not "discover" anything. He merely applied an existing model to a new dataset.

"You didn't need the data; the correlation was just there." Of course you need the data - the correlation is meaningless without the default probabilities extracted from the data. What the author presumably means is that your correlation number does not have to represent the "true" level of correlation observed in your data (which, as I've stated, is a nearly impossible thing to observe in the first place).

But having said that, this is probably the one thing the author has correct. After some futile efforts, researchers stopped measuring correlation and started holding a finger in the air to determine the "right" level. Similar to implied volatility in option pricing, correlation was unobservable and the "right" correlation was whatever level made the model price come out the same as the market price.  Unfortunately, in a space where traders became so dependent on their models, the chain was circular: markets were informed solely by correlation-based models, which were themselves calibrated to the market.

The critique is not limited to the use of a Gaussian copula, however.

"Gaussian was rocket fuel for the CDO market." Another true statement, but one which reveals the author's unfamiliarity: "Gaussian" is an adjective used to describe a type of model. It's a person's name. This is like saying "Newtonian revolutionized the world of physics" when you want to talk about a model of gravitational acceleration or "Darwinian turned the study of biology upside down."

"Li was on his way to a Nobel Prize when the world blew up." No, he most emphatically was not. This is a repeat of a one of Felix's statements from the Wired article. Even if the model had been perfectly accurate, do today's financial journalists think pricing a financial derivative is worthy of a Nobel prize? Black/Scholes/Merton didn't win a Nobel prize for their option pricing model, they won it for the research they did into the economics of asset pricing. The option model was just a nice benefit on the side.

A fundamental issue with this paragraph, on top of all these highlights, is that not once does it explain the actual problem. If you read the paragraph, and I asked you why did they blow up, could you tell me? I'm sure you'd say something about the correlation not being reflective of the data. And I'd respond, well then why didn't we just start using the data, or start using the right correlation?

I'll try to answer these questions soon in part II.

{ 8 comments… read them below or add one }

Mike July 15, 2009 at 6:35 pm

“…without the default probabilities extracted from the data”

Two additional questions:

- What’s your take on endogenous/exogenous PDs? Statistical measures, or Merton model?

- Do you think there was a bank run on the CDS market? If so (or even not), do you think there should be an issuer of last resort, someone with the credibility to backstop the systematic CDS market, like the Fed?

Reply

J July 15, 2009 at 7:48 pm

Very interesting questions –

To the first, briefly, I think that modelling default probabilities exogenously is far more appealing from a conceptual standpoint – it happens to be one of the features I like in the SFGC model – but we must realize that without the feedback mechanisms that would be otherwise captured by an endogenous model, we will systemically underestimate risk. For this reason I’m partial to models that incorporate a jump component; seems like a relatively simple way of capturing the chaotic distribution exogenously (assuming your joint distribution is well specified, of course!). At the risk of contradicting myself, the Merton model is a wonderful tool for illustrative purposes, and a deceptively cruel thing to implement. It’s too limited, in much the same way the SFGC is. In particular, leverage kills it… so like so many models, it fails right when you need it to work.

And to the second, I am actually of the rare opinion that the CDS markets remained liquid – and displayed the voice of the broader market – while the cash markets fell apart. In my mind, this led to the rise of the negative basis trade. As people fled the markets, the potentially unlimited OTC contracts were the ones that could be generated as needed; finding buyers for cash products that were perceived as doomed was another story, and bargain prices resulted. I firmly believe that if banks like AIG had been forced to post collateral, much of the systemic impact would have been avoided. Lehman may still have gone – even AIG may have gone down – but they would not have so grossly levered themselves as to take down everyone standing nearby. We all know that a catastrophe will destroy an insurance company; our goal should not be to save these companies per se, but to make sure that whatever insurance they do sell is done so with proper risk management in place. Nonetheless, because CDS insurance differs from real insurance in that one security may be insured multiple times, I am strongly in favor of a clearinghouse solution supported by a diverse group of banks. I do not see a need for a government backstop, I think it can be handled privately and the diversity of the backing group would provide assurance that the counterparty will always stand.

My apologies for the length of these responses… I’m too used to the blog, I guess.

Reply

Benedict@Large July 16, 2009 at 1:52 am

A bit over my head, but:

“Unfortunately, in a space where traders became so dependent on their models, the chain was circular …”

This would be Soros’ reflexivity, no?

Reply

richl July 21, 2009 at 11:06 am

I like that you focus on the fact that it is not the model itself that is wrong but the abuse on how that model was used. Sort of like blaming the gun manufacturer because someone used their gun to commit suicide.

Reply

kamel November 17, 2010 at 3:52 am

To the first, briefly, I think that modelling default probabilities exogenously is far more appealing from a conceptual standpoint – it happens to be one of the features I like in the SFGC model – but we must realize that without the feedback mechanisms that would be otherwise captured by an endogenous model, we will systemically underestimate risk. For this reason I’m partial to models that incorporate a jump component; seems like a relatively simple way of capturing the chaotic distribution exogenously (assuming your joint distribution is well specified, of course!).

Reply

Gavin Radzick August 15, 2013 at 6:58 pm

Thank you for your comments on this article and its reference to the Wired article (which I have read).
What are your thoughts on arguments from Mandelbrot that VAR, CAPM, MPT, Black-Scholes and models based on normal distributions are fundamentally flawed? Although he never offered a solution, his encouragement to throw out the drawing board and start over seems justified.

Reply

Google July 10, 2014 at 12:41 am

Sooner or later, Google will find all new spam methods. In addition, the
observing surgeons could transmit their comments to the operating surgeon, who could read them on the Google Glass
monitor. But it seems Memorial Day wasn’t important enough to Google.

Reply

Google September 13, 2014 at 2:15 am

Using Ancient Rome 3D in Google Earth, you can explore Rome
as it appeared in 320 A. Based on their experience, they could know how much is required before going into details.

Besides placing advertisers ads on your Blog, you can also make money Blogging by placing Google
Adsense into your Blog.

Reply

Leave a Comment

{ 4 trackbacks }

Previous post:

Next post: