Posts tagged as:

chart

Very amusing… and true:

I especially love “The HDR Hole.” Presumably the y-axis is measured in percent of personal potential… there must be all sorts of Bayesian self-reflection stuff going on there.

(Via DataViz)

{ 0 comments }

Chart Wars

January 8, 2010 in Data

Alex Lundry, Vice President and Director of Research of the consulting firm Target Point, has published a brief talk called Chart Wars which is simply brilliant, serving as an excellent but brief (5 minutes!) overview of how easy it is to manipulate infographics and what tricks to be wary of. His specific focus is a chart (which was covered on TGR previously) whose designs – and it went through many iterations – were politically motivated. While there is no doubt about which charts are more clear, his implicit question – which charts are right? – resonates philosophically.

Here’s the video of his talk:

(Via Information Aesthetics)

{ 0 comments }

Overcharting: airfare edition

November 28, 2009 in Data

Nate Silver writes about the dropping cost of air fares – yes, you read that correctly – over at Five Thirty Eight. His writing, as always, is excellent – I only want to point out a chart he uses and how it can be dangerous to draw conclusions at a glance (or, if you prefer, how similar charts can be used to mislead people).

Here’s the chart in question, showing the cumulative percent change in inflation-adjusted air fares since 1995:

At a glance, the chart is convincing: fares are off about 15% since 1995. But how meaningful is that number?

The chart exhibits a very noisy pattern. Just a year ago, Nate could have written an article about fares being unchanged over more than a decade, and he could have noted a steady rise in price following 9/11! It should be clear that the point in time at which the measurement is made is extremely important.

Additionally, the reference or base year matters a lot as well, from a perceptual standpoint. If the y-axis were zeroed on 1996 or 2004, a very different chart would result. Sure, the shape would be the same, but the present chart is almost entirely in negative territory; a different base year would put more points in positive territory. This makes me wonder if 1995 wasn’t just another spike like 1996, 2001, 2006 and 2008. I believe the dataset only goes back to 1995, so this is far from an accusation of cherrypicking data, but it’s possible that a 1994 base would reveal a very different story – either higher or lower.

Finally, people frequently make the mistake with charts like these of observing the gap area (the grey vertical bars) and attributing meaning to it across its entire length. In this case, that means looking at the two lines and making a statement like “the top 25 airports continued to outpace the rest of the airports in the last decade.” In reality, however, the two groups are almost exactly the same from 2003 to 2009. There is a one-time structural break following 9/11 and lasting about a year or two, during which time the top 25 markets experienced greater price drops than the rest. After that, the price changes are in lockstep. If both time series were zeroed on 2003, the lines would move in tandem following that date. I see this mistake frequently in interpreting the difference between two stocks – a divergence in prices, no matter how stable, always seems to imply a persistent difference even if the split was a one-time event.

My thoughts here have absolutely nothing to do with Nate’s post – please read it as I haven’t covered his reasoning at all – I merely want to take advantage of his graph to demonstrate these potential pitfalls. How’s that for some Saturday afternoon reading material?

{ 0 comments }

Great expectations

November 27, 2009 in Economics, Math

I’ve previously covered the danger of attributing meaning to a forecast which is obviously based on little or no information. In that case, it was the manufacturing survey, which one might dismiss as a more obscure measure. Recently, however, Ken Houghton has written a pair of posts on inflation forecasts that bring me back to that argument.

In his first, he presents a study that seems to show that, indeed, inflation expectations tend to assume that the future will look just like the present:

Again, this does not surprise me, as the futre expectation of a random walk is its present value. In the second post, the time series of inflation vs expectations is presented:

With the additional dimension of time, I can see a simple heuristic for inflation expectations: consumers think that inflation will stay at roughly the same level that it is on any given day, with some slight reversion to the Fed target, unless inflation is currently below the target, in which case they think it will rapidly bounce back to – or above – that level.

You can see the inflationary spikes in 2006 echoed in the 2007 forecast; the sharp 2008 increase and subsequent fall are mirrored in the 2009 forecast for the time they remain above the target, at which point they halt their slide.

These  charts tell me two things. First, that consumers have very little insight into future inflation levels, to the point that they are unwilling to even choose a simple number like 3% and prefer instead to say that the future level will be similar to today’s. Second, that consumers have blind faith in the Fed’s ability to keep inflation at or above its target level – even in the face of evidence against that power.

{ 0 comments }

Data intervention

October 16, 2009 in Data

The always-excellent How I Met Your Mother addresses a major social problem:

YouTube Preview Image

(via FlowingData)

{ 0 comments }

Radial clustering

September 14, 2009 in Data

Finally, a radial visualization which serves a purpose rather than just looking cool. Getting Genetics Done has a tutorial on using clustering functions in R. In it, they show how this this analysis:

is much better represented like this:

There’s nothing wrong with making a chart which looks good – in fact it’s encouraged - so long as the visual niceties enhance the message of the graphic. Radial graphics are all the rage these days, but they rarely help with information communication (and in many cases they detract!). It’s nice to see a truly constructive application of the technique.

(via Revolutions)

{ 0 comments }

How to fix a broken pie chart

September 8, 2009 in Data

Datavisualization.ch has a helpful step-by-step on how to turn this (from a Mashable post):

into this:

Of course, the motivation is worth more than the mechanics.

{ 0 comments }

Twitterverse demographics

August 28, 2009 in Internet

I spoke too soon – another post from ReadWriteWeb manages to frustrate yet again. In an article claiming that teenage use of Twitter is on the rise, they present this chart:

Let’s do what RWW did not and actually think about what this graph is showing. For each age group, their use of Twitter is plotted over time, relative to their use of the internet as a whole. In other words, this is a visualization of the relative composition of the Twitterverse. If all age groups used Twitter similarly to their overall internet consumption, then all the lines would be at 100.

I do find it amusing that RWW has a almost cliched “statistics can be misleading” section in its article, which fails to note the single most important caveat (unsurprisingly, given their misinterpretation of the chart): increased participation by any one age group must be offset by decreasing participation by another. So the rise in the “12-24″ line is equally and exactly offset by declines in the adult groups. Kind of a different headline, isn’t it: “Adults Abandon Twitter!” And yet, it’s based on the exact same information.

At this time we should note that just two days ago, the Times ran an article called “Who’s Driving Twitter’s Popularity? Not Teens.”

The key here is that we don’t know whether teens are using Twitter more or adults are using it less. All we know is that if you look at the Twitter userbase, teenagers form a greater percent of the community than they used to – even though the absolute number of teenage Twitterers could be static or even dropping (if adult use was falling off at a greater rate).

What’s much more interesting is that for the first time, teens are using Twitter disproportionately – they are a larger demographic of the Twitterverse than the internet generally. But again this gives us no context, and that fact could arise from their increased participation or adult accounts going stagnant.

It’s interesting and informative to note that young people are a steadily growing percentage of the Twitterverse. It is a mistake to make assumptions about their number from the graph, however.

I fully expect an article from RWW examining the “massive rise” in “2-11″ Tweets – who are these tweeting toddlers? What do they tweet about? And most importantly, how can your marketing strategy take advantage of this trend?

Update: I am not surprised to learn that this graph comes from Silicon Alley Insider’s Chart of the Day column. I cringe at the thought of that site’s influence.

{ 1 comment }

A post on Junk Charts sent me reading about Stevens’ power law, which supplies a quantification of a problem I’ve discussed before: the danger of representing single-dimensional data with two-dimensional graphics.

Stevens’ law measures the amount by which humans over- or under-perceive a stimulus, relative to its actual intensity. For example, the coefficient for “visual length” is 1, meaning that humans accurately gauge the true difference between lines of various lengths. However, the coefficient for “visual area” is just 0.7, meaning we underestimate differences in area by 30%!

This follows from the arguments laid out previously – area increases with the square of the one dimensional metric; therefore, as we look to that single measurement’s representation in a two-dimensional graph (say, the radius of a circle), we fail to account for the compounding effect of squaring it as it grows. This leads to an underperception of relative differences in area. Using a single-dimensional metric, like pure length in a bar chart, is much more appealing because our perception of variation will scale linearly with the actual measurements.

{ 0 comments }

Untangling charts

August 6, 2009 in Data, Politics

House Minority Leader Boehner recently released this “infographic” (I use the term loosely) in order to demonstrate his frustration with the House Democrats’s heath proposal:

The chart really is an absolute nightmare: the colors, layout, and hidden connections contribute to an absolutely impossible-to-read image, which is exactly what Rep Boehner wants.

Recently, Robert Palmer, a graphic designer in California, took it upon himself to untangle this mess. Here is his version of the chart (click to zoom):

Now, neither chart makes a strong case for or against the policy itself; both attempt merely to show all the affected parties. But the fact that Robert Palmer was able to lay out an extraordinarily clear picture of all participants demonstrates that Rep. Boehner’s chart was intentionally obfuscated in order to mislead and confuse. The only other explanations are that whoever put it together a) didn’t understand the layout or b) didn’t understand how to present it. Ignorance, in this case, is not bliss.

When we are handed data or statistics, we have an enormous power to construct convincing arguments and clear presentations of otherwise complicated ideas. To abuse those tools (and the public’s faith in those tools) by using them to construct a bad analysis is a poor policy choice – not only is it easily falsifiable, but it erodes the ability to effectively communicate at all.

Lies, damn lies and statistics… the two charts above claim to show the exact same situation. Undoubtedly, there are many more graphics that could be constructed – are any of them actually “right”? Hard to say, but I feel that the first chart is “wrong” without question because it breaks every rule of effective design. The tax may well be a beaurocratic nightmare, as Rep. Boehner claims. And Palmer’s chart does not show a lack of bureaucracy, it merely lays out the connections clearly. But by constructing a graphic which willfully corrupts its own message, Rep. Boehner undermines his argument: if his chart shows a tangled mess but Palmer can untangle it, then the public will conclude that Boehner was wrong. He would have done better to have shown Palmer’s chart in the first place and claim that there are too many connections on it – that way any refute would live only in the realm of opinion, not demonstrable fact.

{ 0 comments }

Mapping Seinfeld

August 6, 2009

Posted as a public service following this announcement (click to zoom):

via Daily Fill.

0 comments Read the full post →

The 100 users of Twitter

July 31, 2009

An interesting visualization of Twitter as 100 people is a good take on a popular infographic meme, but reveals a few inconvenient truths about these sorts of images.
Firstly, although I am (not so) secretly pleased to see this illustration of Twitter’s non-inclusive communicative nature let’s not forget that Twitter, like so many other social phenomena, [...]

0 comments Read the full post →

Evaluating returns to social media

July 21, 2009

A collusion by wetpaint and the Altimeter Group has resulted in a fanciful study on social media. Normally, a paper like this wouldn’t be worth addressing, but the amount of attention being paid to its questionable conclusion warrants a closer look. And that conclusion is:
[T]his landmark study has found that the most valuable brands in the [...]

5 comments Read the full post →

Misreading misleading charts: entrepreneur edition

June 18, 2009

Paul Kedrosky writes about a study on the rate of entrepreneurship among various age groups, which includes the following piece of junk (ch)art:

Why is this chart 3D? It contains information in only two spatial dimensions (time and rate), with a third dimension coded by color. To make the chart itself is a purely superfluous move [...]

0 comments Read the full post →

Misreading misleading charts: VIX edition

June 16, 2009

Get your tin foil hats back out! Zero Hedge can’t seem to keep their manipulation theories under control (I addressed one here in one of TGR’s most popular posts) and today’s example is to egregious to pass up.
In this post, Zero Hedge reviews ground breaking “research” from Innovative Quant Solutions “on the very relevant topic of [...]

0 comments Read the full post →

Truth in advertising?

June 15, 2009

I find this graph very interesting, not just because of any implied political statements, but for how it highlights the absurdity of economic forecasting and the potentially misguided trust we place in such numbers.
The blue lines were circulated by Obama’s economic team when they were pitching the stimulus bill in order to illustrate its beneficial [...]

0 comments Read the full post →

Illustrating the importance of data visualization

June 12, 2009

Andrew Gelman discusses research on attitudes toward gay marriage, by state, and notes this graph in particular, which shows the change in opinion over the last 15 years:

Critically, he points out that the states which experienced the greatest change in attitude were the ones that already were most receptive. A naive analysis of the data [...]

0 comments Read the full post →

Critiquing the Crimson

June 9, 2009

The Harvard Crimson has published its annual senior survey, which is making headlines in part because very few seniors are going into finance. Selected results were presented in an interesting visualization (the image below links to a full size pdf):

Now that my brother has graduated after successfully steering the Crimson’s business operations to one of [...]

0 comments Read the full post →

Breaking down labor mobility

June 7, 2009

Great graphic from the NYT (click to zoom):
(via LL)

0 comments Read the full post →

Shades of bullishness

May 29, 2009

FT Alphaville has a post up regarding new research from Citi on how analysts make recommendations. It is accompnaied by this graph:

The graph shows the average recommendation across all analyst-covered stocks, for the last 15 years. A stock gets a 1 if every analyst recommends buying it; a 5 is given to a universal sell; [...]

0 comments Read the full post →

Charting value (maybe)

May 19, 2009

Silicon Valley Insider presented this as its Chart of the Day today, saying it indicates the success of Microsoft’s “Laptop Hunter” ads:

First of all, it takes some digging to learn what this scale even means, which brings us to a violation of charting rule #1: do not use a misleading axis! The true scale goes [...]

1 comment Read the full post →

Presidents vs Pirates

April 15, 2009

Two excellent graphs making the rounds – the first showing Obama the Pirate-Slayer:

And the second with historical context (that being the First and Second Barbary Wars):

 

0 comments Read the full post →

Spurious correlation: fruit edition

April 3, 2009

Who says correlation doesn’t imply causality?  Borrowed from In the Pipeline.

0 comments Read the full post →

Industrial production in perspective

April 3, 2009

This post by Krugman inspired me to take a look at how the industrial production index has fared lately.  At first glance, the recent drop is pretty massive:

But looking at it on a log scale tells a very different story:
Krugman makes the point that the plunge paused in 1931 before resuming, and that the worst [...]

0 comments Read the full post →

The graph is half full

March 31, 2009

The Big Picture has a post which borrows two graphs from Credit Suisse that are meant to illustrate the performance of the S&P 500 in the 100 days following a “major trough.”  I re-borrow them here:

It looks like the top graph represents a collection of bear market bottoms, which are easily identifiable by the characteristic [...]

0 comments Read the full post →

Misreading misleading charts

March 26, 2009

This chart caught my eye because it is potentially misleading (click to zoom).  It shows the year-over-year change in hotel occupancy rates, from 2001-2009.

My first impression, on viewing the small chart, was that we haven’t hit the low of the last recession.  But (as the box very clearly points out) the low of the last [...]

1 comment Read the full post →

Ken Lewis makes dubious claims

March 9, 2009

BOA chairman Ken Lewis has written an opinion for the WSJ (“Some Myths About Banks”)  containing the following “myth” and rebuttal:
The banks are insolvent. In the past 18 months, we’ve seen fewer than 50 bank failures. That compares to about 2,000 failures or closings of commercial banks or savings institutions between 1986 and 1991. There may be [...]

0 comments Read the full post →