A Data Visualization Manifesto

May 31, 2010 in Data

Words of wisdom from Andrew Gelman:

What harm is done, if any, by having ambiguous labels, uninformative orderings of variables, inconsistent scaling of axes, and all the rest? From a psychological or graphical perception perspective, maybe these create no problem at all. Perhaps such glitches (from my perspective) are either irrelevant to the general message of the graph or, from the other direction, force the reader to look at the graph and read the surrounding text more clearly to figure out what's going on. After all, a graph isn't a TV show, readers aren't passive, so maybe it's actually good to make them work to figure out what's going on.

At a statistical level, though, I think the details are very important, because they connect the data being graphed with the underlying questions being studied. For example, if you want to compare unemployment rates for different industries, you want them on the same scale. If you'reĀ notinterested in an alphabetical ordering, you don't want to put it on a graph. If you want to convey something beyond simply that big cars get worse gas mileage, you'll want to invert the axes on your parallel coordinate plot. And so forth. When I make a graph, I typically need to go back and forth between the form of the plot, its details, and the questions I'm studying.

Leave a Comment

Previous post:

Next post: