Posts tagged as:


"Big Data" is meaningless

January 20, 2012

Roger Ehrenberg gets it: Every so often a term becomes so beloved by media that it moves from “instructive” to “hackneyed” to “worthless,” and Big Data is one of those terms.... Every business generates data, but it is a far smaller number that view data as a strategic asset that is actively managed for the benefit […]

0 comments Read the whole post →

...and not a drop of value

January 5, 2012

Bryce Roberts gets it: Here’s the thing. Data, big, medium or small, has no value in and of itself. The value of data is unlocked through context and presentation.

0 comments Read the whole post →

"Highly skilled, nerdy-cool"

September 15, 2011

More good news for data scientists, this time from Fortune: The unemployment rate in the U.S. continues to be abysmal (9.1% in July), but the tech world has spawned a new kind of highly skilled, nerdy-cool job that companies are scrambling to fill: data scientist.

0 comments Read the whole post →

Syncing settings across computers

September 15, 2011

Using Dropbox and shell scripts to automatically sync settings and configurations.

8 comments Read the whole post →

"The application of data is what is fascinating"

September 15, 2011

My friend Darren Herman recently tweeted a statement I couldn't agree more with (I'm linking to his blog post rather than the tweet itself; as we all know, attempting to take advantage of Twitter's disastrous data model is like trying to catch water in a sieve): ”The data itself isn’t overly interesting.  The application of data is what […]

0 comments Read the whole post →

Google Correlate

September 6, 2011

For some time, we ran a popular series on TGR called "Trends" -- you can see 'em all right here. We used Google Trends and Google Insight to uncover interesting behavioral relationships. Now Google has gone and stolen our thunder, releasing Google Correlate to the world. Google Correlate lets you directly compare the search histories […]

0 comments Read the whole post →

Data science in the mainstream

August 14, 2011

AOL Jobs has posted an article titled "Data Scientist: The Hottest Job You Haven't Heard Of" -- except, of course, that you have. But you TGR readers would make up a very small fraction of AOL's traffic (trust me -- it doesn't take a data scientist to figure that one out), so let's take this […]

2 comments Read the whole post →

And eat it, too

July 28, 2011

Mark at Epic Graphic presents a metaphor for the data/knowledge process: While I love the idea, I think it's missing the most important thing -- the recipe! I'm most interested in how we get from data (raw ingredients) to information (consumible product). Do we follow a specific process - taken straight from a cookbook, for example? […]

0 comments Read the whole post →

QOTD: hindsight edition

July 10, 2011

Kaiser Fung writes on uncertainty and thinking probabilistically about events that have already transpired. The full post is worth a read, but this line sticks out for me: The fact that you won the lottery does not change the fact that economically, it was silly to play the lottery in the first place. This fallacy pops […]

0 comments Read the whole post →

Data science vs business intelligence

June 30, 2011

Steve Miller has written a nice two-part piece on data science for Information Management. Part 1 overviews the topic, including links to many pieces that have been profiled on TGR. Part 2 is a more direct comparison of data science and "business intelligence," a somewhat lackluster (but growing) field of data analytics. One quote stood […]

2 comments Read the whole post →

Information economics

April 25, 2011

An excellent article in the NYT suggests that "information economics" is starting to have a demonstrably positive impact on businesses that harness data well. The most important observation, in my opinion, comes a bit earlier in the article: In a modern economy, information should be the prime asset — the raw material of new products […]

5 comments Read the whole post →

Holographic GapMinding

November 30, 2010

Hans Rosling -- whose lectures are always fascinating -- is hosting a new documentary for the BBC called "The Joy of Stats." A 5 minute clip has been released on YouTube showing a faux-holographic version of Hans' GapMinder visualization package. The graphic overlay is very well done and lets Hans describe the data in an […]

0 comments Read the whole post →

Visualizing politics through time

November 22, 2010

We love choropleths here at TGR, and here's a really great set -- David Sparks has mapped US presidential voting patterns through time to create an excellent visualization of ebbing (and sometimes volatile) political attitudes: Best of all, he did it with R. Please see David's website for more details. Some of his other projects […]

0 comments Read the whole post →

Modeling how cats drink

November 11, 2010

I thought this was fascinating -- scientists have modeled how cats drink. Naturally, once you have a model, you want to see how well if fits the data. For example, is there an optimal lapping speed? After calculation of things like the Froude number and the aspect ratio, they were able to figure out how […]

1 comment Read the whole post →

Google Refine

November 11, 2010

Google has launched a new open-source project called Refine (formerly Metaweb's Freebase Gridworks) which allows users to easily clean up and transform large datasets. There is nothing more painful than cleaning data at the command line - I'd even go so far as to say it's impossible to do a good job. Sorry, R. Excel […]

0 comments Read the whole post →

Chicken soup for the global economy

November 8, 2010

Just replace "technology" with "stress": (via Dilbert)

1 comment Read the whole post →

Breaking up is hard to do (especially on Christmas)

November 2, 2010

David McCandless's TED talk on data visualization is excellent -- you can catch it here -- and Mathias Mikkelsen has highlighted a single analysis that investigates when people are most likely to break up (according to Facebook) (Update: original here): What makes the chart so appealing is how easy it is to understand, despite the […]

0 comments Read the whole post →

The data supply chain

October 27, 2010

Pete Warden has written a post on extracting value from data. Early on, he compares the data itself to raw minerals - it's difficult to sell it at a premium because the eventual buyer will have to invest time and money extracting value from the commodity. Now, data may not be commoditized (yet) but I […]

0 comments Read the whole post →

Tower graphics

October 14, 2010

Max Gadney writes on the rise of "tower graphics" - those giant infographics popping up all over the net which require scrolling endlessly to follow their narratives. He notes: Every time I try to hate these, I imagine people who are just interested in the facts finding them easy to use. (albeit hard to search […]

0 comments Read the whole post →

The language of statistics

June 24, 2010

Joseph Rickert has written a piece calling R "the language of statistics," which I feel is a deserved title. As he puts it: I don’t just mean that R “is spoken” by many or even most statisticians. R’s superiority for statistics is deeper than that. R is a language with syntax and structure that have […]

0 comments Read the whole post →

Twitter's firehose problem

June 22, 2010

Esquire confirms what we already knew: Twitter is a waste of time. The information "firehose" has more in common with the Deepwater site, spewing redundant and useless information at a constant pace. In that regard, truth be told, it's not much different from any other communications service - except that alternatives have either explicit or implicit filtering […]

1 comment Read the whole post →

Sweating the small stuff

June 9, 2010

An excellent (and humorous) TED talk by Rory Sutherland on the importance of detail and clarity:

0 comments Read the whole post →

Off the grid 2: here there be tourists

June 9, 2010

Eric Fischer has updated the Geotagger's World Atlas (previously covered on TGR here) by overlaying an analysis of photographers on the geo-located picutures. The result is even more stunning, capturing the different behaviors of locals (blue) and tourists (red): He drew conclusions by examining other photos by the same photographer. If they had taken photos […]

0 comments Read the whole post →

What is data science?

June 3, 2010

The latest in a series of articles on the topic, Mike Loukides of O'Reilly Radar asks, "What is data science?": We've all heard it: according to Hal Varian, statistics is the next sexy job. Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement […]

1 comment Read the whole post →

A Data Visualization Manifesto

May 31, 2010

Words of wisdom from Andrew Gelman: What harm is done, if any, by having ambiguous labels, uninformative orderings of variables, inconsistent scaling of axes, and all the rest? From a psychological or graphical perception perspective, maybe these create no problem at all. Perhaps such glitches (from my perspective) are either irrelevant to the general message […]

0 comments Read the whole post →

Where do R commands come from?

May 13, 2010

Ever wondered why R commands have those funny and sometimes confusing abbreviations? I admit I always found "c" (which [c]ombines elements) confusing... especially when I was starting out, and would bind it to test variables. In the spirit of upholding my end of TGR's bargain (in which I provide items of nerdy interest and you […]

1 comment Read the whole post →

Precision Information Environments

May 11, 2010

The last time I posted a video for all the futurists out there, we'd never even heard of an "iPad." It's amazing how that device has made clips like these seem so much closer to reality. This one is based on research from the Pacific Northwest National Laboratory on a class of emergency management interfaces called PIE's: Precision Information Environment. […]

0 comments Read the whole post →

Data, data, everywhere

May 7, 2010

Doug Glanville on baseball scouting, but he could have been writing about any modern data-driven industry: But when all is said and done, if you don’t have instincts for what is happening, a perpetual stream of information just becomes a time-stealing vortex, and useless at best — even though you may know a lot more […]

0 comments Read the whole post →

The mathematician's lens

January 25, 2010

A beautiful article in the NYTimes contrasts abstract mathematics with the chilling reality of the Mexican drug cartel wars: I was born in Mexico City, in a world that seems less and less familiar to me. I live now in the opposite corner of the continent. I am training to be a political scientist at […]

0 comments Read the whole post →

Data wars

January 11, 2010

The NYT writes about the military's data problem: Air Force drones collected nearly three times as much video over Afghanistan and Iraq last year as in 2007 — about 24 years’ worth if watched continuously. That volume is expected to multiply in the coming years as drones are added to the fleet and as some […]

0 comments Read the whole post →