I really love the latest post on Lessons from my Twenties, called Stat Is Magic. Sometimes, things are better left as magic.

# statistics

## Bayes, prior to reading

August 16, 2011I may have to go pick up this book, which was reviewed in the NYT last week, if only because it opens with a favorite quote from Keynes. Titled The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy (Wow, titles are getting […]

## UCF cheating scandal

November 18, 2010A major cheating scandal at UCF was discovered - and resolved - through a relatively simple statistical analysis of midterm results. The team was able to identify students who cheated on their midterm exam with high confidence. Professor Richard Quinn's announcement of those findings was captured in this video of his lecture: (Via The Daily […]

## Tiny, Large, Very, Nice, Dumbest.

November 12, 2010Here's a great analysis from Ben Blatt of the Harvard Sports Analysis Collective. He looked at three well-known sports writers -- Bill Simmons, Rick Reilly and Jason Whitlock -- and performed a lexical analysis to create a statistical representation of their writing styles. What can you do with that analysis? Well, you can see what […]

## The data supply chain

October 27, 2010Pete Warden has written a post on extracting value from data. Early on, he compares the data itself to raw minerals - it's difficult to sell it at a premium because the eventual buyer will have to invest time and money extracting value from the commodity. Now, data may not be commoditized (yet) but I […]

## World Statistics Day

October 21, 2010World Statistics Day was yesterday, October 20th. Here's how the United States marked the occasion: In order to celebrate WSD, U.S. associations and federal statistical agencies will conduct a breakfast briefing and open house on Capitol Hill to celebrate the contributions of statistics toward informing public policy and improving human welfare. Party hats ON, people! If […]

## Statistical literacy

October 13, 2010Wired has put together a list of 7 essential skills you didn't learn in college but will need to navigate the 21st century. Skill number 1: statistical literacy. (Skill number 7 is domestic tech -- could that be the new home ec?) (via Kottke)

## Risk & risk management

June 30, 2010An overview of financial risk and the risk management process.

## The language of statistics

June 24, 2010Joseph Rickert has written a piece calling R "the language of statistics," which I feel is a deserved title. As he puts it: I don’t just mean that R “is spoken” by many or even most statisticians. R’s superiority for statistics is deeper than that. R is a language with syntax and structure that have […]

## What is data science?

June 3, 2010The latest in a series of articles on the topic, Mike Loukides of O'Reilly Radar asks, "What is data science?": We've all heard it: according to Hal Varian, statistics is the next sexy job. Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement […]

## A Data Visualization Manifesto

May 31, 2010Words of wisdom from Andrew Gelman: What harm is done, if any, by having ambiguous labels, uninformative orderings of variables, inconsistent scaling of axes, and all the rest? From a psychological or graphical perception perspective, maybe these create no problem at all. Perhaps such glitches (from my perspective) are either irrelevant to the general message […]

## Beware statisticians bearing gifts

May 24, 2010The NYT is running a great article about the influx of data in today's world. The prime argument borrows from Einstein's quote, "Not everything that can be counted counts, and not everything that counts can be counted." I think this speaks volumes and should be heeded by the sites that persist in churning out infographics […]

## The revolution will be translated

March 9, 2010From an NYT article on Google's translation services, this excerpt sums up the most critical transition in machine learning that has happened thus far: Creating a translation machine has long been seen as one of the toughest challenges in artificial intelligence. For decades, computer scientists tried using a rules-based approach — teaching the computer the […]

## The mathematician's lens

January 25, 2010A beautiful article in the NYTimes contrasts abstract mathematics with the chilling reality of the Mexican drug cartel wars: I was born in Mexico City, in a world that seems less and less familiar to me. I live now in the opposite corner of the continent. I am training to be a political scientist at […]

## Never more true than today

January 10, 2010In his Chart Wars talk, Alex Lundry mentions a quote which he attributes to H. G. Wells: Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write. However, that statement was actually made by Samuel Wilks, who was paraphrasing a line in Wells' book Mankind in the […]

## More mainstream Bayesians

December 20, 2009The NYT recently ran an article on the math behind the recent and controversial mammogram advisory change. Unsurprisingly, it is heavily centered on a Bayesian argument. Of course, the key point here is not that the statistics dictated the change, but that budgets and political agendas dictated an acceptable level, which the statistics subsequently informed: […]

## Professor Risk

December 13, 2009David Spiegelhalter is the Professor of the Public Understanding of Risk at Cambridge University. He has recently produced the following video to encourage better practices in the casual perception of risky behaviors: I think it's a brilliant video and would love to have been one of Professor Spegelhalter's students. I firmly believe that the study […]

## Ten statisticians every psychologist should know

November 11, 2009Psychologist Daniel Wright has published a list of ten statisticians every psychologist should know. The list is comprised of The Founding Fathers: 1. Karl Pearson - who established statistics as an academic discipline 2. Ronald Fisher - who developed much of statistics' mathematical foundation, including ANOVA and maximum likelihood, and the importance of p-values 3. […]

## Living in a Bayesian world

October 30, 2009Increasingly, I've noted in my discussions with statisticians and practitioners a reliance on Bayesian methods. Bayesian statistics rely on an understanding of the uncertainty of a hypothesis. For example, Bayesian hypotheses are literally updated as new information becomes available. Bayesian analyses will also rely heavily on conditional probabilities, or the understanding of likelihoods that depend […]

## Suspicious poll distributions

September 25, 2009I've covered Benford's method for first-digit fraud analysis before, and now Nate Silver has applied a similar method to polling results. He looked at the last digit of various polls (i.e. a 48% McCain, 49% Obama, 3% undecided poll would be recorded as an 8 and a 9) and compiled histograms of their frequencies. Following […]

## Lottery math is not so easy

September 23, 2009Carl Bialik has written about lottery coincidences in his WSJ print column and on The Numbers Guy blog, inspired of course by the recent consecutive draws in the Bulgarian lottery. Addressing my recent confusion, he sheds a little light on why likelihood estimates varied so much: The probability of Bulgaria's repeated winning numbers became a […]

## Adventures in probability

September 17, 2009Calculating the probability of the Bulgarian lottery drawing the exact same numbers in consecutive weeks.

## Junk Maths

September 10, 2009Via Andrew Gelman, I've learned that the BBC has a radio programme (as they would say write) called More or Less which is dedicated to statistics. The first bit of the most recent one is called "Junk Maths" (and again, I wish I could have taken a class called "maths") with the following synopsis: Spurious […]

## Modelling interactions

August 18, 2009Andrew Gelman's latest post highlights the importance of interactions. He includes this breakdown of where people fall depending on political party, ideology, and income: Consider the income dimension. Among liberals, the income curve is flat no matter whether the person is a Democrat, Independent or Republican. For conservatives, however, income has a large effect - […]

## Deconstructing the Gaussian copula, part III

August 11, 2009The intuition behind copula models: dependence, correlation, single factors and more.

## Statistics: desired and feared?

August 10, 2009My former department chair, Xiao-Li Meng, has published an excellent article on the emergent role of statistics and the challenge of teaching the science to non-statisticians. He addresses the negative perception of the field, often ingrained by a poor high school experience and summed up in a dismissive scoff that "the best speaker in statistics" […]

## Dronish number nerds

August 6, 2009It's still not too late for Stats 101: The NYTimes published an article this morning titled "For Today's Graduate, Just One Word: Statistics." Of course I love to see articles like this, cognizant of the massive amounts of data we are faced we and acknowledging the efforts of the people trying to sort it all out: In […]

## Photo finish: the Netflix prize

July 28, 2009A month ago, the million dollar Netflix prize was finally won by a coalition of leading teams called Bellkor's Pragmatic Chaos, who blended their respective methods into a super-algorithm that finally crossed the 10% improvement barrier. ...or was it? The 10% mark sent the competition into a final, 30-day countdown, during which time other teams could […]