June 24, 2009 in Data,Math

I had a great conversation last night which at one point verged into the pros and cons of various ratings systems. In particular, we discussed the "star+comment" system used by Yelp, in which between 1 and 5 stars can be assigned in addition to a text comment of arbitrary length.

Yelp does some clever things with their rankings, rather than just naively display restaurants with higher average rankings above ones with lower rankings. Most notably, I believe, they use a Bayesian process to asses the accuracy of the mean review. Thus, a 4 star rating based on 100 reviews could be presented above a 5 star rating based on 5 reviews, since there is uncertainty about the veracity of the 5 stars. On top of this, they take into account the people who have left comments (presumably adjusting for other reviews that person has given) as well as the content of the review comments.

Here's a feature I'd like to see: adjust the rating to account for how Yelp predicts I would rate that restaurant. Lets say I'm looking at a certain restaurant, which has 4 stars. If in the past I tended to disagree with the people who have reviewed this restaurant, then perhaps it should be presented as a 3 or 2 star choice to me.  Or perhaps I rate Italian restaurants very highly but hate sushi; even highest-rated sushi place on Yelp should be given a low rating when I view it. Or perhaps I like small restaurants, or cheap restaurants - give those categories a ratings boost when I view them.

There are a few caveats to this process: first, it requires me to have a reliable ratings history. This is just a necessary way to let Yelp know who I am. Second, the change doesn't have to be dramatic - even a subtle shift in presented ratings could make a big impact to me. Finally, there are systemic effects at work. If a restaurant is dirty, or rude, then everyone will feel that way whether they've agreed in the past or not. These have to be accounted for.

On the whole this should be a relatively easy thing to implement for anyone with a reliable ratings history - and Yelp has plenty of those. For all I know, this would be a case of overfitting and have little real impact - but I think its intriguing enough to try.

