The idea behind user-generated reviews is that with a large enough sample size, you will eventually reach the truth about the quality of a product or establishment. However, what isn't often noticed are the strange ways that the accuracy varies depending on the sample size.

For example, a friend of mine in San Francisco complains that all the Yelp reviews for restaurants in a certain neighborhood are always 4 stars. Not 3.5 and not 4.5, but just 4. The reason being is that every establishment there has hundreds of reviews and at that sample size, you're actually just getting an over-representation of the legion of vocal supporters of their favorite local establishment.

But four hours away, in South Lake Tahoe, there is the opposite problem. Since there are only a few reviews for most establishments, they're all the random stinkers from patrons who were burned by their experience.

The ideal sample size for reviews, then, is actually somewhere in the middle. If you have too many reviews, a biased sub-group will be over-represented, and if you have too few, there will be too much noise that doesn't get smoothed out.


posted by phil on Wednesday Jan 23, 2013 3:19 AM

1 Comment

I find that the time-consuming process of actually reading the comments helps to figure out what's really going on.

Leave a Comment