Recently I was looking for a new good restaurant. Google Maps highlighted me 2 options: restaurant A with 10 reviews all 5 stars and restaurant B with 200 reviews and average rating 4. I tempted to choose restaurant A but the low number of reviews concerned me. On the other hand, many reviews of restaurant B gave me confidence of its 4 star rating, but promised nothing excellent. So, I wanted to compare the restaurants and choose the best discounting on reviews or lack of reviews. Thanks to Bayes, there is a way.

Image made by the author.

Bayesian framework allows to assume something about the initial distribution of ratings and then update the inital belief based on observed data.

**Set initial beliefs / prior**

Initially we know nothing about probabilities of each rating (from 1 to 5 — stars). So, before any reviews, all ratings are equally likely. It means we start from the Uniform distr. which can be expressed as a **Dirichlete** distribution (generalization of Beta).Our average rating will be just (1+2+3+4+5)/5 = 3 which is where the most probability is concentrated.# prior prob. estimates sampling from uniform

sample_size = 10000

p_a = np.random.dirichlet(np.ones(5), size=sample_size)

p_b = np.random.dirichlet(np.ones(5), size=sample_size)

# prior ratings’ means based on sampled probs

ratings_support = np.array([1, 2, 3, 4, 5])

prior_reviews_mean_a = np.dot(p_a, ratings_support)

prior_reviews_mean_b = np.dot(p_b, ratings_support)Image made by the author.

#### Update beliefs

To update the initial beliefs we need to multiply the **prior beliefs **to the **likelihood **of observing the data with the prior beliefs.The observed data is naturally described by **Multinomial **distribution (generalization of Binomial).It turns out that Dirichlet is a **conjugate prior **to the Multinomial likelihood. In other words our **posterior distr. is also a Dirichlet** distributuion with parameters incorporating observed data.Image made by the author.# observed data

reviews_a = np.array([0, 0, 0, 0, 10])

reviews_b= np.array([21, 5, 10, 79, 85])

# posterior estimates of ratings probabilities based on observed

sample_size = 10000

p_a = np.random.dirichlet(reviews_a+1, size=sample_size)

p_b = np.random.dirichlet(reviews_b+1, size=sample_size)

# calculate posterior ratings’ means

posterior_reviews_mean_a = np.dot(p_a, ratings_support)

posterior_reviews_mean_b = np.dot(p_b, ratings_support)The posterior avg. rating of A is now somewhere in the middle **between prior 3 and observed 5**. But the avg. rating of B didn’t change much because the large number of reviews outweighted the initial beliefs.Image made by the author.

#### So, which one is better?

Back to our original question, “better” means the probability that an **avg. rating of A is bigger than an avg. rating of B, **i.e., P(E(A|data)>E(B|data)).In my case I obtain the probability of 85% that restaurant A is better than restaurant B.# P(E(A)-E(B)>0)

posterior_rating_diff = posterior_reviews_mean_a-posterior_reviews_mean_b

p_posterior_better = sum(posterior_rating_diff>0)/len(posterior_rating_diff)Image made by the author.

Bayesian update allows us to incorporate prior beliefs which are especially valuable in case of small number of reviews. However, when the number of reviews is big, the initial beliefs do not significantly impact the posterior beliefs.

Code is available in my github and I am going to the restaurant A.

A Bayesian Way of Choosing a Restaurant was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.