Comparing audience and critic ratings using data from Rotten Tomatoes
For the past century or so, cinema has played the part of one of society’s most widely consumed forms of entertainment, with a myriad of movies exploding onto the scene each year with as much vigour as popcorn kernels in a hot pan.
The ever-growing array of cinematic options has made the evening entertainment selection process all the more bewildering, with many viewers employing the assistance of recommendations when making their choices. Such recommendations may come from user-generated websites, reviews published by critics, or platforms that combine the two.
But does the opinion of the esteemed movie critic represent that of the casual cinema goer? Can we rely on the critic’s voice to influence our choice of entertainment?
This article aims to use data from Rotten Tomatoes, a website that’s offered film and television reviews since 1998, to analyse the similarities and differences between the movie rating behaviour of film critics and audience members.
The website displays two separate average ratings (on a percentage scale) for each movie depicting the share of reviews it received that were positive: one based on the reviews from a selection of critics, and another generated by the site’s users.
Using a data set composed of the average critic rating, average audience rating and a variety of attributes of over 17,000 movies from Rotten Tomatoes, this article will employ a number of analytical methods to compare the behaviour of the two groups.
We will begin by making a broad assessment of the extent of the differences in the two groups’ respective rating behaviour. We will then dive deeper into an analysis of the drivers of said differences. Finally, we will implement a simple linear model in an attempt to identify the main determinants of each group’s rating tendencies.
To make a general comparison of the respective rating behaviour of the audience and critics, let’s start by looking at the distribution of each group’s ratings.
We can identify differing patterns in the two groups’ ratings in the above histogram.
Both distributions exhibit a left skew: indicating that, in both cases, there is a greater number of movies in our data set concentrated towards the higher (and therefore more positive) end of the rating spectrum.
However, it’s also fair to say that the two distributions are noticeably different in shape.
The audience ratings are more smoothly distributed without distinct peaks, and are generally clustered around the mid to high rating range with very few falling towards the lower end of the scale.
The critics’ ratings are far more widely spread across the whole spectrum, meaning that there are almost as many movies falling towards the lower end as there are towards the top. The distribution also exhibits peaks at both extremes, although one would assume that these are cases of movies that have received a small number of reviews.
This suggests that we can expect the audience to be more generous in their ratings while the critics, on the other hand, are more, as the name might suggest, critical.
Next, let’s investigate whether there’s any obvious correlation between the two groups’ ratings. With each point as an individual movie, we can plot the ratings of the audience vs the critics.
The above scatter plot indicates that there is in fact some evidence of a positive correlation between the two groups’ ratings, suggesting that a movie rated highly by the audience will also receive a positive rating from the critics.
However, the correlation displayed in the plot can hardly be described as strong, with a large number of movies falling a considerable distance away from the central line.
Both of the above charts provide us with evidence of the fact that there is enough of a contrast between the two groups’ ratings for us to dive deeper into the data.
With that in mind, let’s move onto some further analysis to try and figure out what might be the causes of the aforementioned differences in the behaviour of the two groups.
Considering the initial analysis above, let’s try and take some steps to determine the drivers of the differing behaviour of the audience and critics by exploring other aspects of the data set.
When thinking about what might influence a viewer’s enjoyment of a movie, an attribute that springs to mind might be its genre. Let’s compare the average rating for each individual genre of movie present in our data set for each of the two groups.
The chart above points to the fact that the critics and audience generally reacted differently to some of the movie genres.
The critics responded more favourably towards Documentary and Classic movies, while the audience exhibited a greater preference for the Faith & Spirituality and Kids & Family genres.
From a theoretical perspective, this makes sense. One could argue that the “Classics” of the film world fulfil the necessary criteria to be considered of high cinematic quality, while perhaps being less accessible to the more casual movie goer.
“Family friendly” movies on the other hand may serve the purpose of entertaining viewers seeking a simple movie to watch with their children without employing the various cinematic techniques required to achieve the accolade of movie greatness.
There were, however, genres that the two groups responded similarly towards. It would appear that Musicals and Dramas, for example, are universally average in popularity amongst the audience and critics alike.
Let’s conduct a similar analysis of the year in which movies were released. Could it be that the groups’ respective ratings follow a similar, or indeed contrasting, pattern of change over time?
The graph above suggests that the audience’s average rating remains fairly constant over time, perhaps displaying a slightly decreasing trend. Such a decreasing trend is however exhibited to a much greater extent in the behaviour of the critics: with their line showing a sharp descent over the course of the 100 years covered.
Does this mean that, from a critic’s perspective, there are fewer acclaimed movies around these days than there were in the early stages of the previous century?
Well, not necessarily. It’s worth paying attention to the third line on the graph, which shows us that the further on in time we move, the greater the number of movies we’re analysing becomes.
This could be the result of the fact that the creator of this data set had a preference for including more new movies than old ones, or that the availability of data was greater for more recent years.
However, it’s more likely to be down to the fact that there are simply far more movies produced each year in current times than there were, say, 60 years ago.
With the advent of concepts such as independent cinema paired with the adoption of movie streaming platforms, it’s fair to assume that the barriers of entry into the film industry have decreased significantly.
This means that the distribution of cinema has likely become more “diluted” over time: with a more vast array of films released each year to cater to a wider audience.
While this doesn’t mean that quality cinema has disappeared, an increased number of films designed to appease viewers searching for an easy fix of entertainment as opposed to impress critics inevitably leads to a decrease in the average of the critics’ ratings without significantly harming the general opinion of the audience, and hence potentially explains the behaviour exhibited above.
Let’s move onto the final part of the analysis, in which we’ll attempt to identify the key determinants of each group’s rating behaviour through the implementation of a simple linear model.
We can use movies’ ages, runtimes, content ratings and genres as explanatory variables, with the critics’ and audience’s average ratings as response variables in two separate models that we can then compare.
For readers unfamiliar with the fundamentals of linear regression, the coefficient of a variable describes the estimated impact an incremental change in the variable has upon the response.
In the context of this model, this means that, for example, we can expect a movie being a Documentary to increase the critics’ average rating by as much as 27.9 points.
In terms of an interpretation of the models’ results, we can see that, in both cases, movies’ genres had a much greater influence on each group’s rating behaviour than the other variables we included in the model.
With some exceptions aside, many of the genres had similar effects on both groups’ behaviour in terms of both direction and magnitude, with Documentary and Animation movies having the largest positive influence on both groups’ ratings.
Differences lie in genres such as Horror, which had a much greater negative impact on audience ratings than it did with critics. Westerns, while having a fairly weak influence for both groups, displayed respective correlations running in opposite directions: positive for critics and negative for the audience.
In support of graph from the previous section of the analysis displaying average rating over time, the age variable produced a positive coefficient for both groups, with a stronger effect amongst the critics. The model however estimated its impact to be minimal when compared with the genre variables, as was the case with runtime.
Another interesting difference to note is the effect of movies’ content ratings. While displaying an almost negligible impact on the critics’ behaviour, the effect was more pronounced among the audience: with a more severe content rating causing a decrease in the average score.
This also fits with theory discussed previously in the analysis. Since we identified that the audience responds more favourably to Kids & Family movies, it makes sense that the level of strictness of a movie’s content rating (and hence, by inference, the level of explicit content it contains) is correlated negatively with enjoyment for this group.
This article used data from Rotten Tomatoes to compare the rating behaviour of movie critics with that of regular users of the website. After employing a number of techniques to analyse the data, we identified the following key takeaways:
- There was a noticeable difference in the respective patterns of the two groups’ behaviour, with the audience more likely to rate generously than the critics
- The two groups responded differently to the different genres present in our data set
- The critics’ average rating decreased significantly as movies’ release years increased, while the audience ratings remained more constant
- Genres were the principal determinants of rating behaviour in both groups according to our model, however the audience displayed more sensitivity towards movies’ content ratings
It is of course worth touching upon the limitations of an analysis of this kind. Cinema is, at its core, a form of art.
While we can measure certain aspects to form analytical arguments, there will always be elements of cinema that are impossible to measure quantitatively: characteristics of a film that influence how we feel about it that we can’t describe with numbers.
However, I hope that an attempt to explore an interest in film using a data-centric approach nevertheless made for an interesting read!
If you’re interested in taking a look at the Python code I used to analyse the data set and build the visualisations for this article, feel free to check out this repository on my Github. Acknowledgement of the data set’s creator is included in the repository’s README.