23
Feb
Everyone’s a Critic: Unpacking Rotten Tomatoes’ Ratings Data
Everyone has had this experience: you’re trying to decide what movie to watch, so you see what Rotten Tomatoes has to say. What you find fairly often, however, is confusing: This Means War receives a 25% rating from critics and a 72% rating from audiences, while 8 Mile nets a 76% rating from critics and a 54% user rating. What is a moviegoer to think? That critics take themselves too seriously, and that they habitually undervalue an awesome shoot ‘em up or heartwarming romantic comedy? That the American public is so easy to please that 76% of them actually liked Transformers: Revenge of the Fallen? Let’s turn to the data to find out.

First, a few comments on the raw data. I started with two lists: (1) the 250 highest-rated movies on IMDB (Shawshank Redemption at top, Three Colors: Red at bottom), and (2) the 1,000 highest-grossing films of all time in the U.S., which runs from Avatar ($760M) to Easy A ($58M). With these two lists in hand, I used Javascript/jQuery to pull down oodles of data from Rotten Tomatoes API. Importantly, I also pulled down information for similarly-titled movies (e.g., “Heat” gave me “Heathers”, “In the Heat of the Night”, “Red Heat”, and so on.) This resulted in roughly 9,000 unique movies, of which about 22% (or about 2,000) had enough reviews to merit inclusion in the analysis.
So, what types of movies did we end up with? Summary bar chart below. Note that a single movie can actually be tagged with multiple genres. Star Wars IV, for example, is tagged as Action & Adventure, Mystery & Suspense, and Science Fiction & Fantasy:

Great — this seems to jibe with my intuition about the types of movies that would both have a meaningful number of reviews and appear on our “raw” and “similar title” lists. In other words, this looks like the distribution you might see at your local AMC theater over the course of a year.
On to the ratings themselves. For those who don’t know, Rotten Tomatoes takes letter and number grades found in the wild, converts them to a 10-point system, and then averages the results. While it’s probably the best possible system for rating aggregation, there are some perils: if 100% of critics are slightly positive about a ho-hum movie, that movie gets a score of, say, 75%. Yet, if 75% of critics absolutely love a movie (Napolean Dynamite, perhaps) while 25% despise it, the movie still receives a 75%. We won’t sweat this too hard, though. Let’s look at how ratings are distributed:

The distribution reveals some interesting — but perhaps unsurprising — conclusions. For starters, users have a much tighter distribution. No shock there, since there are so many more data points that go into the user score (435,000 for Scarface) than the critics’ score (55). Another takeaway is that no matter how bad a movie is, there will always be a few viewers willing to give it a favorable rating. Similarly, no matter how good a movie is, you can’t please all of the people, though The Shawshank Redemption (98%), The Godfather (97%), and Cidade de Deus (97%) gave it their best. (Oddly enough, there were two movies with a 100 user rating: Off The Hook and Red Hook Summer, though I exclude these because they have 44 and 21 user ratings, respectively.)
Now, on to the question that piqued my interest from the very beginning. How do critics’ and users’ ratings vary by genre? Note that the blue bars represent the 20th and 80th percentile of rating deltas by genre. In other words, for drama movies, 20% of movies were preferred by critics by at least 10 points, while 20% of movies were preferred by users by at least ~15 points.

The data bear out exactly what we’ve always thought: the average moviegoer is more forgiving of horror, comedy, and romance movies (hey, we just want to be entertained!) while smug, self-impressed critics have a soft spot for documentaries, classics, and “special interest” films.
With such an intriguing mountain of data in hand, why stop at genre? Do user/critic deltas show similar patterns based on other variables? Of course they do. Let’s look at MPAA rating first. For starters, here are average scores by movie rating drawn from 132 G, 422 PG, 507 PG-13, 628 R, 2 NC-17, and 260 unrated films.

Most interesting to me is the “critics’ valley” you see for PG-13 films. This, I believe, is where uninspired, dirty-yet-still-teen-friendly films go to be lambasted by critics. In any case, here is the 20th/80th percentile chart for ratings deltas, which tells the story even more clearly:

Critics, it turns out, are suckers for G and unrated films, while users tend to rank PG, R, and especially PG-13 films higher than critics.
One last factor to examine is decade of release. In the raw data, we have movies from the 1920’s (8), 30’s (30), 40’s (44), 50’s (64), 60’s (79), 70’s (121), 80’s (227), 90’s (422), 00’s (787), and the 10’s (193). There is a pretty severe bias in the data, particularly for earlier films: the only movies from the 1920’s to make it to Rotten Tomatoes (with a significant number of user and critic reviews) are those excellent enough to have survived 80 years of history. With that in mind, you can probably ignore the first half of this otherwise interesting graph of average rating by decade:

Ah, how we all pine for the 50’s. And the rating deltas:

Again, intuition is confirmed: critics, who have patience for the slower-moving films of years past, are more likely to give films made through the 70’s a higher rating. Once the Brat Pack, Steve Martin, and John Rambo started to grace the silver screen in the 1980’s, though, user ratings started to drift significantly higher.
What about Best Picture nominees? These, at least, should be nearly as well-liked by users, right? Not so. Of the 173 in my dataset, nearly 80% of Best Picture nominees were rated higher by critics than by users.

For good measure, let’s take one last look at the user/critic rating deltas. This time, the movies are segmented by the average critics score.

In other words, if critics rated a movie 90+, users gave it a higher score only 20% of the time, while 20% of the time users gave it at least a 15-point haircut. One other conclusion: it seems that critics and users can only agree on mediocre movies.
Let’s close with one final interesting question. What are the 10 most critically-acclaimed, user-despised films of all time? And, conversely, what do moviegoers love despite widespread panning from critics? (Remember that I don’t have data for EVERY movie in the Rotten Tomatoes database — there may be some even more controversial movies out there.)
Movies that Critics Loved and Users Hated
- So Much So Fast (2006), +60
- The Salt of Life (2011), +59
- Fast Company (1978), +58
- Spy Kids (2001), +54
- Ghost of Frankenstein (1942), +50
- Antz (1998), +48
- The Flamingo Kid (1984), +47
- Caged Heat (Caged Females) (Renegade Girls) (1974), +45
- Rudyard Kipling’s The Jungle Book (1994), +44
- Heat (1972), +43 -and- Chicken Run (2000), +43
Movies that Users Loved and Critics Hated
- Pretty Village, Pretty Flame (1996), +92
- The Cross(The Cross: The Arthur Blessitt Story) (2009), +79
- Step Up (2006), +65
- Big Momma’s House 2 (2006), +63
- Twins of Evil (1971), +62
- Seven Days In Utopia (2011), +60
- You Got Served (2004), +60
- John Q (2001), +60
- Sister Act 2: Back in the Habit (1993), +59
- I Now Pronounce You Chuck and Larry (2007), +59 -and- P.S. I Love You (2007), +59 -and- Raise Your Voice (2004), +59
Telling, perhaps — I’ve seen a majority of the user-loved movies, and only one or two of the critically-preferred ones. Of course, I don’t pretend to have sophisticated taste. If it has Jason Statham, then I — like many of my fellow citizens — will probably like it better than a critic.
-
chathaus reblogged this from khuyi
-
shoan liked this
-
thedameloves liked this
-
sparzh liked this
-
kerrimaryberry reblogged this from dfkoz and added:
Interesting…..
-
wheelerbb liked this
-
helms-deep liked this
-
mcdavis reblogged this from khuyi and added:
Data, data, data and movies.
-
mcdavis liked this
-
khuyi reblogged this from dfkoz
-
dfkoz posted this