Thursday, April 05, 2012

"So... what did you think?" or: how I put entirely too much thought into my movie ratings system

Why do we grade movies?  Some would argue (and have!) that critics should forego ratings entirely, because any good review will convey to the reader why and how much the reviewer liked the movie.  Yet somehow it’s become expected that if we’re going to tell others our thoughts on a movie, we should provide a score for them as well, if only as a quick reference for those who are paying attention.  This seems especially important for bite-sized assessments posted on social media sites.  After all, social media sites like Twitter tend to have a character limit, and since it’s not like one can cram a lot of detail into 140 characters, some kind of grade can succinctly sum up what a review would otherwise convey.

But which grading scale to use?  As a sometimes blogger and would-be critic, that’s a question I’ve struggled with for years.  In my early years, I employed letter grades, until it hit me that letter grades tend to carry different weight with different readers.  For example, those junior achiever types like I was in my school days tend to look down on anything less than an A, so when one rates a movie a B, that somehow seems subpar even if it’s a perfectly serviceable entertainment.

After that, I moved on to the four-star scale, which seems to be the most popular right now.  However, the problem with this scale, aside from its ubiquity, is that there isn’t really a consensus for what the different star ratings actually mean.  Back when he was still co-hosting At the Movies, Roger Ebert stated that anything receiving three stars or more would constitute a “thumbs up,” which denoted a positive review.  By contrast, the four-star scale used by The Chicago Reader is closer to the following:

0 stars – Worthless
1 star – Has redeeming facet
2 stars – Recommended
3 stars – A must see
4 stars – Masterpiece

Because of how ill-defined my four-star system was, back in 2007 I decided to switch to a Sicinski-esque 10-point scale.  At the time, this suited my needs best, since it required little explanation.  If you rate a movie “on a scale of 1 to 10, 10 being the best,” people will have a pretty solid idea of how you feel.  It’s simple math, really- if 10 is the best a movie can be, and 1 (or zero in my case) is the worst, then logically speaking a rating a 5 would mean average, and therefore a 6 or higher would be above average, and 4 or lower below.  So you can see how I found that pretty useful.

But the other day, I looked at my lists of the movies I’ve seen and graded between 2007 and 2011, and it hit me that maybe the 10-point scale isn’t right for me after all.  It’s not that it’s not an accurate scale, but something about it just feels… wrong.  Look at the chart below:

Notice something a little off?  If 5 is supposed to be the “average” or “mediocre” rating, then my ratings distribution feels a little asymmetrical.  My movie intake has curtailed since 2007, but the constant is that 6 ratings- given to movies about which I’m positive but hardly passionate- outnumber the rest.  And as you can see, the other ratings tend to follow a pretty similar pattern as well, bottoming out with both 10s and everything 3 and below.

Of course, if I actually saw every movie out there the chart would look rather different.  I dare say that the distribution would shift to the left, perhaps even left of “average” rating of 5.  But the truth is that I’m like most paying moviegoers in that I tend to stick to movies that I think I or my family will enjoy.  Granted, there are some real stinkers out there, and when they’re pitched loudly enough to kids, chances are I won’t be able to avoid them.  But now that my kid is old enough that he doesn’t feel like he has to see every new animated family movie out there, this is less of a problem than it used to be.

Another problem with the 10-point scale is that it devotes as many points to bad movies as good ones.  Unless you see a lot of bad movies- which thankfully I don’t anymore- it’s pretty hard to tell the difference between movies that deserve a 1, a 2, or a 3.  If something’s really offensively awful, it deserves one of these, but considering how pissed off I am after I watch a really terrible movie, the last thing on my mind is trying to pinpoint just how bad it really is.

This is my big problem with the Ebert scale.  Ebert has traditionally ranked movies out of four stars, including half-star ratings, which makes a total of nine possible ratings.  Of these, six of them are negative.  Now, there’s obviously going to be a difference between a 0-star rating (given to the worst of the worst) and a 2 ½-star rating (given to movies that aren’t quite good enough to recommend).  But why would one need six different ratings in which to categorize movies that (a) aren’t good, and that (b) one wouldn’t recommend to others?  Seems excessive to me, having to come up with a bunch of different ways to tell people a movie sucks.

Meanwhile, that leaves only three positive ratings, which basically boil down into (a) good, (b) very good, and (c) great.  Sorry, but that isn’t enough for me.  If I’m praising a movie, I want people to know whether I think it’s a flat-out masterpiece or something that’s just really enjoyable, which to my eyes is a fairly substantial difference.

So what is my alternative?  In the last week or so I’ve been toying with the idea of a modified Reader-style system, which would incorporate half-star ratings but would otherwise have similar emotional responses applied to the ratings.  Here’s what I’m looking at right now:

0 stars – Simultaneously terrible and offensive.  I’m angry at myself for seeing this. (0 to 2 on the current scale)
½ star – Terrible, but not offensively so.  Mostly just a waste of time. (3)
1 star – Not recommended.  See if you must, but don’t say I didn’t warn you. (4)
1 ½ stars – Not quite recommended, but has some redeeming facet that could make it worth seeing. (5)
2 stars – Pretty good, not bad, I can’t complain. (6)
2 ½ stars – Well worth your time. (7)
3 stars – A must-see.  A contender for my yearly top-10 list. (8)
3 ½ stars – A near masterpiece.  One of the best films of the year. (9)
4 stars – A masterpiece.  See it, like, now.  Very rare. (10)

I went back and forth about including the half-star rating, largely because I wasn’t sure I needed another negative rating in there.  However, I decided to include it since I thought it was necessary to distinguish between your garden-variety bad movies and the true crimes against cinema.

So if I switch my 2007-2011 ratings over to the new scale, this is what the distribution looks like.

Looks better to me.  What do you think?


Kristen Lopez said...

I do the A, B, C system. Allows me a little more leeway if a movie is on the fringe and gives more diversity in the ratings.

Yodelling Llama said...

Good piece. I do think ratings systems, to the extent they're explained, can be useful also, even with regard to long form criticism because it signals to the reader which *reviews* are worth reading. Certainly if a writer is amazing, any review is worth reading. But many reviewers aren't great writers, but merely capable writers, and so all I'm looking for is (1) did this reviewer think the movie is good, and (2) if so, why. I don't necessarily care to read a middling-to-negative review, and the ratings can be a useful shorthand.

Also, your new system looks an awful lot like Stephen Whitty's rating system, which I always found a rather helpful shorthand. [By the by, using this system, I find 3 1/2 star movies are almost invariably and universally regarded as awesome--the consensus rating--but 4 star movies are almost always controversial, with some people finding them "masterpieces," and others finding them aggressively overrated. Not sure why.]

Also, I think you used "qualitative" early in the piece, when you meant "quantitative."

Paul C. said...

The reason I described ratings systems as "qualitative" rather than "quantitative" is that they are systems meant to assess a film's intangible qualities. I tend to think of a "quantitative" system more as something that counts concrete items or specific occurrences. However, looking back on the piece neither seems like a good fit in this context, so I've simply removed it altogether.