I Teach for Free, They Pay Me to Grade

Over at Unqualified Offerings, Thoreau has a bit of a rant about what students perceive as grading on a “curve”:

Moreover, many students have only the foggiest idea of what a curve is. Many (though probably not all) of their high schools had fixed grading scales with fixed percentages for each letter grade. The A/A- range is 90% or above, or 85% or above, or whatever. The B+/B/B- range is whatever percentage range below that. And so forth. If we set the grade markers anywhere below the ranges they saw in high school, that constitutes “a curve” in their eyes. We could base those ranges on the class average, base them on our own expectations of what we want from them on the test, or base it on some calculation involving tonight’s Powerball drawing. If the ranges are below what they saw in high school, they call it a “curve.” Besides, I submit that it is impossible to interpret a percentage cutoff without knowing two other things: The difficulty of the questions, and the partial credit policy. Without that, I simply cannot say if a given percentage is a reasonable level for an A, B, or whatever.

This fits fairly well with my experience, though there’s another population of students who assign a mystical importance to the idea of the “curve,” as if it will magically transform B- grades into A grades because reasons. Actually, my experience is that it really doesn’t make that much difference, at least in the intro classes.

Several years ago, I had a student who was really upset about ending up with a B+ rather than an A-, and because he had been a hard worker without whining during the term, I went back over the grades multiple times. I also went around the department and consulted everybody as to what they do to set grades, and tried all the different algorithms that were described to me.

In the end, every single one of the grading methods I tried put him at a B+, just short of an A-. Fixed percentage cut-offs, fixing the class average at a B and scaling up and down as appropriate, graphing the total point scores and looking for natural break points, fixing the high and low ends, and spacing the rest of the class between them– all of those gave the same result, in the end.

This shouldn’t be terribly surprising– we have an idea of what we expect and what the students expect, after all, and pitch our assessments with that in mind. When making up test and homework questions, I shoot fr something where even the deeply confused students ought to get the first part right, while only the best students will get the last part right. That typically ends up with an average in the 80-ish range, and if you call that a “B,” most classes very naturally fall into a roughly normal distribution about that (to within our ability to resolve the distribution given the small numbers we work with, anyway). Sometimes we get a strongly bimodal distribution– all high 80s and low 70s– but not all that often.

As a general matter, whenever I teach the intro classes, the best students will end up with 90+% of the possible points, which sets the threshold for A grades, and most of the rest fall in the 80% kind of range. The exact details of the grade assignment don’t ene up mattering all that much, so I tend to default to fixed percentage ranges– 85% is a B, 95% an A, 75% a C– and use the graph-and-check-break-points method to look at the borderline cases.

This term, we made a more radical departure, using standards-based grading. We assessed students on specific skills we wanted them to acquire over the course of the term– “I can solve projectile motion problems,” “I have a qualitative understanding of angular momentum,” etc. Each standard got a grade of 0, 1, or 2, and we averaged all the scores then multiplied by two to map to a 4-point scale. And the end result? A grade distribution that was basically indistinguishable from the usual. The average may have been slightly higher than usual– between B and B+ instead of B- and B– but it shook out about the same as always.

So, in the end, I don’t think “curves” or the lack thereof are really that big a deal. It’s easy to get sucked into a spiral of angst over the choice of grading methods, but at least for intro classes in the sciences, with lots of relatively objective graded work, it doesn’t make all that much difference in the end.

6 thoughts on “I Teach for Free, They Pay Me to Grade

  1. I’ll admit that I’m not quite understanding what the arguments/objections here are. I will agree that in common parlance, adjusting the cutoffs to be anything but a fixed 90/80/70/etc (unfairly disparaged as “what they saw in highschool”) is referred to as a “curve”. I’ll disagree that students uniformly object to applying a “curve” to grades – they’re certainly not objecting to those situations where “they are the only thing standing between the students and a complete blood bath”.

    Instead, students usually complain about those situations where their grades are based on (and specifically made lower because of) what the other students in the classes are doing, rather than some objective measure of performance. For example, the classic “bell curving”, where 68% of people get Cs, 14% of students get Bs, 14% get Ds, 2% get As and 2% get Fs – just like last year’s class, even if this year’s section is all future Nobel Prize winners, and last years class was all future Wendy’s employee of the month winners(*). Or where an A is based off of the top performer’s grade, with fixed percentages from there, so that one overachiever changes everyone else’s grades. With those schemes, your grade is not dependent just how you did, it’s also dependent on the vagaries of who else happened to end up in your class. Eliminate that, and I’m guessing you’d avoid much of the grumbling about “curves”.

    You’ll still get grumbling about peoples grades being too low, and what they did is worth being bumped to the higher level, but that’s always going to happen, curve or no curve.

    *) Name the song reference to get extra credit and throw the curve.

  2. I’ve never seen anyone, faculty or student, refer to a “curve” as anything other than setting the grade boundaries relative to the class as a whole.

    Sometimes this is what RM describes above– a fixed number of A’s, B’s, etc. Sometimes it’s more touchie-feelie, where the grades are histogrammed, and fall (hopefully) into a few clusters. (We did that. I always hated it as a TA.) Sometimes it’s just scaled to the highest performer in the class or the class average.

    And in my experience, the following are the great causes of score angst, in no particular order:

    1) Lack of clarity in the grading procedure. This is a double edged sword, because the more details you give, the more people are going to game the system if at all possible.

    2) Curves as described above. Yes, curving in that way might result in a nearly indistinguishable distribution of grades, but the students lack that perspective since they don’t have the full set of information going back over years or decades that the faculty have… and the basic fact of the matter is that you are assigning them grades based on factors completely out of their control, i.e., the performance of other students.

    Plus also, they don’t trust you– every single one of them has been scarred by a bad grading system in the past, and that is the one they remember.

    3) Stupidity.

    4) Med school requirements.

  3. For the life of me, I cannot see any possible reason for there to be coupling between students’ grades during an assessment of individual understanding. That bridge the civil engineer will design, the cement job on the oil well, the compounding by the pharmacist, the call by a radiologist about whether a spot is cancerous … these are not committee/team decisions. I want the person to have shown that they understand the matter at hand, independent of whether some other person understood it.

    In a technical field, there is simply no reason not to use grading based on objectives for the course (with the grading scheme/rubric available for students to review after, unless you plan to reuse those exact questions in a future class). Unfortunately, this doesn’t happen often enough, because it can open the door to all F’s, or worse for many administrators, all A’s.

  4. >In a technical field, there is simply no reason not to use grading based on objectives for the course (with the grading scheme/rubric available for students to review after, unless you plan to reuse those exact questions in a future class).

    That is what the final is for, really most of my professors weighted the final heavily. Some had a policy that an A on the final was an A for the class. Personally I’ve seem my class grade go up a full point after the final.

    The other issue with technical degrees is that prerequisites actually matter, a lot. Foundation classes set the nail and the following classes hammer it home so to speak.

    Which points up for technical degrees what often matters is you passed. Anyone whose done so has absorbed a vast body of information. And really that’s the point, not to get high grades but to actually learn.

  5. this doesn’t happen often enough, because it can open the door to all F’s, or worse for many administrators, all A’s.

    Either of those outcomes would demonstrate that something has gone seriously wrong. Individual students might fail, but if an entire class fails, that suggests something is wrong with the instructor, or that expectations are generally too high. On the opposite end, if everybody gets an A then expectations for the class are too low.

    I would argue that the results Chad reports are what professors should aspire to. If the professor is doing his job properly, then the differences among the various grading schemes should be small. This isn’t always true in practice (at my undergraduate alma mater, one department was notorious for making in-class exams so difficult that a student earning 30 percent on those exams was likely to get an A, because few students ever did significantly better than that), so props to Chad for getting this right.

  6. The method of getting the answer matters, not just the answer itself. Therefore, although all methods may give the same answer, only standards-based grading receives full marks.

Comments are closed.