specifications grading

what are grades?

Grades serve a few purposes:

Grades give feedback. They allow you to know what you know and what you don’t know. This is how most professors think of grades most of the time.
Grades allow evaluation and comparison of students. This is how most students think of grades most of the time.
Grades are motivators. Hope for a good grade, and fear of a bad one, is the predominant way professors get their students to do the work necessary to learn.

These purposes hold for any assignment, and for overall course grades. But they are present in different mixtures. A grade on a homework assignment is mostly feedback-oriented; whereas an exam grade is about evaluation; whereas a final grade is a strong motivator.

averages

Most courses are graded based on the accumulation of points. Each assignment is worth some number of points, and the percentages are averaged together (possibly with some weighting) to compute a final average, which is then converted to a letter grade. The primary advantage of this system is that it’s easy to administer and describe. However, it has a several major shortcomings:

comparing unlike things: Averages ignore the fact that some things just aren’t comparable. Exams and homework play very different roles, and the meaning of grades is different in each one. So what sense does it make to add those scores together?
penalizing failure: One of the best ways to learn is to try, fail, and try again. In an average-based system, any mistake a student makes hurts her grade. So a student may not want to take that risk — a risk that may be necessary to truly understand the material.
eliding important differences: Under the points-accumulation system, a student who does poorly throughout the semester but by the end of the semester has learned the material thoroughly may get the same final grade as a student who muddled through the whole time. For example, the following three functions have the same average, but the one on the left is the ideal for a course and the other two are almost an anti-ideal.
not incentivizing complete work: Let’s say you’re the general of an army fleeing a much-larger and better-equipped enemy force. You need to cross a large river. You task your engineers with building ten bridges across the river, but their skills and the available time and materials mean they cannot complete the task. Would you rather they built ten partial bridges (say each bridge is 80% complete) or eight full bridges? It seems obvious that you want as many full bridges as you can get. Even six complete bridges (“only 60% average”) would be preferable to ten 80%-complete bridges (“80% average”). A grading system that regularly gives Bs or even As to students who don’t get the main point is not doing its job.
incentivizing the wrong things: Many students have told me over the years that their strategy for getting a good grade in one of my classes is: “bank” points early, then check out later in the semester. With an average/points based system, this is a reasonable strategy. In terms of learning, though, it’s just about dead wrong. In almost every course I teach, the last few weeks of material are the most interesting and most critical for future courses to build upon. So a student who banks enough points early doesn’t get the main point of the course. A grading system that regularly gives Bs or even As to students who don’t get the main point is not doing its job.

For these reasons, I have moved away from a points-average system to specifications grading.

how specifications grading works (in my courses)

a sample syllabus: MA 425 (real analysis)

For each non-failing letter grade (A, B, C, D), there is a list of specifications. If a student meets all the specifications for a particular grade, they receive that grade. They must achieve all the specifications for the desired grade.

For example, in my MA 225 course, some of the specifications are:

For a C
- get at least 50% of the points on each of two exams
- submit a proof by contradiction
For a B
- get 50% of the points on one exam, 70% on another, including 70% on the final exam
- submit a proof by complete induction
For an A
- get 70% of the points on one exam, 85% on another, including 80% on the final exam
- submit a proof that something is unique

These specifications are cumulative: to get a B, a student must complete both the C and B specifications; to get an A, the student must complete the C, B, and A specifications.

pass/fail (with multiple tries) where appropriate, points where appropriate

The principle of binary specifications — you meet them, or you don’t — also holds at the assignment scale. In the math major courses I teach, the vast majority of the coursework consists of writing proofs. In a very basic philosophical sense, a given proof is either correct, or it is not correct. Writing “8/10” on a proof doesn’t really reflect that the proof is “80% correct”; instead (when I write it), this usually means something like “there were a number of mistakes, but they were minor and I’m willing to accept them”. Whether this constitutes useful feedback to the student is unclear. It is also difficult for points to distinguish between a proof that captures the content correctly, but is incorrect for formal reasons, and a proof that has good formal properties but is lacking on the content side.

Having adopted a specifications grading scheme (hence, being freed from the need to report a numerical grade for each assignment), there are other possibilities.

In MA 225 (starting Fall 2015) and 425 (starting Fall 2018), I give each proof a mark of S (“satisfactory”) if it is nearly perfect. If not completely correct, the S proof has at most minor flaws of formatting or language. S proofs are the kind of proof that nearly every professor would recognize as correct.Only S proofs count toward the final grade.

As with any pass/fail scheme, this might seem unduly harsh (if not impossible for a student to succeed in). To reward incremental progress, I have another possible mark: P (“progressing”), which indicates that the proof is incorrect, but can be salvaged. P proofs can be resubmitted without penalty.

For proofs that have no hope of being corrected, I mark U (“unsatisfactory”) — these are not counted against the student, except as missed opportunities.

One of the specifications for the course is then that the student must get a certain number of S marks during the semester. In MA 225, the number of Ss required for a course grade of A is around half of the total proof opportunities — which may seem low if you’re used to thinking in terms of averages. But because getting an S mark requires the proof to be nearly perfect,

The S/P/U system is not appropriate for all assignments, though: quizzes and exams are graded using points.

results

Since implementing the specifications and S/P/U systems in MA 225, I have seen a dramatic rise in students who complete the course successfully. This isn’t because the scheme makes getting an A easier; subjectively the quality of student work seems to have jumped. Because even a C student has to get some near-perfect proofs, this increase in the quality of student work is perhaps most dramatic among students who wind up getting a C.

references

I learned about specifications grading through a reading circle offered by the Office of Faculty Development; we read the book Specifications Grading by Linda Nilson. Robert Talbert’s thoughts have also been very useful to me.

Andrew Cooper

senior lecturer of mathematics