Tom Goldstein predicted Justice Stevens will retire at the end of the term. He’s getting his own sitcom, so it must be true And in honor of Stevens’ looming retirement and the attendant circus, this week’s installment of the 10th Justice will consider Stevens’ behavior in the 14 cases that have been decided this term. We will show how users look at Justice John Paul Stevens.

For this post, we will be using outcome percentages and standardized majority ratios (SMR), along with their respective confidence intervals. Confidence intervals are synonymous with the margin of errors used in polls. In the language of outcome percentages, the confidence interval determines how far our percentage needs to be from 50% to be determinative about what users predict the outcome will be. In the language of SMRs, the confidence interval determines how far the SMR needs to be from 1 to determine if the difference is statistically significant.

All confidence intervals are dependant on confidence levels, which is the likelihood that the true value is within the interval. Confidence levels are indicated directly next to the Outcome CIs, while the SMR assumes a confidence level of 95%. The information for both metrics and their confidence intervals is contained in tables for the cases, grouped according to some properties observed in their statistics.

First set:

These five cases can be understood as the result of properly measured statistics. In all of the cases, the outcome was correctly predicted by a majority of FantasySCOTUS members at a 99% confidence level. As shown by the width of the confidence interval, all of the cases vary in number of predictions. However, the most interesting aspect is that Stevens’ SMR in each case telegraphed the possible outcome. Citizens United fell along partisan lines, but Stevens’ SMR in the case indicated that he was likely to withhold his vote from the majority (the difference below 1 is statistically significant), and given the tone of his dissent, that was certainly the case. The other cases, with SMRs significantly above 1, indicated that Stevens was likely to “defect” to the “conservative” majority. The outcome of the cases supports the inference of the statistics since all four of the cases were unanimous decisions.

More results, after the jump.

Second set:

Both NRG and Florida v. Powell are interesting cases in that the majority of user predicted the outcome, but it was too close for either case’s outcome to be determinatively stated. However, both cases share another trait in that the Steven’s SMR is not significantly different from 1, meaning that his decision falls along partisan lines. In NRG however, Stevens was the lone dissent, possibly indicating that the other liberal Justices “defected” into the majority while Stevens remained as the loyal opposition, which only one user predicted. Powell reinforces the defection idea, because Stevens is joined by Breyer in that case, which indicates that Ginsburg and Sotomayor “defected” from partisan lines. Overall, the two cases were very close, so more predictions could have refined the information shown in our statistics.

Third set:

These cases share some statistical traits with the second set of cases. Two of the cases are barely significant at the 90% confidence level, while Kucana is not statistically significant. All three cases were unanimous decisions for reversal. The 49% for reversal in Kucana is not troublesome due to its lack of statistical significant. However, Stevens’ SMR for all three cases indicate that he was well within the interval for falling along partisan lines. The main piece of information gleaned from the SMRs and Stevens presence in the majority with the other Justices is that for some cases facing the Supreme Court, there are no real distinctions in the law between a “liberal” and “conservative” ideology. In these cases, the majority consists of Justices following clear procedure and precedent instead of partisan ideologies.

Fourth set:

The main thread connecting these cases is that the Stevens’ SMR was wrong. In all of the above cases, the SMR was significantly above 1 at levels indicated that predictions strongly leaned toward Stevens joining the other justices in the majority. In three of the four cases, the majority of users predicted the wrong outcome at a statistically significant level. Although it is easy to assume that predictions are inherently flawed, another possible, and likely explanation, is that the assumptions underlying the SMR are wrong, such as the dominant ideology in a specific case being “liberal” instead of “conservative.” This idea is further supported by SC v. NC, in which Stevens joins the majority along with Breyer.

When the necessary assumptions of a model are violated, it opens a statistical Pandora’s Box, and the results are subject to unpredictability and distortion. Finally, odd things just happen in Wood, where users predicted the correct outcome, but no one could have expected Kennedy to join Stevens in the minority.

Overall, the predictions were generally correct in the majority of the cases, and did a decent job of pinning down how Stevens would vote for a given case. It is important to recognize the complexity of a Supreme Court decision, observers’ thought processes, and how Justices make decisions. Even Justices that have served as long as John Paul Stevens can still surprise us with every new decision. Situations such as the last four cases discussed are important for realizing the limitations of models and statistics for these purposes. While there are occasionally failures, it is important not to discard a model out of hand that does predict the outcome most of the time. Rather, the end result of situations that perplex modeling should be the refinement of the model to overcome the limitations and make it more reliable.

Many thanks to Corey Carpenter for his fantastic assistance with this column.