A couple of months ago, the “Hey Bill” section of this site hosted an extended discussion of elitism in higher education which eventually meandered its way to the legal profession and the selection of Supreme Court Justices. I have zero interest in that discussion.[*] What did interest me was the small tangent devoted to the Supreme Court Clerks who serve the Justices. Bill referenced a biography on Henry Blackmun (an excellent read regardless of your interest in legal theory), who allegedly preferred to take clerks from non-elite law schools.[†]
Around the time of the discussion here, current Justice Antonin Scalia was making news for his response to a question posed him during an appearance at American University’s Washington College of Law. A law student had asked what her chances were for becoming a clerk to the Court. Scalia said, simply:
“By and large, I’m going to be picking from the law schools that basically are the hardest to get into. They admit the best and the brightest, and they may not teach very well, but you can’t make a sow’s ear out of a silk purse. If they come in the best and the brightest, they’re probably going to leave the best and the brightest, OK?”
Supreme Court Justices are, despite the short terms each year, incredibly busy people. Scalia would rather make 2-3 additional law school appearances or, say, go hunting with highly placed politicos, than he would sort through hundreds of applications and hold countless interviews to hire someone who, at the very end of the day, is probably fungible with half of the applicants.[‡] There’s little or no benefit in digging another layer, and there’s quite a fair deal of pain in doing so. Scalia would rather use a proxy, in this case admission to (and success at) one of the top law schools in the country.
And finally, we come to the point of the article: proxies. Proxies are a form of shortcut, of amassing important information in as small a package as possible, and, like most everything else in life, they can be wonderfully useful, or downright horrific. Proxies allow us to judge nebulous concepts like popularity, to complete simple tasks like grocery shopping and speed up the mundane activities that might otherwise consume our days. They also serve a nefarious role in genocides, racism and (sure, why not go back to the source) elitism. Like most forms of data or analysis, they are only as useful and good as the people who create and rely on them, a demanding standard and one of which we should always be mindful.
A good proxy meets some very basic rules:
  1.) A proxy is an easily digested piece of information, usually with some independent significance,
2.) which accurately represents a significant amount of the broader idea or thing’s (hereinafter “referent”) substance, with…
  3.) a large decrease in the amount of explanation, discovery, research or investigation required to otherwise understand the referent’s substance…
4.) without inserting confusing, incorrect or useless information into the discussion.
Bad proxies tend to violate one of those four rules. Some of the worst violations occur when the independent significance of the proxy somehow becomes flipped with the referent’s substance. Take, for instance, the ever fun-to-discuss topics of abortion and political parties. A long, long time ago (like the ‘50s), there were maybe 5-7 issues that defined political parties, at least platform-wide. The parties were formed and congealed around these issues, and members of those parties tended to agree with most every stance that the larger party took on its platform. Thus, when someone said that they were a “Republican” or a “Democrat,” those terms were proxies for a basic set of beliefs. Republicans hated commies and Democrats loved weed, or at least this is what I’ve been led to believe by the movies of the era. I would do more research, but the basic premise is there, and frankly, this article is all about being “lazy” when it suits you to get to a much larger payoff in the end.
Unfortunately, Vietnam happened, and Roe v. Wade happened, and Woodstock and the Iran-Contra scandal and Murphy Brown and Monica Lewinsky and “The Pet Goat” and Bubblepalooza and everything else happened. So now the political scene is some sort of Bruegel painting, with basket-headed wraiths fencing fat dudes on kegs, people starting bonfires in the background while small children throw up the 16th century equivalent of gang signs in the fore. There are a million different issues and no one feels the same about any three of them, let alone the whole group.
And we still have two parties. Worse, people still treat the monikers for these loose groupings as if their worth as proxies never changed, and in fact, have so strengthened the bond between the proxy and the referent that they’ve effectively switched their places. For example: abortion. Let’s say you’re pro-choice, just for argument’s sake. Most people would consider you, thus, a Democrat. They have used your belief as a proxy for what they believe is a greater referent: political affiliation. Similarly, most people consider anyone who is pro-life a Republican.
See what has happened here: the terms “Republican” and “Democrat,” formerly proxies for the underlying beliefs of those parties, have become referents for which the underlying ideas have become (particularly unsatisfying and frequently incorrect) proxies. This happens far more often than one might think, especially over the passage of time. Proxies are easier than referents (or else they wouldn’t be needed or used), and so they become more widely adopted. Eventually, as with the true origins of all fairy tales and children’s stories, the referents fade away, leaving only the proxies behind, to become the new referents for proxies that look starkly similar to the original referents.
At this point, some of you will note I’ve spent over 1300 words talking about NOT baseball. Please forgive, and move on to the next paragraph.
Proxies obviously have a gigantic role in baseball. The most famous baseball proxies are naturally among the worst: ERA as proxy for pitchers’ performance, Batting Average for hitters’, stolen base numbers for speed or baserunning ability. Fielding percentage, W-L record, SAVES!!! Basically, every old-school statistic is an exercise in terrible proxying.
Still, we are moving forward and I tend to think of proxies as one of the two major goals of all work being done in baseball right now by statisticians and sabermetricians, whether publicly or privately. The first goal is a fundamental understanding of the underlying referent, which can best be seen in the incredibly vast array of fielding research done over the past few years. The introduction of new cameras will only speed this work, allowing those interested to figure out what a good play looks like, what a shortstop’s expected range is, and whether positioning matters more than skill. However, once all that work is done, or has reached a certain point, the second tier goal will be the creation of proxies, to allow simple folk like me to understand the work behind the curtain. Metrics like UZR, +/-, Fielding Win/Loss Shares, etc. are just the current exemplars of what I expect to be many more attempts at good fielding proxies.
Of course the real problem is that, just like in any other field, the movements in these areas will generally be judged by their proxies, not their underlying work. If your fielding system spits out a single number (be it a UZR Rating, a +/-, or what have you) that doesn’t tend to jibe with the CW, you’ve lost half your intended audience before you can even mount a defense. Such are the vagaries of a proxied world.
Still, the proxies ARE getting better, and continue to do so. OBP improves upon BA. OPS improves upon OBP, OPS+ improves upon that, then EqA, wOBA and the lot. Recognize first that they all measure similar but distinct things. They have independent significance, and this is the first goal discussed above, where people attempt to measure an underlying referent. However, they also all exist as a chain of proxies for approximately the same thing: hitting performance. So while Batting Average may be a PERFECT representation of what it is measuring (number of hits a player gets per AB), and OBP, OPS, OPS+, EqA and wOBA might all be PERFECT representations of what they are measuring, that does not mean that they are equals as proxies. Somewhat more confusingly, the chain of proxies is not standard, depending on your intended referent.
As a very simplistic example, take Batting Average and On-Base Percentage. I think we can all agree that Batting Average pales in comparison to a stat as simple as OBP when serving as a proxy for batting performance. However, if you were looking for a quick and easy way to judge who’s going to be voted on an All-Star team, or who’s going to hit leadoff for any team before 1980, or who teams are going to grow overly attached to as players, Batting Average is a FAR BETTER proxy than OBP. This may sound stupid, because obviously we would prefer to create great proxies for positive concepts, but if you look around long enough, you’ll find that baseball (and life) is an equal opportunity game in regards to this, and that great proxies for horribly negative or useless concepts also exist.
Even as the referent stays the same, the value of various proxies may differ. That is, depending on where you are, or more importantly, WHEN you are, the relative merits of different proxies representing the same referent will change.
For instance, we’ve reached the halfway point of the season. Perhaps the most famous of “bad” proxies looms over this mid-season break: a team’s record. No doubt you’ve been constantly reminded that past performance is a much better indicator of future success than past success is. The concept is simple, and indeed, whatever measurement of past performance you use, be it run differential, expected runs created/saved, etc., the predictions are generally better when run with the measurements of performance, rather than the resulting record (presumably skewed by luck and any other number of things).
It’s important to note, though, that while a team’s record may be a poor proxy for future success, trumped by Pythag expectations and all their variants, a team’s record is an astoundingly GOOD proxy for a team’s place in the standings. In fact, it’s a near one-to-one representation. If you see a team with a good record, you can be reasonably sure they are doing well in the standings. I know; I’m setting the world afire with my deep and hard-hitting analysis here.
But the point remains the same. Listen to the talk over the break about how the Rays have the third best run differential in the league, and a strong second half is coming. But don’t forget that they have only 73 games to make up 4 games just to qualify as the Wild Card. And as time keeps going, any gap that remains unbridged in the ACTUAL RECORD makes run differential a less and less useful proxy for playoff worthiness, and actual record a better and better one.
San Francisco and Colorado may have massive run differential advantages over the rest of the NL when it comes to Wild Card standings, but the race in actual record remains much tighter.[§] With so many teams behind them within range, and the statistical likelihood that 1-2 will grossly outperform their run differential, the weight of past performance as a proxy is substantively weakened. The baseball season acts as one large time period, and proxies gain and lose meaning even within the same season. Run differential is but one example but a telling one.
Proxies can be incredibly useful, and I challenge you to go one waking hour without at least four or five of them speeding your progress. However, they must always be kept in context. After all, if there’s one thing you can be sure of, the proxy’s worth is constantly changing, whether it’s because a better proxy is being created and used somewhere, or because the referent is changing, or simply because the context of the referent has made a different proxy more valuable. And always remember, you may not be able to make a sow’s ear out of silk purse, but you can definitely make a playoff team out of a bunch of guys with a mediocre run differential. Ask these guys.
[*] That’s probably neither entirely true nor entirely fair. I believe, as a lawyer, that the legal profession is a sort of world to itself and that people from outside the profession have an incredibly hard time understanding the inner workings and puzzle piece mechanics of how everything and everyone from a contract lawyer’s temporary paralegal to the nomination of a Supreme Court Justice fits together. It’s a stupid, irrational, complicated mess that is duplicated in no other profession. People are constantly trying to “fix” it, with one of two ultimately failed strategies: first, have a host of non-lawyers set up some form of committee to enact legislation, or gather a group of lawyers with “pure hearts” to come up with some binding solution. The first is rife with misunderstanding and waste, the second with self-interest and bias.
Bill’s suggestion that the Senate refuse to confirm Supreme Court nominees from Ivy League or Top Ten law schools is a lark, and a silly one. I doubt he intended it as some national movement, though, so it’s not a giant issue. And yes, this extended endnote was basically a Dibble-esque rant about how non-lawyers simply can’t understand the game. Except I’m right, and listening to non-lawyers discuss the legal field is the only thing worse than actually being a member of same.
[†] I say “allegedly” because he frequently hired clerks from elite schools and perhaps his most famous clerk, Harold Koh, was the son of a Korean legal scholar who taught at Yale and saw Harold go Hopkins School-Harvard-Oxford-Harvard Law before the clerkship. Koh was, until recently, the Dean of Yale Law School, and was confirmed two weeks ago as the Legal Adviser for the Department of State. Not exactly a humble upbringing, nor a shining example of Blackmun’s willingness to look for diamonds in the rough.
[‡] I actually like Scalia, even as a Justice, despite having what I assume are directly contradictory beliefs in all arenas, from politics, to legal philosophy to the use of the designated hitter.
[§] Not shockingly, now that interleague play is over, 8/14 AL teams have positive run differentials, and 6/16 NL teams do. The NL boasts the two worst offenders, and are -140 as a league, despite having a team that is 20 runs ahead of second place.