Striking out

by Russ Roberts on April 8, 2008

in Data, Sports

Sometimes I get depressed about the quality of statistical work in economics. Then I read something from another social science. Here is a recent study where psychologists find that having the initial "K" increases your chance of striking out when playing professional baseball. Why? Well, it’s obvious isn’t it? The letter "K" is used when keeping score in baseball to represent striking out. So it’s obvious now isn’t it? Still don’t get it? Neither do I. But hey, it’s in the data. Between 1913 and 2006, players with first or last initial "K" struck out 18.8% of the time compared to 17.2% for the fortunate players unhandicapped by their initials. Here is the "explanation" of the authors:

Despite a universal desire to avoid striking out, K-initialed players strike out more often.  For those players, we argue that the explicitly negative performance outcome may feel implicitly  positive. Even Karl “Koley” Kolseth would find a strikeout aversive, but on the whole, he might  find it a little less aversive than players who do not share his initials, and avoid it less  enthusiastically.

But why? Why would having the initial "K" make striking out more pleasant? I just don’t get it. The authors go on to "test" their theory by looking at grades of a sample of MBA students:

The MBA students in our sample are well aware of a direct connection between academic  performance and successful job placement. Nevertheless, despite the pervasive desire to achieve  high grades, students with an unconsciously-driven fondness for C’s and D’s were slightly less  successful at achieving their conscious goal.

That is, Charles Darwin received poorer grades than Alan Alda. But it turns out that Alan Alda didn’t do better than the non-ABCD initialed:

Interestingly, A- or B-initialed students did not perform better than students whose  initials were grade-irrelevant. There are two possible explanations for this. First, students with  grade-irrelevant initials may already be maximally motivated to succeed. Second, because  performance is determined by motivation and ability, any increased motivation to succeed that  arises from having initials that match positive performance outcomes may not necessarily  translate into increased performance.

There is, of course, a third explanation: there is no real relationship and the authors have been fooled by randomness. Yes, their results are statistically significant. But how many relationships did they explore before finding the ones that were statistically significant. And ho many relationships are there to explore? To really test the theory, you’d have to look at baseball players with the initial "E" and see if they commit more errors than others. You’d have to look at guards in the NBA to see if those with initials "A" have more assists. Centers whose initials include an "R" should be better rebounders. You’d have to look and see whether students with the initials IC were more likely to take an "incomplete" in a class.

I guess Rabbi Jonathan Sacks, the Chief Rabbi of England should have been a football player. Or maybe he just gets fired more often than the average Briton because it doesn’t bother him as much as someone with a different last name.

Did Kafka know baseball scoring? Does this explain why he found success in life so difficult? Is this why he named a character "K"?

Do players whose initials are a backwards "K" strike out looking more than the average?

Comments    Share Share    Print Print    Email Email

  • Yaa ,you are right this was a very good post,thanks for it.


  • Dang... I just don't think I can get enough sports. Shhh my wife is coming. lol Hey thanks for the post and my for satisfying my need to "feed on sports" info. Kenney

  • I agree, this is the way we should all run our lives, sports, personal or otherwise.

  • Hey, Nice Blog Here, I have been heavily into sports betting for a few years now.


    I recently came across a very impressive system for winning 97% of all bets in the NBA and the MLB.


    Its called sports betting champ, and it actually does what it says it does. For a more detailed look, check out the URL in my comment. All the best

  • Wow. I am amazed at the effort that we as a society goe to, to prove a point by a point(% point that is). I once had a boss that asked me to prepare a report to prove the worth of his proposal to top management by using statistics gathered from manufacturing records. I told him tell me what you want to see and I will make it happen. The figures were all true, just presented in a different manner.

  • bee

    A fine example of junk science. This paper would be an F in a methods class. An example of spurious correlation. I guess if there is a consensus then it is correct.

  • Paris

    Did they make some sort of Bonferroni adjustment for multiple comparisons?


    If they think they have discovered a "significant" correlation, they should test the hypothesis prospectively.


    Assuming this wasn't an April Fools study.

  • Are you sure this wasn't an April Fools study?

  • brian

    "The progressive thing to do for the sake of equity would be to allow players with K's in their names to have more strikes before being out.


    Posted by: Justin Ross"


    Assuming you say this in jest, am I to understand that you oppose handicaps in gold because they are progressive?

  • brian

    Skepticism is always good, but one should examine the evidence at least before concluding that it's bunk!


    I learned about these findings years ago. This paper is a new study that came out last year, replicating the results of the first one. This result has been replicated time and time again from different data sets, so it deserves some attention.


    It's possible there's a different explanation, but the fact that people with the letter K in their initials strike out more often has been shown many many times. Just as the result that people with C's and D's in their initials get more C's and D's has been shown many different times.

  • Christopher W.

    Would this work in reverse? Would pitchers with a K be better hurlers? Worked for Kevin Brown. Not so much for Knolan Ryan. :)

  • Mike

    It's funny these guys are both from management schools. I'm a development econ student currently taking an international finance class in the business school, and I'm doing a regression on foreign exchange rates for a project we're working on.


    The instructions for the regression analysis are ridiculous. They ignore autocorrelation and multicolinearity effects, and when I brought this up to my groupmates and the professor, it was clear that none of them knew anything about stats past how to run a regression in Excel. Meanwhile, I've only got two methods classes under my belt compared to my instructor's PhD.


    It makes me think I could make a fortune in the finance world as the one of the only competent statisticians.

  • Bill

    Unfortunately, that was from a business school not a school of social science. It could easily have come from a medical school. I stopped paying attention to reports of medical findings because most of them are innumerate as well.


    My question is how do referees let this through? Now THAT's scary.


    Several flaws in the report are pretty obvious. First "Kingman" , a player from the 1970s alone accounts for nearly a third of the deviation from the null hypothesis. A cursory review from a similar (but not identical) data set suggests they used a binomial distribution with all batters with the same letter category having the same mean strikeout. Since there is a significant variation among individual players, the null is almost certain to be false once you partition your data set among the smaller subsets. Kingman easily skews "k". In fact, I found huge deviations for every single letter. Their model was wrong, every subset was skewed because there are not enough individual players to wash out.


    For grades, there was no difference between A&B, nor between C&D. That should have been the end of it. Besides, the absolute difference between AB and CD was about 0.02 with a mean around 3.4. That's one letter grade in 50 for a poor CD? That's a tiny effect even before being swamped by different standards in different classes and schools. Besides, with an average around 3.4, Just how many grades of C let alone D could have been in that data set? I'd have expected A-B to be the much bigger contributer, but its not there.

  • The progressive thing to do for the sake of equity would be to allow players with K's in their names to have more strikes before being out.

  • Mesa Econoguy

    Russ’ kurtosis precludes that.

  • noahpoah

    It just occurred to me how odd it is that Russell Roberts, or R.R., which is to say, R-squared, isn't a bigger fan of regression.

  • dave smith

    I wonder if the predicted values from their regression went anywhere near the mean of the actual data.....




    ....sarcasm, of course, as this is a property of all regressions.

  • Grant

    After some initial bewilderment, my first thought was that people from different ethnic groups and cultures were more likely to have certain initials than the rest of the population.


    Did they control for race and culture at all? I'm assuming they at least controlled for gender?

  • mpk...HA...i love it; perfect point.

  • mpkomara

    Am I an asshole for suggesting that Latin American baseball players are less likely to have a K in their name than descendants of Eastern European families, and perhaps it is a cultural phenomenon that the former group are less likely to strike out than the latter? (I have two K's in my last name, and that's only one out away from retiring the side.)

  • Methinks

    My question is, how in the heck did anyone even think to ask such an asinine question?


    Two words:


    Grant money

  • save_the_rustbelt

    I'm reminded of the old joke about the economist who drowned in a river with an average depth of six inches.


    Economists (yes even here) torque around numbers with the best of them, and even engage in selectivity designed to mislead.

  • Marcus

    I'm wondering. How do they explain variations from the mean for other letters?


  • Matt C.

    My question is, how in the heck did anyone even think to ask such an asinine question? Are pyschologists really that short on topics about which they can write? Honestly, who even conciously or unconciously automatically associates initials with a scoring metric?

  • tw

    Truly a study devoid of merit and one that wasted resources. I suppose now Dave Kingman will go on the ESPN lecture tour and claim that his massive strikeout ratio wasn't really his fault...he was inherently doomed from birth.

  • dave smith

    Av300, I'll second the point about not being discouraged by your small sample size. Just put yourself in the data 1,000 times and you'll have a big sample.

  • Jack

    And he teaches at Yale? Really? Somebody needs to point him to Andrew Gelman's papers and others in that literature (Bonferroni bounds, Hal White's data snooping tests....)

  • John S.

    This was discussed on various blogs related to baseball and statistics about six months ago. For some interesting insights, see this post: http://www.hardballtimes.com/main/blog_article/...>

    They found that yes indeed, batters with an initial K struck out slightly more often that average. But there were eight other initials that were even worse.

  • "Centers whose initials include an "R" should be better rebounders."


    Dennis Rodman... although he was a forward. Coincidentally, his middle name is Keith.

  • Methinks

    Don't be put off by your small sample size, Avatar!


    These great researchers certainly wouldn't let a thing like that stand in their way!

  • I wonder what happens when you throw Kevin Mitchell out. But seriously, what percentage of players had K initials? The smaller percentage, the less statistically significant the difference.

  • Oops, "do include" should be "do not include".

  • My last name starts with a "K" and I never liked striking out. And I was generally more patient at the plate and took more walks then my teammates, but my initials do include "P" or "W".


    Of course, I'm a pretty small sample size.

  • Randy

    Noticed that the authors are from schools of management. Wondering if they are in training to be pointy-haired bosses or being paid to turn out pointy-haired bosses.

  • PaulD

    This is an obvious example of data mining. Although it is obvious in this example, there are many similar examples that are not obvious to others. Just pick up books or articles on picking stocks, and one can find all sorts of examples of data mining.

  • Stretch

    Were pitchers over-represented in this population? And if so, would a K named pitcher strike more batters out?


    The sad part is I'm sure these guys take themselves way too seriously.


    "These findings provide striking evidence that unconscious wants can insidiously undermine conscious pursuits."


    I can't decide whether to laugh or cry.

  • marysienka

    Articles like this make me chuckle. Thanks, Russ!

  • Methinks

    These guys need to be introduced to the word "spurious" and forced to pay back the federal grant they undoubtedly received to engage in this nonsense.


blog comments powered by Disqus

Previous post:

Next post: