My original plan was to look only at future innings where the PR (or original hitter if he wasn't pinch run for) came to bat. This would decrease the variance inevitably created by considering opponents' runs or runs in innings when this spot didn't come to bat. Then by finding the probability of this spot returning to the plate in each different situation, I could isolate the negative side of the PR effect, which could then be combined with the estimate of the positive side that I already have. I was hoping this would let me separate the effects of pinch running and the defensive replacement that surely occurs along with it, but these probabilities of returning to bat are going to depend on the opponent's run scoring and this depends a bit on the defense. The main problem though, would be to connect these separate analyses of the good and the bad of pinch running with a sensible variance estimate that allows for the dependence between the two models.

So I scrapped this idea and decided to start by just fitting a multiple logistic regression with win/loss as the binary response, hoping that the noisiness of the data would be offset by the fact that I had the data for games all the way back to 1952. The predictors I considered were:

- I.PR: an indicator of whether there was a pinch runner,
- lead: the score difference between the two teams (either -1, 0, or 1),
- inn: either top 8th, bottom 8th, top 9th, or bottom 9th. I lumped extra innings in with the 9th inning,

- outs: the number of outs when the runner first got on base (this is categorical because there's no reason outs would be related linearly to the log odds of scoring),
- I.2nd: an indicator of whether the runner was on 2nd base (the alternative is the reference level, 1st base),
- player: identity of the original runner (for parsimony I'm lumping all pinch runners together as being fast guys),
- opp: the opposing team. This doesn't make much difference because unless you're going to control for which pitcher you're facing, the variance in opposition even within the same team is pretty big due to different pitchers across different generations.

The main effect of player measures mostly how good his team is, particularly the players hitting immediately after him. A player effect of zero means his team is about average. The main effect of I.PR measures how much pinch running for the average player increases the log odds of winning. Note that this "average" is not the same as the other one: this one refers more to the speed of player himself. It also incorporates how good a hitter he is - because he might return to the plate in extra innings - but as we will see, it's mostly about his speed, just as the player main effect is mostly about his team rather than himself.

But the interaction of I.PR and player provides information about pinch running for this player relative to the average; if it's positive, a PR should be employed for this player at least as often as the average player; if it's negative, the PR should be employed less often than the average. In fact if the sum of the interaction coefficient and the I.PR coefficient is negative, the log odds of winning the game are decreased when the player is pinch run for.

(Aside: the minor selection bias I mentioned in part 2 is exacerbated here. To estimate the player:I.PR interaction, we need all four pairwise combinations of win/loss with PR/no PR for each player, so any player who is lacking one of these combinations is deleted. By far the most likely to be missing are the two involving PR. About 55% of these deletions were loss/PR - a higher percentage of losses than the true PR population contains - so the PR main effect in my analysis has a positive bias. But my goal is to look at player-specific PR effects, and those are based on the sum of the PR main effect with the interaction term, a sum which should be invariant to any bias in the main effect - if the main effect is too high, the interaction estimate will just be lower to balance it out. I wouldn't expect the parameter estimates for the effects not involving PR to be biased.)

I don't want to consider just one player at a time because the variance of the interaction coefficient estimates is too large to make an informed conclusion. But what I can do is average the interaction effects of many players together. Without looking at the data, I picked a list of 25 players who I thought were good hitters, but in general pretty slow runners:

- Berkman, Lance
- Bonds, Barry
- Cabrera, Miguel
- Dunn, Adam
- Giambi, Jason
- Guerrero, Vladimir
- Gwynn, Tony
- Helton, Todd
- Holliday, Matt
- Howard, Ryan
- Jones, Chipper
- Kent, Jeff
- Lee, Carlos
- McGriff, Fred
- McGwire, Mark
- Ordonez, Magglio
- Ortiz, David
- Palmeiro, Rafael
- Piazza, Mike
- Ramirez, Manny
- Rodriguez, Ivan
- Sheffield, Gary
- Sosa, Sammy
- Thomas, Frank
- Youkilis, Kevin

The table below is based on the average of the 25 aforementioned players. It gives the probabilities of winning in each of the 66 different situations (don't worry, I used a loop in R to make the html code so I didn't have to type it all). The situation column is as follows: lead, inning, base, outs. The p-values are for the 2-sided test between the two probability estimates.

situation | P(win) no PR | P(win) PR | p-value |
---|---|---|---|

-1,t8,1,0 | 0.325 | 0.379 | 0.217 |

-1,t8,1,1 | 0.232 | 0.277 | 0.226 |

-1,t8,1,2 | 0.145 | 0.177 | 0.236 |

-1,t8,2,0 | 0.399 | 0.457 | 0.210 |

-1,t8,2,1 | 0.290 | 0.341 | 0.221 |

-1,t8,2,2 | 0.156 | 0.190 | 0.235 |

-1,b8,1,0 | 0.442 | 0.501 | 0.206 |

-1,b8,1,1 | 0.332 | 0.386 | 0.217 |

-1,b8,1,2 | 0.218 | 0.261 | 0.228 |

-1,b8,2,0 | 0.521 | 0.580 | 0.200 |

-1,b8,2,1 | 0.401 | 0.459 | 0.210 |

-1,b8,2,2 | 0.233 | 0.278 | 0.227 |

-1,t9,1,0 | 0.216 | 0.259 | 0.227 |

-1,t9,1,1 | 0.148 | 0.180 | 0.234 |

-1,t9,1,2 | 0.089 | 0.110 | 0.240 |

-1,t9,2,0 | 0.275 | 0.325 | 0.222 |

-1,t9,2,1 | 0.190 | 0.229 | 0.230 |

-1,t9,2,2 | 0.096 | 0.118 | 0.240 |

-1,b9,1,0 | 0.296 | 0.348 | 0.219 |

-1,b9,1,1 | 0.209 | 0.251 | 0.228 |

-1,b9,1,2 | 0.129 | 0.158 | 0.236 |

-1,b9,2,0 | 0.367 | 0.423 | 0.213 |

-1,b9,2,1 | 0.263 | 0.311 | 0.223 |

-1,b9,2,2 | 0.139 | 0.170 | 0.236 |

0,t8,1,0 | 0.551 | 0.662 | 0.009 |

0,t8,1,1 | 0.468 | 0.583 | 0.011 |

0,t8,1,2 | 0.401 | 0.516 | 0.013 |

0,t8,2,0 | 0.628 | 0.729 | 0.007 |

0,t8,2,1 | 0.543 | 0.654 | 0.009 |

0,t8,2,2 | 0.421 | 0.537 | 0.013 |

0,b8,1,0 | 0.723 | 0.806 | 0.006 |

0,b8,1,1 | 0.652 | 0.749 | 0.007 |

0,b8,1,2 | 0.588 | 0.694 | 0.008 |

0,b8,2,0 | 0.782 | 0.851 | 0.005 |

0,b8,2,1 | 0.716 | 0.801 | 0.006 |

0,b8,2,2 | 0.608 | 0.712 | 0.008 |

0,t9,1,0 | 0.565 | 0.674 | 0.009 |

0,t9,1,1 | 0.482 | 0.597 | 0.011 |

0,t9,1,2 | 0.415 | 0.530 | 0.013 |

0,t9,2,0 | 0.641 | 0.740 | 0.007 |

0,t9,2,1 | 0.557 | 0.667 | 0.009 |

0,t9,2,2 | 0.435 | 0.551 | 0.012 |

0,b9,1,0 | 0.739 | 0.819 | 0.006 |

0,b9,1,1 | 0.670 | 0.764 | 0.007 |

0,b9,1,2 | 0.607 | 0.711 | 0.008 |

0,b9,2,0 | 0.796 | 0.861 | 0.005 |

0,b9,2,1 | 0.733 | 0.814 | 0.006 |

0,b9,2,2 | 0.627 | 0.728 | 0.007 |

1,t8,1,0 | 0.796 | 0.879 | 0.001 |

1,t8,1,1 | 0.742 | 0.843 | 0.001 |

1,t8,1,2 | 0.731 | 0.835 | 0.001 |

1,t8,2,0 | 0.843 | 0.909 | 0.001 |

1,t8,2,1 | 0.795 | 0.879 | 0.001 |

1,t8,2,2 | 0.747 | 0.846 | 0.001 |

1,b8,1,0 | 0.917 | 0.954 | 0.001 |

1,b8,1,1 | 0.891 | 0.938 | 0.001 |

1,b8,1,2 | 0.885 | 0.935 | 0.001 |

1,b8,2,0 | 0.939 | 0.966 | 0.001 |

1,b8,2,1 | 0.917 | 0.954 | 0.001 |

1,b8,2,2 | 0.894 | 0.940 | 0.001 |

1,t9,1,0 | 0.877 | 0.930 | 0.001 |

1,t9,1,1 | 0.840 | 0.907 | 0.001 |

1,t9,1,2 | 0.832 | 0.902 | 0.001 |

1,t9,2,0 | 0.907 | 0.948 | 0.001 |

1,t9,2,1 | 0.876 | 0.929 | 0.001 |

1,t9,2,2 | 0.843 | 0.909 | 0.001 |

A couple interesting things I notice:

- the win probability estimates when trailing by one or when tied are higher with a runner on 1st and no out than with a runner on 2nd and one out. Newsflash: bunting is dumb in general, even when you only need one run.
- some of the PR effects are shockingly large compared to what I had estimated in part 2 for the change in probability of the run scoring. I can only say that I double checked these estimates, and that the estimates from part 2 hadn't allowed for the slowness of the hitter. The other difference I can think of is that (for convenience) my code counted the runner as having scored even if he'd been erased by a fielder's choice and a subsequent runner scored. If this was more common in non-PR situations, it could have led to an understatement of the true PR effect in part 2.

I think my next entry will be about late-inning defensive replacements.

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at "www.retrosheet.org".

I am shocked. And, to be honest, pretty disappointed. It's always so satisfying when conventional wisdom is wrong. When it turns out to be right, it's just deflating.

ReplyDelete