Title :
Modeling the Probability of a Strikeout for a Batter/Pitcher Matchup
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of California, Irvine, Irvine, CA, USA
Abstract :
We analyze models for predicting the probability of a strikeout for a batter/pitcher matchup in baseball using player descriptors that can be estimated accurately from small samples. We start with the log5 model which has been used extensively for describing matchups in sports. Log5 is a special case of a logit model and we use constrained logistic regression over nearly one million matchup observations to assess the use of the log5 explanatory variables for this application. We also show that a batter/pitcher ground ball rate interaction variable is significant for the prediction of strikeout probability and we provide physical justification for the inclusion of this variable in the model. We quantify the differences among the models and show that batters control the majority of the variance in predicted strikeout rate.
Keywords :
logistics; probability; regression analysis; sport; baseball; batter-pitcher ground ball rate interaction variable; batter-pitcher matchup; constrained logistic regression; log5 explanatory variables; log5 model; logit model; player descriptors; strikeout probability modeling; Computational modeling; Data models; Equations; Logistics; Mathematical model; Predictive models; Reliability; Modeling; baseball; log5; modeling; prediction; sports analytics;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2015.2416735