A random walk picture of basketball scoring and lead-change dynamics
By analyzing recently available play-by-play data from all regular-season games from multiple seasons of the National Basketball Association (NBA), we present evidence that, basketball scoring during a game is well described by a continuous-time anti-persistent random walk. The time intervals between successive scoring events follow an exponential distribution, with essentially no correlations between different scoring intervals. We will also argue that the heterogeneity of team strengths plays a minor role in understanding the statistical properties of basketball scoring.
As intriguing applications of this random-walk picture, we show that: (i) the distribution of times when the last lead change occurs, (ii) the distribution of times when the score difference is maximal, and (iii) the distribution for the fraction of game time that one team is leading are all given by the celebrated arcsine law--a beautiful and surprising property of random walks. We also use the random-walk picture to construct the criterion for when a lead of a specified size is "safe" as a function of the time remaining in the game. This prediction generally agrees with comprehensive data on more than 1.25 million scoring events in roughly 40,000 games across four professional or semiprofessional team sports, and are more accurate than popular heuristics that are currently used in sports analytics.