Predicting Probability of Winning Free-Throw % in Basketball?_问答_开发者

Predicting Probability of Winning Free-Throw % in Basketball?

开发者 https://www.devze.com 2023-01-19 08:27 出处：网络

My actual problem is a bit more general that this, but here is a specific example.In basketball, you calculate free throw percentage as:

My actual problem is a bit more general that this, but here is a specific example. In basketball, you calculate free throw percentage as:

Free-Throw Percentage (FT%) = Free-Throws Made (FTM) / Free-Throws Attempted (FTA)

I have two teams, and for each team I have the mean and variance of the team's FTM and FTA, so I can model each as a random normal variable (obviously FTM and FTA will be correlated). I can then easily compute the probability that one team will make more free throws than the other, for example.

My question is... how can I find the probability that one team will shoot a higher free-throw perce开发者_运维知识库ntage than the other? Why is this so hard to compute? Any ideas?

Thanks in advance! :-)

It turns out that the ratio of normally distributed variable (such as FTA and FTM in your model), is distributed in a way that is rather complicated to describe! The simplest (or perhaps least intractable!) case is when both means are 0, in which case the ratio follows a Cauchy distribution. This distribution is tough to work with, because the integrals representing the mean and variance are not well defined. But FTA and FTM have nonzero means, so even this is an oversimplification. So I don't think you're going to find any simple expression for the probability you're trying to calculate.

Another way to look at it might be: who cares if the math is intractable...just simulate it! Perform N trials, generating properly distributed values for each team's FTM and FTA, then keep track of how many times Team 1 has a better FT% than Team 2. N might not need to be too large, depending on how accurate your estimate needs to be...it can be shown that the error in the estimated proportion varies as 1/sqrt(N).

I'd also suggest modeling FTM with something other than a normal distribution. A binomial distribution, with parameters n=mean(FTA) and p=mean(FTM)/mean(FTA), seems like a better fit. With two normal distributions, there's a nonzero probability that FTM > FTA, which doesn't make sense.

use the Geary–Hinkley transformation