Probability theory and project planning [closed]_问答_开发者

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

This question does not appear to be about programming within the scope defined in the help center.

Closed 5 years ago.

Improve this question

I'm managing a project that has to be estimated, according to rough requirements and specifications. Because of that, the estimations on the specific features and tasks are set of discrete values, instead of just one discrete value (for example, between 10 and 20, instead of exactly 17).

I'm curious, if I want to get an idea of the approximate probability to finish some task within the lowest estimate, how should I approach this? Please, for the sake of the discussion, disregard factors like my estimation skills, used platform, etc.

I was thinking about using Poisson distribution, with λ = (low + high) / 2, assuming that the probability for each of the proposed values abides to the law of rare events / normal distribution. This doesn't account for the fact that going out of my estimation limits is more unl开发者_如何学Pythonikely than likely, but still...

What do you think about that, and which approach would you choose for such experiment?

Evidence Based Scheduling

Basically the idea is to observe how much it takes in your team to complete similar tasks to estimate how long it might take for another one of those to be finished.

I recommend reading Waltzing With Bears by Tom DeMarco and Tim Lister - it goes into schedule estimating in some depth.

As a rule of thumb I would say the probability of finishing any project within the lowest estimated time is approximately zero. This is both from the analysis they give in the book, and from personal experience.

I don't think you have the information to make that call. To do so you'd need to know whether the probability curve was normalised (probably) and whether it was skewed (almost certainly) plus what the various statistical values associated were (mean, standard deviation and so on).

If you have those I don't think you'd be asking. In addition to that your skill in estimating, the assumptions you've made and their accuracy and so on are all factors, most of which are very hard to quantify.

It's why evidence based scheduling is good - you don't have to understand exactly why things take a certain amount of time, you just know that they do.

A couple of simple things I'd say you should think about:

1) In my experience the realistic chances of it being your lowest estimate are roughly zero. Shit happens on software projects, most people aren't that good at estimating and things will go wrong. If you want a good estimate then go with that.

2) Think very carefully about what you want the number for. If you're going to give it to a client or most managers then:

(a) they won't remember the caveats, the won't remember the top end of the range and the won't remember the probabilities or the theory. They'll remember the nice low number you gave them and the rest is just "wah wah wah".

(b) clients and managers want certainty so you need to give them something you're certain about. If you assume that your estimate is normally distributed and you have your best case and worst case values, if you give them an average of the two you will miss your deadline 50% of the time. From a managers perspective that's bad. If you want to hit your deadline 95% of the time then you need to be giving the mean + 2 standard deviations. Again if you want a rough estimate then your worst case is probably the easiest number to grab.

Generally under promise and over deliver. Be the guy who never misses deadlines and often delivers early. That doesn't involve changing the way you work, you just have to manage expectations.

I suggest using Three Point Estimating. Assign Minimum, Most Likely, and Maximum time and type of Random Distribution (Pert, Triangle, Beta, etc. depending on the characteristics or historical data) to every of your tasks within the Project. Simulate with Monte Carlo for a number of times (e.g. 5000 times) and see what adds up. You can also go further by incorporating element of Risk (and also correlation between Risks if you wish) to get a better picture of what might happen. A tool such as Palisade @Risk might be able to help you.

Poisson has been done so many times, with the same low rate of success. I second evidence based scheduling, because it's self-correcting and works on actual data.