开发者

Best algorithm for optimizing the decisions in a simulation

开发者 https://www.devze.com 2023-02-13 05:59 出处:网络
I\'m looking for the best algorithm to 开发者_开发百科optimise the decisions made in a simultaion to find a fast result in a reasonable amount of time. The simultaion does a number of \"ticks\" and oc

I'm looking for the best algorithm to 开发者_开发百科optimise the decisions made in a simultaion to find a fast result in a reasonable amount of time. The simultaion does a number of "ticks" and occasionaly needs to make a decision. Eventually a goal state is reached. ( It would be possible to never reach a goal state if you make very bad decisions )

There are many many goal states. I want to find the goal state with the least number of ticks ( a tick equates roughly to a second in real life." I basically want to decide which decisions to make to get to the goal in as few seconds as possible,

Some points about the problem domain:

  • Straight off the bat I can generate a series of choices that will lead to a solution. It won't be optimal.
  • I have a reasonable heuristic function to determine what would be a good decision
  • I have a reasonable function to determine the minimum possible time cost from a node to a goal.

Algorithms:

  • I need to process this problem for about 10 seconds and then give the best answer I can.
  • I believe A* would find me the optimal soluton. The problem is that the decision tree will be so large that I won't be able to calculate it quick enough.
  • IDA* would give me a good first few choices in 10 seconds but I need a path all the way to a goal.

At the moment I am thinking that I will start off with the known non optimal path to a goal and then perhaps use Simulated Anealing and attempt to improve it over 10 seconds.

What would be a good algorithm to research to try to solve this sort of problem?


Have a look at limited discrepancy search, repeating with increasingly loose limits on the maximum discrepancy search, or beam search.

If you have a good heuristic you should be able to use it to compare individual choices - for the limited discrepancy search, and compare partial solutions, for the beam search.

See if you can place an upper bound on how good any extension of a partial solution is. Then you can prune out partial solutions that can't possibly be extended to beat the result from the heuristic, or the best result found so far in a series of iterative searches with increasing depth.


Let's get a few facts out.

1) The only way to know for sure which decision is the best is to test every possible decision and evaluate the outcome based on some criteria.

2) We are highly unlikely to have the time to decide to go through every possible decision, so we have to limit how far in the future we will evaluate the decision.

3) We are highly unlikely to make the best move ~ever~. Not just often, but ever. Unless you have only a couple of decisions, chances are every time you make a decision, there was a better one you didn't get to.

4) We can use how our previous decisions worked out to our advantage.

Put all this together... Let's say when we have a decision, we evaluate what happens 30 ticks into the future, in 30 ticks we can check to see if what actually happened matches what we simulated 30 ticks ago. If it was, we know that decision leads to predictable outcomes and we should use that decision less. If we didn't, or if it turns out better than we hoped, we should use that decision more.

Ideally, you would use your logic in a ... simulation of your simulation ... for purposes of evaluating it. Then when you get to the 'real' simulation, you have a better chance at picking your better decisions earlier. Of course, give a higher weight to the results of your actual simulation results as opposed to your simulated simulation results.

0

精彩评论

暂无评论...
验证码 换一张
取 消