开发者

How to use transposition tables with MTD(f)

开发者 https://www.devze.com 2023-01-08 08:52 出处:网络
I\'m writing an AI for a card game and after some testing I\'ve discovered that using MTD(f) on my alpha beta algorithm - a series of zero-window searches - is faster than just using alpha-beta by its

I'm writing an AI for a card game and after some testing I've discovered that using MTD(f) on my alpha beta algorithm - a series of zero-window searches - is faster than just using alpha-beta by itself.

The MTD(f) algorithm is described well here http://people.csail.mit.edu/plaat/mtdf.html

The problem I have is that for each pass in the MTD(f) search (for each guess) I don't reuse any of the previous positions I have stored even though the write up on the link suggests that I should (in fact clearing the table between iterations speeds up the algorithm).

My problem is that when I store a position and a value in my transposition table I also store the alpha and beta values for which it is valid. Therefore a second pass through the tree with a different guess (and therefore alpha and beta) can't possibly reuse any information. Is this what is to be expected or am I missing something fundamental here?

For instance if for alpha=3 beta=4 we come to a result of 7 (obviously a cut-off) 开发者_如何学JAVAshould I store that in the table as valid for alpha=3 to beta=6? Or beta=7?


Your problem comes down to the conceptual understanding of how to use a transposition table along side an alpha beta search. This was a huge problem I ran into as well, and after looking around I found this discussion which explained the concept to me more naturally than any paper I had read on the topic.

Basically you cannot treat all alpha-beta results the same because when a cutoff occurs, the result only represents a bound, and not the true minimax value. It has been proven that using bounds will still always give you the same best next state, but possibly without having the exact score. When you store the state from a cutoff, you need to treat it as a bound and try to improve upon it on the next pass. This will often evaluate the same node multiple times, but it will continually improve upon the actual score as needed.

Here is a good example of a more complete implementation of the concepts listed in the previously linked article. Scroll to page 14.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号