Find optimal/good-enough strategy and AI for the game 'Proximity'?_问答_开发者

'Proximity' is a strategy game of territorial domination similar to Othello, Go and Risk. Two players, uses a 10x12 hex grid. Game invented by Brian Cable in 2007.

Seems to be a worthy game for discussing a) optimal algorithm then b) how to build an AI.

Strategies are going to be probabilistic or heuristic-based, due to the randomness factor, and the insane branching factor (20^120). So it will be kind of hard to compare objectively. A compute time limit of 5 seconds max per turn seems reasonable => this rules out all brute-force attempts. (Play the game's AI on Expert level to get a feel - it does a very good job based on some simple heuristic)

Game: Flash version here, iPhone version iProximity here and many copies elsewhere on the web Rules: here

Object: to have control of the most armies after all tiles have been placed. You start with an empty hexboard. Each turn you receive a randomly numbered tile (value between 1 and 20 armies) to place on any vacant board space. If this tile is adjacent to any ALLY tiles, it will strengthen each of those tile's defenses +1 (up to a max value of 20). If it is adjacent to any ENEMY tiles, it will take control over them IF its number is higher than the number on the enemy tile.

Thoughts on strategy: Here are some initial thoughts; setting the computer AI to Expert will probably teach a lot:

minimizing your perimeter seems to be a good strategy, to prevent flips and minimize worst-case damage
like in Go, leaving holes inside your formation is lethal, only more so with the hex grid because you can lose armies on up开发者_开发技巧 to 6 squares in one move
low-numbered tiles are a liability, so place them away from your main territory, near the board edges and scattered. You can also use low-numbered tiles to plug holes in your formation, or make small gains along the perimeter which the opponent will not tend to bother attacking.
a triangle formation of three pieces is strong since they mutually reinforce, and also reduce the perimeter
Each tile can be flipped at most 6 times, i.e. when its neighbor tiles are occupied. Control of a formation can flow back and forth. Sometimes you lose part of a formation and plug any holes to render that part of the board 'dead' and lock in your territory/ prevent further losses.
Low-numbered tiles are obvious-but-low-valued liabilities, but high-numbered tiles can be bigger liabilities if they get flipped (which is harder). One lucky play with a 20-army tile can cause a swing of 200 (from +100 to -100 armies). So tile placement will have both offensive and defensive considerations.

Comment 1,2,4 seem to resemble a minimax strategy where we minimize the maximum expected possible loss (modified by some probabilistic consideration of the value ß the opponent can get from 1..20 i.e. a structure which can only be flipped by a ß=20 tile is 'nearly impregnable'.) I'm not clear what the implications of comments 3,5,6 are for optimal strategy. Interested in comments from Go, Chess or Othello players.

(The sequel ProximityHD for XBox Live, allows 4-player -cooperative or -competitive local multiplayer increases the branching factor since you now have 5 tiles in your hand at any given time, of which you can only play one. Reinforcement of ally tiles is increased to +2 per ally.)

A former member of the U of A GAMES group here.

That branching factor is insane. Far worse than Go.

Basically, you're hooped.

The problem with this game is that it is not deterministic due to the selection of a random tile. This actually adds another layer of nodes between each existing layer of nodes in the tree. You'll be interested in my publications on *-Minimax to learn about techniques for searching in stochastic domains.

In order to complete one-ply searches before the end of this century, you're going to need some very aggressive forward pruning techniques. Throw provably best move out the window early and concentrate on building good move ordering.

For general algorithms, I would suggest you to check the research done by the Alberta University AI Games group: http://games.cs.ualberta.ca Many of the algorithms there guarantee to find optimal policies. However, I doubt you're really interested in finding the optimal, aim for the "good enough" unless you want to sell that game in Korea :D

From your description, I have understood the game to be a two-player with full-observability i.e. no hidden units and such and fully deterministic i.e. player's actions outcomes do not require rolling, then you should take a look at the real-time bounded-search minimax derivatives proposed by the U Alberta guys. However, being able to do bound as well the depth of the backups of the value function would perhaps be a nice way to add a "difficulty level" to your game. They have been doing some work - a bit fishy imo - on sampling the search space for improving value function estimates.

About the "strategy" section you describe: in the framework I am mentioning, you will have to encode that knowledge as an evaluation function. Look at the work of Michael Büro and others - also in the U Alberta group - for examples of such knowledge engineering.

Another possibility would be to pose the problem as a Reinforcement Learning problem, where adversary moves are compiled as "afterstates". Look that up on the Barto & Sutton book: http://webdocs.cs.ualberta.ca/~sutton/book/the-book.html However the value function for a RL problem resulting from such a compilation might prove a bit difficult to solve optimally - the number of states will blow up like an H-Bomb. However, if you see how to use a factored representation, things can be much easier. And your "strategy" could perhaps be encoded as some shaping function, which would be speeding up the learning process considerably.

EDIT: Damn English prepositions