Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
开发者_运维问答 Improve this questionI know title of the question is a bit vague but bear with me here is the problem, every time I write a game or a bot for a game I use a state machine, decision tree or a behaviour tree. The problem with these techniques is that they require that I pre program all the conditions that character/bot is gonna encounter so anytime the user does something unexpected that I did not put a condition for they lose.
Right now I am wotking on a starcraft bot (bwapi), using state machines I am thinking of using a single state machine for each unit and one master commanding solders what to do, but it still requires I pre program ever thing and for a game like starcraft it is impossible, the only way I can think of I can make it to learn is to use gp to evolve these state machines.
Say there is a bridge on the map if 20 marines tries to go through the bridge at the same time there will be a big traffic jam what techniques I can use so that it can learn from a mistake? so I don't have to preprogram a condition that says go through the bridge one by one.
EDIT: Just because a question has the words starcraft or bot in it, it does not automatically make it non SO question this question also applies to robotics.
To get anywhere, you need first to define empirical measurement of fitness for your bots. It's going to have to be something much clearer than "big traffic jam".
How do you measure that?
What is winning? Are there numerical indicators that your bot is "winning"? Solve this issue first, then when you have a real way of rating one bot against any number of others, plug it in as a fitness function for a GP algorithm.
You're asking two different questions here.
The first is "what is an AI that can learn"? That's the general question machine learning research attempts to answer. There are dozens of different tools for implementing machine learning, but no silver bullet for your application. You'll have to think a lot harder about exactly what the AI should "learn" -- what will its inputs be, and what will it output?
The second is "how can 20 marines cross a bridge simultaneously without turning into a clusterwhat". You're describing group pathfinding, which is a different area of AI called heuristic search. Solving a path for multiple units (agents) simultaneously has its own set of algorithms but is generally a much better understood problem than your first. Off the top of my head, you might use an approach to solve for a subset of the units at the same time (depending on the width of the bridge) and moving the member in each group simultaneously. You can also try solving for each individual marine starting with the one closest to the other side of the bridge.
Using Google search to look up either will expose you to a lot more information than I can cram into an answer, especially given that your question is rather open-ended (w.r.t. learning systems).
Since you're using finite state machines already, try having a look at KB-NEAT.
It's a neuroevolution technique; that is, the creation of neural networks through evolution.
Also have a look at rtNEAT, which may come in handy. An ordinary implementation of NEAT uses a generations approach, that is, runs a number of games, say a hundred, selects the best candidates and create offspring from these. (Other mentioned fitness; this is always used in evolutionary approach, and thus also here) rtNEAT allows the evolution to occur in real-time; that is, while playing a game. (That requires a more sophisticated fitness calculation, as it happens midgame, where you still do not know the outcome)
The implementation is actually not that hard, but they key to this technique is the genetic history, which is vital to the evolutionary process. (It is also what makes this technique so amazing compared to earlier attempts as neuroevolution; the problem here lies in that the input and output must be identical, and that may not be the case)
Oh and your problem can be solved either by a planner at the higher level, or the units may learn it themselves. And input that includes the nearest friendly units and obstacles, could with the right fitness learn that it's counter productive to clock the pipe so to say. This is called emergent behaviour, and it has been shown that the above techniques is capable of evolving such behaviour autonomously.
Here's an implementation that I find very nice to base your work at;
http://nn.cs.utexas.edu/?windowsneat
The above uses generations. I have seen no implementation of rtNEAT. But you can have a look at the book "Adaptation in natural and artificial systems" by John Holland. Admittedly, it may be hard to read, as it's very mathematical. But skip most of it, and look at the proposals at algorithms. It's general to all evolutionary algorithms, of which neuroevolution is a subfield. It have an algorithm that is generally what rtNEAT uses. (And if you are unfamiliar with genetics, which is used in evolutionary algorithms, it nicely defines what a gene, allele, chromosome, phenotype and genome is, which you'll find used in the NEAT publications; NEAT uses genome to describe things in general, which is just a set of chromosomes that together describe the phenotype, as the encoding is a bit more involved than just Genetic Algorithms and Genetic Programming)
The homepage of the technique is here;
http://www.cs.ucf.edu/~kstanley/neat.html
Here's the publication in chronological order;
http://nn.cs.utexas.edu/keyword?stanley:ec02
http://nn.cs.utexas.edu/keyword?stanley:ieeetec05
http://nn.cs.utexas.edu/?kbneat
(The KB-NEAT uses rtNEAT already in the above publication)
The point is that you can basically take what you have, put it into a neuroevolutionary technique, and evolve from there. It's a mix between domain specific AI, and machine learning AI.
Oh and a note; the evolution part is processor intensive, at least without rtNEAT. rtNEAT is instead time intensive, as you need to play a lot against it before it learns. (KB-NEAT gives it a base of intelligence though obviously) However, when evolved, it is very, very fast, as the response from a neural network is very fast to compute. (It's a rather small graph, and there's no search involved)
Oh, and second thing; you need to think hard about input and output. Output may be easy, as that is what the game allows your units to do. But input is what you want them to see, and you cannot include everything; that would make the problem too hard to solve for the evolution, at least in realistic time. (Although it will, theoretically, converge at the optimal solution at infinite time)
Oh, and a third note; you can evolve several brains for the units, and even have different brains for each unit type. The sky is the limit. Maybe you want a brain for each technology level of your or the enemy. This takes extra time to evolve of course, but the brains are small in memory so the amount is not a problem.
Ack, and a fourth note; this is a black box technique. You cannot convert the brain back to a FSM I'm afraid. The encoding in a neural network is not human understandable, and thus it cannot be known how exactly it works. So there is the danger that you'll end up with something that you like, but you cannot understand why. And you cannot easily share that knowledge with other agents. (Albeit you can of course use it as a base to evolve new behaviour for those)
Doesn't the Starcraft AI already implement the movement of units? e.g. Just select 12 marines and then tell them to move across the bridge like a human player would. Of course, the AI is highly flawed and no where near as good as the swarm AI of SCII. However I think there are many other issues in designing an AI before you need to worry about micromanaging each individual unit.
For example, knowing where/when to position your units is arguably more important than figuring out how to do it with 100% efficiency.
I personally think the best way to approach this game is using state machines. There are key stages in a game to which there are proper responses (e.g. are lurkers out? are science vessels out?, etc). But instead of making each individual unit a SM, focus more on a larger entity such as a control group of units. I believe this simplifies things greatly and if later on you can improve your AI's micro if need be.
I think actually making your characters learn is a bad idea. They will develop consciousness, break out of the computer and try to strangle you in your sleep with a mouse-cable.
Ok, kidding. I think it's just over-engineering. Evolving a character will not necessarily make it react robustly to all eventualities but is likely to make it very hard to work on, to extend it and to keep an overview of its attributes.
You might be interested in a goal based approach:
http://www.media.mit.edu/~jorkin/gdc2006_orkin_jeff_fear.pdf
It's a bit of an effort to begin with but in the long run, it makes it much cleaner and easier to extend your dudes' behaviour.
Instead of giving you a suggestion for specific algorithms or techniques, I think it'd help to start by addressing specifically the problem that you're trying to solve. I play StarCraft quite often, so I know what you mean by "big traffic jam." However, "big traffic jam" is just a label, and meaningless in a machine learning context.
A bot, in your particular domain, could be said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.
We now have to work on defining E, T, and P in a way that's meaningful for each problem you're likely to encounter in the game. For example, the class of tasks might include moving units from one area to another, in a group. That area might have some features indicating that it's narrow, and therefore impossible to move the units in the optimal group size. So, the performance measure might be the time it takes might be the unit flux (the movement of some quantity of marines per unit time per unit area). You'd of course want to measure this flux through a volume, which is just the normalized sum of flux operations. With experience, you'd be maximizing this flux.
Once you understand that problem a little better, you can then begin coming up with designs that best embody all of your requirements.
精彩评论