reinforcement-learning
What machine learning algorithm should I use for Connect 4?
I have an AI that is good at playing Connect 4 (using minimax). Now I want to use some machine learning algorithm to learn from this AI that I have, and I would like to do that by just letting them pl[详细]
2023-04-07 01:56 分类:问答XOR Hebbian test/example neural network
I just finished writing some code that runs a hebbian learning feedforward neural network. I\'ve done a backpropaga开发者_运维问答tion neural network before and the first thing I did to make sure it w[详细]
2023-04-02 12:13 分类:问答Are neural networks really abandonware?
I am planning to use neural networks for approximating a value function in a reinforcement learning algorithm. I want to do that to introduce some generalization and flexibility on how I represent sta[详细]
2023-03-24 20:10 分类:问答How to Learn the Reward Function in a Markov Decision Process
What\'s the appropriate way to update your R(s) function during Q-learning? For example, say an agent visits state s1 five times, and receives rewards [0,0,1,1,0]. Shou开发者_StackOverflowld I calcula[详细]
2023-03-20 18:05 分类:问答C++ Reinforcement learning and smart pointers
I am doing my Masters project on robotic\'s sensorimotor online learning using reinforcement learning methods (Q,sarsa,TD(λ),Actor-Critic,R,etc). I am currently designing the framework on which both[详细]
2023-03-18 18:05 分类:问答How to train an artificial neural network to play Diablo 2 using visual input?
I\'m currently trying to get an ANN to play a video game andand I was hoping to get some help from the wonderful community here.[详细]
2023-03-16 09:31 分类:问答SARSA algorithm
I am having trouble understanding the SARSA algorithm: http://en.wikipedia.org/wiki/SARSA In particular, when updating the Q value what is gamma? a开发者_StackOverflow中文版nd what values are used fo[详细]
2023-03-07 07:43 分类:问答Reducing the number of markov-states in reinforcement learning
I\'ve started toying with reinforcement learning (using the Sutton book). I fail to fully understand is the paradox between having to reduce the markov state space while on the other hand not making a[详细]
2023-02-11 10:47 分类:问答TD(λ) in Delphi/Pascal (Temporal Difference Learning)
I have an artificial neural network which plays Tic-Tac-Toe - but it is not complete yet. What I have yet:[详细]
2023-02-07 19:24 分类:问答Implementing HexQ Algorithm [closed]
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.[详细]
2023-01-18 15:01 分类:问答