Q Learning

STSTMCXVI•Created February 8, 2015

1,209 views

Instructions

Q-Learning - Still working on it Just run in turbo mode (shift + Green Flag). It Does work without errors. Purple maze sprite can be placed on the grid if needed. The Q-Learning algorithm goes as follows: 1. Set the gamma parameter, and environment rewards in matrix R. 2. Initialize matrix Q to zero. 3. For each episode: Select a random initial state. Do While the goal state hasn't been reached. Select one among all possible actions for the current state. Using this possible action, consider going to the next state. Get maximum Q value for this next state based on all possible actions. Compute: Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)] Set the next state as the current state.

Notes & Credits

Q-learning is a model-free reinforcement learning technique.

Project Details

Project ID47028822

CreatedFebruary 8, 2015

Last ModifiedApril 4, 2015

SharedFebruary 8, 2015

CommentsAllowed