Doing this for some homework. If you wish to help, comment down the AI score that you got after the project ends! The "times" parameter is a number between 1 and 10. Sak_Dai and ayten02 also helped with the projects, so thanks to them! Additional details: 1. The shortest path to the maze is first calculated. Let's call this x. 2. The AI (Q-learning model) trains on times*100 instances where it starts in a random position and tries to reach the goal. 3. After times*100 trials, the AI goes through the maze one last time, and we calculate the steps it took to be y. 4. The formula for the "AI score" is 100(y/x-1), where 0 means the AI did perfectly. 5. Our experiment will test the AI under a different number of iterations.