I changed the algorithm being used a little bit to make it Q-learning. It’s far slower lol, surprisingly the simpler algorithm of just storing cell states and subtracting a bit every move actually does better than Q-learning which uses a more advanced update algorithm with state-actions. Ideally, I should make it so it repeats the same maze multiple times, maybe with different starting positions and whatnot, but it works. 少しアルゴリズムを変えて、Q学習モデルにしました。@Sak_Daiさんのアルゴリズムの方が学習がはやいことに少し驚きました。学習率などは変更できるようにしてます。
Yeah… Turbo mode might help. ターボモード大事ですね First time implementing a reinforcement learning algorithm I just realized, I did so many AI projects and not a single one of this type lol