“The computer was born to solve problems that did not exist before.”

Random Posts

Monday, July 19, 2021

Convergence

 Will the Q Learning Algorithm converge toward a Q equal to the true Q function?

Yes, under certain conditions.

  1. Assume the system is a deterministic MDP.
  2. Assume the immediate reward values are bounded; that is, there exists some positive constant c such that for all states s and actions a, | r(s, a)| < c
  3. Assume the agent selects actions in such a fashion that it visits every possible state- action pair infinitely often

No comments:

Post a Comment

Post Top Ad

Your Ad Spot

Pages

SoraTemplates

Best Free and Premium Blogger Templates Provider.

Buy This Template