Off-policy and on-policy reinforcement learning with the tsetlin machine

HIGHLIGHTS

SUMMARY

The Tsetlin Machine (TM) is a novel supervised learning algorithm that combines learning automata and propositional logic to describe frequent data patterns. The authors demonstrate the viability of using bootstrapping for TM learning, lacking a prelabelled training set. A key challenge that the authors address is mapping the intrinsically continuous nature of reinforcement learning state-value learning to the propositional nature of the TM, leveraging probabilistic updates. On-policy, this mechanism learns significantly slower than neural_networks. The authors introduce on-policy learning with TM in Section 3.2, covering multi-step approaches in Section . . .

If you want to have access to all the content you need to log in!

Thanks :)

If you don't have an account, you can create one here.

Add A Knowledge Base Question !