Discrete Upper Confidence Bound Algorithm in HOL

Abstract

This project formally verifies the Discrete Upper Confidence Bound (UCB) algorithm in Isabelle/Higher-order Logic (HOL), focusing on its probabilistic guarantees and regret bounds. The work extends Isabelle/HOLs probabilistic framework and explores verification of discrete-time bandit models following [1]. This research advances the formal verification of probabilistic algorithms in reinforcement learning.

License

BSD License

Topics

Computer science/Machine learning

Session Discrete-UCB

MSc_Project_Discrete_Prop15_1
Discrete_UCB_Step1
Discrete_UCB_Step2
Discrete_UCB_Step3

Abstract

License

Topics

Session Discrete-UCB

Cite

Download