Discrete Upper Confidence Bound Algorithm in HOL

Arjan Faber 📧

October 25, 2025

Abstract

This project formally verifies the Discrete Upper Confidence Bound (UCB) algorithm in Isabelle/Higher-order Logic (HOL), focusing on its probabilistic guarantees and regret bounds. The work extends Isabelle/HOLs probabilistic framework and explores verification of discrete-time bandit models following [1]. This research advances the formal verification of probabilistic algorithms in reinforcement learning.

License

BSD License

Topics

Session Discrete-UCB