Hédi Hadiji

Maitre de Conférence

L2S CentraleSupélec

Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox

We show that the lower bound for Linear Bandit for ellipsoidal action set is $\Omega(\min(d \sigma \sqrt{T} + d \|\theta\|_{A}, \|\theta\|_{A} T))$ and that for any algorithm hard …

Raymond Zhang

• Jun 30, 2025 • 1 min read