Odalric-Ambrym Maillard
25
Documents
Publications
- 25
- 3
|
From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic BanditsNeurIPS 2021 - 35th International Conference on Neural Information Processing Systems, Dec 2021, Sydney, Australia
Communication dans un congrès
hal-03421252v2
|
|
Indexed Minimum Empirical Divergence for Unimodal BanditsNeurIPS 2021 - International Conference on Neural Information Processing Systems, Dec 2021, Virtual-only Conference, United States
Communication dans un congrès
hal-03446617v1
|
|
Reinforcement Learning in Parametric MDPs with Exponential FamiliesInternational Conference on Artificial Intelligence and Statistics, 2021, San diego, United States. pp.1855-1863
Communication dans un congrès
hal-03472116v1
|
|
Stochastic bandits with groups of similar armsNeurIPS 2021 - Thirty-fifth Conference on Neural Information Processing Systems, Dec 2021, Sydney, Australia
Communication dans un congrès
hal-03427597v1
|
|
Optimal Thompson Sampling strategies for support-aware CVaR bandits38th International Conference on Machine Learning, Jul 2021, Virtual, United States
Communication dans un congrès
hal-03447244v1
|
|
Sub-sampling for Efficient Non-Parametric Bandit ExplorationNeurIPS 2020, Dec 2020, Vancouver, Canada
Communication dans un congrès
hal-02977552v1
|
|
Restarted Bayesian Online Change-point Detector achieves Optimal Detection DelayInternational Conference on Machine Learning, Jul 2020, Wien, Austria
Communication dans un congrès
hal-03021712v1
|
|
Tightening Exploration in Upper Confidence Reinforcement LearningInternational Conference on Machine Learning, Jul 2020, Vienna, Austria
Communication dans un congrès
hal-03000664v1
|
|
Model-Based Reinforcement Learning Exploiting State-Action EquivalenceACML 2019, Proceedings of Machine Learning Research, Nov 2019, Nagoya, Japan. pp.204 - 219
Communication dans un congrès
hal-02378887v1
|
|
Learning Multiple Markov Chains via Adaptive AllocationAdvances in Neural Information Processing Systems 32 (NIPS 2019), Dec 2019, Vancouver, Canada
Communication dans un congrès
hal-02387345v1
|
|
Regret Bounds for Learning State Representations in Reinforcement LearningConference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada
Communication dans un congrès
hal-02375715v1
|
|
Budgeted Reinforcement Learning in Continuous State SpaceConference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada
Communication dans un congrès
hal-02375727v1
|
|
Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay boundsAlgorithmic Learning Theory, 2019, Chicago, United States. pp.1 - 23
Communication dans un congrès
hal-02351665v1
|
|
Practical Open-Loop Optimistic PlanningEuropean Conference on Machine Learning, Sep 2019, Würzburg, Germany
Communication dans un congrès
hal-02375697v1
|
|
Boundary Crossing for General Exponential FamiliesAlgorithmic Learning Theory, Oct 2017, Kyoto, Japan. pp.1 - 34
Communication dans un congrès
hal-01615427v1
|
|
Spectral Learning from a Single Trajectory under Finite-State PoliciesInternational conference on Machine Learning, Jul 2017, Sidney, France
Communication dans un congrès
hal-01590940v1
|
|
Efficient tracking of a growing number of expertsAlgorithmic Learning Theory, Oct 2017, Tokyo, Japan. pp.1 - 23
Communication dans un congrès
hal-01615424v1
|
|
Optimal Strategies for Graph-Structured Bandits2020
Pré-publication, Document de travail
hal-02891139v2
|
|
Forced-exploration free Strategies for Unimodal Bandits2020
Pré-publication, Document de travail
hal-02883907v1
|