Accéder directement au contenu

Odalric-Ambrym Maillard

69
Documents

Publications

Image document

Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits

Lilian Besson , Emilie Kaufmann , Odalric-Ambrym Maillard , Julien Seznec
Journal of Machine Learning Research, 2022
Article dans une revue hal-02006471v3
Image document

Reinforcement Learning for crop management

Romain Gautron , Odalric-Ambrym Maillard , Philippe Preux , Marc Corbeels , Régis Sabbadin
Computers and Electronics in Agriculture, 2022, 200, pp.107182. ⟨10.1016/j.compag.2022.107182⟩
Article dans une revue hal-03834290v1
Image document

Collaborative Algorithms for Online Personalized Mean Estimation

Mahsa Asadi , Aurélien Bellet , Odalric-Ambrym Maillard , Marc Tommasi
Transactions on Machine Learning Research Journal, 2022
Article dans une revue hal-03905917v1
Image document

Local Dvoretzky-Kiefer-Wolfowitz confidence bands

Odalric-Ambrym Maillard
Mathematical Methods of Statistics, 2022, ⟨10.3103/S1066530721010038⟩
Article dans une revue hal-03780573v1
Image document

Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs

Mohammad Sadegh Talebi , Odalric-Ambrym Maillard
Journal of Machine Learning Research, inPress, pp.1-36
Article dans une revue hal-01737142v1
Image document

Boundary Crossing Probabilities for General Exponential Families

Odalric-Ambrym Maillard
Mathematical Methods of Statistics, 2018, 27, pp.1-31. ⟨10.3103/S1066530718010015⟩
Article dans une revue hal-01737150v1
Image document

Streaming kernel regression with provably adaptive mean, variance, and regularization

Audrey Durand , Odalric-Ambrym Maillard , Joelle Pineau
Journal of Machine Learning Research, 2018, 1, pp.1 - 48
Article dans une revue hal-01927007v1
Image document

The Non-stationary Stochastic Multi-armed Bandit Problem

Robin Allesiardo , Raphaël Féraud , Odalric-Ambrym Maillard
International Journal of Data Science and Analytics, 2017, 3 (4), pp.267-283. ⟨10.1007/s41060-017-0050-5⟩
Article dans une revue hal-01575000v1
Image document

Concentration inequalities for sampling without replacement

Rémi Bardenet , Odalric-Ambrym Maillard
Bernoulli, 2015, 21 (3), pp.1361-1385. ⟨10.3150/14-BEJ605⟩
Article dans une revue hal-01216652v1
Image document

Latent Bandits

Odalric-Ambrym Maillard , Shie Mannor
JFPDA, 2014, pp.05
Article dans une revue hal-00990804v1
Image document

Sub-sampling for Multi-armed Bandits

Akram Baransi , Odalric-Ambrym Maillard , Shie Mannor
Proceedings of the European Conference on Machine Learning, 2014, pp.13
Article dans une revue hal-01025651v2
Image document

Kullback-Leibler Upper Confidence Bounds for Optimal Sequential Allocation

Olivier Cappé , Aurélien Garivier , Odalric-Ambrym Maillard , Rémi Munos , Gilles Stoltz
Annals of Statistics, 2013, 41 (3), pp.1516-1541. ⟨10.1214/13-AOS1119⟩
Article dans une revue hal-00738209v2
Image document

Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits

Dorian Baudry , Fabien Pesquerel , Rémy Degenne , Odalric-Ambrym Maillard
NeurIPS 2023 - Thirty-seventh Conference on Neural Information Processing Systems, Dec 2023, New Orleans (Louisiana), United States
Communication dans un congrès hal-04337742v1
Image document

Risk-aware linear bandits with convex loss

Patrick Saux , Odalric-Ambrym Maillard
International Conference on Artificial Intelligence and Statistics (AISTATS), Apr 2023, Valencia, Spain
Communication dans un congrès hal-04044440v1
Image document

Farm-gym: A modular reinforcement learning platform for stochastic agronomic games

Odalric-Ambrym Maillard , Timothée Mathieu , Debabrota Basu
AIAFS 2023 - Artificial Intelligence for Agriculture and Food Systems, Feb 2023, Wahington DC, United States
Communication dans un congrès hal-03960683v1
Image document

Learning crop management by reinforcement: gym-DSSAT

Romain Gautron , Emilio J Padrón , Philippe Preux , Julien Bigot , Odalric-Ambrym Maillard
AIAFS 2023 - 2nd AAAI Workshop on AI for Agriculture and Food Systems, Feb 2023, Washignton DC, United States
Communication dans un congrès hal-03976393v1
Image document

Risk-aware linear bandits with convex loss

Patrick Saux , Odalric-Ambrym Maillard
European Workshop on Reinforcement Learning, Sep 2022, Milan, Italy
Communication dans un congrès hal-03776680v1
Image document

IMED-RL: Regret optimal learning of ergodic Markov decision processes

Fabien Pesquerel , Odalric-Ambrym Maillard
NeurIPS 2022 - Thirty-sixth Conference on Neural Information Processing Systems, Nov 2022, New-Orleans, United States
Communication dans un congrès hal-03825423v1
Image document

Indexed Minimum Empirical Divergence for Unimodal Bandits

Hassan Saber , Pierre Ménard , Odalric-Ambrym Maillard
NeurIPS 2021 - International Conference on Neural Information Processing Systems, Dec 2021, Virtual-only Conference, United States
Communication dans un congrès hal-03446617v1
Image document

Optimal Thompson Sampling strategies for support-aware CVaR bandits

Dorian Baudry , Romain Gautron , Emilie Kaufmann , Odalric-Ambrym Maillard
38th International Conference on Machine Learning, Jul 2021, Virtual, United States
Communication dans un congrès hal-03447244v1
Image document

Stochastic bandits with groups of similar arms

Fabien Pesquerel , Hassan Saber , Odalric-Ambrym Maillard
NeurIPS 2021 - Thirty-fifth Conference on Neural Information Processing Systems, Dec 2021, Sydney, Australia
Communication dans un congrès hal-03427597v1
Image document

Improved Exploration in Factored Average-Reward MDPs

Sadegh Talebi , Anders Jonsson , Odalric-Ambrym Maillard
24th International Conference on Artificial Intelligence and Statistics, 2021, San diego (virtual), United States
Communication dans un congrès hal-03780564v1
Image document

Reinforcement Learning in Parametric MDPs with Exponential Families

Sayak Ray Chowdhury , Aditya Gopalan , Odalric-Ambrym Maillard
International Conference on Artificial Intelligence and Statistics, 2021, San diego, United States. pp.1855-1863
Communication dans un congrès hal-03472116v1
Image document

From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits

Dorian Baudry , Patrick Saux , Odalric-Ambrym Maillard
NeurIPS 2021 - 35th International Conference on Neural Information Processing Systems, Dec 2021, Sydney, Australia
Communication dans un congrès hal-03421252v2
Image document

Learning Value Functions in Deep Policy Gradients using Residual Variance

Yannis Flet-Berliac , Reda Ouhamma , Odalric-Ambrym Maillard , Philippe Preux
ICLR 2021 - International Conference on Learning Representations, May 2021, Vienna / Virtual, Austria
Communication dans un congrès hal-02964174v3
Image document

Robust-Adaptive Interval Predictive Control for Linear Uncertain Systems

Edouard Leurent , Denis Efimov , Odalric-Ambrym Maillard
CDC 2020 - 59th IEEE Conference on Decision and Control, Dec 2020, Jeju Island / Virtual, South Korea
Communication dans un congrès hal-02942414v1
Image document

Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs

Edouard Leurent , Denis Efimov , Odalric-Ambrym Maillard
NeurIPS 2020 - 34th Conference on Neural Information Processing Systems, Dec 2020, Vancouver / Virtual, Canada
Communication dans un congrès hal-03004060v1
Image document

Sub-sampling for Efficient Non-Parametric Bandit Exploration

Dorian Baudry , Emilie Kaufmann , Odalric-Ambrym Maillard
NeurIPS 2020, Dec 2020, Vancouver, Canada
Communication dans un congrès hal-02977552v1
Image document

Monte-Carlo Graph Search: the Value of Merging Similar States

Edouard Leurent , Odalric-Ambrym Maillard
ACML 2020 - 12th Asian Conference on Machine Learning, Nov 2020, Bangkok / Virtual, Thailand. pp.577 - 602
Communication dans un congrès hal-03004124v2
Image document

Tightening Exploration in Upper Confidence Reinforcement Learning

Hippolyte Bourel , Odalric-Ambrym Maillard , Mohammad Sadegh Talebi
International Conference on Machine Learning, Jul 2020, Vienna, Austria
Communication dans un congrès hal-03000664v1
Image document

Restarted Bayesian Online Change-point Detector achieves Optimal Detection Delay

Réda Alami , Odalric-Ambrym Maillard , Raphael Féraud
International Conference on Machine Learning, Jul 2020, Wien, Austria
Communication dans un congrès hal-03021712v1
Image document

Model-Based Reinforcement Learning Exploiting State-Action Equivalence

Mahsa Asadi , Mohammad Sadegh Talebi , Hippolyte Bourel , Odalric-Ambrym Maillard
ACML 2019, Proceedings of Machine Learning Research, Nov 2019, Nagoya, Japan. pp.204 - 219
Communication dans un congrès hal-02378887v1
Image document

Learning Multiple Markov Chains via Adaptive Allocation

Mohammad Sadegh Talebi , Odalric-Ambrym Maillard
Advances in Neural Information Processing Systems 32 (NIPS 2019), Dec 2019, Vancouver, Canada
Communication dans un congrès hal-02387345v1
Image document

Budgeted Reinforcement Learning in Continuous State Space

Nicolas Carrara , Edouard Leurent , Romain Laroche , Tanguy Urvoy , Odalric-Ambrym Maillard
Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada
Communication dans un congrès hal-02375727v1
Image document

Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay bounds

Odalric-Ambrym Maillard
Algorithmic Learning Theory, 2019, Chicago, United States. pp.1 - 23
Communication dans un congrès hal-02351665v1
Image document

Practical Open-Loop Optimistic Planning

Edouard Leurent , Odalric-Ambrym Maillard
European Conference on Machine Learning, Sep 2019, Würzburg, Germany
Communication dans un congrès hal-02375697v1
Image document

Regret Bounds for Learning State Representations in Reinforcement Learning

Ronald Ortner , Matteo Pirotta , Ronan Fruit , Alessandro Lazaric , Odalric-Ambrym Maillard
Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada
Communication dans un congrès hal-02375715v1
Image document

Approximate Robust Control of Uncertain Dynamical Systems

Edouard Leurent , Yann Blanco , Denis Efimov , Odalric-Ambrym Maillard
Proc. MLITS Workshop at NeurIPS, Dec 2018, Montreal, Canada
Communication dans un congrès hal-01931744v2
Image document

Efficient tracking of a growing number of experts

Jaouad Mourtada , Odalric-Ambrym Maillard
Algorithmic Learning Theory, Oct 2017, Tokyo, Japan. pp.1 - 23
Communication dans un congrès hal-01615424v1
Image document

Boundary Crossing for General Exponential Families

Odalric-Ambrym Maillard
Algorithmic Learning Theory, Oct 2017, Kyoto, Japan. pp.1 - 34
Communication dans un congrès hal-01615427v1
Image document

Spectral Learning from a Single Trajectory under Finite-State Policies

Borja Balle , Odalric-Ambrym Maillard
International conference on Machine Learning, Jul 2017, Sidney, France
Communication dans un congrès hal-01590940v1

Selecting Near-Optimal Approximate State Representations in Reinforcement Learning

Ronald Ortner , Odalric-Ambrym Maillard , Daniil Ryabko
International Conference on Algorithmic Learning Theory (ALT), Oct 2014, Bled, Slovenia. pp.140-154
Communication dans un congrès hal-01057562v1
Image document

Optimal Regret Bounds for Selecting the State Representation in Reinforcement Learning

Odalric-Ambrym Maillard , Phuong Nguyen , Ronald Ortner , Daniil Ryabko
ICML - 30th International Conference on Machine Learning, 2013, Atlanta, USA, United States. pp.543-551
Communication dans un congrès hal-00778586v1

Competing with an Infinite Set of Models in Reinforcement Learning

Phuong Nguyen , Odalric-Ambrym Maillard , Daniil Ryabko , Ronald Ortner
AISTATS, 2013, Arizona, United States. pp.463-471
Communication dans un congrès hal-00823230v1

Selecting the State-Representation in Reinforcement Learning

Odalric-Ambrym Maillard , Rémi Munos , Daniil Ryabko
Neural Information Processing Systems, Dec 2011, Granada, Spain
Communication dans un congrès hal-00639483v1
Image document

A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences

Odalric-Ambrym Maillard , Rémi Munos , Gilles Stoltz
24th Annual Conference on Learning Theory : COLT'11, Jul 2011, Budapest, Hungary. pp.18
Communication dans un congrès inria-00574987v2
Image document

Finite-Sample Analysis of Bellman Residual Minimization

Odalric-Ambrym Maillard , Rémi Munos , Alessandro Lazaric , Mohammad Ghavamzadeh
Asian Conference on Machine Learning, 2010, Japan
Communication dans un congrès hal-00830212v1
Image document

Compressed Least-Squares Regression

Odalric-Ambrym Maillard , Rémi Munos
NIPS 2009, Dec 2009, Vancouver, Canada
Communication dans un congrès inria-00419210v2
Image document

Bandits Corrupted by Nature: Lower Bounds on Regret and Robust Optimistic Algorithm

Debabrota Basu , Odalric-Ambrym Maillard , Timothée Mathieu
2022
Pré-publication, Document de travail hal-03611816v1
Image document

Optimal Strategies for Graph-Structured Bandits

Hassan Saber , Pierre Ménard , Odalric-Ambrym Maillard
2020
Pré-publication, Document de travail hal-02891139v2
Image document

Forced-exploration free Strategies for Unimodal Bandits

Hassan Saber , Pierre Ménard , Odalric-Ambrym Maillard
2020
Pré-publication, Document de travail hal-02883907v1
Image document

Active Roll-outs in MDP with Irreversible Dynamics

Odalric-Ambrym Maillard , Timothy Mann , Ronald Ortner , Shie Mannor
2019
Pré-publication, Document de travail hal-02177808v1
Image document

Upper Confidence Reinforcement Learning exploiting state-action equivalence

Odalric-Ambrym Maillard , Mahsa Asadi
2018
Pré-publication, Document de travail hal-01945034v1
Image document

Random Shuffling and Resets for the Non-stationary Stochastic Bandit Problem

Robin Allesiardo , Raphaël Féraud , Odalric-Ambrym Maillard
2016
Pré-publication, Document de travail hal-01400320v1
Image document

Low-rank Bandits with Latent Mixtures

Aditya Gopalan , Odalric-Ambrym Maillard , Mohammadi Zaki
2016
Pré-publication, Document de travail hal-01400318v1
Image document

Hierarchical Optimistic Region Selection driven by Curiosity

Odalric-Ambrym Maillard
2012
Pré-publication, Document de travail hal-00740418v1
Image document

Online allocation and homogeneous partitioning for piecewise constant mean-approximation

Odalric-Ambrym Maillard , Alexandra Carpentier
2012
Pré-publication, Document de travail hal-00742893v1
Image document

APPRENTISSAGE SÉQUENTIEL : Bandits, Statistique et Renforcement.

Odalric-Ambrym Maillard
Machine Learning [cs.LG]. Université des Sciences et Technologie de Lille - Lille I, 2011. English. ⟨NNT : ⟩
Thèse tel-00845410v1
Image document

Mathematics of Statistical Sequential Decision Making

Odalric-Ambrym Maillard
Mathematics [math]. Université de Lille, Sciences et Technologies, 2019
HDR tel-02162189v3