Skip to Main content
Number of documents

48

CV


Journal articles8 documents

  • Audrey Durand, Odalric-Ambrym Maillard, Joelle Pineau. Streaming kernel regression with provably adaptive mean, variance, and regularization. Journal of Machine Learning Research, Microtome Publishing, 2018, 1, pp.1 - 48. ⟨hal-01927007⟩
  • Odalric-Ambrym Maillard. Boundary Crossing Probabilities for General Exponential Families. Mathematical Methods of Statistics, Allerton Press, Springer (link), 2018, 27. ⟨hal-01737150⟩
  • Mohammad Talebi, Odalric-Ambrym Maillard. Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs. Journal of Machine Learning Research, Microtome Publishing, In press, pp.1-36. ⟨hal-01737142⟩
  • Robin Allesiardo, Raphaël Féraud, Odalric-Ambrym Maillard. The Non-stationary Stochastic Multi-armed Bandit Problem. International Journal of Data Science and Analytics, Springer Verlag, 2017, 3 (4), pp.267-283. ⟨10.1007/s41060-017-0050-5⟩. ⟨hal-01575000⟩
  • Rémi Bardenet, Odalric-Ambrym Maillard. Concentration inequalities for sampling without replacement. Bernoulli, Bernoulli Society for Mathematical Statistics and Probability, 2015, 21 (3), pp.1361-1385. ⟨10.3150/14-BEJ605⟩. ⟨hal-01216652⟩
  • Odalric-Ambrym Maillard, Shie Mannor. Latent Bandits. JFPDA, 2014, pp.05. ⟨hal-00990804⟩
  • Akram Baransi, Odalric-Ambrym Maillard, Shie Mannor. Sub-sampling for Multi-armed Bandits. Proceedings of the European Conference on Machine Learning, 2014, pp.13. ⟨hal-01025651v2⟩
  • Olivier Cappé, Aurélien Garivier, Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz. Kullback-Leibler Upper Confidence Bounds for Optimal Sequential Allocation. Annals of Statistics, Institute of Mathematical Statistics, 2013, 41 (3), pp.1516-1541. ⟨hal-00738209v2⟩

Conference papers21 documents

  • Edouard Leurent, Denis Efimov, Odalric-Ambrym Maillard. Robust-Adaptive Interval Predictive Control for Linear Uncertain Systems. 59th IEEE CDC 2020 - Conference on Decision and Control, Dec 2020, Jeju Island / Virtual, South Korea. ⟨hal-02942414⟩
  • Edouard Leurent, Denis Efimov, Odalric-Ambrym Maillard. Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs. NeurIPS 2020 - 34th Conference on Neural Information Processing Systems, Dec 2020, Vancouver / Virtual, Canada. ⟨hal-03004060⟩
  • Odalric-Ambrym Maillard, Hippolyte Bourel, Mohammad Talebi. Tightening Exploration in Upper Confidence Reinforcement Learning. International Conference on Machine Learning, Jul 2020, Vienna, Austria. ⟨hal-03000664⟩
  • Edouard Leurent, Odalric-Ambrym Maillard. Monte-Carlo Graph Search: the Value of Merging Similar States. ACML 2020 - 12th Asian Conference on Machine Learning, Nov 2020, Bangkok / Virtual, Thailand. pp.577 - 602. ⟨hal-03004124⟩
  • Dorian Baudry, Emilie Kaufmann, Odalric-Ambrym Maillard. Sub-sampling for Efficient Non-Parametric Bandit Exploration. NeurIPS 2020, Dec 2020, Vancouver, Canada. ⟨hal-02977552⟩
  • Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, et al.. Budgeted Reinforcement Learning in Continuous State Space. Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada. ⟨hal-02375727⟩
  • Ronald Ortner, Matteo Pirotta, Ronan Fruit, Alessandro Lazaric, Odalric-Ambrym Maillard. Regret Bounds for Learning State Representations in Reinforcement Learning. Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada. ⟨hal-02375715⟩
  • Edouard Leurent, Odalric-Ambrym Maillard. Practical Open-Loop Optimistic Planning. European Conference on Machine Learning, Sep 2019, Würzburg, Germany. ⟨hal-02375697⟩
  • Odalric-Ambrym Maillard. Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay bounds. Algorithmic Learning Theory, 2019, Chicago, United States. pp.1 - 23. ⟨hal-02351665⟩
  • Mahsa Asadi, Mohammad Talebi, Hippolyte Bourel, Odalric-Ambrym Maillard. Model-Based Reinforcement Learning Exploiting State-Action Equivalence. ACML 2019, Proceedings of Machine Learning Research, Nov 2019, Nagoya, Japan. pp.204 - 219. ⟨hal-02378887⟩
  • Mohammad Talebi, Odalric-Ambrym Maillard. Learning Multiple Markov Chains via Adaptive Allocation. Advances in Neural Information Processing Systems 32 (NIPS 2019), Dec 2019, Vancouver, Canada. ⟨hal-02387345⟩
  • Jaouad Mourtada, Odalric-Ambrym Maillard. Efficient tracking of a growing number of experts. Algorithmic Learning Theory, Oct 2017, Tokyo, Japan. pp.1 - 23. ⟨hal-01615424⟩
  • Odalric-Ambrym Maillard. Boundary Crossing for General Exponential Families. Algorithmic Learning Theory, Oct 2017, Kyoto, Japan. pp.1 - 34. ⟨hal-01615427⟩
  • Borja Balle, Odalric-Ambrym Maillard. Spectral Learning from a Single Trajectory under Finite-State Policies. International conference on Machine Learning, Jul 2017, Sidney, France. ⟨hal-01590940⟩
  • Ronald Ortner, Odalric-Ambrym Maillard, Daniil Ryabko. Selecting Near-Optimal Approximate State Representations in Reinforcement Learning. International Conference on Algorithmic Learning Theory (ALT), Oct 2014, Bled, Slovenia. pp.140-154. ⟨hal-01057562⟩
  • Odalric-Ambrym Maillard, Phuong Nguyen, Ronald Ortner, Daniil Ryabko. Optimal Regret Bounds for Selecting the State Representation in Reinforcement Learning. ICML - 30th International Conference on Machine Learning, 2013, Atlanta, USA, United States. pp.543-551. ⟨hal-00778586⟩
  • Phuong Nguyen, Odalric-Ambrym Maillard, Daniil Ryabko, Ronald Ortner. Competing with an Infinite Set of Models in Reinforcement Learning. AISTATS, 2013, Arizona, United States. pp.463-471. ⟨hal-00823230⟩
  • Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz. A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences. 24th Annual Conference on Learning Theory : COLT'11, Jul 2011, Budapest, Hungary. pp.18. ⟨inria-00574987v2⟩
  • Odalric-Ambrym Maillard, Rémi Munos, Daniil Ryabko. Selecting the State-Representation in Reinforcement Learning. Neural Information Processing Systems, Dec 2011, Granada, Spain. ⟨hal-00639483⟩
  • Odalric-Ambrym Maillard, Rémi Munos, Alessandro Lazaric, Mohammad Ghavamzadeh. Finite-Sample Analysis of Bellman Residual Minimization. Asian Conference on Machine Learning, 2010, Japan. ⟨hal-00830212⟩
  • Odalric-Ambrym Maillard, Rémi Munos. Compressed Least-Squares Regression. NIPS 2009, Dec 2009, Vancouver, Canada. ⟨inria-00419210v2⟩

Poster communications1 document

  • Réda Alami, Odalric-Ambrym Maillard, Raphaël Féraud. Memory Bandits: Towards the Switching Bandit Problem Best Resolution. MLSS 2018 - Machine Learning Summer School, Aug 2018, Madrid, Spain. ⟨hal-01879251⟩

Other publications2 documents

Preprints, Working Papers, ...9 documents

  • Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux. Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients. 2020. ⟨hal-02964174⟩
  • Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard. Optimal Strategies for Graph-Structured Bandits. 2020. ⟨hal-02891139v2⟩
  • Edouard Leurent, Yann Blanco, Denis Efimov, Odalric-Ambrym Maillard. Approximate Robust Control of Uncertain Dynamical Systems. 2019. ⟨hal-01931744v2⟩
  • Odalric-Ambrym Maillard, Timothy Mann, Ronald Ortner, Shie Mannor. Active Roll-outs in MDP with Irreversible Dynamics. 2019. ⟨hal-02177808⟩
  • Odalric-Ambrym Maillard, Mahsa Asadi. Upper Confidence Reinforcement Learning exploiting state-action equivalence. 2018. ⟨hal-01945034⟩
  • Robin Allesiardo, Raphaël Féraud, Odalric-Ambrym Maillard. Random Shuffling and Resets for the Non-stationary Stochastic Bandit Problem. 2016. ⟨hal-01400320⟩
  • Aditya Gopalan, Odalric-Ambrym Maillard, Mohammadi Zaki. Low-rank Bandits with Latent Mixtures. 2016. ⟨hal-01400318⟩
  • Odalric-Ambrym Maillard. Hierarchical Optimistic Region Selection driven by Curiosity. 2012. ⟨hal-00740418⟩
  • Odalric-Ambrym Maillard, Alexandra Carpentier. Online allocation and homogeneous partitioning for piecewise constant mean-approximation. 2012. ⟨hal-00742893⟩

Reports3 documents

  • Odalric-Ambrym Maillard, Rémi Munos. Adaptive Bandits: Towards the best history-dependent strategy. [Technical Report] 2011, pp.14. ⟨inria-00574999⟩
  • Odalric-Ambrym Maillard, Rémi Munos. Brownian Motions and Scrambled Wavelets for Least-Squares Regression. [Technical Report] 2010, pp.13. ⟨inria-00483017⟩
  • Odalric-Ambrym Maillard, Rémi Munos. Linear regression with random projections. [Technical Report] 2010, pp.22. ⟨inria-00483014v2⟩

Theses1 document

  • Odalric-Ambrym Maillard. APPRENTISSAGE SÉQUENTIEL : Bandits, Statistique et Renforcement.. Machine Learning [cs.LG]. Université des Sciences et Technologie de Lille - Lille I, 2011. English. ⟨tel-00845410⟩

Habilitation à diriger des recherches2 documents

  • Odalric-Ambrym Maillard. Mathematics of Statistiscal Sequential Decision Making. Statistics [math.ST]. Université de Lille Nord de France, 2019. ⟨tel-02077035⟩
  • Odalric-Ambrym Maillard. Mathematics of Statistical Sequential Decision Making. Mathematics [math]. Université de Lille, Sciences et Technologies, 2019. ⟨tel-02162189⟩

Lectures1 document

  • Odalric-Ambrym Maillard. Basic Concentration Properties of Real-Valued Distributions. Doctoral. France. 2017. ⟨cel-01632228⟩