Skip to Main content
Number of documents

79

link to my webpage


http://people.isir.upmc.fr/sigaud
http://people.isir.upmc.fr/sigaud/en


Journal articles17 documents

  • Pierre Fournier, Cédric Colas, Mohamed Chetouani, Olivier Sigaud. CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments. IEEE Transactions on Cognitive and Developmental Systems, Institute of Electrical and Electronics Engineers, Inc, 2021, 13 (2), pp.239-248. ⟨10.1109/TCDS.2019.2933371⟩. ⟨hal-02370859⟩
  • Marwen Belkaid, Elise Bousseyrol, Romain Durand-de Cuttoli, Malou Dongelmans, Etienne Duranté, et al.. Mice adaptively generate choice variability in a deterministic task. Communications Biology, Nature Publishing Group, 2020, 3, pp.34. ⟨10.1038/s42003-020-0759-x⟩. ⟨hal-02485779⟩
  • Olivier Sigaud, Freek Stulp. Policy search in continuous action domains: An overview. Neural Networks, Elsevier, 2019, 113, pp.28-40. ⟨10.1016/j.neunet.2019.01.011⟩. ⟨hal-02182466⟩
  • Stéphane Doncieux, David Filliat, Natalia Díaz-Rodríguez, Timothy Hospedales, Richard Duro, et al.. Open-Ended Learning: A Conceptual Framework Based on Representational Redescription. Frontiers in Neurorobotics, Frontiers, 2018, 12, pp.59. ⟨10.3389/fnbot.2018.00059⟩. ⟨hal-01889947⟩
  • Luka Peternel, Olivier Sigaud, Jan Babič. Unifying Speed-Accuracy Trade-Off and Cost-Benefit Trade-Off in Human Reaching Movements. Frontiers in Human Neuroscience, Frontiers, 2017, 11, pp.615. ⟨10.3389/fnhum.2017.00615⟩. ⟨hal-01679624⟩
  • Florian Lesaint, Olivier Sigaud, Jeremy Clark, Shelly Flagel, Mehdi Khamassi. Experimental predictions drawn from a computational model of sign-trackers and goal-trackers. Journal of Physiology - Paris, Elsevier, 2015, 109 (1-3), pp.78-86. ⟨10.1016/j.jphysparis.2014.06.001⟩. ⟨hal-01219979⟩
  • Freek Stulp, Olivier Sigaud. Many regression algorithms, one unified model — A review. Neural Networks, Elsevier, 2015, 69, pp.60-79. ⟨10.1016/j.neunet.2015.05.005⟩. ⟨hal-01162281v2⟩
  • Florian Lesaint, Olivier Sigaud, Shelly Flagel, Terry Robinson, Mehdi Khamassi. Modelling Individual Differences in the Form of Pavlovian Conditioned Approach Responses: A Dual Learning Systems Approach with Factored Representation. PLoS Computational Biology, Public Library of Science, 2014, 10 (2), pp.e1003466. ⟨10.1371/journal.pcbi.1003466⟩. ⟨hal-00947727⟩
  • Serena Ivaldi, Salvatore Anzalone, Woody Rousseau, Olivier Sigaud, Mohamed Chetouani. Robot initiative in a team learning task increases the rhythm of interaction but not the perceived engagement. Frontiers in Neurorobotics, Frontiers, 2014, 8, ⟨10.3389/fnbot.2014.00005⟩. ⟨hal-02423102⟩
  • Didier Marin, Olivier Sigaud. A machine learning approach to reaching tasks. Computer Methods in Biomechanics and Biomedical Engineering, Taylor & Francis, 2012, 15 (sup1), pp.151-152. ⟨10.1080/10255842.2012.713684⟩. ⟨hal-00743364⟩
  • Olivier Sigaud. Les systèmes de classeurs : un état de l'art. Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, Lavoisier, 2007, 21 (1), pp.75-106. ⟨10.3166/ria.21.75-106⟩. ⟨hal-01185706⟩
  • Fabien Flacher, Olivier Sigaud. GACS : une approche ascendante pour la coordination spatiale. Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, Lavoisier, 2006, 20 (1), pp.7-30. ⟨hal-01185691⟩
  • Thomas Degris, Olivier Sigaud, Sidney Wiener, Angelo Arleo. Rapid response of head direction cells to reorienting visual cues: A computational model. Neurocomputing, Elsevier, 2004, 58-60, pp.675-682. ⟨10.1016/j.neucom.2004.01.113⟩. ⟨hal-01185700⟩
  • Pierre Gérard, Olivier Sigaud. Apprentissage par renforcement indirect dans les systèmes de classeurs. JEDAI - Journal électronique d'intelligence artificielle, AFIA, 2004, 4, pp.19. ⟨hal-01185705⟩
  • Fabien Flacher, Olivier Sigaud. Coordination spatiale émergente par champs de potentiel. Revue des Sciences et Technologies de l'Information - Série TSI : Technique et Science Informatiques, Lavoisier, 2003, 22 (2), pp.171-195. ⟨10.3166/tsi.22.171-195⟩. ⟨hal-01184953⟩
  • Pierre Gérard, Wolfgang Stolzmann, Olivier Sigaud. YACS : Yet a new Learning Classifier System using Anticipation.. Soft Computing, Springer Verlag, 2002, 6 (3-4), pp.216-228. ⟨10.1007/s005000100117⟩. ⟨hal-01184970⟩
  • Pierre Gérard, Olivier Sigaud. Généralisation et apprentissage latent dans les systèmes de classeurs. Extraction de connaissances et apprentissage, 2001, 1 (3), pp.87-114. ⟨hal-01184979⟩

Conference papers44 documents

  • Ahmed Akakzia, Cédric Colas, Pierre-Yves Oudeyer, Mohamed Chetouani, Olivier Sigaud. Grounding Language to Autonomously-Acquired Skills via Goal Generation. ICLR 2021 - Ninth International Conference on Learning Representation, May 2021, Vienna / Virtual, Austria. ⟨hal-03121146⟩
  • Guillaume Matheron, Nicolas Perrin, Olivier Sigaud. Understanding Failures of Deterministic Actor-Critic with Continuous Action Spaces and Sparse Rewards. Artificial Neural Networks and Machine Learning – ICANN 2020, Sep 2020, Bratislava, Slovakia. pp.308-320, ⟨10.1007/978-3-030-61616-8_25⟩. ⟨hal-03080925⟩
  • Guillaume Matheron, Nicolas Perrin, Olivier Sigaud. PBCS: Efficient Exploration and Exploitation Using a Synergy Between Reinforcement Learning and Motion Planning. Artificial Neural Networks and Machine Learning – ICANN 2020, Sep 2020, Bratislava, Slovakia. pp.295-307, ⟨10.1007/978-3-030-61616-8_24⟩. ⟨hal-03080918⟩
  • Cédric Colas, Pierre Fournier, Olivier Sigaud, Mohamed Chetouani, Pierre-Yves Oudeyer. CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning. ICML 2019 - Thirty-sixth International Conference on Machine Learning, Jun 2019, Long Beach, United States. ⟨hal-01934921v2⟩
  • Thomas Pierrot, Guillaume Ligner, Scott Reed, Olivier Sigaud, Nicolas Perrin, et al.. Learning Compositional Neural Programs with Recursive Tree Search and Planning. Advances in Neural Information Processing Systems 32 (NeurIPS 2019), Dec 2019, Vancouver, Canada. ⟨hal-03080949⟩
  • Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer. GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms. International Conference on Machine Learning (ICML), Jul 2018, Stockholm, Sweden. ⟨hal-01890151⟩
  • Alexandre Péré, Sébastien Forestier, Olivier Sigaud, Pierre-Yves Oudeyer. Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration. ICLR2018 - 6th International Conference on Learning Representations, Apr 2018, Vancouver, Canada. ⟨hal-01891758⟩
  • Alexis Ducarouge, Olivier Sigaud. The Successor Representation as a model of behavioural flexibility. Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes (JFPDA 2017), Jul 2017, Caen, France. ⟨hal-01576352⟩
  • Pierre Fournier, Olivier Sigaud, Mohamed Chetouani. Combining artificial curiosity and tutor guidance for environment exploration. Workshop on Behavior Adaptation, Interaction and Learning for Assistive Robotics at IEEE RO-MAN 2017, Aug 2017, Lisbon, Portugal. ⟨hal-01581363⟩
  • Anis Najar, Olivier Sigaud, Mohamed Chetouani. Social-Task Learning for HRI. International Conference on Social Robotics, Oct 2015, Paris, France. pp.472-481, ⟨10.1007/978-3-319-25554-5_47⟩. ⟨hal-02422990⟩
  • Anis Najar, Olivier Sigaud, Mohamed Chetouani. Socially Guided XCS. GECCO Companion '15 Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation, Jul 2015, Madrid, Spain. pp.1021-1028, ⟨10.1145/2739482.2768452⟩. ⟨hal-02423004⟩
  • Thibaut Munzer, Freek Stulp, Olivier Sigaud. Non-linear regression algorithms for motor skill acquisition: a comparison. 9èmes Journées Francophones de Planification, Décision et Apprentissage, May 2014, Liège, Belgium. ⟨hal-01090848⟩
  • Alain Droniou, Serena Ivaldi, Olivier Sigaud. Learning a Repertoire of Actions with Deep Neural Networks. Joint International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob), Oct 2014, Italy. 6 p. ⟨hal-01065741⟩
  • Didier Marin, Olivier Sigaud. Towards fast and adaptive optimal control policies for robots: A direct policy search approach. Robotica 2012, 2012, Guimaraes, Portugal. pp.21-26. ⟨hal-00703755⟩
  • Alain Droniou, Serena Ivaldi, Olivier Sigaud. Comparaison expérimentale d'algorithmes de régression pour l'apprentissage de modèles cinématiques du robot humanoïde iCub. Conférence Francophone sur l'Apprentissage Automatique - CAp 2012, Laurent Bougrain, May 2012, Nancy, France. 16 p. ⟨hal-00745471⟩
  • Alain Droniou, Serena Ivaldi, Patrick Stalph, Martin Butz, Olivier Sigaud. Learning Velocity Kinematics: Experimental Comparison of On-line Regression Algorithms. Robotica, Apr 2012, Guimaraes, Portugal. pp.15-20. ⟨hal-00719975⟩
  • Freek Stulp, Olivier Sigaud. Adaptation de la matrice de covariance pour l'apprentissage par renforcement direct. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012, May 2012, Villers-lès-Nancy, France. 12 p. ⟨hal-00736310⟩
  • Alain Droniou, Serena Ivaldi, Olivier Sigaud. Comparaison expérimentale d'algorithmes de régression pour l'apprentissage de modèles cinématiques du robot humanoïde iCub. Conférence Francophone sur l'Apprentissage Automatique, May 2012, Nancy, France. pp.95-110. ⟨hal-00719977⟩
  • Alain Droniou, Serena Ivaldi, Vincent Padois, Olivier Sigaud. Autonomous Online Learning of Velocity Kinematics on the iCub: a Comparative Study. IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct 2012, Vilamoura, Portugal. To appear. ⟨hal-00719964⟩
  • Didier Marin, Olivier Sigaud. Reaching optimally over the workspace: a machine learning approach. The Fourth IEEE RAS/EMBS International Conference on Biomedical Robotics and Biomechatronics, Jun 2012, Roma, Italy. pp.1128-1133, ⟨10.1109/BioRob.2012.6290743⟩. ⟨hal-00743371⟩
  • Serena Ivaldi, Damien Gerardeaux-Viret, Alain Droniou, Salvatore Anzalone, Olivier Sigaud, et al.. Social Coordination Assessment: Distinguishing between Shape and Timing. Multimodal Pattern Recognition of Social Signals in Human-Computer-Interaction, Nov 2012, Tsukuba, Japan. pp.9-18, ⟨10.1007/978-3-642-37081-6_2⟩. ⟨hal-02423165⟩
  • Salvatore Anzalone, Serena Ivaldi, Olivier Sigaud, Mohamed Chetouani. Multimodal People Engagement with iCub. Biologically Inspired Cognitive Architectures, Oct 2012, Palermo, Italy. pp.59-64, ⟨10.1007/978-3-642-34274-5_16⟩. ⟨hal-02423157⟩
  • Didier Marin, Jérémie Decock, Lionel Rigoux, Olivier Sigaud. Learning Cost-Efficient Control Policies with XCSF: Generalization Capabilities and Further Improvement. GECCO 2011, 2011, Dublin, Ireland. pp.1235-1242. ⟨hal-00703760⟩
  • Didier Marin, Olivier Sigaud. Apprentissage par renforcement appliqué au contrôle moteur : reproduction du principe d'isochronie. JFPDA 2010, 2010, Besançon, France. pp.1-11. ⟨hal-00703784⟩
  • Lionel Rigoux, Olivier Sigaud. Un modéle computationnel de l'automatisation motrice. Deuxième conférence française de Neurosciences Computationnelles, "Neurocomp08", Oct 2008, Marseille, France. ⟨hal-00331620⟩
  • Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin. Exploiting Additive Structure in Factored MDPs for Reinforcement Learning. European Workshop on Reinforcement Learning, Jun 2008, Villeneuve d’Ascq, France. pp.15-26, ⟨10.1007/978-3-540-89722-4_2⟩. ⟨hal-01302178⟩
  • Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin. Apprentissage par renforcement exploitant la structure additive des MDP factorisés. JFPDA 2007 - 2e Journées Francophones Planification, Décision, Apprentissage pour la conduite de système, Jul 2007, Grenoble, France. pp.49-60. ⟨hal-01305984⟩
  • Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin. Apprentissage de la structure des processus de décision markoviens factorisés pour l'apprentissage par renforcement. JFPDA 2006 - 1ères Journées Francophones sur la Planification, Décision, Apprentissage pour la conduite de systèmes, May 2006, Toulouse, France. pp.89-96. ⟨hal-01336934⟩
  • Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin. Learning the Structure of Factored Markov Decision Processes in Reinforcement Learning Problems. The 23rd International Conference on Machine Learning, Jun 2006, Pittsburgh, Pennsylvania, United States. pp.257-264, ⟨10.1145/1143844.1143877⟩. ⟨hal-01336925⟩
  • Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin. Chi-square Tests Driven Method for Learning the Structure of Factored MDPs. The 22nd conference on Uncertainty in Artificial Intelligence, Jul 2006, Massachusetts Institute of Technology, Cambridge, MA, United States. pp.122-129. ⟨hal-01351133⟩
  • Fabien Flacher, Olivier Sigaud. GACS, an Evolutionary Approach to the Spatial Coordination of Agents. AAMAS 2005 - 4th international joint conference on Autonomous agents and multiagent systems, Jul 2005, Utrecht, Netherlands. pp.1109-1110, ⟨10.1145/1082473.1082647⟩. ⟨hal-01416706⟩
  • Thierry Gourdin, Olivier Sigaud. Towards a Reinforcement Learning Module for Navigation in Video Games. ECML 2005 Workshop on Reinforcement Learning in Non-Stationary Environments, Oct 2005, Porto, Portugal. pp.1-12. ⟨hal-01491687⟩
  • Zahia Guessoum, Lilia Rejeb, Olivier Sigaud. Using XCS to build adaptive agents. 4th Symposium on Adaptive Agents and Multi-Agent Systems (AAMAS-4), AISB convention, Mar 2004, Leeds, United Kingdom. pp.101-106. ⟨hal-01496303⟩
  • Vincent Labbé, Olivier Sigaud, Philippe Codognet. Anticipation of Periodic Movements in Real Time 3D Environments. ABiALS 2004 - 2nd Workshop of Anticipatory Behavior in Adaptive Learning Systems, Jul 2004, Los Angeles, CA, United States. pp.51-61. ⟨hal-01498522⟩
  • Thierry Gourdin, Olivier Sigaud, Pierre-Henri Wuillemin. Improving MACS thanks to a comparison with 2TBNs. GECCO 2004 - Genetic and Evolutionary Computation Conference, Jun 2004, Seattle, WA, United States. pp.810-823, ⟨10.1007/978-3-540-24855-2_95⟩. ⟨hal-01501406⟩
  • Fabien Flacher, Olivier Sigaud. BASC, a Bottom-up Approach to automated design of Spatial Coordination. SAB 2004 - 8th International Conference on Simulation of Adaptive Behavior, Jul 2004, Los Angeles, CA, United States. pp.435-444. ⟨hal-01498529⟩
  • Samuel Landau, Olivier Sigaud. A Michigan style architecture for learning finite state controllers: a first step. 7th International Workshop on Learning Classifier Systems, Jun 2004, Seattle, WA, United States. ⟨hal-01498518⟩
  • Pierre Gérard, Olivier Sigaud. Designing Efficient Exploration with MACS: Modules and Function Approximation. Genetic and Evolutionary Computation Conference 2003 (GECCO03), Jul 2003, Chicago, IL, United States. pp.1882-1893, ⟨10.1007/3-540-45110-2_85⟩. ⟨hal-01532220⟩
  • Olivier Sigaud, Pierre Gérard. Apprentissage par renforcement indirect dans les systèmes de classeurs. Troisièmes Journées Nationales sur Processus Décisionnel de Markov et Intelligence Artificielle - PDMIA’03, 2003, Caen, France. ⟨hal-01532006⟩
  • Fabien Flacher, Olivier Sigaud. Spatial Coordination through Social Potential Fields and Genetic Algorithms. 7th International Conference on Simulation of Adaptive Behavior, 2002, Edimburg, United Kingdom. pp.389-390. ⟨hal-01548174⟩
  • Pierre Gérard, Olivier Sigaud. Adding a Generalization Mechanism to YACS. GECCO 2001 - 3rd Annual Conference on Genetic and Evolutionary Computation, Jul 2001, San Francisco, CA, United States. pp.951-957. ⟨hal-01571789⟩
  • Olivier Sigaud, Pierre Gérard. The use of roles in a multiagent adaptive simulation. 14th European Conference in Artificial Intelligence, Workshop on Balancing reactivity and Social Deliberation in Multiagent Systems, Aug 2000, Berlin, Germany. ⟨hal-01573199⟩
  • Pierre Gérard, Olivier Sigaud. YACS : Combining Anticipation and Dynamic Programming in CLassifier Systems. IWLCS 2000 - 3rd International Workshop on Learning Classifier Systems, Sep 2000, Paris, France. pp.52-69, ⟨10.1007/3-540-44640-0_5⟩. ⟨hal-01571787⟩
  • Olivier Sigaud, Pierre Gérard. Using Classifier Systems as Adaptive Expert Systems for Control. IWLCS 2000 - 3rd International Workshop on Learning Classifier Systems, Sep 2000, Paris, France. pp.138-157, ⟨10.1007/3-540-44640-0_10⟩. ⟨hal-01571784⟩

Books1 document

  • Martin V. Butz, Olivier Sigaud, Pierre Gérard. Anticipatory Behavior in Adaptive Learning Systems. Springer, 2684, 2003, Lecture Notes in Computer Science, ⟨10.1007/b11711⟩. ⟨hal-01532241⟩

Book sections6 documents

  • Isabelle Bloch, Régis Clouard, Marinette Revenu, Olivier Sigaud. Intelligence artificielle et reconnaissance des formes, vision, apprentissage pour la robotique. L'Intelligence Artificielle : Frontières et Applications, Volume 3, série : Panorama de l'Intelligence Artificielle, Editions CEPADUES, 30 pp, 2014, 9782364930438. ⟨hal-00995039⟩
  • Martin Butz, Olivier Sigaud, Giovanni Pezzulo, Gianluca Baldassarre. Anticipations, Brains, Individual and Social Behavior: An Introduction to Anticipatory Systems. Anticipatory Behavior in Adaptive Learning Systems: From Brains to Individual and Social Behavior, 4520, Springer-Verlag, pp.1-18, 2007, Lecture Notes in Computer Science, 978-3-540-74261-6. ⟨10.1007/978-3-540-74262-3_1⟩. ⟨hal-01311570⟩
  • Martin V. Butz, Olivier Sigaud, Pierre Gérard. Internal Models and Anticipations in Adaptive Learning Systems. Anticipatory Behavior in Adaptive Learning Systems, 2684, Springer, pp.86-109, 2003, Lecture Notes in Computer Science, ⟨10.1007/978-3-540-45002-3_6⟩. ⟨hal-01532236⟩
  • Martin V. Butz, Olivier Sigaud, Pierre Gérard. Anticipatory Behavior: Exploiting Knowledge about the Future to Improve Current Behavior. Anticipatory Behavior in Adaptive Learning Systems, 2684, Springer, pp.1-10, 2003, Lecture Notes in Computer Science, ⟨10.1007/978-3-540-45002-3_1⟩. ⟨hal-01532240⟩
  • Olivier Sigaud, Fabien Flacher. Vers une approche dynamique de la sélection de l'action. Approche dynamique de la cognition artificielle, Hermès, pp.163-178, 2002. ⟨hal-01548147⟩
  • Olivier Sigaud, Pierre Gérard. Being Reactive by Exchanging Roles: an Empirical Study. Balancing reactivity and Social Deliberation in Multiagent Systems, 2103, Springer-Verlag, pp.150-172, 2000, Lecture Notes in Computer Science, ⟨10.1007/3-540-44568-4_10⟩. ⟨hal-01571768⟩

Directions of work or proceedings2 documents

  • Olivier Sigaud, Olivier Buffet. Markov Decision Processes in Artificial Intelligence. Olivier Sigaud and Olivier Buffet. ISTE Ltd and John Wiley & Sons Inc, pp.455, 2010, 978-1-84821-167-4. ⟨inria-00432735⟩
  • Martin Butz, Olivier Sigaud, Giovanni Pezzulo, Gianluca Baldassarre. Anticipatory Behavior in Adaptive Learning Systems: From Brains to Individual and Social Behavior. 4520, Springer-Verlag, 2007, Lecture Notes in Computer Science, ⟨10.1007/978-3-540-74262-3⟩. ⟨hal-01311568⟩

Preprints, Working Papers, ...9 documents

  • Cédric Colas, Tristan Karch, Olivier Sigaud, Pierre-Yves Oudeyer. Intrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short Survey. 2021. ⟨hal-03099891⟩
  • Alexandre Chenu, Nicolas Perrin, Stephane Doncieux, Olivier Sigaud. Selection-Expansion: A Unifying Framework for Motion-Planning and Diversity Search Algorithms. 2021. ⟨hal-03196479⟩
  • Cédric Colas, Ahmed Akakzia, Pierre-Yves Oudeyer, Mohamed Chetouani, Olivier Sigaud. Language-Conditioned Goal Generation: a New Approach to Language Grounding in RL. 2021. ⟨hal-03099887⟩
  • Geoffrey Cideron, Thomas Pierrot, Nicolas Perrin, Karim Beguir, Olivier Sigaud. QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning. 2020. ⟨hal-03083159⟩
  • Thomas Pierrot, Nicolas Perrin, Feryal Behbahani, Alexandre Laterre, Olivier Sigaud, et al.. Learning Compositional Neural Programs for Continuous Control. 2020. ⟨hal-03083161⟩
  • Aloïs Pourchot, Nicolas Perrin, Olivier Sigaud. Importance mixing: Improving sample reuse in evolutionary policy search methods. 2019. ⟨hal-02397754⟩
  • Thomas Pierrot, Nicolas Perrin, Olivier Sigaud. First-order and second-order variants of the gradient descent in a unified framework. 2019. ⟨hal-02397757⟩
  • Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer. How Many Random Seeds? Statistical Power Analysis in Deep Reinforcement Learning Experiments. 2018. ⟨hal-01890154⟩
  • Freek Stulp, Olivier Sigaud. Policy Improvement Methods: Between Black-Box Optimization and Episodic Reinforcement Learning. 2012. ⟨hal-00738463⟩