Number of documents

83

Matthieu Geist


Olivier Pietquin   

Journal articles8 documents

  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning. IEEE Transactions on Neural Networks and Learning Systems, IEEE, 2017, 28 (8), pp.1814 - 1826. ⟨10.1109/TNNLS.2016.2543000⟩. ⟨hal-01629654⟩
  • Matthieu Geist, Olivier Pietquin. An algorithmic Survey of Parametric Value Function Approximation. IEEE Transactions on Neural Networks and Learning Systems, IEEE, 2013, 24 (6), pp.845-867. ⟨10.1109/TNNLS.2013.2247418⟩. ⟨hal-00869725⟩
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Classification structurée pour l'apprentissage par renforcement inverse. Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, Lavoisier, 2013, 27 (2), pp.155-169. ⟨10.3166/ria.27.155-169⟩. ⟨hal-00869723⟩
  • Lucie Daubigney, Matthieu Geist, Senthilkumar Chandramohan, Olivier Pietquin. A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimisation. IEEE Journal of Selected Topcis in Signal Processing, 2012, 6 (8), pp.891-902. ⟨10.1109/JSTSP.2012.2229257⟩. ⟨hal-00771646⟩
  • Olivier Pietquin, Matthieu Geist, Senthilkumar Chandramohan, Hervé Frezza-Buet. Sample-Efficient Batch Reinforcement Learning for Dialogue Management Optimization. ACM - Transactions on Speech and Language Processing, Association for Computing Machinery, 2011, 7 (3), pp.art. 7 (1-21). ⟨10.1145/1966407.1966412⟩. ⟨hal-00617517⟩
  • Matthieu Geist, Olivier Pietquin. Kalman Temporal Differences. Journal of Artificial Intelligence Research, Association for the Advancement of Artificial Intelligence, 2010, 39, pp.483-532. ⟨hal-00858687⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Différences temporelles de Kalman: Cas déterministe. Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, Lavoisier, 2010, 24 (4), pp.423-443. ⟨10.3166/ria.24.423-443⟩. ⟨hal-00512093⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. From Supervised to Reinforcement Learning: a Kernel-based Bayesian Filtering Framework. International Journal On Advances in Software, IARIA, 2009, 2 (1), pp.101-116. ⟨hal-00429891⟩

Conference papers71 documents

  • Matthieu Geist, Bilal Piot, Olivier Pietquin. Is the Bellman residual a bad proxy?. NIPS 2017 - Advances in Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-13. ⟨hal-01629739⟩
  • Julien Pérolat, Bilal Piot, Matthieu Geist, Bruno Scherrer, Olivier Pietquin. Softened approximate policy iteration for Markov games. ICML 2016 - 33rd International Conference on Machine Learning, Jun 2016, New York City, United States. ⟨hal-01393328⟩
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Batch Policy Iteration Algorithms for Continuous Domains. European Workshop on Reinforcement Learning (EWRL), 2016, Barcelone, Spain. ⟨hal-01629651⟩
  • Layla El Asri, Bilal Piot, Matthieu Geist, Romain Laroche, Olivier Pietquin. Score-based Inverse Reinforcement Learning. International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), May 2016, Singapore, Singapore. ⟨hal-01406886⟩
  • Thibaut Munzer, Bilal Piot, Matthieu Geist, Olivier Pietquin, Manuel Lopes. Inverse Reinforcement Learning in Relational Domains. International Joint Conferences on Artificial Intelligence, Jul 2015, Buenos Aires, Argentina. ⟨hal-01154650⟩
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Méthode de minimisation du résidu de Bellman boostée qui tient compte des démonstrations expertes.. 9èmes Journées Francophones de Planification, Décision et Apprentissage (JFPDA'14), May 2014, Liège, Belgique. ⟨hal-01104789⟩
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Boosted and Reward-regularized Classification for Apprenticeship Learning. AAMAS 2014 : 13th International Conference on Autonomous Agents and Multiagent Systems, May 2014, Paris, France. pp.1249-1256. ⟨hal-01107837⟩
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Difference of Convex Functions Programming for Reinforcement Learning. Advances in Neural Information Processing Systems (NIPS 2014), Dec 2014, Montreal, Canada. ⟨hal-01104419⟩
  • Bilal Piot, Olivier Pietquin, Matthieu Geist. Predicting when to laugh with structured classification. InterSpeech 2014, Sep 2014, Singapore, Singapore. pp.1786-1790. ⟨hal-01104739⟩
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Boosted Bellman Residual Minimization Handling Expert Demonstrations. European Conference, ECML PKDD 2014, Sep 2014, Nancy, France. pp.549-564, ⟨10.1007/978-3-662-44851-9_35⟩. ⟨hal-01060953⟩
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Apprentissage par démonstrations : vaut-il la peine d'estimer une fonction de récompense?. Journées Francophones de Plannification, Décision et Apprentissage (JFPDA), Jul 2013, Lille, France. ⟨hal-00916941⟩
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Learning from demonstrations: Is it worth estimating a reward function?. 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013), Oct 2013, Princeton, New Jersey, United States. ⟨hal-00916938⟩
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. A cascaded supervised learning approach to inverse reinforcement learning. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2013), Sep 2013, Prague, Czech Republic. pp.1-16, ⟨10.1007/978-3-642-40988-2_1⟩. ⟨hal-00869804⟩
  • Radoslaw Niewiadomski, Jennifer Hofmann, Jérôme Urbain, Tracey Platt, Johannes Wagner, et al.. Laugh-aware virtual agent and its impact on user amusement. AAMAS '13, May 2013, Saint Paul, Minnesota, United States. pp.619-626. ⟨hal-00869751⟩
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Model-free POMDP optimisation of tutoring systems with echo-state networks. SIGDial 2013, Aug 2013, Metz, France. pp.102-106. ⟨hal-00869773⟩
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Optimisation par essaims particulaires de stratégies de dialogue. Journées Francophones de Plannification, Décision et Apprentissage (JFPDA), Jul 2013, Lille, France. ⟨hal-00918425⟩
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Apprentissage par renforcement inverse en cascadant classification et régression. Journées Francophones de Plannification, Décision et Apprentissage (JFPDA), Jul 2013, Lille, France. ⟨hal-00916942⟩
  • Matthieu Geist, Edouard Klein, Bilal Piot, Yann Guermeur, Olivier Pietquin. Around Inverse Reinforcement Learning and Score-based Classification. 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013), Oct 2013, Princeton, New Jersey, United States. ⟨hal-00916936⟩
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Random Projections: a Remedy for Overfitting Issues in Time Series Prediction with Echo State Networks. ICASSP 2013, May 2013, Vancouver, Canada. pp.3253-3257, ⟨10.1109/ICASSP.2013.6638259⟩. ⟨hal-00869814⟩
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Classification régularisée par la récompense pour l'Apprentissage par Imitation. Journées Francophones de Plannification, Décision et Apprentissage (JFPDA), Jul 2013, Lille, France. ⟨hal-00916940⟩
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Learning from Demonstrations: Is It Worth Estimating a Reward Function?. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2013), Sep 2013, Prague, Czech Republic. pp.17-32, ⟨10.1007/978-3-642-40988-2_2⟩. ⟨hal-00869801⟩
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Particle Swarm Optimisation of Spoken Dialogue System Strategies. Interspeech 2013, Aug 2013, Lyon, France. pp.1-5. ⟨hal-00916935⟩
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. Regroupement non-supervisé d'utilisateurs par leur comportement pour les systèmes de dialogue parlé. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012, May 2012, Villers-lès-Nancy, France. 16 p. ⟨hal-00736205⟩
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Off-policy Learning in Large-scale POMDP-based Dialogue Systems. ICASSP 2012, Mar 2012, Kyoto, Japan. pp.4989-4992. ⟨hal-00684819⟩
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. Co-adaptation in Spoken Dialogue Systems. IWSDS 2012, Nov 2012, Paris, France. pp.1. ⟨hal-00778752⟩
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Structured Classification for Inverse Reinforcement Learning. EWRL 2012, Jun 2012, Edinburgh, United Kingdom. pp.1-14. ⟨hal-00749524⟩
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. Clustering Behaviors Of Spoken Dialogue Systems Users. ICASSP 2012, Mar 2012, Kyoto, Japan. pp.4981-4984. ⟨hal-00685009⟩
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Apprentissage off-policy appliqué à un système de dialogue basé sur les PDMPO. RFIA 2012 (Reconnaissance des Formes et Intelligence Artificielle), Jan 2012, Lyon, France. pp.978-2-9539515-2-3. ⟨hal-00656496⟩
  • Julien Oster, Matthieu Geist, Olivier Pietquin, Gary Clifford. Filtering of pathological ventricular rhythms during MRI scanning. BSI2012, Jul 2012, Como, Italy. pp.97-100. ⟨hal-00749457⟩
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. Behavior Specific User Simulation in Spoken Dialogue Systems. 10th ITG Conference on Speech Communication, Sep 2012, Braunschweig, Germany. pp.1-4. ⟨hal-00749421⟩
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Optimisation d'un tuteur intelligent à partir d'un jeu de données fixé. JEP 2012, Jun 2012, Grenoble, France. pp.241-248. ⟨hal-00749498⟩
  • Edouard Klein, Matthieu Geist, Bilal Piot, Olivier Pietquin. Inverse Reinforcement Learning through Structured Classification. NIPS 2012, Dec 2012, Lake Tahoe, Nevada, United States. pp.1-9. ⟨hal-00778624⟩
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Classification structurée pour l'apprentissage par renforcement inverse. Conférence Francophone sur l'Apprentissage Automatique - CAp 2012, May 2012, Nancy, France. pp.1-16. ⟨hal-00701947⟩
  • Olivier Pietquin, Matthieu Geist, Senthilkumar Chandramohan. Sample Efficient On-line Learning of Optimal Dialogue Policies with Kalman Temporal Differences. IJCAI 2011, Jul 2011, Barcelona, Spain. pp.1878-1883. ⟨hal-00618252⟩
  • Lucie Daubigney, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Gestion de l'incertitude pour l'optimisation en ligne d'un gestionnaire de dialogues parlés à grande échelle basé sur les POMDP. JFPDA 2011, Jun 2011, Rouen, France. pp.1-7. ⟨hal-00652511⟩
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. User Simulation in Dialogue Systems using Inverse Reinforcement Learning. Interspeech 2011, Aug 2011, Florence, Italy. pp.1025-1028. ⟨hal-00652446⟩
  • Hadrien Glaude, Fadi Akrimi, Matthieu Geist, Olivier Pietquin. A Non-Parametric Approach to Approximate Dynamic Programming. ICMLA 2011, Dec 2011, Honolulu, Hawaii, United States. pp.1-6. ⟨hal-00652438⟩
  • Edouard Klein, Matthieu Geist, Olivier Pietquin. Batch, Off-policy and Model-Free Apprenticeship Learning. IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT 2011), Jun 2011, Barcelona, Spain. 6 p. ⟨hal-00596370⟩
  • Matthieu Geist, Olivier Pietquin. Parametric value function approximation: A unified view. ADPRL 2011, Apr 2011, Paris, France. pp.9-16, ⟨10.1109/ADPRL.2011.5967355⟩. ⟨hal-00618112⟩
  • Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Apprentissage par Renforcement Inverse pour la Simulation d'Utilisateurs dans les Systèmes de Dialogue. JFPDA 2011, Jun 2011, Rouen, France. pp.1-7. ⟨hal-00652753⟩
  • Lucie Daubigney, Milica Gašić, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin, et al.. Uncertainty management for on-line optimisation of a POMDP-based large-scale spoken dialogue system. Interspeech 2011, Aug 2011, Florence, Italy. pp.1301-1304. ⟨hal-00652194⟩
  • Olivier Pietquin, Lucie Daubigney, Matthieu Geist. Optimization of a Tutoring System from a Fixed Set of Data. SLaTE 2011, Aug 2011, Venice, Italy. pp.1-4. ⟨hal-00652324⟩
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Apprentissage par renforcement pour la personnalisation d'un logiciel d'enseignement des langues. EIAH 2011, May 2011, Mons, Belgique. pp.1-5. ⟨hal-00652516⟩
  • Matthieu Geist, Olivier Pietquin. Kalman filtering & colored noises: the (autoregressive) moving-average case. MLASA 2011, Dec 2011, Honolulu, United States. pp.1-4. ⟨hal-00660607⟩
  • Edouard Klein, Matthieu Geist, Olivier Pietquin. Reducing the dimentionality of the reward space in the Inverse Reinforcement Learning problem. MLASA 2011, Dec 2011, Honolulu, United States. pp.1-4. ⟨hal-00660612⟩
  • Jérémy Fix, Matthieu Geist, Olivier Pietquin, Hervé Frezza-Buet. Dynamic Neural Field Optimization using the Unscented Kalman Filter. CCMB 2011, Apr 2011, Paris, France. 7 p., ⟨10.1109/CCMB.2011.5952113⟩. ⟨hal-00618117⟩
  • Edouard Klein, Matthieu Geist, Olivier Pietquin. Apprentissage par imitation dans un cadre batch, off-policy et sans modèle. JFPDA 2011, Jun 2011, Rouen, France. pp.1-9. ⟨hal-00652762⟩
  • Edouard Klein, Matthieu Geist, Olivier Pietquin. Batch, Off-policy and Model-free Apprenticeship Learning. EWRL 2011, Sep 2011, Athens, Greece. pp.1-12. ⟨hal-00660623⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Astuce du Noyau & Quantification Vectorielle. RFIA'10, Jan 2010, Caen, France. 8 p. ⟨hal-00553114⟩
  • Matthieu Geist, Olivier Pietquin. Revisiting natural actor-critics with value function approximation. BNAIC 2010, Oct 2010, Luxembourg, Luxembourg. 1 page. ⟨hal-00553175⟩
  • Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Optimizing Spoken Dialogue Management with Fitted Value Iteration. Interspeech 2010, Sep 2010, Makuhari, Japan. pp.86-89. ⟨hal-00553184⟩
  • Matthieu Geist, Olivier Pietquin. Managing Uncertainty within the KTD Framework. Active Learning and Experimental Design workshop in conjunction with AISTATS 2010, May 2010, Sardinia, Italy. pp.157-168. ⟨hal-00599636⟩
  • Matthieu Geist, Olivier Pietquin. Statistically Linearized Recursive Least Squares. MLSP 2010, Aug 2010, Kittilä, Finland. pp.272-276, ⟨10.1109/MLSP.2010.5589236⟩. ⟨hal-00553168⟩
  • Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Sparse Approximate Dynamic Programming for Dialog Management. SIGDial 2010, Sep 2010, Tokyo, Japan. pp.107-115. ⟨hal-00553180⟩
  • Matthieu Geist, Olivier Pietquin. Managing Uncertainty within Value Function Approximation in Reinforcement Learning. Active Learning and Experimental Design workshop (collocated with AISTATS 2010), May 2010, Sardinia, Italy. ⟨hal-00554398⟩
  • Matthieu Geist, Olivier Pietquin. Revisiting natural actor-critics with value function approximation. 5èmes Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes (JFPDA'10), Jun 2010, Besançon, France. ⟨hal-00554346⟩
  • Matthieu Geist, Olivier Pietquin. Statistically Linearized Least-Squares Temporal Differences. 5èmes Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes (JFPDA'10), Jun 2010, Besançon, France. ⟨hal-00554338⟩
  • Matthieu Geist, Olivier Pietquin. Revisiting Natural Actor-Critics with Value Function Approximation. MDAI 2010, Oct 2010, Perpignan, France. pp.207-218, ⟨10.1007/978-3-642-16292-3_21⟩. ⟨hal-00553870⟩
  • Matthieu Geist, Olivier Pietquin. Gestion de l'incertitude dans le cadre de l'approximation de la fonction de valeur pour l'apprentissage par renforcement. CAP 2010, May 2010, Clermont-Ferrand, France. pp.101-112. ⟨hal-00553895⟩
  • Matthieu Geist, Olivier Pietquin. Eligibility Traces through Colored Noises. ICUMT 2010, Oct 2010, Moscow, Russia. pp.458-465, ⟨10.1109/ICUMT.2010.5676597⟩. ⟨hal-00553910⟩
  • Matthieu Geist, Olivier Pietquin. Statistically linearized least-squares temporal differences. ICUMT 2010, Oct 2010, Moscow, Russia. pp.450-457, ⟨10.1109/ICUMT.2010.5676598⟩. ⟨hal-00553913⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Différences Temporelles de Kalman : le cas stochastique. JFPDA 2009, Jun 2009, Paris, France. (13 p.). ⟨hal-00437006⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Différences Temporelles de Kalman. JFPDA 2009, Jun 2009, Paris, France. (20 p.). ⟨hal-00437002⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Tracking in Reinforcement Learning. 16th International Conference on Neural Information Processing - ICONIP 2009, Dec 2009, Bangkok, Thailand. pp.502-511, ⟨10.1007/978-3-642-10677-4_57⟩. ⟨hal-00439316⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Kernelizing Vector Quantization Algorithms. ESANN'2009, Apr 2009, Bruges, Belgium. pp.541-546. ⟨hal-00429892⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Kalman Temporal Differences: the deterministic case. ADPRL 2009, Mar 2009, Nashville, TN, United States. pp.185-192, ⟨10.1109/ADPRL.2009.4927543⟩. ⟨hal-00380870⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Kalman Temporal Differences: Uncertainty and Value Function Approximation. NIPS Workshop on Model Uncertainty and Risk in Reinforcement Learning, Dec 2008, Vancouver, Canada. ⟨hal-00351298⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Filtrage bayésien de la récompense. JFPDA 2008, Jun 2008, Metz, France. pp.113-122. ⟨hal-00351343⟩
  • Matthieu Geist, Olivier Pietquin. Bayesian Reward Filtering. EWRL 2008, Jun 2008, Lille, France. pp.96-109, ⟨10.1007/978-3-540-89722-4_8⟩. ⟨hal-00351282⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Online Bayesian Kernel Regression from Nonlinear Mapping of Observations. MLSP 2008, Oct 2008, Cancun, Mexico. pp.309-314, ⟨10.1109/MLSP.2008.4685498⟩. ⟨hal-00335052⟩
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. A Sparse Nonlinear Bayesian Online Kernel Regression. AdvComp 2008, Oct 2008, Valencia, Spain. pp.199-204, ⟨10.1109/ADVCOMP.2008.7⟩. ⟨hal-00327081⟩

Patents1 document

  • Gari Clifford, Julien Oster, Olivier Pietquin, Matthieu Geist. PERIODIC ARTIFACT REDUCTION FROM BIOMEDICAL SIGNALS. France, Patent n° : WO/2013/052944. 2013. ⟨hal-00869739⟩

Other publications1 document

  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Apprentissage off-policy appliqué à un système de dialogue basé sur les PDMPO. Actes du 18ème congrès francophone sur la Reconnaissance de Formes et l'Intelligence Artificielle (RFIA 2012), 2012, pp.1-8. ⟨hal-00656997⟩

Preprints, Working Papers, ...1 document

  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Difference of Convex Functions Programming Applied to Control with Expert Data. 2017. ⟨hal-01629653⟩

Reports1 document

  • Filip Jurcicek, Milica Gašić, Steve Young, Ghislain Putois, Romain Laroche, et al.. Online adaptation of dialogue systems. 2011. ⟨hal-00652841⟩