Nombre de documents

114

Matthieu Geist


Article dans une revue13 documents

  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning. IEEE Transactions on Neural Networks and Learning Systems, IEEE, 2017, 28 (8), pp.1814 - 1826. 〈10.1109/TNNLS.2016.2543000〉. 〈hal-01629654〉
  • Bruno Scherrer, Matthieu Geist. Recherche locale de politique dans un espace convexe. Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, Lavoisier, 2015, 29 (6), pp.685-704. 〈10.3166/RIA.29.685-706〉. 〈hal-01275247〉
  • Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, Boris Lesner, Matthieu Geist. Approximate Modified Policy Iteration and its Application to the Game of Tetris. Journal of Machine Learning Research, Journal of Machine Learning Research, 2015, 16, pp.1629−1676. 〈hal-01091341〉
  • Matthieu Geist. Soft-max boosting. Machine Learning, Springer Verlag, 2015, 100 (2), pp.305-332. 〈http://link.springer.com/article/10.1007/s10994-015-5491-2〉. 〈10.1007/s10994-015-5491-2〉. 〈hal-01258816〉
  • Matthieu Geist, Bruno Scherrer. Off-policy Learning with Eligibility Traces: A Survey. Journal of Machine Learning Research, Journal of Machine Learning Research, 2014, 15 (1), pp.289-333. 〈hal-00921275〉
  • Matthieu Geist, Olivier Pietquin. An algorithmic Survey of Parametric Value Function Approximation. IEEE Transactions on Neural Networks and Learning Systems, IEEE, 2013, 24 (6), pp.845-867. 〈10.1109/TNNLS.2013.2247418〉. 〈hal-00869725〉
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Classification structurée pour l'apprentissage par renforcement inverse. Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, Lavoisier, 2013, 27 (2), pp.155-169. 〈10.3166/ria.27.155-169〉. 〈hal-00869723〉
  • Hervé Frezza-Buet, Matthieu Geist. A C++ Template-Based Reinforcement Learning Library: Fitting the Code to the Mathematics. Journal of Machine Learning Research, Journal of Machine Learning Research, 2013, 14 (1), pp.625-628. 〈hal-00914768〉
  • Lucie Daubigney, Matthieu Geist, Senthilkumar Chandramohan, Olivier Pietquin. A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimisation. IEEE Journal of Selected Topcis in Signal Processing, 2012, 6 (8), pp.891-902. 〈10.1109/JSTSP.2012.2229257〉. 〈hal-00771646〉
  • Olivier Pietquin, Matthieu Geist, Senthilkumar Chandramohan, Hervé Frezza-Buet. Sample-Efficient Batch Reinforcement Learning for Dialogue Management Optimization. ACM - Transactions on Speech and Language Processing, Association for Computing Machinery, 2011, 7 (3), pp.art. 7 (1-21). 〈10.1145/1966407.1966412〉. 〈hal-00617517〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Différences temporelles de Kalman: Cas déterministe. Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, Lavoisier, 2010, 24 (4), pp.423-443. 〈10.3166/ria.24.423-443〉. 〈hal-00512093〉
  • Matthieu Geist, Olivier Pietquin. Kalman Temporal Differences. Journal of Artificial Intelligence Research, Association for the Advancement of Artificial Intelligence, 2010, 39, pp.483-532. 〈hal-00858687〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. From Supervised to Reinforcement Learning: a Kernel-based Bayesian Filtering Framework. International Journal On Advances in Software, IARIA, 2009, 2 (1), pp.101-116. 〈hal-00429891〉

Communication dans un congrès91 documents

  • Deepika Singh, Erinc Merdivan, Ismini Psychoula, Johannes Kropf, Sten Hanke, et al.. Human activity recognition using recurrent neural networks. International Cross-Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE), 2017, Reggio di Calabria, Italy. 〈hal-01629704〉
  • Anush Manukyan, Olivares-Mendez Miguel, Holger Voos, Matthieu Geist. Real time degradation identification of UAV using machine learning techniques. International Conference on Unmanned Aircraft Systems (ICUAS), 2017, Miami, United States. 〈hal-01629680〉
  • Matthieu Geist, Bilal Piot, Olivier Pietquin. Faut-il minimiser le résidu de Bellman ou maximiser la valeur moyenne ?. Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes (JFPDA 2017), Jul 2017, Caen, France. Actes des Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes (JFPDA 2017). 〈hal-01576347〉
  • Erinc Merdivan, Mohammad Loghmani, Matthieu Geist. Reconstruct & Crush Network. Advances in Neural Information Processing Systems, 2017, Long Beach, United States. 〈hal-01629742〉
  • Matthieu Geist, Bilal Piot, Olivier Pietquin. Is the Bellman residual a bad proxy?. NIPS 2017 - Advances in Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-13. 〈hal-01629739〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Batch Policy Iteration Algorithms for Continuous Domains. European Workshop on Reinforcement Learning (EWRL), 2016, Barcelone, Spain. 〈hal-01629651〉
  • Julien Pérolat, Bilal Piot, Matthieu Geist, Bruno Scherrer, Olivier Pietquin. Softened Approximate Policy Iteration for Markov Games. ICML 2016 - 33rd International Conference on Machine Learning, Jun 2016, New York City, United States. 〈hal-01393328〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Imitation Learning Applied to Embodied Conversational Agents. JMLR Workshop and Conference Proceedings. 4th Workshop on Machine Learning for Interactive Systems (MLIS 2015), Jul 2015, Lille, France. 43, Proceedings of the 4th Workshop on Machine Learning for Interactive Systems. 〈http://mlis-workshop.org/2015/〉. 〈hal-01225816〉
  • Matthieu Geist. A multiplicative UCB strategy for Gamma rewards. European Workshop on Reinforcement Learning, 2015, Lille, France. 〈hal-01258820〉
  • Thibaut Munzer, Bilal Piot, Matthieu Geist, Olivier Pietquin, Manuel Lopes. Inverse Reinforcement Learning in Relational Domains. International Joint Conferences on Artificial Intelligence, Jul 2015, Buenos Aires, Argentina. 〈hal-01154650〉
  • Bruno Scherrer, Matthieu Geist. Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search. ECMLPKDD 2014, Sep 2014, Nancy, France. Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 8726, pp.35 - 50, 2014, Lecture Notes in Computer Science. 〈10.1007/978-3-662-44845-8_3〉. 〈hal-01086345〉
  • Bruno Scherrer, Matthieu Geist. Quand l'optimalité locale implique une garantie globale : recherche locale de politique dans un espace convexe et algorithme d'itération sur les politiques conservatif vu comme une montée de gradient fonctionnel. 9èmes Journées Francophones de Planification, Décision et Apprentissage (JFPDA'14), May 2014, Liège, Belgique. 2014. 〈hal-01104776〉
  • Bilal Piot, Olivier Pietquin, Matthieu Geist. Predicting when to laugh with structured classification. InterSpeech 2014, Sep 2014, Singapore, Singapore. Proceedings of the Annual Conference of the International Speech Communication Association, pp.1786-1790, 2014, 〈http://www.isca-speech.org/archive/archive_papers/interspeech_2014/i14_1786.pdf〉. 〈hal-01104739〉
  • Bruno Scherrer, Matthieu Geist. Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search. ECML, Sep 2014, Nancy, France. pp.35 - 50, 2014, 〈10.1007/978-3-662-44845-8_3〉. 〈hal-01091079〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Difference of Convex Functions Programming for Reinforcement Learning. Advances in Neural Information Processing Systems (NIPS 2014), Dec 2014, Montreal, Canada. 〈http://nips.cc/Conferences/2014/〉. 〈hal-01104419〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Méthode de minimisation du résidu de Bellman boostée qui tient compte des démonstrations expertes.. 9èmes Journées Francophones de Planification, Décision et Apprentissage (JFPDA'14), May 2014, Liège, Belgique. 2014. 〈hal-01104789〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Boosted and Reward-regularized Classification for Apprenticeship Learning. AAMAS 2014 : 13th International Conference on Autonomous Agents and Multiagent Systems, May 2014, Paris, France. ACM, pp.1249-1256. 〈hal-01107837〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Boosted Bellman Residual Minimization Handling Expert Demonstrations. European Conference, ECML PKDD 2014, Sep 2014, Nancy, France. 8725, pp.549-564, 2014, Lecture Notes in Computer Science. 〈10.1007/978-3-662-44851-9_35〉. 〈hal-01060953〉
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Random Projections: a Remedy for Overfitting Issues in Time Series Prediction with Echo State Networks. ICASSP 2013, May 2013, Vancouver, Canada. ICASSP 2013, pp.3253-3257, 2013, 〈10.1109/ICASSP.2013.6638259〉. 〈hal-00869814〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Learning from Demonstrations: Is It Worth Estimating a Reward Function?. Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, Filip Železný. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2013), Sep 2013, Prague, Czech Republic. Springer, Lecture Notes in Computer Science, 8188, pp.17-32, 2013, Machine Learning and Knowledge Discovery in Databases. 〈10.1007/978-3-642-40988-2_2〉. 〈hal-00869801〉
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. A cascaded supervised learning approach to inverse reinforcement learning. Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2013), Sep 2013, Prague, Czech Republic. Springer, Lecture Notes in Computer Science, 8188, pp.1-16, 2013, Machine Learning and Knowledge Discovery in Databases. 〈10.1007/978-3-642-40988-2_1〉. 〈hal-00869804〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Classification régularisée par la récompense pour l'Apprentissage par Imitation. Journées Francophones de Plannification, Décision et Apprentissage (JFPDA), Jul 2013, Lille, France. 〈hal-00916940〉
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Particle Swarm Optimisation of Spoken Dialogue System Strategies. Interspeech 2013, Aug 2013, Lyon, France. pp.1-5, 2013. 〈hal-00916935〉
  • Matthieu Geist, Edouard Klein, Bilal Piot, Yann Guermeur, Olivier Pietquin. Around Inverse Reinforcement Learning and Score-based Classification. 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013), Oct 2013, Princeton, New Jersey, United States. 〈hal-00916936〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Learning from demonstrations: Is it worth estimating a reward function?. 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013), Oct 2013, Princeton, New Jersey, United States. 〈hal-00916938〉
  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Apprentissage par démonstrations : vaut-il la peine d'estimer une fonction de récompense?. Journées Francophones de Plannification, Décision et Apprentissage (JFPDA), Jul 2013, Lille, France. 〈hal-00916941〉
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Apprentissage par renforcement inverse en cascadant classification et régression. Journées Francophones de Plannification, Décision et Apprentissage (JFPDA), Jul 2013, Lille, France. 〈hal-00916942〉
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Optimisation par essaims particulaires de stratégies de dialogue. Journées Francophones de Plannification, Décision et Apprentissage (JFPDA), Jul 2013, Lille, France. 〈hal-00918425〉
  • Radoslaw Niewiadomski, Jennifer Hofmann, Jérôme Urbain, Tracey Platt, Johannes Wagner, et al.. Laugh-aware virtual agent and its impact on user amusement. AAMAS '13, May 2013, Saint Paul, Minnesota, United States. pp.619-626, 2013. 〈hal-00869751〉
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Model-free POMDP optimisation of tutoring systems with echo-state networks. SIGDial 2013, Aug 2013, Metz, France. pp.102-106, 2013. 〈hal-00869773〉
  • Edouard Klein, Matthieu Geist, Bilal Piot, Olivier Pietquin. Inverse Reinforcement Learning through Structured Classification. NIPS 2012, Dec 2012, Lake Tahoe, Nevada, United States. pp.1-9, 2012. 〈hal-00778624〉
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. Co-adaptation in Spoken Dialogue Systems. IWSDS 2012, Nov 2012, Paris, France. pp.1, 2012. 〈hal-00778752〉
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. Regroupement non-supervisé d'utilisateurs par leur comportement pour les systèmes de dialogue parlé. Olivier Buffet. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012, May 2012, Villers-lès-Nancy, France. 16 p, 2012. 〈hal-00736205〉
  • Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, Mohammad Ghavamzadeh. Un sélecteur de Dantzig pour l'apprentissage par différences temporelles. Olivier Buffet. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012, May 2012, Villers-lès-Nancy, France. 13 p, 2012. 〈hal-00736229〉
  • Bruno Scherrer, Victor Gabillon, Mohammad Ghavamzadeh, Matthieu Geist. Approximations de l'Algorithme Itérations sur les Politiques Modifié. Olivier Buffet. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012, May 2012, Villers-lès-Nancy, France. 1 p, 2012, 〈http://icml.cc/2012/papers/608.pdf〉. 〈hal-00736226〉
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Apprentissage off-policy appliqué à un système de dialogue basé sur les PDMPO. RFIA 2012 (Reconnaissance des Formes et Intelligence Artificielle), Jan 2012, Lyon, France. pp.978-2-9539515-2-3, 2012. 〈hal-00656496〉
  • Jérémy Fix, Matthieu Geist. Optimisation de contrôleurs par essaim particulaire. Conférence Francophone sur l'Apprentissage Automatique - CAp 2012, May 2012, Nancy, France. pp.1-14, 2012. 〈hal-00701945〉
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Classification structurée pour l'apprentissage par renforcement inverse. Conférence Francophone sur l'Apprentissage Automatique - CAp 2012, May 2012, Nancy, France. pp.1-16, 2012. 〈hal-00701947〉
  • Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, Matthieu Geist. Approximate Modified Policy Iteration. 29th International Conference on Machine Learning - ICML 2012, Jun 2012, Edinburgh, United Kingdom. 2012. 〈hal-00758882〉
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. Clustering Behaviors Of Spoken Dialogue Systems Users. ICASSP 2012, Mar 2012, Kyoto, Japan. IEEE, pp.4981-4984, 2012. 〈hal-00685009〉
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Off-policy Learning in Large-scale POMDP-based Dialogue Systems. ICASSP 2012, Mar 2012, Kyoto, Japan. IEEE, pp.4989-4992, 2012. 〈hal-00684819〉
  • Julien Oster, Matthieu Geist, Olivier Pietquin, Gary Clifford. Filtering of pathological ventricular rhythms during MRI scanning. BSI2012, Jul 2012, Como, Italy. pp.97-100, 2012. 〈hal-00749457〉
  • Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, Mohammad Ghavamzadeh. A Dantzig Selector Approach to Temporal Difference Learning. John Langford and Joelle Pineau. ICML-12, Jun 2012, Edinburgh, United Kingdom. Omnipress, pp.1399-1406, 2012. 〈hal-00749480〉
  • Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Structured Classification for Inverse Reinforcement Learning. EWRL 2012, Jun 2012, Edinburgh, United Kingdom. 24, pp.1-14, 2012. 〈hal-00749524〉
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin. Behavior Specific User Simulation in Spoken Dialogue Systems. 10th ITG Conference on Speech Communication, Sep 2012, Braunschweig, Germany. pp.1-4, 2012. 〈hal-00749421〉
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Optimisation d'un tuteur intelligent à partir d'un jeu de données fixé. JEP 2012, Jun 2012, Grenoble, France. 1 (JEP), pp.241-248, 2012. 〈hal-00749498〉
  • Jérémy Fix, Matthieu Geist. Monte-Carlo Swarm Policy Search. Rutkowski, Leszek and Korytkowski, Marcin and Scherer, Rafal and Tadeusiewicz, Ryszard and Zadeh, Lotfi and Zurada, Jacek. Symposium on Swarm Intelligence and Differential Evolution, Apr 2012, Zakopane, Poland. Springer Berlin / Heidelberg, 7269, pp.75-83, 2012, Lecture Notes in Computer Science. 〈10.1007/978-3-642-29353-5_9〉. 〈hal-00695540〉
  • Edouard Klein, Matthieu Geist, Olivier Pietquin. Apprentissage par imitation dans un cadre batch, off-policy et sans modèle. JFPDA 2011, Jun 2011, Rouen, France. pp.1-9, 2011. 〈hal-00652762〉
  • Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Apprentissage par Renforcement Inverse pour la Simulation d'Utilisateurs dans les Systèmes de Dialogue. JFPDA 2011, Jun 2011, Rouen, France. pp.1-7, 2011. 〈hal-00652753〉
  • Lucie Daubigney, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Gestion de l'incertitude pour l'optimisation en ligne d'un gestionnaire de dialogues parlés à grande échelle basé sur les POMDP. JFPDA 2011, Jun 2011, Rouen, France. pp.1-7, 2011. 〈hal-00652511〉
  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Apprentissage par renforcement pour la personnalisation d'un logiciel d'enseignement des langues. EIAH 2011, May 2011, Mons, Belgique. pp.1-5, 2011. 〈hal-00652516〉
  • Olivier Pietquin, Lucie Daubigney, Matthieu Geist. Optimization of a Tutoring System from a Fixed Set of Data. SLaTE 2011, Aug 2011, Venice, Italy. pp.1-4, 2011. 〈hal-00652324〉
  • Lucie Daubigney, Milica Gašić, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin, et al.. Uncertainty management for on-line optimisation of a POMDP-based large-scale spoken dialogue system. Interspeech 2011, Aug 2011, Florence, Italy. pp.1301-1304, 2011. 〈hal-00652194〉
  • Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefevre, Olivier Pietquin. User Simulation in Dialogue Systems using Inverse Reinforcement Learning. Interspeech 2011, Aug 2011, Florence, Italy. pp.1025-1028, 2011. 〈hal-00652446〉
  • Hadrien Glaude, Fadi Akrimi, Matthieu Geist, Olivier Pietquin. A Non-Parametric Approach to Approximate Dynamic Programming. ICMLA 2011, Dec 2011, Honolulu, Hawaii, United States. pp.1-6, 2011. 〈hal-00652438〉
  • Remi Chou, Yvo Boers, Martin Podt, Matthieu Geist. Performance evaluation for particle filters. FUSION 2011, Jul 2011, Chicago, United States. pp.1-7, 2011. 〈hal-00652168〉
  • Edouard Klein, Matthieu Geist, Olivier Pietquin. Reducing the dimentionality of the reward space in the Inverse Reinforcement Learning problem. MLASA 2011, Dec 2011, Honolulu, United States. pp.1-4, 2011. 〈hal-00660612〉
  • Edouard Klein, Matthieu Geist, Olivier Pietquin. Batch, Off-policy and Model-free Apprenticeship Learning. EWRL 2011, Sep 2011, Athens, Greece. pp.1-12, 2011. 〈hal-00660623〉
  • Matthieu Geist, Olivier Pietquin. Kalman filtering & colored noises: the (autoregressive) moving-average case. MLASA 2011, Dec 2011, Honolulu, United States. pp.1-4, 2011. 〈hal-00660607〉
  • Olivier Pietquin, Matthieu Geist, Senthilkumar Chandramohan. Sample Efficient On-line Learning of Optimal Dialogue Policies with Kalman Temporal Differences. IJCAI 2011, Jul 2011, Barcelona, Spain. pp.1878-1883, 2011. 〈hal-00618252〉
  • Jérémy Fix, Matthieu Geist, Olivier Pietquin, Hervé Frezza-Buet. Dynamic Neural Field Optimization using the Unscented Kalman Filter. CCMB 2011, Apr 2011, Paris, France. 7 p., 2011, 〈10.1109/CCMB.2011.5952113〉. 〈hal-00618117〉
  • Matthieu Geist, Olivier Pietquin. Parametric value function approximation: A unified view. ADPRL 2011, Apr 2011, Paris, France. pp.9-16, 2011, 〈10.1109/ADPRL.2011.5967355〉. 〈hal-00618112〉
  • Edouard Klein, Matthieu Geist, Olivier Pietquin. Batch, Off-policy and Model-Free Apprenticeship Learning. IJCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT 2011), Jun 2011, Barcelona, Spain. 6 p., 2011. 〈hal-00596370〉
  • Bruno Scherrer, Matthieu Geist. Moindres carrés récursifs pour l'évaluation off-policy d'une politique avec traces d'éligibilité. 6ème Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes - JFPDA 2011, Jun 2011, Rouen, France. 2011. 〈hal-00644874〉
  • Bruno Scherrer, Matthieu Geist. Recursive Least-Squares Learning with Eligibility Traces. European Wrokshop on Reinforcement Learning (EWRL 11), Sep 2011, Athens, Greece. 2011. 〈hal-00644511〉
  • Matthieu Geist, Bruno Scherrer. l1-penalized projected Bellman residual. European Wrokshop on Reinforcement Learning (EWRL 11), Sep 2011, Athens, Greece. 2011. 〈hal-00644507〉
  • Matthieu Geist, Olivier Pietquin. Revisiting natural actor-critics with value function approximation. BNAIC 2010, Oct 2010, Luxembourg, Luxembourg. 1 page, 2010. 〈hal-00553175〉
  • Matthieu Geist, Olivier Pietquin. Statistically Linearized Recursive Least Squares. MLSP 2010, Aug 2010, Kittilä, Finland. pp.272-276, 2010, 〈10.1109/MLSP.2010.5589236〉. 〈hal-00553168〉
  • Matthieu Geist, Olivier Pietquin. Gestion de l'incertitude dans le cadre de l'approximation de la fonction de valeur pour l'apprentissage par renforcement. CAP 2010, May 2010, Clermont-Ferrand, France. PUG, pp.101-112, 2010. 〈hal-00553895〉
  • Matthieu Geist, Olivier Pietquin. Statistically Linearized Least-Squares Temporal Differences. 5èmes Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes (JFPDA'10), Jun 2010, Besançon, France. 〈hal-00554338〉
  • Matthieu Geist. Statistical Linearization for Value Function Approximation in Reinforcement Learning. NIPS Workshop on Learning and Planning from Batch Time Series Data (OPT 2010), Dec 2010, Vancouver, Canada. pp.1-6, 2010. 〈hal-00554324〉
  • Matthieu Geist, Olivier Pietquin. Revisiting natural actor-critics with value function approximation. 5èmes Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes (JFPDA'10), Jun 2010, Besançon, France. 〈hal-00554346〉
  • Matthieu Geist, Olivier Pietquin. Eligibility Traces through Colored Noises. ICUMT 2010, Oct 2010, Moscow, Russia. pp.458-465, 2010, 〈10.1109/ICUMT.2010.5676597〉. 〈hal-00553910〉
  • Matthieu Geist, Olivier Pietquin. Revisiting Natural Actor-Critics with Value Function Approximation. V. Torra and Y. Narukawa and M. Daumas. MDAI 2010, Oct 2010, Perpignan, France. Springer Verlag - Heidelberg Berlin, 6408, pp.207-218, 2010, Lecture Notes in Computer Science (LNCS). 〈10.1007/978-3-642-16292-3_21〉. 〈hal-00553870〉
  • Matthieu Geist, Olivier Pietquin. Statistically linearized least-squares temporal differences. ICUMT 2010, Oct 2010, Moscow, Russia. pp.450-457, 2010, 〈10.1109/ICUMT.2010.5676598〉. 〈hal-00553913〉
  • Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Optimizing Spoken Dialogue Management with Fitted Value Iteration. Interspeech 2010, Sep 2010, Makuhari, Japan. ISCA, pp.86-89, 2010. 〈hal-00553184〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Astuce du Noyau & Quantification Vectorielle. RFIA'10, Jan 2010, Caen, France. 8 p., 2010. 〈hal-00553114〉
  • Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Sparse Approximate Dynamic Programming for Dialog Management. SIGDial 2010, Sep 2010, Tokyo, Japan. ACL, pp.107-115, 2010. 〈hal-00553180〉
  • Matthieu Geist, Olivier Pietquin. Managing Uncertainty within Value Function Approximation in Reinforcement Learning. Active Learning and Experimental Design workshop (collocated with AISTATS 2010), May 2010, Sardinia, Italy. 〈hal-00554398〉
  • Matthieu Geist, Olivier Pietquin. Managing Uncertainty within the KTD Framework. Active Learning and Experimental Design workshop in conjunction with AISTATS 2010, May 2010, Sardinia, Italy. 16, pp.157-168, 2011. 〈hal-00599636〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Différences Temporelles de Kalman. JFPDA 2009, Jun 2009, Paris, France. (20 p.), 2009. 〈hal-00437002〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Différences Temporelles de Kalman : le cas stochastique. JFPDA 2009, Jun 2009, Paris, France. (13 p.), 2009. 〈hal-00437006〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Kalman Temporal Differences: the deterministic case. ADPRL 2009, Mar 2009, Nashville, TN, United States. pp.185-192, 2009, 〈10.1109/ADPRL.2009.4927543〉. 〈hal-00380870〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Kernelizing Vector Quantization Algorithms. ESANN'2009, Apr 2009, Bruges, Belgium. pp.541-546, 2009. 〈hal-00429892〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Tracking in Reinforcement Learning. 16th International Conference on Neural Information Processing - ICONIP 2009, Dec 2009, Bangkok, Thailand. Springer, 5863, pp.502-511, 2009, Lecture Notes in Computer Science (LNCS). 〈10.1007/978-3-642-10677-4_57〉. 〈hal-00439316〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. A Sparse Nonlinear Bayesian Online Kernel Regression. AdvComp 2008, Oct 2008, Valencia, Spain. pp.199-204, 2008, 〈10.1109/ADVCOMP.2008.7〉. 〈hal-00327081〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Online Bayesian Kernel Regression from Nonlinear Mapping of Observations. MLSP 2008, Oct 2008, Cancun, Mexico. pp.309-314, 2008, 〈10.1109/MLSP.2008.4685498〉. 〈hal-00335052〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Filtrage bayésien de la récompense. JFPDA 2008, Jun 2008, Metz, France. pp.113-122, 2008. 〈hal-00351343〉
  • Matthieu Geist, Olivier Pietquin. Bayesian Reward Filtering. EWRL 2008, Jun 2008, Lille, France. 5323, pp.96-109, 2008, Lecture Notes in Computer Science (LNCS). 〈10.1007/978-3-540-89722-4_8〉. 〈hal-00351282〉
  • Matthieu Geist. Kalman Temporal Differences. Cross-border workshop of PhD students in fundamental and applied mathematics (LMAM - UPVM), Dec 2008, Metz, France. 〈hal-00351297〉
  • Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Kalman Temporal Differences: Uncertainty and Value Function Approximation. NIPS Workshop on Model Uncertainty and Risk in Reinforcement Learning, Dec 2008, Vancouver, Canada. 〈hal-00351298〉

Chapitre d'ouvrage1 document

  • Deepika Singh, Erinc Merdivan, Sten Hanke, Johannes Kropf, Matthieu Geist, et al.. Convolutional and Recurrent Neural Networks for Activity Recognition in Smart Environment. A. Holzinger; R. Goebel; M. Ferri; V. Palade. Towards Integrative Machine Learning and Knowledge Extraction , 10344, springer, pp.194-205, 2017, Lecture Notes in Computer Science. 〈hal-01629732〉

Brevet1 document

  • Gari Clifford, Julien Oster, Olivier Pietquin, Matthieu Geist. PERIODIC ARTIFACT REDUCTION FROM BIOMEDICAL SIGNALS. France, Patent n° : WO/2013/052944. 2013. 〈hal-00869739〉

Autre publication1 document

  • Lucie Daubigney, Matthieu Geist, Olivier Pietquin. Apprentissage off-policy appliqué à un système de dialogue basé sur les PDMPO. (Référence à supprimer). 2012, pp.1-8. 〈hal-00656997〉

Pré-publication, Document de travail2 documents

  • Bilal Piot, Matthieu Geist, Olivier Pietquin. Difference of Convex Functions Programming Applied to Control with Expert Data. 2017. 〈hal-01629653〉
  • Bruno Scherrer, Matthieu Geist. Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee. 2013. 〈hal-00829548〉

Rapport3 documents

  • Matthieu Geist, Bruno Scherrer. Off-policy Learning with Eligibility Traces: A Survey. [Research Report] 2013, pp.43. 〈hal-00644516v2〉
  • Bruno Scherrer, Victor Gabillon, Mohammad Ghavamzadeh, Matthieu Geist. Approximate Modified Policy Iteration. [Research Report] 2012. 〈hal-00697169v2〉
  • Filip Jurcicek, Milica Gašić, Steve Young, Ghislain Putois, Romain Laroche, et al.. Online adaptation of dialogue systems. 2011. 〈hal-00652841〉

Thèse1 document

  • Matthieu Geist. Optimisation des chaînes de production dans l'industrie sidérurgique : une approche statistique de l'apprentissage par renforcement. Mathématiques [math]. Université de Metz, 2009. Français. 〈tel-00441557〉

HDR1 document

  • Matthieu Geist. Contrôle optimal et apprentissage automatique, applications aux interactions homme-machine. Machine Learning [stat.ML]. Université de Lille 1 - Sciences et Technologies, 2016. 〈tel-01629638〉