Recherche - Archive ouverte HAL

88 résultats

	Pour les 88 documents Envoyer sur ORCID RSS ATOM Exporter BibTeX XML-TEI CSV RTF EndNote PDF HTML Export avancé	Page : Page précédente 1 2 3 Page suivante	triés par Pertinence Auteur A→Z Auteur Z→A Titre A→Z Titre Z→A Date de publication croissante Date de publication décroissante Date de dépôt croissante Date de dépôt décroissante

		Building Controllers for Tetris Christophe Thiery , Bruno Scherrer International Computer Games Association Journal, 2009, 32, pp.3-11 Article dans une revue inria-00418954v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Classification-based Policy Iteration with a Critic Victor Gabillon , Alessandro Lazaric , Mohammad Ghavamzadeh , Bruno Scherrer 2011 Rapport hal-00590972v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Guide pratique pour la conception de systèmes de culture légumiers économes en produits phytopharmaceutiques Marine Launais , Ludovic Bzdrenga , Vianney Estorgues , Vincent V. Faloya , Benoit B. Jeannequin , et al. 178 p., 2014 Ouvrages hal-02800645v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Simulations de carrières et retraites à points dans 3 cadres macro-économiques: modèle du gouvernement Philippe (âge-pivot bloqué), modèle du gouvernement Philippe corrigé (âge-pivot glissant), modèle Destinie2 (avec revalorisation de la fonction publique) Bruno Scherrer [Rapport de recherche] INRIA. 2020 Rapport hal-03137362v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Abstraction Pathologies In Markov Decision Processes Manel Tagorti , Bruno Scherrer , Olivier Buffet , Joerg Hoffmann ICAPS'13 workshop on Heuristics and Search for Domain-independent Planning (HSDIP), Jun 2013, Rome, Italy Communication dans un congrès hal-00907315v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Approximate dynamic programming for two-player zero-sum Markov games Julien Perolat , Bruno Scherrer , Bilal Piot , Olivier Pietquin International Conference on Machine Learning (ICML 2015), Jul 2015, Lille, France Communication dans un congrès hal-01153270v3	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Sur l'utilisation de politiques non-stationnaires pour les processus de décision Markoviens à horizon infini Bruno Scherrer , Boris Lesner JFPDA - 8èmes Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes - 2013, Jul 2013, Lille, France Communication dans un congrès hal-00921291v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Optimal control subsumes harmonic control Amine Boumaza , Bruno Scherrer [Research Report] 2006, pp.8 Rapport inria-00119243v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games Julien Pérolat , Bilal Piot , Bruno Scherrer , Olivier Pietquin 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), May 2016, Cadiz, Spain Communication dans un congrès hal-01291495v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Abstraction Pathologies In Markov Decision Processes Manel Tagorti , Bruno Scherrer , Olivier Buffet , Joerg Hoffmann 8èmes Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes, Jul 2013, Lille, France Communication dans un congrès hal-00907295v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning Nino Vieillard , Tadashi Kozuno , Bruno Scherrer , Olivier Pietquin , Rémi Munos , et al. NeurIPS - 34th Conference on Neural Information Processing Systems, Dec 2020, Vancouver / Online, Canada Communication dans un congrès hal-03137351v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		A Theory of Regularized Markov Decision Processes Matthieu Geist , Bruno Scherrer , Olivier Pietquin ICML 2019 - Thirty-sixth International Conference on Machine Learning, Jun 2019, Long Island, United States Communication dans un congrès hal-02273741v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Contributions algorithmiques au contrôle optimal stochastique à temps discret et horizon infini Bruno Scherrer Optimisation et contrôle [math.OC]. Université de Lorraine (Nancy), 2016 HDR tel-01400208v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Performance Bounds for Lambda Policy Iteration and Application to the Game of Tetris Bruno Scherrer Journal of Machine Learning Research, 2013, 14, pp.1175-1221 Article dans une revue hal-00759102v2	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Improved and generalized upper bounds on the complexity of policy iteration Bruno Scherrer Mathematics of Operations Research, 2016, 41 (3), pp.758-774. ⟨10.1287/moor.2015.0753⟩ Article dans une revue hal-00829532v4	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Recherche locale de politique dans un espace convexe Bruno Scherrer , Matthieu Geist Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, 2015, 29 (6), pp.685-704. ⟨10.3166/RIA.29.685-706⟩ Article dans une revue hal-01275247v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes Bruno Scherrer [Research Report] 2012 Rapport hal-00682172v2	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Improvements on Learning Tetris with Cross Entropy Christophe Thiery , Bruno Scherrer International Computer Games Association Journal, 2009, 32 Article dans une revue inria-00418930v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Least-Squares λ Policy Iteration : optimisme et compromis biais-variance pour le contrôle optimal Christophe Thiery , Bruno Scherrer Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes, Jun 2010, Besançon, France Communication dans un congrès inria-00520843v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Biasing Approximate Dynamic Programming with a Lower Discount Factor Marek Petrik , Bruno Scherrer Twenty-Second Annual Conference on Neural Information Processing Systems -NIPS 2008, Dec 2008, Vancouver, Canada Communication dans un congrès inria-00337652v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		How to Combine Tree-Search Methods in Reinforcement Learning Yonathan Efroni , Gal Dalal , Bruno Scherrer , Shie Mannor AAAI 19 - Thirty-Third AAAI Conference on Artificial Intelligence, Jan 2019, Honolulu, Hawai, United States Communication dans un congrès hal-02273713v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Classification-based Policy Iteration with a Critic Victor Gabillon , Alessandro Lazaric , Mohammad Ghavamzadeh , Bruno Scherrer International Conference on Machine Learning (ICML), Jun 2011, Seattle, United States. pp.1049-1056 Communication dans un congrès hal-00644935v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Improved and Generalized Upper Bounds on the Complexity of Policy Iteration Bruno Scherrer Neural Information Processing Systems (NIPS) 2013, Dec 2013, South Lake Tahoe, United States Communication dans un congrès hal-00921261v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		A Dantzig Selector Approach to Temporal Difference Learning Matthieu Geist , Bruno Scherrer , Alessandro Lazaric , Mohammad Ghavamzadeh ICML-12, Jun 2012, Edinburgh, United Kingdom. pp.1399-1406 Communication dans un congrès hal-00749480v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Auto-organisation modulaire d'une architecture intelligente Bruno Scherrer Valgo numéro 01-02, La revue en ligne de l'Association des Connexionnistes en THèse, Association des Connexionnistes en THèse, Oct 2001, Montélimar, France, 8 p Communication dans un congrès inria-00099399v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Approximate Modified Policy Iteration Bruno Scherrer , Mohammad Ghavamzadeh , Victor Gabillon , Matthieu Geist 29th International Conference on Machine Learning - ICML 2012, Jun 2012, Edinburgh, United Kingdom Communication dans un congrès hal-00758882v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Momentum in Reinforcement Learning Nino Vieillard , Bruno Scherrer , Olivier Pietquin , Matthieu Geist AISTATS 2020 - 23rd International Conference on Artificial Intelligence and Statistics, Aug 2020, Palermo / Virtual, Italy Communication dans un congrès hal-03137343v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Navigation, fonctions harmoniques et contrôle optimal stochastique Amine Boumaza , Bruno Scherrer Cinquièmes Journées Nationales sur Processus Décisionnel de Markov et Intelligence Artificielle - PDMIA 2005, Jun 2005, Lille/France Communication dans un congrès inria-00000644v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Tight Performance Bounds for Approximate Modified Policy Iteration with Non-Stationary Policies Boris Lesner , Bruno Scherrer 2013 Pré-publication, Document de travail hal-00815996v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More
		Convergence of Online and Approximate Multiple-Step Lookahead Policy Iteration Yonathan Efroni , Gal Dalal , Bruno Scherrer , Shie Mannor EWRL 2018 - 14th European workshop on Reinforcement Learning, Oct 2018, Lille, France Communication dans un congrès hal-01927977v1	Envoyer sur ORCID Exporter BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite Export avancé Partager Gmail Facebook X LinkedIn More

Filtrer vos résultats

Building Controllers for Tetris

Classification-based Policy Iteration with a Critic

Guide pratique pour la conception de systèmes de culture légumiers économes en produits phytopharmaceutiques

Simulations de carrières et retraites à points dans 3 cadres macro-économiques: modèle du gouvernement Philippe (âge-pivot bloqué), modèle du gouvernement Philippe corrigé (âge-pivot glissant), modèle Destinie2 (avec revalorisation de la fonction publique)

Abstraction Pathologies In Markov Decision Processes

Approximate dynamic programming for two-player zero-sum Markov games

Sur l'utilisation de politiques non-stationnaires pour les processus de décision Markoviens à horizon infini

Optimal control subsumes harmonic control

On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games

Abstraction Pathologies In Markov Decision Processes

Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning

A Theory of Regularized Markov Decision Processes

Contributions algorithmiques au contrôle optimal stochastique à temps discret et horizon infini

Performance Bounds for Lambda Policy Iteration and Application to the Game of Tetris

Improved and generalized upper bounds on the complexity of policy iteration

Recherche locale de politique dans un espace convexe

On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes

Improvements on Learning Tetris with Cross Entropy

Least-Squares λ Policy Iteration : optimisme et compromis biais-variance pour le contrôle optimal

Biasing Approximate Dynamic Programming with a Lower Discount Factor

How to Combine Tree-Search Methods in Reinforcement Learning

Classification-based Policy Iteration with a Critic

Improved and Generalized Upper Bounds on the Complexity of Policy Iteration

A Dantzig Selector Approach to Temporal Difference Learning

Auto-organisation modulaire d'une architecture intelligente

Approximate Modified Policy Iteration

Momentum in Reinforcement Learning

Navigation, fonctions harmoniques et contrôle optimal stochastique

Tight Performance Bounds for Approximate Modified Policy Iteration with Non-Stationary Policies

Convergence of Online and Approximate Multiple-Step Lookahead Policy Iteration