Filtrer vos résultats
- 62
- 4
- 50
- 8
- 5
- 2
- 1
- 3
- 3
- 42
- 21
- 6
- 3
- 2
- 4
- 5
- 10
- 6
- 5
- 4
- 2
- 13
- 8
- 6
- 1
- 65
- 1
- 63
- 35
- 35
- 29
- 9
- 6
- 6
- 4
- 3
- 2
- 2
- 2
- 2
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 66
- 21
- 15
- 9
- 9
- 6
- 5
- 5
- 5
- 4
- 4
- 3
- 3
- 3
- 3
- 3
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
66 résultats
|
Second-Order Kernel Online Convex Optimization with Adaptive SketchingInternational Conference on Machine Learning, 2017, Sydney, Australia
Communication dans un congrès
hal-01537799v1
|
||
|
Regret Minimization in MDPs with Options without Prior Knowledge NIPS 2017 - Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-36
Communication dans un congrès
hal-01649082v1
|
||
|
Bayesian Multi-Task Reinforcement LearningICML - 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. pp.599-606
Communication dans un congrès
inria-00475214v1
|
||
|
Risk-Aversion in Multi-armed BanditsNIPS - Twenty-Sixth Annual Conference on Neural Information Processing Systems, Dec 2012, Lake Tahoe, United States
Communication dans un congrès
hal-00772609v1
|
||
|
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement LearningICML 2018 - The 35th International Conference on Machine Learning, Jul 2018, Stockholm, Sweden. pp.1578-1586
Communication dans un congrès
hal-01941206v1
|
||
|
Distributed adaptive sampling for kernel matrix approximationInternational Conference on Artificial Intelligence and Statistics, 2017, Fort Lauderdale, United States
Communication dans un congrès
hal-01482760v1
|
||
|
Classification-based Policy Iteration with a Critic2011
Rapport
hal-00590972v1
|
||
|
Least-squares methods for policy iterationReinforcement Learning: State of the Art, Springer, pp.75-109, 2011
Chapitre d'ouvrage
hal-00830122v1
|
||
|
Sparse Multi-task Reinforcement LearningNIPS - Advances in Neural Information Processing Systems 26, Dec 2014, Montreal, Canada
Communication dans un congrès
hal-01073513v1
|
||
|
Learning with stochastic inputs and adversarial outputsJournal of Computer and System Sciences, 2012, 78 (5), pp.1516-1537. ⟨10.1016/j.jcss.2011.12.027⟩
Article dans une revue
hal-00772046v1
|
||
|
Maximum Entropy Semi-Supervised Inverse Reinforcement LearningInternational Joint Conference on Artificial Intelligence, Jul 2015, Bueons Aires, Argentina
Communication dans un congrès
hal-01146187v1
|
||
|
Truthful Learning Mechanisms for Multi–Slot Sponsored Search Auctions with ExternalitiesArtificial Intelligence, 2015, 227, pp.93-139. ⟨10.1016/j.artint.2015.05.012⟩
Article dans une revue
hal-01237670v1
|
||
|
Rotting bandits are not harder than stochastic onesInternational Conference on Artificial Intelligence and Statistics, 2019, Naha, Japan
Communication dans un congrès
hal-01936894v2
|
||
|
Exploration–Exploitation in MDPs with OptionsAISTATS 2017 - 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States
Communication dans un congrès
hal-01493567v2
|
||
|
Improved Learning Complexity in Combinatorial Pure Exploration BanditsProceedings of the 19th International Conference on Artificial Intelligence (AISTATS), May 2016, Cadiz, Spain
Communication dans un congrès
hal-01322198v1
|
||
|
A Truthful Learning Mechanism for Contextual Multi--Slot Sponsored Search Auctions with ExternalitiesEC - 13th ACM Conference on Electronic Commerce, Jun 2012, Valencia, Spain
Communication dans un congrès
hal-00772624v1
|
||
|
Word-order biases in deep-agent emergent communicationACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Jul 2019, Florence, Italy
Communication dans un congrès
hal-02274157v1
|
||
|
Risk-Aversion in Multi-armed Bandits[Research Report] 2012
Rapport
hal-00750298v1
|
||
|
Pack only the essentials: Adaptive dictionary learning for kernel ridge regressionAdaptive and Scalable Nonparametric Methods in Machine Learning at Neural Information Processing Systems, 2016, Barcelona, Spain
Communication dans un congrès
hal-01482756v1
|
||
|
Active Learning for Accurate Estimation of Linear ModelsICML 2017 - 34th International Conference on Machine Learning, Aug 2017, Sydney, Australia. pp.36
Communication dans un congrès
hal-01538762v1
|
||
|
Thompson Sampling for Linear-Quadratic Control ProblemsAISTATS 2017 - 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States
Communication dans un congrès
hal-01493564v1
|
||
|
Finite-Sample Analysis of Least-Squares Policy Iteration[Technical Report] 2010
Rapport
inria-00528596v1
|
||
|
Analysis of a Classification-based Policy Iteration AlgorithmICML - 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. pp.607-614
Communication dans un congrès
inria-00482065v3
|
||
|
Transfer in Reinforcement Learning: a Framework and a SurveyMarco Wiering, Martijn van Otterlo. Reinforcement Learning - State of the art, 12, Springer, pp.143-173, 2012, ⟨10.1007/978-3-642-27645-3_5⟩
Chapitre d'ouvrage
hal-00772626v1
|
||
|
Regret Bounds for Reinforcement Learning with Policy AdviceECML/PKDD - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2013, Prague, Czech Republic
Communication dans un congrès
hal-00924021v1
|
||
|
Un sélecteur de Dantzig pour l'apprentissage par différences temporellesJournées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012, May 2012, Villers-lès-Nancy, France. 13 p
Communication dans un congrès
hal-00736229v1
|
||
|
Linear Thompson Sampling RevisitedAISTATS 2017 - 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States
Communication dans un congrès
hal-01493561v1
|
||
Parallel Higher Order Alternating Least Square for Tensor Recommender System AAAI 2017 - Thirty-First AAAI Conference on Artificial Intelligence, Feb 2017, San Francisco, United States
Communication dans un congrès
hal-01628298v1
|
|||
|
Efficient second-order online kernel learning with adaptive embeddingNeural Information Processing Systems, 2017, Long Beach, United States
Communication dans un congrès
hal-01643961v1
|
||
|
Finite-sample analysis of Lasso-TDInternational Conference on Machine Learning, 2011, United States
Communication dans un congrès
hal-00830149v1
|