Filtrer vos résultats
- 53
- 5
- 44
- 10
- 1
- 1
- 1
- 1
- 50
- 14
- 11
- 5
- 8
- 10
- 7
- 4
- 10
- 2
- 4
- 2
- 3
- 3
- 56
- 2
- 53
- 50
- 10
- 7
- 6
- 5
- 5
- 5
- 3
- 3
- 3
- 2
- 2
- 2
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 58
- 10
- 9
- 6
- 5
- 4
- 4
- 4
- 3
- 3
- 3
- 3
- 3
- 3
- 3
- 3
- 3
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 2
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
58 résultats
|
|
triés par
|
|
On the Complexity of A/B TestingConference on Learning Theory, Jun 2014, Barcelona, Spain. pp.461-481
Communication dans un congrès
hal-00990254v2
|
||
|
General parallel optimization without a metricAlgorithmic Learning Theory, 2019, Chicago, United States
Communication dans un congrès
hal-02047225v2
|
||
|
Optimistic PAC Reinforcement Learning: the Instance-Dependent ViewAlgorithmic Learning Theory (ALT), Feb 2023, Singapore (SG), Singapore
Communication dans un congrès
hal-04306228v1
|
||
|
Efficient Change-Point Detection for Tackling Piecewise-Stationary BanditsJournal of Machine Learning Research, 2022
Article dans une revue
hal-02006471v3
|
||
|
A kernel-based approach to non-stationary reinforcement learning in metric spacesInternational Conference on Artificial Intelligence and Statistics, Apr 2021, San Diego / Virtual, United States
Communication dans un congrès
hal-03289026v1
|
||
|
Monte-Carlo Tree Search by Best Arm IdentificationNIPS 2017 - 31st Annual Conference on Neural Information Processing Systems, Dec 2017, Long Beach, United States. pp.1-23
Communication dans un congrès
hal-01535907v2
|
||
|
Non-Asymptotic Sequential Tests for Overlapping Hypotheses and application to near optimal arm identification in bandit modelsSequential Analysis, 2021
Article dans une revue
hal-02123833v2
|
||
|
What Doubling Tricks Can and Can't Do for Multi-Armed Bandits2018
Pré-publication, Document de travail
hal-01736357v1
|
||
|
Aggregation of Multi-Armed Bandits Learning Algorithms for Opportunistic Spectrum AccessIEEE WCNC - IEEE Wireless Communications and Networking Conference, Apr 2018, Barcelona, Spain. ⟨10.1109/wcnc.2018.8377070⟩
Communication dans un congrès
hal-01705292v1
|
||
|
Analyse non asymptotique d'un test séquentiel de détection de rupture et application aux bandits non stationnairesGRETSI 2019 - XXVIIème Colloque francophone de traitement du signal et des images, Aug 2019, Lille, France
Communication dans un congrès
hal-02152243v1
|
||
|
Sub-sampling for Efficient Non-Parametric Bandit ExplorationNeurIPS 2020, Dec 2020, Vancouver, Canada
Communication dans un congrès
hal-02977552v1
|
||
|
Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPsNeurIPS 2022 - 36th Conference on Neural Information Processing System, Nov 2022, New Orleans, United States
Communication dans un congrès
hal-03825101v1
|
||
|
Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPsEWRL 2022 - European Workshop on Reinforcement Learning, Sep 2022, Milan, Italy
Communication dans un congrès
hal-03767412v1
|
||
|
Kernel-based reinforcement Learning: A finite-time analysisInternational Conference on Machine Learning, Jul 2021, Vienna / Virtual, Austria
Communication dans un congrès
hal-02541790v2
|
||
|
On Multi-Armed Bandit Designs for Dose-Finding TrialsJournal of Machine Learning Research, 2021
Article dans une revue
hal-02533297v1
|
||
Thompson Sampling : an asymptotically optimal finite time analysisInternational Conference on Algorithmic Learning Theory, Nov 2012, Lyon, France. pp.199-213
Communication dans un congrès
hal-02286442v1
|
|||
|
Fixed-confidence guarantees for Bayesian best-arm identificationInternational Conference on Artificial Intelligence and Statistics, 2020, Palermo, Italy
Communication dans un congrès
hal-02330187v2
|
||
|
On the Complexity of Best Arm Identification in Multi-Armed Bandit ModelsJournal of Machine Learning Research, 2016, 17, pp.1-42
Article dans une revue
hal-01024894v2
|
||
|
A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among PlayersAISTATS 2020 - 23rd International Conference on Artificial Intelligence and Statistics, Aug 2020, Palermo, Italy
Communication dans un congrès
hal-02006069v3
|
||
|
Thompson sampling for one-dimensional exponential family banditsAdvances in Neural Information Processing Systems, 2013, United States
Communication dans un congrès
hal-00923683v1
|
||
|
A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in NetworksTheoretical Computer Science, 2018, 742, pp.3-26. ⟨10.1016/j.tcs.2017.12.028⟩
Article dans une revue
hal-01163147v3
|
||
|
An ε-Best-Arm Identification Algorithm for Fixed-Confidence and BeyondAdvances in Neural Information Processing Systems (NeurIPS), Dec 2023, New Orleans, United States
Communication dans un congrès
hal-04306214v1
|
||
|
On Explore-Then-Commit StrategiesNIPS, Dec 2016, Barcelona, Spain
Communication dans un congrès
hal-01322906v2
|
||
|
Fast active learning for pure exploration in reinforcement learningInternational Conference on Machine Learning, Jul 2021, Vienna, Austria
Communication dans un congrès
hal-02906985v3
|
||
|
Maximin Action Identification: A New Bandit Framework for Games29th Annual Conference on Learning Theory (COLT), Jun 2016, New-York, United States
Communication dans un congrès
hal-01273842v2
|
||
|
Asymptotically Optimal Algorithms for Budgeted Multiple Play BanditsMachine Learning, 2019, 108 (11), pp.1919-1949. ⟨10.1007/s10994-019-05799-x⟩
Article dans une revue
hal-01338733v3
|
||
|
On Bayesian index policies for sequential resource allocationAnnals of Statistics, 2018, 46 (2), pp.842-865. ⟨10.1214/17-AOS1569⟩
Article dans une revue
hal-01251606v3
|
||
|
Near-Optimal Collaborative Learning in BanditsNeurIPS 2022 - 36th Conference on Neural Information Processing System, Dec 2022, New Orleans, United States
Communication dans un congrès
hal-03825099v1
|
||
|
Dealing with Unknown Variances in Best-Arm IdentificationAlgorithmic Learning Theory (ALT), Feb 2023, Singapore (SG), Singapore
Communication dans un congrès
hal-04306221v1
|
||
|
Episodic reinforcement learning in finite MDPs: Minimax lower bounds revisitedAlgorithmic Learning Theory, Mar 2021, Paris / Virtual, France
Communication dans un congrès
hal-03289004v1
|
- 1
- 2