David Defour
76
Documents
Publications
|
Using scheduling entropy amplification in CUDA/OpenMP code to exhibit non-reproducibility issues15th IEEE International Symposium on EMbedded Multicore/Many-core Systems-on-Chip (MCSoC-2022), Dec 2022, Penang, Malaysia
Communication dans un congrès
hal-03832904v1
|
|
Shadow computation with BFloat16 to estimate thenumerical accuracy of summationsIEEE 28th Symposium on Computer Arithmetic (ARITH), Jun 2021, Virtual Conference, France
Communication dans un congrès
hal-03159965v2
|
|
Custom-Precision Mathematical Library Explorations for Code Profiling and Optimization2020 IEEE 27th Symposium on Computer Arithmetic (ARITH), 2020, Los Alamitos, United States. pp.121-124, ⟨10.1109/ARITH48897.2020.00026⟩
Communication dans un congrès
hal-02563852v1
|
|
Automatic Exploration of Reduced Floating-Point Representations in Iterative Methods25th International Conference Euro-Par 2019 Parallel Processing, Aug 2019, Göttingen, Germany. pp.481-494, ⟨10.1007/978-3-030-29400-7_34⟩
Communication dans un congrès
hal-02564972v1
|
VeriTracer: Context-enriched tracer for floating-point arithmetic analysis2018 IEEE 25th Symposium on Computer Arithmetic (ARITH), Jun 2018, Amherst, United States. pp.61-68, ⟨10.1109/ARITH.2018.8464687⟩
Communication dans un congrès
hal-01989607v1
|
|
|
Towards a Reproducible Solution of Linear Systems Supercomputing Conference 2017-Computational Reproducibility at Exascale Workshop, Nov 2017, Denver, United States
Communication dans un congrès
hal-01633980v1
|
|
Asynchronous Power Flow on Graphic Processing UnitsPDP: Parallel, Distributed and network-Based Processing, Mar 2017, St Petersburg, Russia
Communication dans un congrès
lirmm-01475578v1
|
Reproducible and Accurate Algorithms for Numerical Linear AlgebraPP: Parallel Processing for Scientific Computing, Apr 2016, Paris, France
Communication dans un congrès
lirmm-01268048v1
|
|
Towards Fast, Accurate and Reproducible LU FactorizationSCAN 2016, 17th international symposium on Scientific Computing, Computer Arithmetic and Validated Numerics, Sep 2016, Uppsala, Sweden. pp.59-60
Communication dans un congrès
hal-01539343v1
|
|
|
Hierarchical Approach for Deriving a Reproducible LU factorization on GPUsThe Numerical Reproducibility at Exascale (NRE16) workshop held as part of the Supercomputing Conference (SC16), Nov 2016, Salt Lake City, UT, United States
Communication dans un congrès
hal-01382645v1
|
|
Reproducible Triangular Solvers for High-Performance Computing2015 12th International Conference on Information Technology - New Generations, Apr 2015, Las Vegas, NV, United States. pp.353-358, ⟨10.1109/ITNG.2015.63⟩
Communication dans un congrès
hal-01116588v2
|
An efficient midpoint-radius implementation to handle symmetric fuzzy intervalsRAIM: Rencontres Arithmétiques de l’Informatique Mathématique, Apr 2015, Rennes, France
Communication dans un congrès
hal-01140504v1
|
|
|
ExBLAS: Reproducible and Accurate BLAS LibraryNRE: Numerical Reproducibility at Exascale, Nov 2015, Austin, TX, United States
Communication dans un congrès
hal-01202396v3
|
|
Réduction d'argument basée sur les triplets pythagoriciens pour l'évaluation de fonctions trigonométriquesComPAS: Conférence en Parallélisme, Architecture et Système, Jun 2015, Lille, France
Communication dans un congrès
lirmm-01136772v1
|
Reproducible floating-point atomic addition in data-parallel environmentACSIS, Sep 2015, Lodz, Poland. pp.721-728, ⟨10.15439/2015F86⟩
Communication dans un congrès
hal-01267755v1
|
|
Measuring predictability of Nvidia’s GPU warp and block schedulers: Application to the summation problemMCSoC: Embedded Multicore/Many-core Systems-on-Chip, Sep 2015, Turin, Italy. pp.17-24, ⟨10.1109/MCSoC.2015.9⟩
Communication dans un congrès
hal-01267747v1
|
|
Reproducibility and Accuracy for High-Performance ComputingRAIM: Rencontres Arithmétiques de l’Informatique Mathématique, Apr 2015, Rennes, France
Communication dans un congrès
hal-01140531v1
|
|
|
Range Reduction Based on Pythagorean Triples for Trigonometric Function EvaluationASAP: Application-specific Systems, Architectures and Processors, Jul 2015, Toronto, Canada. pp.74-81, ⟨10.1109/ASAP.2015.7245712⟩
Communication dans un congrès
hal-01134232v2
|
|
Impact des schedulers sur la prédictibilité dans les GPUComPAS: Conférence en Parallélisme, Architecture et Système, Apr 2014, Neuchâtel, Suisse
Communication dans un congrès
hal-00951916v1
|
|
FuzzyGPU : a fuzzy arithmetic library for GPUPDP: Parallel, Distributed and Network-Based Processing, Feb 2014, Torino, Italy. pp.624-631, ⟨10.1109/PDP.2014.16⟩
Communication dans un congrès
lirmm-01206375v1
|
|
A Reproducible Accurate Summation Algorithm for High-Performance ComputingEX: Exascale Applied Mathematics Challenges and Opportunities, Jul 2014, Chicago, United States
Communication dans un congrès
hal-01267825v1
|
|
Reproducible and Accurate Matrix Multiplication for High-Performance ComputingSCAN: Scientific Computing, Computer Arithmetic and Validated Numerics, Sep 2014, Wuerzburg, Germany. pp.42-43
Communication dans un congrès
hal-01215627v1
|
|
Power Flow Analysis under Uncertainty using Symmetric Fuzzy ArithmeticPES General Meeting 2014 | Conference & Exposition, Jul 2014, National Harbor, MD, United States. pp.1-5, ⟨10.1109/PESGM.2014.6939274⟩
Communication dans un congrès
lirmm-01206373v1
|
|
Reproducible and Accurate Matrix MultiplicationSCAN: Scientific Computing, Computer Arithmetic and Validated Numerics, Sep 2014, Wurzburg, Germany. pp.126-137, ⟨10.1007/978-3-319-31769-4_11⟩
Communication dans un congrès
hal-01539180v1
|
A Pseudo-Random Bit Generator Based on Three Chaotic Logistic Maps and IEEE 754-2008 Floating-Point ArithmeticTheory and Applications of Models of Computation, Apr 2014, Chennai, India. pp.229-247, ⟨10.1007/978-3-319-06089-7_16⟩
Communication dans un congrès
hal-00985357v1
|
|
|
GPUburn: A System to Test and Mitigate GPU Hardware FailuresEmbedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), Jul 2013, Samos, Greece. pp.263-270, ⟨10.1109/SAMOS.2013.6621133⟩
Communication dans un congrès
hal-00827588v1
|
|
Regularity versus Load-Balancing on GPU for treefix computationsICCS: International Conference on Computational Science, Jun 2013, Barcelone, Spain. pp.309-318
Communication dans un congrès
hal-00768293v1
|
|
Températures, erreurs matérielles et GPUComPAS: Conférence en Parallélisme, Architecture et Système, Jan 2013, Grenoble, France. pp.1-11
Communication dans un congrès
hal-00785386v1
|
|
Implementing LNS using filtering units of GPUsInternational Conference on Acoustics Speech and Signal Processing (ICASSP), Mar 2010, Dallas, TX, United States. pp.1542--1545, ⟨10.1109/ICASSP.2010.5495516⟩
Communication dans un congrès
hal-00423434v1
|
|
Using Graphics Processors for Parallelizing Hash-based Data Carving42nd Hawaii International Conference on System Sciences, Jan 2009, Waikoloa, United States. 10 p
Communication dans un congrès
hal-00350962v1
|
|
Étude comparée et simulation d'algorithmes de branchements pour le GPGPUToulouse'2009, Sep 2009, Toulouse, France. pp.10
Communication dans un congrès
hal-00397697v2
|
|
Power Consumption of GPUs from a Software Perspective9th International Conference on Computational Science, May 2009, Baton Rouge, Louisiana, United States. pp.914-923, ⟨10.1007/978-3-642-01970-8_92⟩
Communication dans un congrès
hal-00348672v2
|
|
Dynamic detection of uniform and affine vectors in GPGPU computationsEuro-Par 2009, Aug 2009, Delft, Netherlands. pp.46-55, ⟨10.1007/978-3-642-14122-5_8⟩
Communication dans un congrès
hal-00396719v1
|
|
Fonctions élémentaires sur GPU exploitant la localité de valeursSYMPosium en Architectures nouvelles de machines, 2008, Fribourg, Suisse. 12p
Communication dans un congrès
hal-00202906v1
|
|
A GPU interval library based on Boost.Interval8th Conference on Real Numbers and Computers, Jul 2008, Santiago de Compostela, Spain. pp.61-71
Communication dans un congrès
hal-00263670v2
|
|
Graphic processors to speed-up simulations for the design of high performance solar receptorsIEEE 18th International Conference Application-specific Systems, Architectures and Processors, 2007, Montréal, Canada. pp.377-382
Communication dans un congrès
hal-00135126v3
|
|
Caractéristiques arithmétiques des processeurs graphiquesSympA: Symposium en Architecture de Machines, Oct 2006, Perpignan, France. pp.86-95
Communication dans un congrès
hal-00069622v1
|
|
Implementation of float-float operators on graphics hardwareReal Numbers and Computers 7, Jul 2006, Nancy, France. pp.23-32
Communication dans un congrès
hal-00021443v1
|
|
InterFLOP, Interoperable Tools for Computing, Debugging, Validation and Optimization of Floating-Point ProgramsISC-HPC 2021 DIGITAL, Jun 2021, Online, France
Poster de conférence
hal-03245586v1
|
|
Error-free Tables for Trigonometric Function EvaluationARCHI: Architecture des systèmes matériels et logiciels embarqués, et méthodes de conception associées, Jun 2015, Lille, France. , 8e édition de l’école thématique Archi, 2015
Poster de conférence
lirmm-01273490v1
|
|
ExBLAS: Reproducible and Accurate BLAS LibraryRAIM: Rencontres Arithmétiques de l’Informatique Mathématique, Apr 2015, Rennes, France. , 7ème Rencontre Arithmétique de l'Informatique Mathématique, 2015
Poster de conférence
hal-01140280v1
|
Simulation temps réel de réseaux électriques à l’aide des architectures multicœursUPVD Magazine Hors-Série recherche, 3, pp.42-44, 2014
Chapitre d'ouvrage
hal-01267852v1
|
|
Optimiser la représentation des flottantsChapitre d'ouvrage hal-01267953v1 |
|
Interval Arithmetic in CUDAWen-mei W. Hwu. GPU Computing Gems Jade Edition, 978-0-12-385963-1, Morgan Kaufmann, pp.99-107, 2011, 978-0123859631
Chapitre d'ouvrage
hal-00813423v1
|
|
FP-ANR: A representation format to handle floating-point cancellation at run-time2017
Pré-publication, Document de travail
lirmm-01549601v3
|
|
Reproducible and Accurate Matrix Multiplication for GPU Accelerators2015
Pré-publication, Document de travail
hal-01102877v1
|
|
Numerical Reproducibility for the Parallel Reduction on Multi- and Many-Core Architectures2015
Pré-publication, Document de travail
hal-00949355v4
|
|
Linear circuit analysis based on parallel asynchronous fixed-point method2015
Pré-publication, Document de travail
hal-01142496v1
|
|
Barra: a Parallel Functional Simulator for GPGPU2009
Pré-publication, Document de travail
hal-00359342v4
|
|
Fonctions élémentaires : algorithmes et implémentations efficaces pour l'arrondi correct en double précisionModélisation et simulation. Ecole normale supérieure de lyon - ENS LYON, 2003. Français. ⟨NNT : ⟩
Thèse
tel-00006022v1
|
|
Contribution au calcul sur GPU: considérations arithmétiques et architecturalesArchitectures Matérielles [cs.AR]. Université de Perpignan, 2014
HDR
tel-01206379v1
|