GPU-accelerated generation of correctly rounded elementary functions

From MaRDI portal
Publication:3176315

DOI10.1145/2935746zbMATH Open1391.65133arXiv1211.3056OpenAlexW2963116348WikidataQ113310168 ScholiaQ113310168MaRDI QIDQ3176315FDOQ3176315


Authors: Pierre Fortin, Mourad Gouicem, Stef Graillat Edit this on Wikidata


Publication date: 20 July 2018

Published in: ACM Transactions on Mathematical Software (Search for Journal in Brave)

Abstract: The IEEE 754-2008 standard recommends the correct rounding of some elementary functions. This requires to solve the Table Maker's Dilemma which implies a huge amount of CPU computation time. We consider in this paper accelerating such computations, namely Lefe'vre algorithm on Graphics Processing Units (GPUs) which are massively parallel architectures with a partial SIMD execution (Single Instruction Multiple Data). We first propose an analysis of the Lef`evre hard-to-round argument search using the concept of continued fractions. We then propose a new parallel search algorithm much more efficient on GPU thanks to its more regular control flow. We also present an efficient hybrid CPU-GPU deployment of the generation of the polynomial approximations required in Lef`evre algorithm. In the end, we manage to obtain overall speedups up to 53.4x on one GPU over a sequential CPU execution, and up to 7.1x over a multi-core CPU, which enable a much faster solving of the Table Maker's Dilemma for the double precision format.


Full work available at URL: https://arxiv.org/abs/1211.3056




Recommendations





Cited In (1)





This page was built for publication: GPU-accelerated generation of correctly rounded elementary functions

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3176315)