An Efficient Task-Based Execution Model ...
Document type :
Communication dans un congrès avec actes
DOI :
Title :
An Efficient Task-Based Execution Model for Stochastic Linear Solver on Multi-core and Many-Core Systems
Author(s) :
Ye, Fan [Auteur]
Southern University of Science and Technology [SUSTech]
Calvin, Christophe [Auteur]
Département de Modélisation des Systèmes et Structures [DM2S]
Petiton, Serge [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Southern University of Science and Technology [SUSTech]
Calvin, Christophe [Auteur]
Département de Modélisation des Systèmes et Structures [DM2S]
Petiton, Serge [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Conference title :
2015 IEEE 18th International Conference on Computational Science and Engineering (CSE)
City :
Porto
Country :
Portugal
Start date of the conference :
2015-10-21
Publisher :
IEEE
HAL domain(s) :
Informatique [cs]
English abstract : [en]
Monte Carlo methods are a wide range of computational algorithms which depend on repeated random sampling to obtain numerical results. They are of great interest in parallel computing because the samplings are very often ...
Show more >Monte Carlo methods are a wide range of computational algorithms which depend on repeated random sampling to obtain numerical results. They are of great interest in parallel computing because the samplings are very often independent of one another, which expose abundant parallelism. Such parallelism is well suited for modern processors with large number of cores. In this study, we revisit the Monte Carlo technique for solving linear systems. The conventional implementation of this method, in spite of its abundant parallelism, still exhibits some fundamental bottlenecks which limit performance: (a) relatively large amount of time spent in random number generation, (b) serialized selection of new states, (c) lack of vectorization which leads to low SIMD efficiency for processors with wide vector units, and (d) variable results due to the stochastic nature of algorithm. We propose an efficient task-based execution model for tackling these problems. It provides a new perspective to interpret the theory so we can bypass the inevitable routines in conventional implementation of Monte Carlo method, such as random number generation. The new model also exploits the salient architectural features of modern multi-core system, such as wide vector units and hardware support for irregular memory access. Our work is built on the latest research on task-based scheduling. It shows very promising performance on both multi-core and many-core system. Compared with optimized conventional parallel implementation, we achieved significant speedups (up to 3.68x) on test matrices.Show less >
Show more >Monte Carlo methods are a wide range of computational algorithms which depend on repeated random sampling to obtain numerical results. They are of great interest in parallel computing because the samplings are very often independent of one another, which expose abundant parallelism. Such parallelism is well suited for modern processors with large number of cores. In this study, we revisit the Monte Carlo technique for solving linear systems. The conventional implementation of this method, in spite of its abundant parallelism, still exhibits some fundamental bottlenecks which limit performance: (a) relatively large amount of time spent in random number generation, (b) serialized selection of new states, (c) lack of vectorization which leads to low SIMD efficiency for processors with wide vector units, and (d) variable results due to the stochastic nature of algorithm. We propose an efficient task-based execution model for tackling these problems. It provides a new perspective to interpret the theory so we can bypass the inevitable routines in conventional implementation of Monte Carlo method, such as random number generation. The new model also exploits the salient architectural features of modern multi-core system, such as wide vector units and hardware support for irregular memory access. Our work is built on the latest research on task-based scheduling. It shows very promising performance on both multi-core and many-core system. Compared with optimized conventional parallel implementation, we achieved significant speedups (up to 3.68x) on test matrices.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :