An Efficient Task-Based Execution Model ...
Type de document :
Communication dans un congrès avec actes
DOI :
Titre :
An Efficient Task-Based Execution Model for Stochastic Linear Solver on Multi-core and Many-Core Systems
Auteur(s) :
Ye, Fan [Auteur]
Southern University of Science and Technology [SUSTech]
Calvin, Christophe [Auteur]
Département de Modélisation des Systèmes et Structures [DM2S]
Petiton, Serge [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Southern University of Science and Technology [SUSTech]
Calvin, Christophe [Auteur]
Département de Modélisation des Systèmes et Structures [DM2S]
Petiton, Serge [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Titre de la manifestation scientifique :
2015 IEEE 18th International Conference on Computational Science and Engineering (CSE)
Ville :
Porto
Pays :
Portugal
Date de début de la manifestation scientifique :
2015-10-21
Éditeur :
IEEE
Discipline(s) HAL :
Informatique [cs]
Résumé en anglais : [en]
Monte Carlo methods are a wide range of computational algorithms which depend on repeated random sampling to obtain numerical results. They are of great interest in parallel computing because the samplings are very often ...
Lire la suite >Monte Carlo methods are a wide range of computational algorithms which depend on repeated random sampling to obtain numerical results. They are of great interest in parallel computing because the samplings are very often independent of one another, which expose abundant parallelism. Such parallelism is well suited for modern processors with large number of cores. In this study, we revisit the Monte Carlo technique for solving linear systems. The conventional implementation of this method, in spite of its abundant parallelism, still exhibits some fundamental bottlenecks which limit performance: (a) relatively large amount of time spent in random number generation, (b) serialized selection of new states, (c) lack of vectorization which leads to low SIMD efficiency for processors with wide vector units, and (d) variable results due to the stochastic nature of algorithm. We propose an efficient task-based execution model for tackling these problems. It provides a new perspective to interpret the theory so we can bypass the inevitable routines in conventional implementation of Monte Carlo method, such as random number generation. The new model also exploits the salient architectural features of modern multi-core system, such as wide vector units and hardware support for irregular memory access. Our work is built on the latest research on task-based scheduling. It shows very promising performance on both multi-core and many-core system. Compared with optimized conventional parallel implementation, we achieved significant speedups (up to 3.68x) on test matrices.Lire moins >
Lire la suite >Monte Carlo methods are a wide range of computational algorithms which depend on repeated random sampling to obtain numerical results. They are of great interest in parallel computing because the samplings are very often independent of one another, which expose abundant parallelism. Such parallelism is well suited for modern processors with large number of cores. In this study, we revisit the Monte Carlo technique for solving linear systems. The conventional implementation of this method, in spite of its abundant parallelism, still exhibits some fundamental bottlenecks which limit performance: (a) relatively large amount of time spent in random number generation, (b) serialized selection of new states, (c) lack of vectorization which leads to low SIMD efficiency for processors with wide vector units, and (d) variable results due to the stochastic nature of algorithm. We propose an efficient task-based execution model for tackling these problems. It provides a new perspective to interpret the theory so we can bypass the inevitable routines in conventional implementation of Monte Carlo method, such as random number generation. The new model also exploits the salient architectural features of modern multi-core system, such as wide vector units and hardware support for irregular memory access. Our work is built on the latest research on task-based scheduling. It shows very promising performance on both multi-core and many-core system. Compared with optimized conventional parallel implementation, we achieved significant speedups (up to 3.68x) on test matrices.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :