06 December 2011, 10h00 - 06 December 2011, 11h00 Salle/Bat : 475/PCRI-N
Contact :
Activités de recherche :
Résumé :
As the scientific software community looks toward platforms in the 50-100 Pflop/s neighborhood, the characteristics of the design space it confronts –forced reductions in processor clock speeds, new power constraints, anemic improvement in communication latencies, exponential escalation in the number of computing elements, and the revolutionary increase in component heterogeneity– raise a host of difficult and unsolved problems. To create software capable of extracting a significant percentage of theoretical peak performance on systems at extreme scale, the numerical library community will need groundbreaking innovations in their algorithms, in their ability to control massive parallelism in multiple dimensions of the software architecture, and in the simultaneous and thorough application of autotuning techniques at several levels of that same architecture.
In this presentation I will depict DAGuE, a generic framework for architecture aware scheduling and management of micro-tasks on distributed many-core heterogeneous architectures. Targeted applications can be expressed as a Direct Acyclic Graph of tasks with labeled edges designating data dependencies. DAGs are represented in a compact, problem-size independent format that can be queried on-demand to discover data dependencies, in a totally distributed fashion. DAGuE assigns computation threads to the cores, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on cache awareness, data-locality and task priority. I will demonstrate the efficiency of such an approach, using both micro-benchmarks to analyze the performance of different components, and Linear Algebra factorizations as a use case.