Self-configuration of the Number of ...
Document type :
Communication dans un congrès avec actes
Title :
Self-configuration of the Number of Concurrently Running MapReduce Jobs in a Hadoop Cluster
Author(s) :
Zhang, Bo [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Křikava, Filip [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Rouvoy, Romain [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Seinturier, Lionel [Auteur]
Institut universitaire de France [IUF]
Self-adaptation for distributed services and large software systems [SPIRALS]
Self-adaptation for distributed services and large software systems [SPIRALS]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Křikava, Filip [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Rouvoy, Romain [Auteur]

Self-adaptation for distributed services and large software systems [SPIRALS]
Seinturier, Lionel [Auteur]

Institut universitaire de France [IUF]
Self-adaptation for distributed services and large software systems [SPIRALS]
Conference title :
ICAC 2015
City :
Grenoble
Country :
France
Start date of the conference :
2015-07-07
English keyword(s) :
Hadoop Cluster
MapReduce
Performance Optimization
MapReduce
Performance Optimization
HAL domain(s) :
Informatique [cs]
Informatique [cs]/Recherche d'information [cs.IR]
Informatique [cs]/Recherche d'information [cs.IR]
English abstract : [en]
There is a trade-off between the number of concurrently running MapReduce jobs and their corresponding map and reduce tasks within a node in a Hadoop cluster. Leaving this trade-off statically configured to a single value ...
Show more >There is a trade-off between the number of concurrently running MapReduce jobs and their corresponding map and reduce tasks within a node in a Hadoop cluster. Leaving this trade-off statically configured to a single value can significantly reduce job response times leaving only suboptimal resource usage. To overcome this problem, we propose a feedback control loop based approach that dynamically adjusts the Hadoop resource manager configuration based on the current state of the cluster. The preliminary assessment based on workloads synthesized from real-world traces shows that the system performance can be improved by about 30% compared to default Hadoop setup.Show less >
Show more >There is a trade-off between the number of concurrently running MapReduce jobs and their corresponding map and reduce tasks within a node in a Hadoop cluster. Leaving this trade-off statically configured to a single value can significantly reduce job response times leaving only suboptimal resource usage. To overcome this problem, we propose a feedback control loop based approach that dynamically adjusts the Hadoop resource manager configuration based on the current state of the cluster. The preliminary assessment based on workloads synthesized from real-world traces shows that the system performance can be improved by about 30% compared to default Hadoop setup.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- https://hal.inria.fr/hal-01143157/document
- Open access
- Access the document
- https://hal.inria.fr/hal-01143157/document
- Open access
- Access the document
- https://hal.inria.fr/hal-01143157/document
- Open access
- Access the document
- document
- Open access
- Access the document
- icac15-paper.pdf
- Open access
- Access the document
- document
- Open access
- Access the document
- icac15-paper.pdf
- Open access
- Access the document