Dynamic Hadoop Slot Allocation for MapReduce Clusters in Cloud
Abstract— MapReduce is a popular computing paradigm for large-scale data processing in cloud computing. But, the slot-based MapReduce system can suffer from poor performance due to unoptimized resource allocation. The problem occurred in slot allocation is identified and optimized in three key aspects. First, the pre-configuration of distinct map and reduce slots which are not fungible, slots can be severely under-utilized. Because map slots might be fully utilized when reduce slots are empty, and vice-versa. An alternative technique called Dynamic Hadoop Slot Allocation. It performs the slot allocation to the Map or Reduce task based on their needs and also it should be done by keep the slot based model. Second, the speculative execution can tackle the straggler problem, which has shown to improve the performance for a single job but at the expense of the cluster efficiency Speculative Execution Performance Balancing is proposed. Here, Speculative Execution Performance balancing technique performs tradeoff between a single job and a batch of jobs. Third, delay scheduling has shown to improved the data locality. However it should be improve the performance of the data locality in the cost of the process will be high. Alternatively, to avoid impact on fairness and improve the data locality PreScheduling is proposed. Finally to combine these techniques together and form a step-by-step slot allocation system called DynamicMR that can improve the performance of MapReduce workloads substantially and will increase a process against data distribution using Two-Phase Top-Down Specialization Scheduling approach in real Big Data in cloud.
Hadoop Fair Scheduler (HFS), Slot PreScheduling, Delay Scheduler, Dynamic Hadoop Slot Allocation (DHSA), Two Phase Top Down Specialization (TDS).
Click Here
International Journal for Trends in Technology & Engineering © 2015 IJTET JOURNAL