KISTI Institutional Repository: A New Efficient Resource Management Framework for Iterative MapReduce Processing in Large-Scale Data Analysis

KISTI repository

download0 view1,278

This item is licensed Korea Open Government License

Title: A New Efficient Resource Management Framework for Iterative MapReduce Processing in Large-Scale Data Analysis

Abstract: To analyze large-scale data efficiently, studies on Hadoop, one of the most popular MapReduce frameworks, have been actively done. Meanwhile, most of the large-scale data analysis applications, e.g., data clustering, are required to do the same map and reduce functions repeatedly. However, Hadoop cannot provide an optimal performance for iterative MapReduce jobs because it derives a result by doing one phase of map and reduce functions. To solve the problems, in this paper, we propose a new efficient resource management framework for iterative MapReduce processing in large-scale data analysis. For this, we first design an iterative job state-machine for managing the iterative MapReduce jobs. Secondly, we propose an invariant data caching mechanism for reducing the I/O costs of data accesses. Thirdly, we propose an iterative resource management technique for efficiently managing the resources of a Hadoop cluster. Fourthly, we devise a stop condition check mechanism for preventing unnecessary computation. Finally, we show the performance superiority of the proposed framework by comparing it with the existing frameworks.

Keyword: large-scale data analysis; iterative data processing framework; MapReduce; Hadoop

URI: https://repository.kisti.re.kr/handle/10580/14718
http://www.ndsl.kr/ndsl/search/detail/article/articleSearchResultDetail.do?cn=NART78519451

KISTI 국가과학기술데이터본부 디지털큐레이션센터 데이터표준화팀
우)34141 대전광역시 유성구 대학로 245 한국과학기술정보연구원
Tel 042) 869-1004,1234 FAX 042) 869-1091

KISTI Institutional Repository는 국립중앙도서관 OAK 보급사업으로 구축되었습니다.