KISTI Institutional Repository: Improving I/O efficiency in Hadoop-based Massive Data Analysis programs

Open Access KISTI

KISTI repository

BROWSE

KISTI Institutional Repository7. KISTI 연구성과 학술지 발표논문

download0 view1,419

This item is licensed Korea Open Government License

dc.contributor.author: 이경하

dc.contributor.author: 서영균

dc.contributor.author: 강우람

dc.date.accessioned: 2019-08-28T07:42:14Z

dc.date.available: 2019-08-28T07:42:14Z

dc.date.issued: 2018-12-02

dc.identifier.issn: 1058-9244

dc.identifier.uri: https://repository.kisti.re.kr/handle/10580/14739

dc.description.abstract: Apache Hadoop has been a popular parallel processing tool in this era of big data.
While practitioners have rewritten many conventional analysis algorithms to make them accustomed to Hadoop,
the I/O inefficiency of Hadoop-based programs has been repeatedly reported in the literature.
In this article, we address the problem of I/O inefficiency in Hadoop-based massive data analysis
by introducing our efficient modification of Hadoop.
We first incorporate a columnar data layout into the conventional Hadoop framework without any modification
of the Hadoop internals. We also provide an indexing capability into Hadoop to save many I/Os
while processing not only selection predicates but also star-join queries which are frequently used in
many analysis tasks.

dc.language: eng

dc.relation.ispartofseries: Scientific programming

dc.title: Improving I/O efficiency in Hadoop-based Massive Data Analysis programs

dc.subject.keyword: Parallel processing

dc.subject.keyword: MapReduce

dc.subject.keyword: Data layout

dc.subject.keyword: bitmap index

Appears in Collections:: 7. KISTI 연구성과 > 학술지 발표논문

Files in This Item:: There are no files associated with this item.

Show simple item record

KISTI 국가과학기술데이터본부 디지털큐레이션센터 데이터표준화팀
우)34141 대전광역시 유성구 대학로 245 한국과학기술정보연구원
Tel 042) 869-1004,1234 FAX 042) 869-1091

KISTI Institutional Repository는 국립중앙도서관 OAK 보급사업으로 구축되었습니다.

개인정보처리방침

저작권 정책

공공누리

Browse