download0 view887
twitter facebook

공공누리This item is licensed Korea Open Government License

dc.contributor.author
김정림
dc.contributor.author
박상현
dc.contributor.author
서동민
dc.contributor.author
유석종
dc.date.accessioned
2019-08-28T07:42:15Z
dc.date.available
2019-08-28T07:42:15Z
dc.date.issued
2018-10-10
dc.identifier.issn
1932-6203
dc.identifier.uri
https://repository.kisti.re.kr/handle/10580/14747
dc.identifier.uri
http://www.ndsl.kr/ndsl/search/detail/article/articleSearchResultDetail.do?cn=NART90641578
dc.description.abstract
As the size of networks increases, it is becoming important to analyze large-scale network
data. A network clustering algorithm is useful for analysis of network data. Conventional network
clustering algorithms in a single machine environment rather than a parallel machine
environment are actively being researched. However, these algorithms cannot analyze
large-scale network data because of memory size issues. As a solution, we propose a network
clustering algorithm for large-scale network data analysis using Apache Spark by
changing the paradigm of the conventional clustering algorithm to improve its efficiency in
the Apache Spark environment. We also apply optimization approaches such as Bloom filter
and shuffle selection to reduce memory usage and execution time. By evaluating our proposed
algorithm based on an average normalized cut, we confirmed that the algorithm can
analyze diverse large-scale network datasets such as biological, co-authorship, internet
topology and social networks. Experimental results show that the proposed algorithm can
develop more accurate clusters than comparative algorithms with less memory usage. Furthermore,
we confirm the proposed optimization approaches and the scalability of the proposed
algorithm. In addition, we validate that clusters found from the proposed algorithm
can represent biologically meaningful functions.
dc.language
eng
dc.relation.ispartofseries
PLOS one
dc.title
CASS: A distributed network clustering algorithm based on structure similarity for large-scale network
dc.citation.endPage
22
dc.citation.number
10
dc.citation.startPage
1
dc.citation.volume
13
dc.subject.keyword
network alaysks
dc.subject.keyword
algorithm
dc.subject.keyword
clustering
Appears in Collections:
7. KISTI 연구성과 > 학술지 발표논문
Files in This Item:
There are no files associated with this item.

Browse