KISTI Institutional Repository: Deep learning model for unstructured knowledge classification using structural features

KISTI repository

download0 view864

This item is licensed Korea Open Government License

Title: Deep learning model for unstructured knowledge classification using structural features

Abstract: Automatic text classification is widely used as a basic method for analyzing data. While classification methods like support vector machines (SVM) have exhibited the highest performance in the past, the recent use of deep learning has led to many advancements in text classification. This study presents a deep learning-based classification model for national research and development (R&D) information with complex structural features, large text, and large-scale classification classes. In addition to the word–sentence structure of a simple document, the number of stacking layers of the deep model is raised by considering the higher-level structure of items. Based on experimental results on 180,000 datasets and 366 classification schemes, we achieved a performance improvement of 22.7% over conventional SVM, and 15.7% over the conventional model using structured modeling of word-sentences. This performance improvement was achieved because the multi-layered stacking method was applied to enhance learning by stacking 5-to-10 times the depth of the conventional model and by effectively combining features of the heterogeneous items. Despite the limited availability of datasets with complex structures, the proposed model adopted for national R&D information is equally applicable to datasets with similar structures.

Keyword: Structured document; Text classification; Deep learning model; Deep model architecture

KISTI 국가과학기술데이터본부 디지털큐레이션센터 데이터표준화팀
우)34141 대전광역시 유성구 대학로 245 한국과학기술정보연구원
Tel 042) 869-1004,1234 FAX 042) 869-1091

KISTI Institutional Repository는 국립중앙도서관 OAK 보급사업으로 구축되었습니다.