Background/Objectives: Subject classification of thesis units is essential to serve scholarly information deliverables. However, to date, there is a journal-based topic classification, and there are not many article-level subject classification services. Methods/Statistical analysis: In this paper, we try to classify topics using unsupervised learning method. The unsupervised Learning Algorithms are a well-known Hierarchical Dirichlet Process (HDP), Latent Dirichlet Allocation (LDA) and Latent Semantic Indexing (LSI) algorithms. Findings: In this paper, we can confirm that the classification algorithm should be used in accordance with the characteristics and purpose of the data. The LSI is used for a more intuitive data set, and the LDA is advantageous for applying a new term by classifying various keywords, and HDP seems to be advantageous for applying to a more detailed classification system. The limitations of this study are that algorithms such as LDA are sensitive to keywords and require detailed refinement of keywords. Improvements/Applications: When the reliability is improved on the basis of the major classification, it will become the subject classification of the thesis unit, and it will be possible to provide the subject classification service which is necessary for various institutions and researchers in various fields.
Keyword
과학기술정보; 학술논문; 주제분류; 정보서비스; Science and technology information; thesis data; academic papers; subject classification; information service
Journal Title
Journal of Advanced Research in Dynamical and Control Systems;