The importance of research on knowledge management is growing due to recent issues on Big Data. One of the most fundamental steps in knowledge management is the extraction of terminologies. Terms are often expressed in various forms and the variations often play a negative role, becoming an obstacle which causes knowledge systems to extract unnecessary ones. To solve the problem, we propose a method of term normalization which finds a normalized form (original and standard form defined in dictionaries) of variant terms. The method employs two characteristics of terms: appearance similarity measuring how similar terms are, context similarity measuring how many clue words they share. Through experiment, we show its positive influence of both similarities in term normalization.
Keyword
Term normalization; Knowledge acquisition; Text mining; Appearance similarity; Context similarity