download387 view1,012
twitter facebook

CC_BYThis item is licensed Creative Commons License

dc.contributor.author
Min Song
dc.date.accessioned
2018-10-12T04:51:11Z
dc.date.available
2018-10-12T04:51:11Z
dc.date.issued
2014-03-30
dc.identifier.issn
2287-4577
dc.identifier.uri
https://repository.kisti.re.kr/handle/10580/8646
dc.description.abstract
This paper proposes a novel knowledge extraction system, TAKES (Two-step Approach for Knowledge Extraction System), which integrates advanced techniques from Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing (NLP). In particular, TAKES adopts a novel keyphrase extraction-based query expansion technique to collect promising documents. It also uses a Conditional Random Field-based machine learning technique to extract important biological entities and relations. TAKES is applied to biological knowledge extraction, particularly retrieving promising documents that contain Protein-Protein Interaction (PPI) and extracting PPI pairs. TAKES consists of two major components: DocSpotter, which is used to query and retrieve promising documents for extraction, and a Conditional Random Field (CRF)-based entity extraction component known as FCRF. The present paper investigated research problems addressing the issues with a knowledge extraction system and conducted a series of experiments to test our hypotheses. The findings from the experiments are as follows: First, the author verified, using three different test collections to measure the performance of our query expansion technique, that DocSpotter is robust and highly accurate when compared to Okapi BM25 and SLIPPER. Second, the author verified that our relation extraction algorithm, FCRF, is highly accurate in terms of F-Measure compared to four other competitive extraction algorithms: Support Vector Machine, Maximum Entropy, Single POS HMM, and Rapier.
dc.format
application/pdf
dc.language.iso
eng
dc.relation.ispartofseries
Journal of Information Science Theory and Practice
dc.title
TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries
dc.type
Article
dc.rights.license
CC_BY
dc.identifier.doi
10.1633/JISTaP.2014.2.1.1
dc.citation.endPage
21
dc.citation.number
1
dc.citation.startPage
6
dc.citation.volume
2
dc.identifier.bibliographicCitation
vol. 2, no. 1, page. 6 - 21
dc.subject.keyword
Semantic Query Expansion
dc.subject.keyword
Information Extraction
dc.subject.keyword
Information Retrieval
dc.subject.keyword
Text Mining
dc.rights.holder
KISTI
Appears in Collections:
8. KISTI 간행물 > JISTaP > Vol. 2 - No. 1
Files in This Item:
Thumbnail E1JSCH_2014_v2n1_6.pdf270.22 kBDownload

Browse