한국과학기술정보연구원 Korea Institute of Science and Technology Information
Publication Year
2014-12
Description
funder : 미래창조과학부 funder : KA agency : 한국과학기술정보연구원 agency : Korea Institute of Science and Technology Information
Abstract
Traditional information extraction (IE) maps text into a knowledge-base (KB) by extracting arguments for known relations. State-of-the-art approaches to IE employ supervised machine learning classifiers and hence, need large volumes of annotated training data from domain experts. This annotation process is expensive, time-consuming, and relies heavily on the availability of domain experts.
In this work, IITD builds an alternative paradigm for IE, which is based on a rule-based system in which the rules apply over domain-independent semantic processing of a sentence called Open Information Extraction (Open IE). This attractive model allows the domain experts to rapidly define rules in a simple rule language and hence is able to make the best use of expert time. The rules are easy to understand by humans and can be read even by NLP non-experts. Just a few hours of engineering achieves a good performance in the TACKBP slot filling task. A few additional modifications result in meeting the target of 0.2 F1 score on the task.
In addition we deliver a suite of tools associated with this task. This includes a pipeline of entity extraction, open information extraction, relation and event extraction, as well as code for the TACKBP temporal slot filling.