We present the design, preparation, results and analysis of the Cancer Genetics (CG) and Pathway Curation (PC) main tasks of the BioNLP Shared Task 2013. These two information extraction tasks target the recognition of events, represented as structured n-ary associations of given physical entities. The CG task focuses on the cancer literature, emphasizing the extraction of information relating physical entities and physiological and pathological processes across multiple levels of biological organization, from the molecular to the whole organism. The PC task targets reactions relevant to the curation, evaluation and maintenance of detailed biomolecular pathway models, selecting documents by relevance to speciﬁc reactions and deﬁning its extraction targets and their semantics with reference to pathway model standards and ontologies. Six participating groups submitted ﬁnal results to the CG task and two groups to the PC task. The best-performing systems achieved F-scores of 55.4% on the CG task and 52.8% on the PC task, a level of performance broadly comparable with the state of the art for previously proposed event extraction tasks. These results demonstrate that both tasks are feasible for current event extraction methods and that event extraction generalizes well to higher levels of biological organization and is applicable to the analysis of scientiﬁc texts on cancer as well as to supporting pathway curation eﬀorts. In this paper, we extend on previously reported results for these two tasks through additional analyses of the task data and system predictions. The CG and PC tasks continue as open challenges for all interested parties, with data, tools and resources available from http://2013.bionlp-st.org.
Natural Language Processing; Bioinformatics; Pathway Curation