Ontology design patterns to disambiguate relations between genes and gene products in GENIA

Robert Hoehndorf, Axel-Cyrille Ngonga Ngomo, Sampo Pyysalo, Tomoko Ohta, Anika Oellrich, Dietrich Rebholz-Schuhmann

Annotated reference corpora such as the GENIA corpus play an important role in biomedical information extraction. A semantic annotation of the natural language texts in these reference corpora using formal ontologies and logic is challenging due to the ambiguous use of natural language and natural language semantics. Providing formal definitions and axioms for these relations would offer the means for developing consistent and verifiable annotation guidelines and allow for the automatic verification of annotations as well as enabling the discovery of new information through deductive inferences.

We developed a formal ontology of relations based on the relations used in the recent GENIA corpus annotations. For this purpose, we selected existing axiom systems based on the desired properties of the relations within the domain and provided new axioms for several relations. To apply this ontology of relations to these mantic annotation of natural language texts, we developed and implemented two ontology design patterns. We provide an implementation of the ontology of relations in the Web Ontology Language (OWL). By combining the implementation of the design patterns and that of the relation ontology, we also provide a software application to convert annotated GENIA abstracts into OWL ontologies. In this way, we make these ontologies amenable for automated verification, deductive inferences and other knowledge-based applications.

Documentation, implementation and examples are available from http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/.Contact: rh497@cam.ac.uk
