Abstract
The co-occurrence of terms in a text corpus may indicate the presence of a relation between the referents of these terms. We expect co-occurrence-based methods to identify association relations that cannot be found using static patterns. We developed a new method to identify associations between ontological categories in text using the co-occurrence of terms that designate these categories. We use the taxonomic structure of the ontologies to cumulate the number of co-occurrences of terms designating categories. Based on these cumulated values, we designed a novel family of statistical tests to identify associated categories. These tests take both co-occurrence specificity and relevance into consideration. We applied our method to a 2.2 GB text corpus containing fulltext articles and used Gene Ontology's biological process ontology and the Celltype Ontology. The software and results can be found at http: //bioonto.de/pmwiki.php/Main/ ExtractingBiologicalRelations.
Original language | English |
---|---|
Title of host publication | Proceedings of the Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008), Turku, Finland |
Editors | Tapio Salakoski, Dietrich Rebholz-Schuhmann, Sampo Pyysalo |
Publisher | Turun tietotekniikan tutkimus- ja koulutuskeskus |
Pages | 53-60 |
Number of pages | 8 |
Publication status | Published - 01 Sept 2008 |
Event | Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008), Turku, Finland - Turku, Finland Duration: 01 Sept 2008 → 03 Sept 2008 |
Conference
Conference | Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008), Turku, Finland |
---|---|
Country/Territory | Finland |
City | Turku |
Period | 01 Sept 2008 → 03 Sept 2008 |