Abstract
The co-occurrence of terms in a text corpus may indicate the presence of a relation between the referents of these terms. We expect co-occurrence-based methods to identify association relations that cannot be found using static patterns. We developed a new method to identify associations between ontological categories in text using the co-occurrence of terms that designate these categories. We use the taxonomic structure of the ontologies to cumulate the number of co-occurrences of terms designating categories. Based on these cumulated values, we designed a novel family of statistical tests to identify associated categories. These tests take both co-occurrence specificity and relevance into consideration. We applied our method to a 2.2 GB text corpus containing fulltext articles and used Gene Ontology's biological process ontology and the Celltype Ontology. The software and results can be found at http: //bioonto.de/pmwiki.php/Main/ ExtractingBiologicalRelations.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008), Turku, Finland |
| Editors | Tapio Salakoski, Dietrich Rebholz-Schuhmann, Sampo Pyysalo |
| Publisher | Turun tietotekniikan tutkimus- ja koulutuskeskus |
| Pages | 53-60 |
| Number of pages | 8 |
| Publication status | Published - 01 Sept 2008 |
| Event | Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008), Turku, Finland - Turku, Finland Duration: 01 Sept 2008 → 03 Sept 2008 |
Conference
| Conference | Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008), Turku, Finland |
|---|---|
| Country/Territory | Finland |
| City | Turku |
| Period | 01 Sept 2008 → 03 Sept 2008 |