Abstract
Inductive Logic Programming (ILP) systems constructmodels for data using domain-specific background
information. When using these systems, it is typically assumed that sufficient human expertise
is at hand to rule out irrelevant background information. Such irrelevant information can, and
typically does, hinder an ILP system’s search for good models. Here, we provide evidence that if
expertise is available that can provide a partial-ordering on sets of background predicates in terms
of relevance to the analysis task, then this can be used to good effect by an ILP system. In particular,
using data from biochemical domains, we investigate an incremental strategy of including sets
of predicates in decreasing order of relevance. Results obtained suggest that: (a) the incremental
approach identifies, in substantially less time, a model that is comparable in predictive accuracy to
that obtained with all background information in place; and (b) the incremental approach using the
relevance ordering performs better than one that does not (that is, one that adds sets of predicates
randomly). For a practitioner concerned with use of ILP, the implication of these findings are twofold:
(1) when not all background information can be used at once (either due to limitations of the
ILP system, or the nature of the domain) expert assessment of the relevance of background predicates
can assist substantially in the construction of good models; and (2) good “first-cut” results
can be obtained quickly by a simple exclusion of information known to be less relevant.
Original language | English |
---|---|
Pages (from-to) | 369-383 |
Number of pages | 15 |
Journal | Journal of Machine Learning Research |
Volume | 4 |
Publication status | Published - Jul 2003 |
Keywords
- ILP
- relevance of background predicates
- expert-assistance