Discrimination of fish populations using parasites: Random Forests on a 'predictable' host-parasite system

A. Perez-Del-Olmo, F. E. Montero, M. Fernandez, J. Barrett, J. A. Raga, A. Kostadinova

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)


We address the effect of spatial scale and temporal variation on model generality when forming predictive models for fish assignment using a new data mining approach, Random Forests (RF), to variable biological markers (parasite community data). Models were implemented for a fish host-parasite system sampled along the Mediterranean and Atlantic coasts of Spain and were validated using independent datasets. We considered 2 basic classification problems in evaluating the importance of variations in parasite infracommunities for assignment of individual fish to their populations of origin: multiclass (2–5 population models, using 2 seasonal replicates from each of the populations) and 2-class task (using 4 seasonal replicates from 1 Atlantic and 1 Mediterranean population each). The main results are that (i) RF are well suited for multiclass population assignment using parasite communities in non-migratory fish; (ii) RF provide an efficient means for model cross-validation on the baseline data and this allows sample size limitations in parasite tag studies to be tackled effectively; (iii) the performance of RF is dependent on the complexity and spatial extent/configuration of the problem; and (iv) the development of predictive models is strongly influenced by seasonal change and this stresses the importance of both temporal replication and model validation in parasite tagging studies.
Original languageEnglish
Pages (from-to)1833-1847
Number of pages15
Issue number12
Early online date06 Jul 2010
Publication statusPublished - 01 Oct 2010


  • predictive models
  • random forests
  • fish population discrimination
  • parasites as tags
  • boops boops
  • mediterranean
  • north-east atlantic


Dive into the research topics of 'Discrimination of fish populations using parasites: Random Forests on a 'predictable' host-parasite system'. Together they form a unique fingerprint.

Cite this