Rough Set Theory as a Data Mining Technique: A Case Study in Epidemiology and Cancer Incidence Prediction

Zaineb Chelly Dagdia, Christine Zarges, Benjamin Schannes, Martin Micalef, Lino Galiana, Benoît Rolland, Olivier de Fresnoye, Mehdi Benchoufi

Allbwn ymchwil: Pennod mewn Llyfr/Adroddiad/Trafodion CynhadleddTrafodion Cynhadledd (Nid-Cyfnodolyn fathau)

5 Dyfyniadau (Scopus)
306 Wedi eu Llwytho i Lawr (Pure)

Crynodeb

A big challenge in epidemiology is to perform data pre-processing, specifically feature selection, on large scale data sets with a high dimensional feature set. In this paper, this challenge is tackled by using a recently established distributed and scalable version of Rough Set Theory (RST. It considers epidemiological data that has been collected from three international institutions for the purpose of cancer incidence prediction. The concrete data set used aggregates about 5 495 risk factors (features), spanning 32 years and 38 countries. Detailed experiments demonstrate that RST is relevant to real world big data applications as it can offer insights into the selected risk factors, speed up the learning process, ensure the performance of the cancer incidence prediction model without huge information loss, and simplify the learned model for epidemiologists. Code related to this paper is available at: https://github.com/zeinebchelly/Sp-RST.

Iaith wreiddiolSaesneg
TeitlMachine Learning and Knowledge Discovery in Databases
Is-deitlEuropean Conference, ECML PKDD 2018, Proceedings
GolygyddionUlf Brefeld, Alice Marascu, Fabio Pinelli, Edward Curry, Brian MacNamee, Neil Hurley, Elizabeth Daly, Michele Berlingerio
CyhoeddwrSpringer Nature
Tudalennau440-455
Nifer y tudalennau16
ISBN (Electronig)978-3-030-10997-4
ISBN (Argraffiad)978-3-030-10996-7
Dynodwyr Gwrthrych Digidol (DOIs)
StatwsCyhoeddwyd - 18 Ion 2019
DigwyddiadEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - Croke Park Conference Centre, Dublin, Iwerddon
Hyd: 10 Medi 201814 Medi 2018
http://www.ecmlpkdd2018.org

Cyfres gyhoeddiadau

EnwLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Cyfrol11053 LNAI
ISSN (Argraffiad)0302-9743
ISSN (Electronig)1611-3349

Cynhadledd

CynhadleddEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
Teitl crynoECML-PKDD
Gwlad/TiriogaethIwerddon
DinasDublin
Cyfnod10 Medi 201814 Medi 2018
Cyfeiriad rhyngrwyd

Ôl bys

Gweld gwybodaeth am bynciau ymchwil 'Rough Set Theory as a Data Mining Technique: A Case Study in Epidemiology and Cancer Incidence Prediction'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.

Dyfynnu hyn