Prosiectau fesul blwyddyn
Crynodeb
A big challenge in epidemiology is to perform data pre-processing, specifically feature selection, on large scale data sets with a high dimensional feature set. In this paper, this challenge is tackled by using a recently established distributed and scalable version of Rough Set Theory (RST. It considers epidemiological data that has been collected from three international institutions for the purpose of cancer incidence prediction. The concrete data set used aggregates about 5 495 risk factors (features), spanning 32 years and 38 countries. Detailed experiments demonstrate that RST is relevant to real world big data applications as it can offer insights into the selected risk factors, speed up the learning process, ensure the performance of the cancer incidence prediction model without huge information loss, and simplify the learned model for epidemiologists. Code related to this paper is available at: https://github.com/zeinebchelly/Sp-RST.
Iaith wreiddiol | Saesneg |
---|---|
Teitl | Machine Learning and Knowledge Discovery in Databases |
Is-deitl | European Conference, ECML PKDD 2018, Proceedings |
Golygyddion | Ulf Brefeld, Alice Marascu, Fabio Pinelli, Edward Curry, Brian MacNamee, Neil Hurley, Elizabeth Daly, Michele Berlingerio |
Cyhoeddwr | Springer Nature |
Tudalennau | 440-455 |
Nifer y tudalennau | 16 |
ISBN (Electronig) | 978-3-030-10997-4 |
ISBN (Argraffiad) | 978-3-030-10996-7 |
Dynodwyr Gwrthrych Digidol (DOIs) | |
Statws | Cyhoeddwyd - 18 Ion 2019 |
Digwyddiad | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - Croke Park Conference Centre, Dublin, Iwerddon Hyd: 10 Medi 2018 → 14 Medi 2018 http://www.ecmlpkdd2018.org |
Cyfres gyhoeddiadau
Enw | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Cyfrol | 11053 LNAI |
ISSN (Argraffiad) | 0302-9743 |
ISSN (Electronig) | 1611-3349 |
Cynhadledd
Cynhadledd | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases |
---|---|
Teitl cryno | ECML-PKDD |
Gwlad/Tiriogaeth | Iwerddon |
Dinas | Dublin |
Cyfnod | 10 Medi 2018 → 14 Medi 2018 |
Cyfeiriad rhyngrwyd |
Ôl bys
Gweld gwybodaeth am bynciau ymchwil 'Rough Set Theory as a Data Mining Technique: A Case Study in Epidemiology and Cancer Incidence Prediction'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.Prosiectau
- 1 Wedi Gorffen
-
Optimised Framework based on Rough Set Theory for Big Data Pre-processing in Certain and Imprecise Contexts - RoSTBiDFramework
Zarges, C. (Prif Ymchwilydd)
Horizon 2020 -European Commission
01 Maw 2017 → 28 Chwef 2019
Prosiect: Ymchwil a ariannwyd yn allanol