Searching parsimonious solutions with GA-PARSIMONY and XGboost in high-dimensional databases
- Martinez-de-Pison, F.J. 1
- Fraile-Garcia, E. 1
- Ferreiro-Cabello, J. 1
- Gonzalez, R. 1
- Pernia, A. 1
-
1
Universidad de La Rioja
info
ISSN: 2194-5357
ISBN: 978-331947363-5
Año de publicación: 2017
Volumen: 527
Páginas: 201-210
Tipo: Capítulo de Libro
beta Ver similares en nube de resultadosResumen
EXtreme Gradient Boosting (XGBoost) has become one of the most successful techniques in machine learning competitions. It is computationally efficient and scalable, it supports a wide variety of objective functions and it includes different mechanisms to avoid overfitting and improve accuracy. Having so many tuning parameters, soft computing (SC) is an alternative to search precise and robust models against classical hyper-tuning methods. In this context, we present a preliminary study in which a SC methodology, named GA-PARSIMONY, is used to find accurate and parsimonious XGBoost solutions. The methodology was designed to optimize the search of parsimonious models by feature selection, parameter tuning and model selection. In this work, different experiments are conducted with four complexity metrics in six high dimensional datasets. Although XGBoost performs well with high-dimensional databases, preliminary results indicated that GAPARSIMONY with feature selection slightly improved the testing error. Therefore, the choice of solutions with fewer inputs, between those with similar cross-validation errors, can help to obtain more robust solutions with better generalization capabilities. © Springer International Publishing AG 2017.