Model selection in omnivariate decision trees using Structural Risk Minimization

Yükleniyor...
Küçük Resim

Tarih

2011-12-01

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Elsevier Science Inc

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Araştırma projeleri

Organizasyon Birimleri

Dergi sayısı

Özet

As opposed to trees that use a single type of decision node, an omnivariate decision tree contains nodes of different types. We propose to use Structural Risk Minimization (SRM) to choose between node types in omnivariate decision tree construction to match the complexity of a node to the complexity of the data reaching that node. In order to apply SRM for model selection, one needs the VC-dimension of the candidate models. In this paper, we first derive the VC-dimension of the univariate model, and estimate the VC-dimension of all three models (univariate, linear multivariate or quadratic multivariate) experimentally. Second, we compare SRM with other model selection techniques including Akaike's Information Criterion (AIC), Bayesian Information Criterion (BIC) and cross-validation (CV) on standard datasets from the UCI and Delve repositories. We see that SRM induces omnivariate trees that have a small percentage of multivariate nodes close to the root and they generalize more or at least as accurately as those constructed using other model selection techniques.

Açıklama

The authors thank the three anonymous referees and the editor for their constructive comments, pointers to related literature, and pertinent questions which allowed us to better situate our work as well as organize the ms and improve the presentation. This work has been supported by the Turkish Scientific Technical Research Council TUBITAK EEEAG 107E127

Anahtar Kelimeler

Classification, Machine learning, Model selection, VC-dimension, Structural risk minimization, Decision tree

Kaynak

Information Sciences

WoS Q Değeri

N/A

Scopus Q Değeri

Q1

Cilt

181

Sayı

23

Künye

Yıldız, O. T. (2011). Model selection in omnivariate decision trees using structural risk minimization. Information Sciences, 181(23), 5214-5226. doi:10.1016/j.ins.2011.07.028