3 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 3 / 3
Yayın Calculating the VC-dimension of decision trees(IEEE, 2009) Aslan, Özlem; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemWe propose an exhaustive search algorithm that calculates the VC-dimension of univariate decision trees with binary features. The VC-dimension of the univariate decision tree with binary features depends on (i) the VC-dimension values of the left and right subtrees, (ii) the number of inputs, and (iii) the number of nodes in the tree. From a training set of example trees whose VC-dimensions are calculated by exhaustive search, we fit a general regressor to estimate the VC-dimension of any binary tree. These VC-dimension estimates are then used to get VC-generalization bounds for complexity control using SRM in decision trees, i.e., pruning. Our simulation results shows that SRM-pruning using the estimated VC-dimensions finds trees that are as accurate as those pruned using cross-validation.Yayın Regularizing soft decision trees(Springer, 2013) Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemRecently, we have proposed a new decision tree family called soft decision trees where a node chooses both its left and right children with different probabilities as given by a gating function, different from a hard decision node which chooses one of the two. In this paper, we extend the original algorithm by introducing local dimension reduction via L-1 and L-2 regularization for feature selection and smoother fitting. We compare our novel approach with the standard decision tree algorithms over 27 classification data sets. We see that both regularized versions have similar generalization ability with less complexity in terms of number of nodes, where L-2 seems to work slightly better than L-1.Yayın Statistical tests using hinge/ε-sensitive loss(Springer-Verlag, 2013) Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemStatistical tests used in the literature to compare algorithms use the misclassification error which is based on the 0/1 loss and square loss for regression. Kernel-based, support vector machine classifiers (regressors) however are trained to minimize the hinge (ε-sensitive) loss and hence they should not be assessed or compared in terms of the 0/1 (square loss) but with the loss measure they are trained to minimize. We discuss how the paired t test can use the hinge (ε-sensitive) loss and show in our experiments that doing that, we can detect differences that the test on error cannot detect, indicating higher power in distinguishing between the behavior of kernel-based classifiers (regressors). Such tests can be generalized to compare L > 2 algorithms.












