3 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 3 / 3
Yayın Design and analysis of classifier learning experiments in bioinformatics: survey and case studies(IEEE Computer Soc, 2012-12) İrsoy, Ozan; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemIn many bioinformatics applications, it is important to assess and compare the performances of algorithms trained from data, to be able to draw conclusions unaffected by chance and are therefore significant. Both the design of such experiments and the analysis of the resulting data using statistical tests should be done carefully for the results to carry significance. In this paper, we first review the performance measures used in classification, the basics of experiment design and statistical tests. We then give the results of our survey over 1,500 papers published in the last two years in three bioinformatics journals (including this one). Although the basics of experiment design are well understood, such as resampling instead of using a single training set and the use of different performance metrics instead of error, only 21 percent of the papers use any statistical test for comparison. In the third part, we analyze four different scenarios which we encounter frequently in the bioinformatics literature, discussing the proper statistical methodology as well as showing an example case study for each. With the supplementary software, we hope that the guidelines we discuss will play an important role in future studies.Yayın Incremental construction of classifier and discriminant ensembles(Elsevier Science Inc, 2009-04-15) Ulaş, Aydın; Semerci, Murat; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemWe discuss approaches to incrementally construct an ensemble. The first constructs an ensemble of classifiers choosing a subset from a larger set, and the second constructs an ensemble of discriminants, where a classifier is used for some classes only. We investigate criteria including accuracy, significant improvement, diversity, correlation, and the role of search direction. For discriminant ensembles, we test subset selection and trees. Fusion is by voting or by a linear model. Using 14 classifiers on 38 data sets. incremental search finds small, accurate ensembles in polynomial time. The discriminant ensemble uses a subset of discriminants and is simpler, interpretable, and accurate. We see that an incremental ensemble has higher accuracy than bagging and random subspace method; and it has a comparable accuracy to AdaBoost. but fewer classifiers.Yayın Cost-conscious comparison of supervised learning algorithms over multiple data sets(Elsevier Sci Ltd, 2012-04) Ulaş, Aydın; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemIn the literature, there exist statistical tests to compare supervised learning algorithms on multiple data sets in terms of accuracy but they do not always generate an ordering. We propose Multi(2)Test, a generalization of our previous work, for ordering multiple learning algorithms on multiple data sets from "best" to "worst" where our goodness measure is composed of a prior cost term additional to generalization error. Our simulations show that Multi2Test generates orderings using pairwise tests on error and different types of cost using time and space complexity of the learning algorithms.












