Arama Sonuçları
Listeleniyor 1 - 10 / 65
Yayın Calculating the VC-dimension of decision trees(IEEE, 2009) Aslan, Özlem; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemWe propose an exhaustive search algorithm that calculates the VC-dimension of univariate decision trees with binary features. The VC-dimension of the univariate decision tree with binary features depends on (i) the VC-dimension values of the left and right subtrees, (ii) the number of inputs, and (iii) the number of nodes in the tree. From a training set of example trees whose VC-dimensions are calculated by exhaustive search, we fit a general regressor to estimate the VC-dimension of any binary tree. These VC-dimension estimates are then used to get VC-generalization bounds for complexity control using SRM in decision trees, i.e., pruning. Our simulation results shows that SRM-pruning using the estimated VC-dimensions finds trees that are as accurate as those pruned using cross-validation.Yayın Univariate margin tree(Springer, 2010) Yıldız, Olcay TanerIn many pattern recognition applications, first decision trees are used due to their simplicity and easily interpretable nature. In this paper, we propose a new decision tree learning algorithm called univariate margin tree, where for each continuous attribute, the best split is found using convex optimization. Our simulation results on 47 datasets show that the novel margin tree classifier performs at least as good as C4.5 and LDT with a similar time complexity. For two class datasets it generates smaller trees than C4.5 and LDT without sacrificing from accuracy, and generates significantly more accurate trees than C4.5 and LDT for multiclass datasets with one-vs-rest methodology.Yayın Incremental construction of rule ensembles using classifiers produced by different class orderings(IEEE, 2016) Yıldız, Olcay Taner; Ulaş, AydınIn this paper, we discuss a novel approach to incrementally construct a rule ensemble. The approach constructs an ensemble from a dynamically generated set of rule classifiers. Each classifier in this set is trained by using a different class ordering. We investigate criteria including accuracy, ensemble size, and the role of starting point in the search. Fusion is done by averaging. Using 22 data sets, floating search finds small, accurate ensembles in polynomial time.Yayın TRopBank: Turkish PropBank V2.0(European Language Resources Association (ELRA), 2020-05-16) Kara, Neslihan; Aslan, Deniz Baran; Marşan, Büşra; Bakay, Özge; Ak, Koray; Yıldız, Olcay TanerIn this paper, we present and explain TRopBank “Turkish PropBank v2.0”. PropBank is a hand-annotated corpus of propositions which is used to obtain the predicate-argument information of a language. Predicate-argument information of a language can help understand semantic roles of arguments. “Turkish PropBank v2.0”, unlike PropBank v1.0, has a much more extensive list of Turkish verbs, with 17.673 verbs in total.Yayın Soft decision trees(IEEE, 2012) İrsoy, Ozan; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemWe discuss a novel decision tree architecture with soft decisions at the internal nodes where we choose both children with probabilities given by a sigmoid gating function. Our algorithm is incremental where new nodes are added when needed and parameters are learned using gradient-descent. We visualize the soft tree fit on a toy data set and then compare it with the canonical, hard decision tree over ten regression and classification data sets. Our proposed model has significantly higher accuracy using fewer nodes.Yayın Construction of a Turkish proposition bank(Tubitak Scientific & Technical Research Council Turkey, 2018) Ak, Koray; Toprak, Cansu; Esgel, Volkan; Yıldız, Olcay TanerThis paper describes our approach to developing the Turkish PropBank by adopting the semantic role-labeling guidelines of the original PropBank and using the translation of the English Penn-TreeBank as a resource. We discuss the semantic annotation process of the PropBank and language-specific cases for Turkish, the tools we have developed for annotation, and quality control for multiuser annotation. In the current phase of the project, more than 9500 sentences are semantically analyzed and predicate-argument information is extracted for 1330 verbs and 1914 verb senses. Our plan is to annotate 17,000 sentences by the end of 2017.Yayın Integrating Turkish Wordnet KeNet to Princeton WordNet: The case of one-to-many correspondences(Institute of Electrical and Electronics Engineers Inc., 2019-10) Bakay, Özge; Ergelen, Özlem; Yıldız, Olcay TanerIn this paper, we introduce a novel approach of forming interlingual relations between multilingual wordnets. We have mapped Turkish senses in KeNet with their corresponding senses in Princeton WordNet by drawing one-To-many correspondences. As a result of language-specific properties, one synset in one language is matched with multiple synsets in the other language in some cases. Our method of integrating KeNet into a multilingual network also included mapping the most frequent 5000 senses in English with their equivalent senses in Turkish. What we demonstrate is that one-To-many interlingual correspondances are necessary to include in mappings both from Turkish-To-English and English-To-Turkish. Furthermore, one-To-many mappings give us insights into the semantic relations to be constructed in Turkish, such as hypernymy.Yayın VC-dimension of rule sets(IEEE Computer Soc, 2014-12-04) Yıldız, Olcay TanerIn this paper, we give and prove lower bounds of the VC-dimension of the rule set hypothesis class where the input features are binary or continuous. The VC-dimension of the rule set depends on the VC-dimension values of its rules and the number of inputs.Yayın Budding trees(IEEE Computer Soc, 2014-08-24) İrsoy, Ozan; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemWe propose a new decision tree model, named the budding tree, where a node can be both a leaf and an internal decision node. Each bud node starts as a leaf node, can then grow children, but then later on, if necessary, its children can be pruned. This contrasts with traditional tree construction algorithms that only grows the tree during the training phase, and prunes it in a separate pruning phase. We use a soft tree architecture and show that the tree and its parameters can be trained using gradient-descent. Our experimental results on regression, binary classification, and multi-class classification data sets indicate that our newly proposed model has better performance than traditional trees in terms of accuracy while inducing trees of comparable size.Yayın English-Turkish parallel semantic annotation of Penn-Treebank(Oficyna Wydawnicza Politechniki Wroclawskiej, 2020) Arıcan, Bilge Nas; Bakay, Özge; Avar, Begüm; Yıldız, Olcay Taner; Ergelen, ÖzlemThis paper reports our efforts in constructing a sense-labeled English-Turkish parallel corpus using the traditional method of manual tagging. We tagged a pre-built parallel treebank which was translated from the Penn Treebank corpus. This approach allowed us to generate a resource combining syntactic and semantic information. We provide statistics about the corpus itself as well as information regarding its development process.












