Arama Sonuçları
Listeleniyor 1 - 10 / 35
Yayın Calculating the VC-dimension of decision trees(IEEE, 2009) Aslan, Özlem; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemWe propose an exhaustive search algorithm that calculates the VC-dimension of univariate decision trees with binary features. The VC-dimension of the univariate decision tree with binary features depends on (i) the VC-dimension values of the left and right subtrees, (ii) the number of inputs, and (iii) the number of nodes in the tree. From a training set of example trees whose VC-dimensions are calculated by exhaustive search, we fit a general regressor to estimate the VC-dimension of any binary tree. These VC-dimension estimates are then used to get VC-generalization bounds for complexity control using SRM in decision trees, i.e., pruning. Our simulation results shows that SRM-pruning using the estimated VC-dimensions finds trees that are as accurate as those pruned using cross-validation.Yayın Identification of metabolic correlates of mild cognitive impairment in Parkinson's disease using magnetic resonance spectroscopic imaging and machine learning(Springer Science and Business Media Deutschland GmbH, 2022-12) Cengiz, Sevim; Arslan, Dilek Betül; Kıçik, Ani; Erdoğdu, Emel; Yıldırım, Muhammed; Hatay, Gökçe Hale; Tüfekçioğlu, Zeynep; Uluğ, Aziz Müfit; Bilgiç, Başar; Hanagasi, Haşmet; Demiralp, Tamer; Gürvit, Hakan; Öztürk Işıkk, EsinObjective: To investigate metabolic changes of mild cognitive impairment in Parkinson’s disease (PD-MCI) using proton magnetic resonance spectroscopic imaging (1H-MRSI). Methods: Sixteen healthy controls (HC), 26 cognitively normal Parkinson’s disease (PD-CN) patients, and 34 PD-MCI patients were scanned in this prospective study. Neuropsychological tests were performed, and three-dimensional 1H-MRSI was obtained at 3 T. Metabolic parameters and neuropsychological test scores were compared between PD-MCI, PD-CN, and HC. The correlations between neuropsychological test scores and metabolic intensities were also assessed. Supervised machine learning algorithms were applied to classify HC, PD-CN, and PD-MCI groups based on metabolite levels. Results: PD-MCI had a lower corrected total N-acetylaspartate over total creatine ratio (tNAA/tCr) in the right precentral gyrus, corresponding to the sensorimotor network (p = 0.01), and a lower tNAA over myoinositol ratio (tNAA/mI) at a part of the default mode network, corresponding to the retrosplenial cortex (p = 0.04) than PD-CN. The HC and PD-MCI patients were classified with an accuracy of 86.4% (sensitivity = 72.7% and specificity = 81.8%) using bagged trees. Conclusion: 1H-MRSI revealed metabolic changes in the default mode, ventral attention/salience, and sensorimotor networks of PD-MCI patients, which could be summarized mainly as ‘posterior cortical metabolic changes’ related with cognitive dysfunction.Yayın İlişkisel veri tabanlarında mükerrer kayıtların makine öğrenmesiyle tespiti(Institute of Electrical and Electronics Engineers Inc., 2018-07-05) Bayrak, Ahmet Tuğrul; Yılmaz, Aykut İnan; Yılmaz, Kemal Burak; Düzağaç, Remzi; Yıldız, Olcay TanerVeri miktarının artışına paralel olarak, ilişkisel veri tabanlarında mükerrer kayıtlar da artmaktadır. Artan bu kayıtlar kullanıldıkları rapor veya analizlerde tutarsızlığa sebep olabilmektedir. Bu sorunu en aza indirgemek için yaptığımız çalışmada, kayıtların birbirlerine olan benzerlikleri ve alan uzmanlık bilgisiyle belirlenen ağırlıklar, öznitelik olarak kullanılarak makine öğrenmesi algoritmaları ile mükerrer kayıtların bulunması hedeflenmiştir. Yapılan işlem sonucunda 9301467 satır veride 28412 mükerrer çift tespit edilmiştir. Bulunan bu mükerrer kayıtlar veri kaynağından temizlenerek verinin daha tutarlı hale gelmesi sağlanmaktadır.Yayın Machine learning(Institution of Engineering and Technology, 2020-01-01) Yıldız, Olcay Taner[No abstract available]Yayın Effective semi-supervised learning strategies for automatic sentence segmentation(Elsevier Science BV, 2018-04-01) Dalva, Doğan; Güz, Ümit; Gürkan, HakanThe primary objective of sentence segmentation process is to determine the sentence boundaries of a stream of words output by the automatic speech recognizers. Statistical methods developed for sentence segmentation requires a significant amount of labeled data which is time-consuming, labor intensive and expensive. In this work, we propose new multi-view semi-supervised learning strategies for sentence boundary classification problem using lexical, prosodic, and morphological information. The aim is to find effective semi-supervised machine learning strategies when only small sets of sentence boundary labeled data are available. We primarily investigate two semi-supervised learning approaches, called self-training and co-training. Different example selection strategies were also used for co-training, namely, agreement, disagreement and self-combined. Furthermore, we propose three-view and committee-based algorithms incorporating with agreement, disagreement and self-combined strategies using three disjoint feature sets. We present comparative results of different learning strategies on the sentence segmentation task. The experimental results show that the sentence segmentation performance can be highly improved using multi-view learning strategies that we proposed since data sets can be represented by three redundantly sufficient and disjoint feature sets. We show that the proposed strategies substantially improve the average baseline F-measure of 67.66% to 75.15% and 64.84% to 66.32% when only a small set of manually labeled data is available for Turkish and English spoken languages, respectively.Yayın A robust Gradient boosting model based on SMOTE and NEAR MISS methods for intrusion detection in imbalanced data sets(Işık Üniversitesi, 2022-01-18) Arık, Ahmet Okan; Çavdaroğlu Akkoç, Gülsüm Çiğdem; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Enformasyon Teknolojileri Yüksek Lisans ProgramıNovel technologies cause many security vulnerabilities and zero-day attack risks. Intrusion Detection Systems (IDS) are developed to protect computer networks from threats and attacks. Many challenging problems need to be solved in existing methods. The class imbalance problem is one of the most difficult problems of IDS, and it reduces the detection rate performance of the classifiers. The highest IDS detection rate in the literature is 96.54%. This thesis proposes a new model called ROGONG-IDS (Robust Gradient Boosting) based on Gradient Boosting. ROGONGIDS model uses Synthetic Minority Over-Sampling Technique (SMOTE) and Near Miss methods to handle class imbalance. Three different gradient boosting-based classification algorithms (GBM, LightGBM, XGBoost) were compared. The performance of the proposed model on multiclass classification has been verified in the UNSW-NB15 dataset. It reached the highest attack detection rate and F1 score in the literature with a 97.30% detection rate and 97.65% F1 score. ROGONG-IDS provides a robust, efficient solution for IDS built on datasets with the imbalanced class distribution. It outperforms state-of-the-art and traditional intrusion detection methods.Yayın VC-dimension of univariate decision trees(IEEE-INST Electrical Electronics Engineers Inc, 2015-02-25) Yıldız, Olcay TanerIn this paper, we give and prove the lower bounds of the Vapnik-Chervonenkis (VC)-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. Via a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively, we show that our VC-dimension bounds are tight for simple trees. To verify that the VC-dimension bounds are useful, we also use them to get VC-generalization bounds for complexity control using structural risk minimization in decision trees, i.e., pruning. Our simulation results show that structural risk minimization pruning using the VC-dimension bounds finds trees that are more accurate as those pruned using cross validation.Yayın A cooperative neural network control structure and its application for systems having dead-zone nonlinearities(Springer International Publishing Ag, 2022-03) Dinçmen, ErkinAn adaptive control structure utilizing two feed-forward neural networks (NN) is proposed to deal with systems having unknown nonlinearities. One of the networks is trained to mimic the nonlinear system dynamics. Its training will be repeated with periods in order to keep it an updated valid model of the system all the times since the parameters and/or nonlinearities of the system may change during time. The other network, which is the Controller NN, adapts itself continuously by collaborating with the Model NN. The stability-convergence analysis of both networks is performed via Lyapunov method. An example system is chosen to show the applicability of the control algorithm. This example system is created by combining a linear dynamics model with a dead-zone function to represent a nonlinear system to be controlled. It should be noted that the proposed control structure can be used in any nonlinear system without knowing the system dynamics. The only information required by Model NN is the training set consisting input-output data pairs of the system. The Model NN is trained offline with this training set, and afterward the Controller NN adapts its weights online continuously during the control task with the help of Model NN. The performances of PD and PID controllers are also given for comparison purposes.Yayın Extension of conventional co-training learning strategies to three-view and committee-based learning strategies for effective automatic sentence segmentation(IEEE, 2018) Dalva, Doğan; Güz, Ümit; Gürkan, HakanThe objective of this work is to develop effective multi-view semi-supervised machine learning strategies for sentence boundary classification problem when only small sets of sentence boundary labeled data are available. We propose three-view and committee-based learning strategies incorporating with co-training algorithms with agreement, disagreement, and self-combined learning strategies using prosodic, lexical and morphological information. We compare experimental results of proposed three-view and committee-based learning strategies to other semi-supervised learning strategies in the literature namely, self-training and co-training with agreement, disagreement, and self-combined strategies. The experiment results show that sentence segmentation performance can be highly improved using multi-view learning strategies that we propose since data sets can be represented by three redundantly sufficient and disjoint feature sets. We show that the proposed strategies substantially improve the average performance when only a small set of manually labeled data is available for Turkish and English spoken languages, respectively.Yayın Incremental construction of classifier and discriminant ensembles(Elsevier Science Inc, 2009-04-15) Ulaş, Aydın; Semerci, Murat; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim EthemWe discuss approaches to incrementally construct an ensemble. The first constructs an ensemble of classifiers choosing a subset from a larger set, and the second constructs an ensemble of discriminants, where a classifier is used for some classes only. We investigate criteria including accuracy, significant improvement, diversity, correlation, and the role of search direction. For discriminant ensembles, we test subset selection and trees. Fusion is by voting or by a linear model. Using 14 classifiers on 38 data sets. incremental search finds small, accurate ensembles in polynomial time. The discriminant ensemble uses a subset of discriminants and is simpler, interpretable, and accurate. We see that an incremental ensemble has higher accuracy than bagging and random subspace method; and it has a comparable accuracy to AdaBoost. but fewer classifiers.












