Arama Sonuçları

Listeleniyor 1 - 8 / 8
  • Yayın
    Effective semi-supervised learning strategies for automatic sentence segmentation
    (Elsevier Science BV, 2018-04-01) Dalva, Doğan; Güz, Ümit; Gürkan, Hakan
    The primary objective of sentence segmentation process is to determine the sentence boundaries of a stream of words output by the automatic speech recognizers. Statistical methods developed for sentence segmentation requires a significant amount of labeled data which is time-consuming, labor intensive and expensive. In this work, we propose new multi-view semi-supervised learning strategies for sentence boundary classification problem using lexical, prosodic, and morphological information. The aim is to find effective semi-supervised machine learning strategies when only small sets of sentence boundary labeled data are available. We primarily investigate two semi-supervised learning approaches, called self-training and co-training. Different example selection strategies were also used for co-training, namely, agreement, disagreement and self-combined. Furthermore, we propose three-view and committee-based algorithms incorporating with agreement, disagreement and self-combined strategies using three disjoint feature sets. We present comparative results of different learning strategies on the sentence segmentation task. The experimental results show that the sentence segmentation performance can be highly improved using multi-view learning strategies that we proposed since data sets can be represented by three redundantly sufficient and disjoint feature sets. We show that the proposed strategies substantially improve the average baseline F-measure of 67.66% to 75.15% and 64.84% to 66.32% when only a small set of manually labeled data is available for Turkish and English spoken languages, respectively.
  • Yayın
    Tree Ensembles on the induced discrete space
    (Institute of Electrical and Electronics Engineers Inc., 2016-05) Yıldız, Olcay Taner
    Decision trees are widely used predictive models in machine learning. Recently, K-tree is proposed, where the original discrete feature space is expanded by generating all orderings of values of k discrete attributes and these orderings are used as the new attributes in decision tree induction. Although K-tree performs significantly better than the proper one, their exponential time complexity can prohibit their use. In this brief, we propose K-forest, an extension of random forest, where a subset of features is selected randomly from the induced discrete space. Simulation results on 17 data sets show that the novel ensemble classifier has significantly lower error rate compared with the random forest based on the original feature space.
  • Yayın
    Eigenclassifiers for combining correlated classifiers
    (Elsevier Science Inc, 2012-03-15) Ulaş, Aydın; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim Ethem
    In practice, classifiers in an ensemble are not independent. This paper is the continuation of our previous work on ensemble subset selection [A. Ulas, M. Semerci, O.T. Yildiz, E. Alpaydin, Incremental construction of classifier and discriminant ensembles, Information Sciences, 179 (9) (2009) 1298-1318] and has two parts: first, we investigate the effect of four factors on correlation: (i) algorithms used for training, (ii) hyperparameters of the algorithms, (iii) resampled training sets, (iv) input feature subsets. Simulations using 14 classifiers on 38 data sets indicate that hyperparameters and overlapping training sets have higher effect on positive correlation than features and algorithms. Second, we propose postprocessing before fusing using principal component analysis (PCA) to form uncorrelated eigenclassifiers from a set of correlated experts. Combining the information from all classifiers may be better than subset selection where some base classifiers are pruned before combination, because using all allows redundancy.
  • Yayın
    Mapping classifiers and datasets
    (Pergamon-Elsevier Science Ltd, 2011-04) Yıldız, Olcay Taner
    Given the posterior probability estimates of 14 classifiers on 38 datasets, we plot two-dimensional maps of classifiers and datasets using principal component analysis (PCA) and Isomap. The similarity between classifiers indicate correlation (or diversity) between them and can be used in deciding whether to include both in an ensemble. Similarly, datasets which are too similar need not both be used in a general comparison experiment. The results show that (i) most of the datasets (approximately two third) we used are similar to each other, (ii) multilayer perceptrons and k-nearest neighbor variants are more similar to each other than support vector machine and decision tree variants. (iii) the number of classes and the sample size has an effect on similarity.
  • Yayın
    Adaptive convolution kernel for artificial neural networks
    (Academic Press Inc., 2021-02) Tek, Faik Boray; Çam, İlker; Karlı, Deniz
    Many deep neural networks are built by using stacked convolutional layers of fixed and single size (often 3 × 3) kernels. This paper describes a method for learning the size of convolutional kernels to provide varying size kernels in a single layer. The method utilizes a differentiable, and therefore backpropagation-trainable Gaussian envelope which can grow or shrink in a base grid. Our experiments compared the proposed adaptive layers to ordinary convolution layers in a simple two-layer network, a deeper residual network, and a U-Net architecture. The results in the popular image classification datasets such as MNIST, MNIST-CLUTTERED, CIFAR-10, Fashion, and ‘‘Faces in the Wild’’ showed that the adaptive kernels can provide statistically significant improvements on ordinary convolution kernels. A segmentation experiment in the Oxford-Pets dataset demonstrated that replacing ordinary convolution layers in a U-shaped network with 7 × 7 adaptive layers can improve its learning performance and ability to generalize.
  • Yayın
    Quadratic programming for class ordering in rule induction
    (Elsevier Science BV, 2015-03-01) Yıldız, Olcay Taner
    Separate-and-conquer type rule induction algorithms such as Ripper, solve a K>2 class problem by converting it into a sequence of K - 1 two-class problems. As a usual heuristic, the classes are fed into the algorithm in the order of increasing prior probabilities. Although the heuristic works well in practice, there is much room for improvement. In this paper, we propose a novel approach to improve this heuristic. The approach transforms the ordering search problem into a quadratic optimization problem and uses the solution of the optimization problem to extract the optimal ordering. We compared new Ripper (guided by the ordering found with our approach) with original Ripper (guided by the heuristic ordering) on 27 datasets. Simulation results show that our approach produces rulesets that are significantly better than those produced by the original Ripper.
  • Yayın
    An adaptive locally connected neuron model: Focusing neuron
    (Elsevier B.V., 2021-01-02) Tek, Faik Boray
    This paper presents a new artificial neuron model capable of learning its receptive field in the topological domain of inputs. The experiments include tests of focusing neuron networks of one or two hidden layers on synthetic and well-known image recognition data sets. The results demonstrated that the focusing neurons can move their receptive fields towards more informative inputs. In the simple two-hidden layer networks, the focusing layers outperformed the dense layers in the classification of the 2D spatial data sets. Moreover, the focusing networks performed better than the dense networks even when 70% of the weights were pruned. The tests on convolutional networks revealed that using focusing layers instead of dense layers for the classification of convolutional features may work better in some data sets.
  • Yayın
    Analysis of the benefits, challenges and risks for the integrated use of BIM, RFID and WSN: a mixed method research
    (Emerald Group Publishing Ltd, 2023-07-11) Seyis, Senem; Sönmez, Alperen Mert
    Purpose The purpose of this study is to identify, classify and prioritize the benefits, challenges and risks for the integrated use of building information modeling (BIM), radio frequency identification (RFID) and wireless sensor network (WSN) in the architecture, engineering, construction and operation (AECO) industry. Design/methodology/approach This study relies on the mixed method approach which consists of systematic literature review, semistructured interviews and Delphi technique. A systematic literature review was performed and face-to-face semistructured interviews with seven subject matter experts (SMEs) were conducted for identification and classification purposes. Delphi method was applied in two structured rounds with eleven SMEs for prioritization purpose. These three research techniques were chosen to reach the most accurate data by combining different perspectives on the subject matter. Data gathered by these three methods was triangulated to increase the validity and reliability of this research. Findings Thirteen benefits, ten challenges and four risks for the integrated use of BIM, RFID and WSN were identified. The results could aid the practitioners and researchers comprehend the pros and cons of this integration by representing SMEs' valuable insights and perspectives about the current and future status, trends, limitations and requirements of the AECO industry. The identified risks and challenges show the requirements for future studies while the benefits demonstrate the capabilities and the potential contributions of this hybrid integration to the AECO industry. Originality/value The integration of BIM, RFID and WSN is still not commonly implemented in the AECO industry. Some studies focused on this topic; however, none of them reveals the benefits, risks and challenges for integrating BIM, RFID and WSN in a holistic manner. This research makes a significant contribution to the AECO literature and industry by uncovering the benefits, challenges and risks for the integrated use of BIM, RFID and WSN that could increase industry applications.