17 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 10 / 17
Yayın Sınıflandırma için diferansiyel mahremiyete dayalı öznitelik seçimi(Gazi Univ, Fac Engineering Architecture, 2018) Var, Esra; İnan, AliVeri madenciliği ve makine öğrenmesi çözümlerinin en önemli ön aşamalarından biri yapılacak analizde kullanılacak verinin özniteliklerinin uygun bir alt kümesini belirlemektir. Sınıflandırma yöntemleri için bu işlem, bir özniteliğin sınıf niteliği ile ne oranda ilişkili olduğuna bakılarak yapılır. Kişisel gizliliği koruyan pek çok sınıflandırma çözümü bulunmaktadır. Ancak bu yöntemler için öznitelik seçimi yapan çözümler geliştirilmemiştir. Bu çalışmada, istatistiksel veritabanı güvenliğinde bilinen en kapsamlı ve güvenli çözüm olan diferansiyel mahremiyete dayalı özgün öznitelik seçimi yöntemleri sunulmaktadır. Önerilen bu yöntemler, yaygın olarak kullanılan bir veri madenciliği kütüphanesi olan WEKA ile entegre edilmiş ve deney sonuçları ile önerilen çözümlerin sınıflandırma başarımına olumlu etkileri gösterilmiştir.Yayın Driver recognition using gaussian mixture models and decision fusion techniques(Springer-Verlag Berlin, 2008) Benli, Kristin Surpuhi; Düzağaç, Remzi; Eskil, Mustafa TanerIn this paper we present our research in driver recognition. The goal of this study is to investigate the performance of different classifier fusion techniques in a driver recognition scenario. We are using solely driving behavior signals such as break and accelerator pedal pressure, engine RPM, vehicle speed; steering wheel angle for identifying the driver identities. We modeled each driver using Gaussian Mixture Models, obtained posterior probabilities of identities and combined these scores using different fixed mid trainable (adaptive) fusion methods. We observed error rates is low as 0.35% in recognition of 100 drivers using trainable combiners. We conclude that the fusion of multi-modal classifier results is very successful in biometric recognition of a person in a car setting.Yayın Mobile applications discovery: a subscriber-centric approach(Wiley Periodicals, 2011-03) Erman, Bilgehan; İnan, Ali; Nagarajan, Ramesh; Uzunalioğlu, HüseyinRapid adoption of smartphones and the business success of the Apple App Store have resulted in the rampant growth of mobile applications. Seeking new revenue opportunities from application development has created a gold rush. However, free or very cheap applications constitute a great bulk of the application downloads putting great pricing pressure on the developers. Furthermore, usage statistics suggest that most of the applications have been either one-trick applications or are downright useless, meriting no attention from the user beyond the first day. This is not surprising since cheap prices will dissuade developers from investing large sums of money to continue to develop more sophisticated, high quality applications. Developers have been complaining about the lack of visibility of their applications in stores that are beginning to resemble a high volume warehouse. It is clear that enhancing application discovery and building better marketing tools will be essential for the continued success of the mobile application marketplace and application stores. This paper proposes and investigates techniques for effective discovery of applications by matching user interests with application characteristics, with a special focus on adapting classical data mining techniques to user ratings of the applications. The user ratings are leveraged to make recommendations on potential applications of interest.Yayın Construction of a Turkish proposition bank(Tubitak Scientific & Technical Research Council Turkey, 2018) Ak, Koray; Toprak, Cansu; Esgel, Volkan; Yıldız, Olcay TanerThis paper describes our approach to developing the Turkish PropBank by adopting the semantic role-labeling guidelines of the original PropBank and using the translation of the English Penn-TreeBank as a resource. We discuss the semantic annotation process of the PropBank and language-specific cases for Turkish, the tools we have developed for annotation, and quality control for multiuser annotation. In the current phase of the project, more than 9500 sentences are semantically analyzed and predicate-argument information is extracted for 1330 verbs and 1914 verb senses. Our plan is to annotate 17,000 sentences by the end of 2017.Yayın Constructing a WordNet for Turkish using manual and automatic annotation(Assoc Computing Machinery, 2018-05) Ehsani, Razieh; Solak, Ercan; Yıldız, Olcay TanerIn this article, we summarize the methodology and the results of our 2-year-long efforts to construct a comprehensive WordNet for Turkish. In our approach, we mine a dictionary for synonym candidate pairs and manually mark the senses in which the candidates are synonymous. We marked every pair twice by different human annotators. We derive the synsets by finding the connected components of the graph whose edges are synonym senses. We also mined Turkish Wikipedia for hypernym relations among the senses. We analyzed the resulting WordNet to highlight the difficulties brought about by the dictionary construction methods of lexicographers. After splitting the unusually large synsets, we used random walk-based clustering that resulted in a Zipfian distribution of synset sizes. We compared our results to BalkaNet and automatic thesaurus construction methods using variation of information metric. Our Turkish WordNet is available online.Yayın On the maximum cardinality cut problem in proper interval graphs and related graph classes(Elsevier B.V., 2022-01-04) Boyacı, Arman; Ekim, Tınaz; Shalom, MordechaiAlthough it has been claimed in two different papers that the maximum cardinality cut problem is polynomial-time solvable for proper interval graphs, both of them turned out to be erroneous. In this work we consider the parameterized complexity of this problem. We show that the maximum cardinality cut problem in proper/unit interval graphs is FPT when parameterized by the maximum number of non-empty bubbles in a column of its bubble model. We then generalize this result to a more general graph class by defining new parameters related to the well-known clique-width parameter. Specifically, we define an (?,?,?)-clique-width decomposition of a graph as a clique-width decomposition in which at each step the following invariant is preserved: after discarding at most ? labels, a) every label consists of at most ? sets of twin vertices, and b) all the labels together induce a graph with independence number at most ?. We show that for every two constants ?,?>0 the problem is FPT when parameterized by ? plus the smallest width of an (?,?,?)-clique-width decomposition.Yayın Colored simultaneous geometric embeddings(Springer-Verlag Berlin, 2007) Brandes, Ulrik; Erten, Cesim; Fowler, J. Joseph; Frati, Fabrizio; Geyer, Markus; Gutwenger, Carsten; Hong, Seok-Hee; Kaufmann, Michael; Kobourov, Stephen G.; Liotta, Giuseppe; Mutzel, Petra; Symvonis, AntoniosWe introduce the concept of colored simultaneous geometric embeddings as a generalization of simultaneous graph embeddings with and without mapping. We show that there exists a universal pointset of size n for paths colored with two or three colors. We use these results to show that colored simultaneous geometric embeddings exist for: (1) a 2-colored tree together with any number of 2-colored paths and (2) a 2-colored outerplanar graph together with any number of 2-colored paths. We also show that there does not exist a universal pointset of size n for paths colored with five colors. We finally show that the following simultaneous embeddings are not possible: (1) three 6-colored cycles, (2) four 6-colored paths, and (3) three 9-colored paths.Yayın Chunking in Turkish with conditional random fields(Springer-Verlag, 2015-04-14) Yıldız, Olcay Taner; Solak, Ercan; Ehsani, Razieh; Görgün, OnurIn this paper, we report our work on chunking in Turkish. We used the data that we generated by manually translating a subset of the Penn Treebank. We exploited the already available tags in the trees to automatically identify and label chunks in their Turkish translations. We used conditional random fields (CRF) to train a model over the annotated data. We report our results on different levels of chunk resolution.Yayın A tree-based approach for English-to-Turkish translation(Tubitak Scientific & Technical Research Council Turkey, 2019) Bakay, Özge; Avar, Begüm; Yıldız, Olcay TanerIn this paper, we present our English-to-Turkish translation methodology, which adopts a tree-based approach. Our approach relies on tree analysis and the application of structural modification rules to get the target side (Turkish) trees from source side (English) ones. We also use morphological analysis to get candidate root words and apply tree-based rules to obtain the agglutinated target words. Compared to earlier work on English-to-Turkish translation using phrase-based models, we have been able to obtain higher BLEU scores in our current study. Our syntactic subtree permutation strategy, combined with a word replacement algorithm, provides a 67% relative improvement from a baseline 12.8 to 21.4 BLEU, all averaged over 10-fold cross-validation. As future work, improvements in choosing the correct senses and structural rules are needed.Yayın Bagging soft decision trees(Springer Verlag, 2016) Yıldız, Olcay Taner; İrsoy, Ozan; Alpaydın, Ahmet İbrahim EthemThe decision tree is one of the earliest predictive models in machine learning. In the soft decision tree, based on the hierarchical mixture of experts model, internal binary nodes take soft decisions and choose both children with probabilities given by a sigmoid gating function. Hence for an input, all the paths to all the leaves are traversed and all those leaves contribute to the final decision but with different probabilities, as given by the gating values on the path. Tree induction is incremental and the tree grows when needed by replacing leaves with subtrees and the parameters of the newly-added nodes are learned using gradient-descent. We have previously shown that such soft trees generalize better than hard trees; here, we propose to bag such soft decision trees for higher accuracy. On 27 two-class classification data sets (ten of which are from the medical domain), and 26 regression data sets, we show that the bagged soft trees generalize better than single soft trees and bagged hard trees. This contribution falls in the scope of research track 2 listed in the editorial, namely, machine learning algorithms.












