Arama Sonuçları

Listeleniyor 1 - 10 / 30
  • Yayın
    From past to present: spam detection and identifying opinion leaders in social networks
    (Yildiz Teknik Univ., 2022-06-22) Altınel Girgin, Ayşe Berna; Gümüşçekiçci, Gizem
    On microblogging sites, which are gaining more and more users every day, a wide range of ideas are quickly emerging, spreading, and creating interactive environments. In some cases, in Turkey as well as in the rest of the world, it was noticed that events were published on microblogging sites before appearing in visual, audio and printed news sources. Thanks to the rapid flow of information in social networks, it can reach millions of people in seconds. In this context, social media can be seen as one of the most important sources of information affecting public opinion. Since the information in social networks became accessible, research started to be conducted using the information on the social networks. While the studies about spam detection and identification of opinion leaders gained popularity, surveys about these topics began to be published. This study also shows the importance of spam detection and identification of opinion leaders in social networks. It is seen that the data collected from social platforms, especially in recent years, has sourced many state-of-art applications. There are independent surveys that focus on filtering the spam content and detecting influencers on social networks. This survey analyzes both spam detection studies and opinion leader identification and categorizes these studies by their methodologies. As far as we know there is no survey that contains approaches for both spam detection and opinion leader identification in social networks. This survey contains an overview of the past and recent advances in both spam detection and opinion leader identification studies in social networks. Furthermore, readers of this survey have the opportunity of understanding general aspects of different studies about spam detection and opinion leader identification while observing key points and comparisons of these studies.
  • Yayın
    Evaluating the English-Turkish parallel treebank for machine translation
    (TÜBİTAK, 2022-01-19) Görgün, Onur; Yıldız, Olcay Taner
    This study extends our initial efforts in building an English-Turkish parallel treebank corpus for statistical machine translation tasks. We manually generated parallel trees for about 17K sentences selected from the Penn Treebank corpus. English sentences vary in length: 15 to 50 tokens including punctuation. We constrained the translation of trees by (i) reordering of leaf nodes based on suffixation rules in Turkish, and (ii) gloss replacement. We aim to mimic human annotator's behavior in real translation task. In order to fill the morphological and syntactic gap between languages, we do morphological annotation and disambiguation. We also apply our heuristics by creating Nokia English-Turkish Treebank (NTB) to address technical document translation tasks. NTB also includes 8.3K sentences in varying lengths. We validate the corpus both extrinsically and intrinsically, and report our evaluation results regarding perplexity analysis and translation task results. Results prove that our heuristics yield promising results in terms of perplexity and are suitable for translation tasks in terms of BLEU scores.
  • Yayın
    A novel similarity based unsupervised technique for training convolutional filters
    (IEEE, 2023-05-17) Erkoç, Tuğba; Eskil, Mustata Taner
    Achieving satisfactory results with Convolutional Neural Networks (CNNs) depends on how effectively the filters are trained. Conventionally, an appropriate number of filters is carefully selected, the filters are initialized with a proper initialization method and trained with backpropagation over several epochs. This training scheme requires a large labeled dataset, which is costly and time-consuming to obtain. In this study, we propose an unsupervised approach that extracts convolutional filters from a given dataset in a self-organized manner by processing the training set only once without using backpropagation training. The proposed method allows for the extraction of filters from a given dataset in the absence of labels. In contrast to previous studies, we no longer need to select the best number of filters and a suitable filter weight initialization scheme. Applying this method to the MNIST, EMNIST-Digits, Kuzushiji-MNIST, and Fashion-MNIST datasets yields high test performances of 99.19%, 99.39%, 95.03%, and 90.11%, respectively, without applying backpropagation training or using any preprocessed and augmented data.
  • Yayın
    Unreasonable effectiveness of last hidden layer activations for adversarial robustness
    (Institute of Electrical and Electronics Engineers Inc., 2022) Tuna, Ömer Faruk; Çatak, Ferhat Özgür; Eskil, Mustafa Taner
    In standard Deep Neural Network (DNN) based classifiers, the general convention is to omit the activation function in the last (output) layer and directly apply the softmax function on the logits to get the probability scores of each class. In this type of architectures, the loss value of the classifier against any output class is directly proportional to the difference between the final probability score and the label value of the associated class. Standard White-box adversarial evasion attacks, whether targeted or untargeted, mainly try to exploit the gradient of the model loss function to craft adversarial samples and fool the model. In this study, we show both mathematically and experimentally that using some widely known activation functions in the output layer of the model with high temperature values has the effect of zeroing out the gradients for both targeted and untargeted attack cases, preventing attackers from exploiting the model's loss function to craft adversarial samples. We've experimentally verified the efficacy of our approach on MNIST (Digit), CIFAR10 datasets. Detailed experiments confirmed that our approach substantially improves robustness against gradient-based targeted and untargeted attack threats. And, we showed that the increased non-linearity at the output layer has some ad-ditional benefits against some other attack methods like Deepfool attack.
  • Yayın
    Graph convolutional network based virus-human protein-protein interaction prediction for novel viruses
    (Elsevier Ltd, 2022-08-13) Koca, Mehmet Burak; Nourani, Esmaeil; Abbasoğlu, Ferda; Karadeniz, İlknur; Sevilgen, Fatih Erdoğan
    Computational identification of human-virus protein-protein interactions (PHIs) is a worthwhile step towards understanding infection mechanisms. Analysis of the PHI networks is important for the determination of path-ogenic diseases. Prediction of these interactions is a popular problem since experimental detection of PHIs is both time-consuming and expensive. The available methods use biological features like amino acid sequences, molecular structure, or biological activities for prediction. Recent studies show that the topological properties of proteins in protein-protein interaction (PPI) networks increase the performance of the predictions. The basic network projections, random-walk-based models, or graph neural networks are used for generating topologically enriched (hybrid) protein embeddings. In this study, we propose a three-stage machine learning pipeline that generates and uses hybrid embeddings for PHI prediction. In the first stage, numerical features are extracted from the amino acid sequences using the Doc2Vec and Byte Pair Encoding method. The amino acid embeddings are used as node features while training a modified GraphSAGE model, which is an improved version of the graph convolutional network. Lastly, the hybrid protein embeddings are used for training a binary interaction classifier model that predicts whether there is an interaction between the given two proteins or not. The proposed method is evaluated with comprehensive experiments to test its functionality and compare it with the state-of-art methods. The experimental results on the benchmark dataset prove the efficiency of the proposed model by having a 3–23% better area under curve (AUC) score than its competitors.
  • Yayın
    TENET: a new hybrid network architecture for adversarial defense
    (Springer Science and Business Media Deutschland GmbH, 2023-08) Tuna, Ömer Faruk; Çatak, Ferhat Özgür; Eskil, Mustafa Taner
    Deep neural network (DNN) models are widely renowned for their resistance to random perturbations. However, researchers have found out that these models are indeed extremely vulnerable to deliberately crafted and seemingly imperceptible perturbations of the input, referred to as adversarial examples. Adversarial attacks have the potential to substantially compromise the security of DNN-powered systems and posing high risks especially in the areas where security is a top priority. Numerous studies have been conducted in recent years to defend against these attacks and to develop more robust architectures resistant to adversarial threats. In this study, we propose a new architecture and enhance a recently proposed technique by which we can restore adversarial samples back to their original class manifold. We leverage the use of several uncertainty metrics obtained from Monte Carlo dropout (MC Dropout) estimates of the model together with the model’s own loss function and combine them with the use of defensive distillation technique to defend against these attacks. We have experimentally evaluated and verified the efficacy of our approach on MNIST (Digit), MNIST (Fashion) and CIFAR10 datasets. In our experiments, we showed that our proposed method reduces the attack’s success rate lower than 5% without compromising clean accuracy.
  • Yayın
    Distribution games: a new class of games with application to user provided networks
    (Institute of Electrical and Electronics Engineers Inc., 2022-11-29) Taşçı, Sinan Emre; Shalom, Mordechai; Korçak, Ömer
    User Provided Network (UPN) is a promising solution for sharing the limited network resources by utilizing user capabilities as a part of the communication infrastructure. In UPNs, it is an important problem to decide how to share the resources among multiple clients in decentralized manner. Motivated by this problem, we introduce a new class of games termed distribution games that can be used to distribute efficiently and fairly the bandwidth capacity among users. We show that every distribution game has at least one pure strategy Nash equilibrium (NE) and any best response dynamics always converges to such an equilibrium. We consider social welfare functions that are weighted sums of bandwidths allocated to clients. We present tight upper bounds for the price of anarchy and price of stability of these games provided that they satisfy some reasonable assumptions. We define two specific practical instances of distribution games that fit these assumptions. We conduct experiments on one of these instances and demonstrate that in most of the settings the social welfare obtained by the best response dynamics is very close to the optimum. Simulations show that this game also leads to a fair distribution of the bandwidth.
  • Yayın
    Machine learning-based model categorization using textual and structural features
    (Springer Science and Business Media Deutschland GmbH, 2022-09-08) Khalilipour, Alireza; Bozyiğit, Fatma; Utku, Can; Challenger, Moharram
    Model Driven Engineering (MDE), where models are the core elements in the entire life cycle from the specification to maintenance phases, is one of the promising techniques to provide abstraction and automation. However, model management is another challenging issue due to the increasing number of models, their size, and their structural complexity. So that the available models should be organized by modelers to be reused and overcome the development of the new and more complex models with less cost and effort. In this direction, many studies are conducted to categorize models automatically. However, most of the studies focus either on the textual data or structural information in the intelligent model management, leading to less precision in the model management activities. Therefore, we utilized a model classification using baseline machine learning approaches on a dataset including 555 Ecore metamodels through hybrid feature vectors including both textual and structural information. In the proposed approach, first, the textual information of each model has been summarized in its elements through text processing as well as the ontology of synonyms within a specific domain. Then, the performances of machine learning classifiers were observed on two different variants of the datasets. The first variant includes only textual features (represented both in TF-IDF and word2vec representations), whereas the second variant consists of the determined structural features and textual features. It was finally concluded that each experimented machine learning algorithm gave more successful prediction performance on the variant containing structural features. The presented model yields promising results for the model classification task with a classification accuracy of 89.16%.
  • Yayın
    Categorization of the models based on structural information extraction and machine learning
    (Springer Science and Business Media Deutschland GmbH, 2022-07-21) Khalilipour, Alireza; Bozyiğit, Fatma; Utku, Can; Challenger, Moharram
    As various engineering fields increasingly use modelling techniques, the number of provided models, their size, and their structural complexity increase. This makes model management, including finding these models, with state of the art very expensive computationally, i.e., leads to non-tractable graph comparison algorithms. To handle this problem, modelers can organize available models to be reused and overcome the development of the new and more complex models with less cost and effort. Therefore, we utilized a model classification using baseline machine learning approaches on a dataset including 555 Ecore metamodels. In our proposed system, the structural information of each model was summarized in its elements through generating their simple labelled graphs. The proposed solution is to transform the complex attributed graphs of the models to simply labelled graphs so that graph analysis algorithms can be applied to them. The labelled graphs (models) were structurally compared using graph comparison techniques such as graph kernels, and the results were used as a set of features for similarity search. After generating feature vectors, the performance of six machine learning classifiers (Naïve Bayes (NB), k Nearest Neighbors (kNN), Support Vector Machine (SVM), Random Forest (RF), and Artificial Neural Network (ANN) were evaluated on the feature vectors. The presented model yields promising results for the model classification task with a classification accuracy over 87%.
  • Yayın
    Exploiting epistemic uncertainty of the deep learning models to generate adversarial samples
    (Cornell Univ, 2021-02-13) Tuna, Ömer Faruk; Çatak, Ferhat Özgür; Eskil, Mustafa Taner
    Deep neural network architectures are considered to be robust to random perturbations. Nevertheless, it was shown that they could be severely vulnerable to slight but carefully crafted perturbations of the input, termed as adversarial samples. In recent years, numerous studies have been conducted in this new area called "Adversarial Machine Learning" to devise new adversarial attacks and to defend against these attacks with more robust DNN architectures. However, almost all the research work so far has been concentrated on utilising model loss function to craft adversarial examples or create robust models. This study explores the usage of quantified epistemic uncertainty obtained from Monte-Carlo Dropout Sampling for adversarial attack purposes by which we perturb the input to the areas where the model has not seen before. We proposed new attack ideas based on the epistemic uncertainty of the model. Our results show that our proposed hybrid attack approach increases the attack success rates from 82.59% to 85.40%, 82.86% to 89.92% and 88.06% to 90.03% on MNIST Digit, MNIST Fashion and CIFAR-10 datasets, respectively.