Arama Sonuçları

Listeleniyor 1 - 10 / 20
  • Yayın
    A hybrid approach to private record matching
    (IEEE Computer Soc, 2012-10) İnan, Ali; Kantarcıoğlu, Murat; Ghinita, Gabriel; Bertino, Elisa
    Real-world entities are not always represented by the same set of features in different data sets. Therefore, matching records of the same real-world entity distributed across these data sets is a challenging task. If the data sets contain private information, the problem becomes even more difficult. Existing solutions to this problem generally follow two approaches: sanitization techniques and cryptographic techniques. We propose a hybrid technique that combines these two approaches and enables users to trade off between privacy, accuracy, and cost. Our main contribution is the use of a blocking phase that operates over sanitized data to filter out in a privacy-preserving manner pairs of records that do not satisfy the matching condition. We also provide a formal definition of privacy and prove that the participants of our protocols learn nothing other than their share of the result and what can be inferred from their share of the result, their input and sanitized views of the input data sets (which are considered public information). Our method incurs considerably lower costs than cryptographic techniques and yields significantly more accurate matching results compared to sanitization techniques, even when privacy requirements are high.
  • Yayın
    Unsupervised textile defect detection using convolutional neural networks
    (Elsevier Ltd, 2021-12) Koulali, Imane; Eskil, Mustafa Taner
    In this study, we propose a novel motif-based approach for unsupervised textile anomaly detection that combines the benefits of traditional convolutional neural networks with those of an unsupervised learning paradigm. It consists of five main steps: preprocessing, automatic pattern period extraction, patch extraction, features selection and anomaly detection. This proposed approach uses a new dynamic and heuristic method for feature selection which avoids the drawbacks of initialization of the number of filters (neurons) and their weights, and those of the backpropagation mechanism such as the vanishing gradients, which are common practice in the state-of-the-art methods. The design and training of the network are performed in a dynamic and input domain-based manner and, thus, no ad-hoc configurations are required. Before building the model, only the number of layers and the stride are defined. We do not initialize the weights randomly nor do we define the filter size or number of filters as conventionally done in CNN-based approaches. This reduces effort and time spent on hyper-parameter initialization and fine-tuning. Only one defect-free sample is required for training and no further labeled data is needed. The trained network is then used to detect anomalies on defective fabric samples. We demonstrate the effectiveness of our approach on the Patterned Fabrics benchmark dataset. Our algorithm yields reliable and competitive results (on recall, precision, accuracy and f1-measure) compared to state-of-the-art unsupervised approaches, in less time, with efficient training in a single epoch and a lower computational cost.
  • Yayın
    Breaking an orbit-based symmetric cryptosystem
    (Pergamon-Elsevier Science Ltd, 2011-09) Solak, Ercan; Rhouma, Rhouma; Belghith, Safya Mdimegh
    We report a break for a recently proposed class of cryptosystems. The cryptosystem uses constant points of a periodic secret orbit to encrypt the plaintext. In order to break the system, it suffices to sort the constant points and find the initial fixed point. We also report breaks for modified versions of the cryptosystem. In addition, we discuss some efficiency issues of the cryptosystem.
  • Yayın
    Design and analysis of classifier learning experiments in bioinformatics: survey and case studies
    (IEEE Computer Soc, 2012-12) İrsoy, Ozan; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim Ethem
    In many bioinformatics applications, it is important to assess and compare the performances of algorithms trained from data, to be able to draw conclusions unaffected by chance and are therefore significant. Both the design of such experiments and the analysis of the resulting data using statistical tests should be done carefully for the results to carry significance. In this paper, we first review the performance measures used in classification, the basics of experiment design and statistical tests. We then give the results of our survey over 1,500 papers published in the last two years in three bioinformatics journals (including this one). Although the basics of experiment design are well understood, such as resampling instead of using a single training set and the use of different performance metrics instead of error, only 21 percent of the papers use any statistical test for comparison. In the third part, we analyze four different scenarios which we encounter frequently in the bioinformatics literature, discussing the proper statistical methodology as well as showing an example case study for each. With the supplementary software, we hope that the guidelines we discuss will play an important role in future studies.
  • Yayın
    Comment on "Encryption and decryption of images with chaotic map lattices" [Chaos 16, 033118 (2006)]
    (American Institute of Physics Inc., 2008-09) Solak, Ercan; Çokal, Cahit
    In this paper, we comment on the chaotic encryption algorithm proposed by A. N. Pisarchik et al. [Chaos 16, 033118 (2006)]. We demonstrate that the algorithm is not invertible. We suggest simple modifications that can remedy some of the problems we identified.
  • Yayın
    VC-dimension of univariate decision trees
    (IEEE-INST Electrical Electronics Engineers Inc, 2015-02-25) Yıldız, Olcay Taner
    In this paper, we give and prove the lower bounds of the Vapnik-Chervonenkis (VC)-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. Via a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively, we show that our VC-dimension bounds are tight for simple trees. To verify that the VC-dimension bounds are useful, we also use them to get VC-generalization bounds for complexity control using structural risk minimization in decision trees, i.e., pruning. Our simulation results show that structural risk minimization pruning using the VC-dimension bounds finds trees that are more accurate as those pruned using cross validation.
  • Yayın
    A factorized high dimensional model representation on the nodes of a finite hyperprismatic regular grid
    (Elsevier Science inc, 2005-05-25) Tunga, Mehmet Alper; Demiralp, Metin
    When the values of a multivariate function f(x(1),...,x(N)), having N independent variables like x(1),...,x(N) are given at the nodes of a cartesian, product set in the space of the independent variables and ail interpolation problem is defined to find out the analytical structure of this function some difficulties arise in the standard methods due to the multidimensionality of the problem. Here, the main purpose is to partition this multivariate data into low-variate data and to obtain the analytical structure of the multivariate function by using this partitioned data. High dimensional model representation (HDMR) is used for these types of problems. However, if HDMR requires all components, which means 2(N) number of components, to get a desired accuracy then factorized high dimensional model representation (FHDMR) can be used. This method uses the components of HDMR. This representation is needed when the sought multivariate function has a multiplicative nature. In this work we introduce how to utilize FHDMR for these problems and present illustrative examples.
  • Yayın
    Hybrid high dimensional model representation (HHDMR) on the partitioned data
    (Elsevier B.V., 2006-01-01) Tunga, Mehmet Alper; Demiralp, Metin
    A multivariate interpolation problem is generally constructed for appropriate determination of a multivariate function whose values are given at a finite number of nodes of a multivariate grid. One way to construct the solution of this problem is to partition the given multivariate data into low-variate data. High dimensional model representation (HDMR) and generalized high dimensional model representation (GHDMR) methods are used to make this partitioning. Using the components of the HDMR or the GHDMR expansions the multivariate data can be partitioned. When a cartesian product set in the space of the independent variables is given, the HDMR expansion is used. On the other band, if the nodes are the elements of a random discrete data the GHDMR expansion is used instead of HDMR. These two expansions work well for the multivariate data that have the additive nature. If the data have multiplicative nature then factorized high dimensional model representation (FHDMR) is used. But in most cases the nature of the given multivariate data and the sought multivariate function have neither additive nor multiplicative nature. They have a hybrid nature. So, a new method is developed to obtain better results and it is called hybrid high dimensional model representation (HHDMR). This new expansion includes both the HDMR (or GHDMR) and the FHDMR expansions through a hybridity parameter. In this work, the general structure of this hybrid expansion is given. It has tried to obtain the best value for the hybridity parameter. According to this value the analytical structure of the sought multivariate function can be determined via HHDMR.
  • Yayın
    On the feature extraction in discrete space
    (Elsevier Sci Ltd, 2014-05) Yıldız, Olcay Taner
    In many pattern recognition applications, feature space expansion is a key step for improving the performance of the classifier. In this paper, we (i) expand the discrete feature space by generating all orderings of values of k discrete attributes exhaustively, (ii) modify the well-known decision tree and rule induction classifiers (ID3, Quilan, 1986 [1] and Ripper, Cohen, 1995 [2]) using these orderings as the new attributes. Our simulation results on 15 datasets from UCI repository [3] show that the novel classifiers perform better than the proper ones in terms of error rate and complexity.
  • Yayın
    BinBRO: Binary Battle Royale Optimizer algorithm
    (Elsevier Ltd, 2022-02-04) (Rahkar Farshi), Taymaz Akan; Agahian, Saeid; Dehkharghani, Rahim
    Stochastic methods attempt to solve problems that cannot be solved by deterministic methods with reasonable time complexity. Optimization algorithms benefit from stochastic methods; however, they do not guarantee to obtain the optimal solution. Many optimization algorithms have been proposed for solving problems with continuous nature; nevertheless, they are unable to solve discrete or binary problems. Adaptation and use of continuous optimization algorithms for solving discrete problems have gained growing popularity in recent decades. In this paper, the binary version of a recently proposed optimization algorithm, Battle Royale Optimization, which we named BinBRO, has been proposed. The proposed algorithm has been applied to two benchmark datasets: the uncapacitated facility location problem, and the maximum-cut graph problem, and has been compared with 6 other binary optimization algorithms, namely, Particle Swarm Optimization, different versions of Genetic Algorithm, and different versions of Artificial Bee Colony algorithm. The BinBRO-based algorithms could rank first among those algorithms when applying on all benchmark datasets of both problems, UFLP and Max-Cut.