8 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 8 / 8
Yayın Cryptanalysis of Fridrich's chaotic image encryption(World Scientific Publishing, 2010-05) Solak, Ercan; Çokal, Cahit; Yıldız, Olcay Taner; Bıyıkoğlu, TürkerWe cryptanalyze Fridrich's chaotic image encryption algorithm. We show that the algebraic weaknesses of the algorithm make it vulnerable against chosen-ciphertext attacks. We propose an attack that reveals the secret permutation that is used to shuffle the pixels of a round input. We demonstrate the effectiveness of our attack with examples and simulation results. We also show that our proposed attack can be generalized to other well-known chaotic image encryption algorithms.Yayın Parallel univariate decision trees(Elsevier B.V., 2007-05-01) Yıldız, Olcay Taner; Dikmen, OnurUnivariate decision tree algorithms are widely used in data mining because (i) they are easy to learn (ii) when trained they can be expressed in rule based manner. In several applications mainly including data mining, the dataset to be learned is very large. In those cases it is highly desirable to construct univariate decision trees in reasonable time. This may be accomplished by parallelizing univariate decision tree algorithms. In this paper, we first present two different univariate decision tree algorithms C4.5 and univariate linear discriminant tree. We show how to parallelize these algorithms in three ways: (i) feature based; (ii) node based; (iii) data based manners. Experimental results show that performance of the parallelizations highly depend on the dataset and the node based parallelization demonstrate good speedups.Yayın Univariate decision tree induction using maximum margin classification(Oxford Univ Press, 2012-03) Yıldız, Olcay TanerIn many pattern recognition applications, first decision trees are used due to their simplicity and easily interpretable nature. In this paper, we propose a new decision tree learning algorithm called univariate margin tree where, for each continuous attribute, the best split is found using convex optimization. Our simulation results on 47 data sets show that the novel margin tree classifier performs at least as good as C4.5 and linear discriminant tree (LDT) with a similar time complexity. For two-class data sets, it generates significantly smaller trees than C4.5 and LDT without sacrificing from accuracy, and generates significantly more accurate trees than C4.5 and LDT for multiclass data sets with one-vs-rest methodology.Yayın Quadratic programming for class ordering in rule induction(Elsevier Science BV, 2015-03-01) Yıldız, Olcay TanerSeparate-and-conquer type rule induction algorithms such as Ripper, solve a K>2 class problem by converting it into a sequence of K - 1 two-class problems. As a usual heuristic, the classes are fed into the algorithm in the order of increasing prior probabilities. Although the heuristic works well in practice, there is much room for improvement. In this paper, we propose a novel approach to improve this heuristic. The approach transforms the ordering search problem into a quadratic optimization problem and uses the solution of the optimization problem to extract the optimal ordering. We compared new Ripper (guided by the ordering found with our approach) with original Ripper (guided by the heuristic ordering) on 27 datasets. Simulation results show that our approach produces rulesets that are significantly better than those produced by the original Ripper.Yayın Software defect prediction using Bayesian networks(Springer, 2014-02) Okutan, Ahmet; Yıldız, Olcay TanerThere are lots of different software metrics discovered and used for defect prediction in the literature. Instead of dealing with so many metrics, it would be practical and easy if we could determine the set of metrics that are most important and focus on them more to predict defectiveness. We use Bayesian networks to determine the probabilistic influential relationships among software metrics and defect proneness. In addition to the metrics used in Promise data repository, we define two more metrics, i.e. NOD for the number of developers and LOCQ for the source code quality. We extract these metrics by inspecting the source code repositories of the selected Promise data repository data sets. At the end of our modeling, we learn the marginal defect proneness probability of the whole software system, the set of most effective metrics, and the influential relationships among metrics and defectiveness. Our experiments on nine open source Promise data repository data sets show that response for class (RFC), lines of code (LOC), and lack of coding quality (LOCQ) are the most effective metrics whereas coupling between objects (CBO), weighted method per class (WMC), and lack of cohesion of methods (LCOM) are less effective metrics on defect proneness. Furthermore, number of children (NOC) and depth of inheritance tree (DIT) have very limited effect and are untrustworthy. On the other hand, based on the experiments on Poi, Tomcat, and Xalan data sets, we observe that there is a positive correlation between the number of developers (NOD) and the level of defectiveness. However, further investigation involving a greater number of projects is needed to confirm our findings.Yayın A novel kernel to predict software defectiveness(Elsevier Science Inc, 2016-09) Okutan, Ahmet; Yıldız, Olcay TanerAlthough the software defect prediction problem has been researched for a long time, the results achieved are not so bright. In this paper, we propose to use novel kernels for defect prediction that are based on the plagiarized source code, software clones and textual similarity. We generate precomputed kernel matrices and compare their performance on different data sets to model the relationship between source code similarity and defectiveness. Each value in a kernel matrix shows how much parallelism exists between the corresponding files of a software system chosen. Our experiments on 10 real world datasets indicate that support vector machines (SVM) with a precomputed kernel matrix performs better than the SVM with the usual linear kernel in terms of F-measure. Similarly, when used with a precomputed kernel, the k-nearest neighbor classifier (KNN) achieves comparable performance with respect to KNN classifier. The results from this preliminary study indicate that source code similarity can be used to predict defect proneness.Yayın WikiLeaks on the Middle East: Obscure diplomacy networks and binding spaces(Routledge Journals, 2014-10-02) Bıçakcı, Ahmet Salih; Rende, Deniz; Rende, Sevinç; Yıldız, Olcay TanerIn this paper, we explore the flow of information regarding strategic Middle Eastern countries in the WikiLeaks 'diplomatic cables' by applying data-mining techniques to construct directed networks. The results show that between 2002 and 2009, US diplomatic communication related to these countries increased although with notable variation in flow patterns. We discuss the value of a visual display of diplomatic communication patterns in understanding the decentralized nature of information gathering on regional foreign policy issues.Yayın Grammar or crammer? the role of morphology in distinguishing orthographically similar but semantically unrelated words(Institute of Electrical and Electronics Engineers Inc., 2025) Ercan, Gökhan; Yıldız, Olcay TanerWe show that n-gram-based distributional models fail to distinguish unrelated words due to the noise in semantic spaces. This issue remains hidden in conventional benchmarks but becomes more pronounced when orthographic similarity is high. To highlight this problem, we introduce OSimUnr, a dataset of nearly one million English and Turkish word-pairs that are orthographically similar but semantically unrelated (e.g., grammar - crammer). These pairs are generated through a graph-based WordNet approach and morphological resources. We define two evaluation tasks - unrelatedness identification and relatedness classification - to test semantic models. Our experiments reveal that FastText, with default n-gram segmentation, performs poorly (below 5% accuracy) in identifying unrelated words. However, morphological segmentation overcomes this issue, boosting accuracy to 68% (English) and 71% (Turkish) without compromising performance on standard benchmarks (RareWords, MTurk771, MEN, AnlamVer). Furthermore, our results suggest that even state-of-the-art LLMs, including Llama 3.3 and GPT-4o-mini, may exhibit noise in their semantic spaces, particularly in highly synthetic languages such as Turkish. To ensure dataset quality, we leverage WordNet, MorphoLex, and NLTK, covering fully derivational morphology supporting atomic roots (e.g., '-co_here+ance+y' for 'coherency'), with 405 affixes in Turkish and 467 in English.












