Arama Sonuçları

Listeleniyor 1 - 7 / 7
  • Yayın
    From past to present: spam detection and identifying opinion leaders in social networks
    (Yildiz Teknik Univ., 2022-06-22) Altınel Girgin, Ayşe Berna; Gümüşçekiçci, Gizem
    On microblogging sites, which are gaining more and more users every day, a wide range of ideas are quickly emerging, spreading, and creating interactive environments. In some cases, in Turkey as well as in the rest of the world, it was noticed that events were published on microblogging sites before appearing in visual, audio and printed news sources. Thanks to the rapid flow of information in social networks, it can reach millions of people in seconds. In this context, social media can be seen as one of the most important sources of information affecting public opinion. Since the information in social networks became accessible, research started to be conducted using the information on the social networks. While the studies about spam detection and identification of opinion leaders gained popularity, surveys about these topics began to be published. This study also shows the importance of spam detection and identification of opinion leaders in social networks. It is seen that the data collected from social platforms, especially in recent years, has sourced many state-of-art applications. There are independent surveys that focus on filtering the spam content and detecting influencers on social networks. This survey analyzes both spam detection studies and opinion leader identification and categorizes these studies by their methodologies. As far as we know there is no survey that contains approaches for both spam detection and opinion leader identification in social networks. This survey contains an overview of the past and recent advances in both spam detection and opinion leader identification studies in social networks. Furthermore, readers of this survey have the opportunity of understanding general aspects of different studies about spam detection and opinion leader identification while observing key points and comparisons of these studies.
  • Yayın
    Sarcasm detection in text using deep neural networks
    (Işık Üniversitesi, 2024-02-25) Gümüşçekiçci, Gizem; Dehkharghani, Rahim; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı; Işık University, School of Graduate Studies, Computer Science Engineering Master Program
    Sarcasm is a form of irony which is generally used in expressing negative opinions. Sarcasm poses a linguistic challenge due to its figurative nature where intended meaning contradicts with literal interpretation. Sarcasm is widely used in our Daily lives and also upon many social platforms. Detecting sarcasm in written text is a challenging process that has captured the interest of many researchers. Hence, sarcasm has become a crucial task in the Natural Language Processing (NLP) field. This thesis study explores the concept of sarcasm, and its importance on existing sarcasm research. The automatic process of sarcasm detection involves dataset selection, preprocessing steps, and selecting proper approaches, including rule-based methods, Machine Learning (ML), Deep Learning (DL) and Transformer architectures. This study surveys previous research on sarcasm detection, specifically examining the dataset, methodology and performance. This thesis study attempts to automatically detect sarcasm by utilizing various ML, DL and transformer and hybrid neural network architectures on news headlines datasets. To overcome the dataset and performance limitations on existing approaches, we propose various methodologies to detect sarcastic text mostly focusing on DL, hybrid neural networks and transformer architectures. We combine appropriate architectures with several hand-crafted features and utilizing different word embedding models. To further extend the performance of our proposed methods and also enhance the existing news headlines dataset, we proposed several modifications. We contribute to the existing dataset by applying augmentation to increase the dataset size to help enhance the performance of the proposed models with overcoming dataset limitations. Our methodologies correctly identify sarcasm with 97.68% F1 score.
  • Yayın
    Web service translating content into Turkish sign language
    (Institute of Electrical and Electronics Engineers Inc., 2020-10-12) Gümüşçekiçci, Gizem; Ezerceli, Özay; Tek, Faik Boray
    The essential communication tool for people with hearing loss is sign language. It is way more efficient for their communication. Existing systems for translating the text into sign language are offline and not practical. In this study, we propose a web service-based solution for online translation of content into Turkish Sign Language. We implemented the system and tested it using 32 sentences of 189 words as inputs. The correct word translation rate was 81.74% for the media or audio inputs and the correct word translation for the text inputs was 81.09% The results show the feasibility of the solution and the potential for improvements.
  • Yayın
    Sarcasm detection on news headlines using transformers
    (Springer, 2025-09-07) Gümüşçekiçci, Gizem; Dehkharghani, Rahim
    Sarcasm poses a linguistic challenge due to its figurative nature, where intended meaning contradicts literal interpretation. Sarcasm is prevalent in human communication, affecting interactions in literature, social media, news, e-commerce, etc. Identifying the true intent behind sarcasm is challenging but essential for applications in sentiment analysis. Detecting sarcasm in written text, as a challenging task, has attracted many researchers in recent years. This paper attempts to detect sarcasm in news headlines. Journalists prefer using sarcastic news headlines as they seem much more interesting to the readers. In the proposed methodology, we experimented with Transformers, namely the BERT model, and several Machine and Deep Learning models with different word and sentence embedding methods. The proposed approach inherently requires high-performance resources due to the use of large-scale pre-trained language models such as BERT. We also extended an existing news headlines dataset for sarcasm detection using augmentation techniques and annotating it with hand-crafted features. The proposed methodology could outperform almost all existing sarcasm detection approaches with a 98.86% F1-score when applied to the extended news headlines dataset, which we made publicly available on GitHub.
  • Yayın
    Turkish sentiment analysis: a comprehensive review
    (Yildiz Technical University, 2024-08) Altınel Girgin, Ayşe Berna; Gümüşçekiçci, Gizem; Birdemir, Nuri Can
    Sentiment analysis (SA) is a very popular research topic in the text mining field. SA is the process of textual mining in which the meaning of a text is detected and extracted. One of the key aspects of SA is to analyze the body of a text to determine its polarity to understand the opinions it expresses. Substantial amounts of data are produced by online resources such as social media sites, blogs, news sites, etc. Due to this reason, it is impossible to process all of this data without automated systems, which has contributed to the rise in popularity of SA in recent years. SA is considered to be extremely essential, mostly due to its ability to analyze mass opinions. SA, and Natural Language Processing (NLP) in particular, has become an overwhelmingly popular topic as social media usage has increased. The data collected from social media has sourced numerous different SA studies due to being versatile and accessible to the masses. This survey presents a comprehensive study categorizing past and present studies by their employed methodologies and levels of sentiment. In this survey, Turkish SA studies were categorized under three sections. These are Dictionary-based, Machine Learning-based, and Hybrid-based. Researchers can discover, compare, and analyze properties of different Turkish SA studies reviewed in this survey, as well as obtain information on the public dataset and the dictionaries used in the studies. The main purpose of this study is to combine Turkish SA approaches and methods while briefly explaining its concepts. This survey uniquely categorizes a large number of related articles and visualizes their properties. To the best of our knowledge, there is no such comprehensive and up-to-date survey that strictly covers Turkish SA which mainly concerns analysis of sentiment levels. Furthermore, this survey contributes to the literature due to its unique property of being the first of its kind.
  • Yayın
    TurkEmbed: Turkish embedding model on natural language inference & sentence text similarity tasks
    (Institute of Electrical and Electronics Engineers Inc., 2025) Ezerceli, Özay; Gümüşçekiçci, Gizem; Erkoç, Tuğba; Özenç, Berke
    This paper introduces TurkEmbed, a novel Turkish language embedding model designed to outperform existing models, particularly in Natural Language Inference (NLI) and Semantic Textual Similarity (STS) tasks. Current Turkish embedding models often rely on machine-translated datasets, potentially limiting their accuracy and semantic understanding. TurkEmbed utilizes a combination of diverse datasets and advanced training techniques, including matryoshka representation learning, to achieve more robust and accurate embeddings. This approach enables the model to adapt to various resource-constrained environments, offering faster encoding capabilities. Our evaluation on the Turkish STS-b-TR dataset, using Pearson and Spearman correlation metrics, demonstrates significant improvements in semantic similarity tasks. Furthermore, TurkEmbed surpasses the current state-of-the-art model, Emrecan, on All-NLI-TR and STS-b-TR benchmarks, achieving a 1-4% improvement. TurkEmbed promises to enhance the Turkish NLP ecosystem by providing a more nuanced understanding of language and facilitating advancements in downstream applications.
  • Yayın
    TurkEmbed4Retrieval: Türkçe için geri getirme görevine özel gömme modeli
    (Institute of Electrical and Electronics Engineers Inc., 2025-08-15) Ezerceli, Özay; Gümüşçekiçci, Gizem; Erkoç, Tuğba; Özenç, Berke
    Bu çalışmada, öncelikle Doğal Dil Çıkarımı (DDÇ) ve Anlamsal Metin Benzerliği (AMB) görevleri için geliştirilen TurkEmbed modelinin, MS-Marco-TR veri seti üzerinde ince ayar yapılarak geri getirme görevlerine uygun hale getirilmesini sağlayan TurkEmbed4Retrieval modelini tanıtıyoruz. Model, Matruşka temsili ögrenme ve özel tasarlanmış negatif çiftlerin sıralanması kayıp fonksiyonu gibi ileri seviye egitim teknikleri kullanılarak optimize edilmiştir. Yapılan kapsamlı deneyler, TurkEmbed4Retrieval’ın, geri getirme metriklerinde TurkishcolBERT modelini Scifact-TR veri kümesinde %19–26 oranında geçtiğini göstermektedir. Bu bağlamda, modelimiz, Türkçe bilgi getirme sistemleri için yeni bir çıtaya ulaşmaktadır.