6 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 6 / 6
Yayın Kural bazlı otomatik haber etiketleme(IEEE, 2017-06-27) Özenç, Berke; Solak, ErcanBu çalışmada , genel ağ kaynaklarından haber toplayan ve topladığı bu haberleri otomatik olarak etiketleyen kural tabanlı bir uygulama yapılmıştır. Çalışmanın alt amacı hangi özelliklerin etiket belirleme işine daha uygun olduğunu ölçmektir. Elle etiketlenmiş 100 haber üzerinde her bir kuralın başarısı oranı ölçülmüştür.Yayın MorAz: An open-source morphological analyzer for Azerbaijani Turkish(Association for Computational Linguistics (ACL), 2018) Özenç, Berke; Ehsani, Razieh; Solak, ErcanMorAz is an open-source morphological analyzer for Azerbaijani Turkish. The analyzer is available through both as a website for interactive exploration and as a RESTful web service for integration into a natural language processing pipeline. MorAz implements the morphology of Azerbaijani Turkish following a two-level approach using Helsinki finite-state transducer and wraps the analyzer with python scripts in a Django instance.Yayın Shallow parsing in Turkish(IEEE, 2017) Topsakal, Ozan; Açıkgöz, Onur; Gürkan, Ali Tunca; Kanburoğlu, Ali Buğra; Ertopçu, Burak; Özenç, Berke; Çam, İlker; Avar, Begüm; Ercan, Gökhan; Yıldız, Olcay TanerIn this study, shallow parsing is applied on Turkish sentences. These sentences are used to train and test the per-formances of various learning algorithms with various features specified for shallow parsing in Turkish.Yayın All-words word sense disambiguation for Turkish(IEEE, 2017) Açıkgöz, Onur; Gürkan, Ali Tunca; Ertopçu, Burak; Topsakal, Ozan; Özenç, Berke; Kanburoğlu, Ali Buğra; Çam, İlker; Avar, Begüm; Ercan, Gökhan; Yıldız, Olcay TanerIdentifying the sense of a word within a context is a challenging problem and has many applications in natural language processing. This assignment problem is called word sense disambiguation(WSD). Many papers in the literature focus on English language and data. Our dataset consists of 1400 sentences translated to Turkish from the Penn Treebank Corpus. This paper seeks to address and discuss 6 different feature extraction methods and its classification performances using C4.5, Random Forests, Rocchio, Naive Bayes, KNN, Linear and multilayer Perceptron. This paper calls into question how the described features perform on a morphologically rich language (Turkish) with several classifiers.Yayın TurkEmbed: Turkish embedding model on natural language inference & sentence text similarity tasks(Institute of Electrical and Electronics Engineers Inc., 2025) Ezerceli, Özay; Gümüşçekiçci, Gizem; Erkoç, Tuğba; Özenç, BerkeThis paper introduces TurkEmbed, a novel Turkish language embedding model designed to outperform existing models, particularly in Natural Language Inference (NLI) and Semantic Textual Similarity (STS) tasks. Current Turkish embedding models often rely on machine-translated datasets, potentially limiting their accuracy and semantic understanding. TurkEmbed utilizes a combination of diverse datasets and advanced training techniques, including matryoshka representation learning, to achieve more robust and accurate embeddings. This approach enables the model to adapt to various resource-constrained environments, offering faster encoding capabilities. Our evaluation on the Turkish STS-b-TR dataset, using Pearson and Spearman correlation metrics, demonstrates significant improvements in semantic similarity tasks. Furthermore, TurkEmbed surpasses the current state-of-the-art model, Emrecan, on All-NLI-TR and STS-b-TR benchmarks, achieving a 1-4% improvement. TurkEmbed promises to enhance the Turkish NLP ecosystem by providing a more nuanced understanding of language and facilitating advancements in downstream applications.Yayın TurkEmbed4Retrieval: Türkçe için geri getirme görevine özel gömme modeli(Institute of Electrical and Electronics Engineers Inc., 2025-08-15) Ezerceli, Özay; Gümüşçekiçci, Gizem; Erkoç, Tuğba; Özenç, BerkeBu çalışmada, öncelikle Doğal Dil Çıkarımı (DDÇ) ve Anlamsal Metin Benzerliği (AMB) görevleri için geliştirilen TurkEmbed modelinin, MS-Marco-TR veri seti üzerinde ince ayar yapılarak geri getirme görevlerine uygun hale getirilmesini sağlayan TurkEmbed4Retrieval modelini tanıtıyoruz. Model, Matruşka temsili ögrenme ve özel tasarlanmış negatif çiftlerin sıralanması kayıp fonksiyonu gibi ileri seviye egitim teknikleri kullanılarak optimize edilmiştir. Yapılan kapsamlı deneyler, TurkEmbed4Retrieval’ın, geri getirme metriklerinde TurkishcolBERT modelini Scifact-TR veri kümesinde %19–26 oranında geçtiğini göstermektedir. Bu bağlamda, modelimiz, Türkçe bilgi getirme sistemleri için yeni bir çıtaya ulaşmaktadır.












