Arama Sonuçları

Listeleniyor 1 - 5 / 5
  • Yayın
    Visual modeling of Turkish morphology
    (European Language Resources Association (ELRA), 2020-05-16) Özenç, Berke; Solak, Ercan
    In this paper, we describe the steps in a visual modeling of Turkish morphology using diagramming tools. We aimed to make modeling easier and more maintainable while automating much of the code generation. We released the resulting analyzer, MorTur, and the diagram conversion tool, DiaMor as free, open-source utilities. MorTur analyzer is also publicly available on its web page as a web service. MorTur and DiaMor are part of our ongoing efforts in building a set of natural language processing tools for Turkic languages under a consistent framework.
  • Yayın
    Kural bazlı otomatik haber etiketleme
    (IEEE, 2017-06-27) Özenç, Berke; Solak, Ercan
    Bu çalışmada , genel ağ kaynaklarından haber toplayan ve topladığı bu haberleri otomatik olarak etiketleyen kural tabanlı bir uygulama yapılmıştır. Çalışmanın alt amacı hangi özelliklerin etiket belirleme işine daha uygun olduğunu ölçmektir. Elle etiketlenmiş 100 haber üzerinde her bir kuralın başarısı oranı ölçülmüştür.
  • Yayın
    MorAz: An open-source morphological analyzer for Azerbaijani Turkish
    (Association for Computational Linguistics (ACL), 2018) Özenç, Berke; Ehsani, Razieh; Solak, Ercan
    MorAz is an open-source morphological analyzer for Azerbaijani Turkish. The analyzer is available through both as a website for interactive exploration and as a RESTful web service for integration into a natural language processing pipeline. MorAz implements the morphology of Azerbaijani Turkish following a two-level approach using Helsinki finite-state transducer and wraps the analyzer with python scripts in a Django instance.
  • Yayın
    A FST description of noun and verb morphology of Azarbaijani Turkish
    (Association for Computational Linguistics (ACL), 2021) Ehsani, Razieh; Özenç, Berke; Solak, Ercan; Drewes F.
    We give a FST description of nominal and finite verb morphology of Azarbaijani Turkish. We use a hybrid approach where nominal inflection is expressed as a slot-based paradigm and major parts of verb inflection are expressed as optional paths on the FST. We collapse adjective and noun categories in a single nominal category as they behave similarly as far as their paradigms are concerned. Thus, we defer a more precise identification of POS to further down the NLP pipeline.
  • Yayın
    TurkEmbed: Turkish embedding model on natural language inference & sentence text similarity tasks
    (Institute of Electrical and Electronics Engineers Inc., 2025) Ezerceli, Özay; Gümüşçekiçci, Gizem; Erkoç, Tuğba; Özenç, Berke
    This paper introduces TurkEmbed, a novel Turkish language embedding model designed to outperform existing models, particularly in Natural Language Inference (NLI) and Semantic Textual Similarity (STS) tasks. Current Turkish embedding models often rely on machine-translated datasets, potentially limiting their accuracy and semantic understanding. TurkEmbed utilizes a combination of diverse datasets and advanced training techniques, including matryoshka representation learning, to achieve more robust and accurate embeddings. This approach enables the model to adapt to various resource-constrained environments, offering faster encoding capabilities. Our evaluation on the Turkish STS-b-TR dataset, using Pearson and Spearman correlation metrics, demonstrates significant improvements in semantic similarity tasks. Furthermore, TurkEmbed surpasses the current state-of-the-art model, Emrecan, on All-NLI-TR and STS-b-TR benchmarks, achieving a 1-4% improvement. TurkEmbed promises to enhance the Turkish NLP ecosystem by providing a more nuanced understanding of language and facilitating advancements in downstream applications.