Arama Sonuçları

Listeleniyor 1 - 10 / 29
  • Yayın
    Visual modeling of Turkish morphology
    (European Language Resources Association (ELRA), 2020-05-16) Özenç, Berke; Solak, Ercan
    In this paper, we describe the steps in a visual modeling of Turkish morphology using diagramming tools. We aimed to make modeling easier and more maintainable while automating much of the code generation. We released the resulting analyzer, MorTur, and the diagram conversion tool, DiaMor as free, open-source utilities. MorTur analyzer is also publicly available on its web page as a web service. MorTur and DiaMor are part of our ongoing efforts in building a set of natural language processing tools for Turkic languages under a consistent framework.
  • Yayın
    TRopBank: Turkish PropBank V2.0
    (European Language Resources Association (ELRA), 2020-05-16) Kara, Neslihan; Aslan, Deniz Baran; Marşan, Büşra; Bakay, Özge; Ak, Koray; Yıldız, Olcay Taner
    In this paper, we present and explain TRopBank “Turkish PropBank v2.0”. PropBank is a hand-annotated corpus of propositions which is used to obtain the predicate-argument information of a language. Predicate-argument information of a language can help understand semantic roles of arguments. “Turkish PropBank v2.0”, unlike PropBank v1.0, has a much more extensive list of Turkish verbs, with 17.673 verbs in total.
  • Yayın
    Status of the Focal Plane Instrumentation (FPI) Project of the 4 m DAG Telescope
    (SPIE, 2016-08-09) Keskin, Onur; Yerli, Sinan Kaan; Yeşilyaprak, Cahit; Güver, Tolga; Aliş, Sinan; Yelkenci, Filiz Korhan; Güçsav, Bülent Burak; Arabacı, Mehtap Özbey; Erol, Ayşe
    DAG (Eastern Anatolia Observatory in Turkish) will be the newest and largest (4m) observatory of Turkey in both optical (VIS) and near-infrared (NIR) Owith its robust observing site infrastructure. The telescope is designedOto house 2 Nasmyth platformes which will be dedicated to NIR and VIS observations. A collaboration has recently been established among four Turkish universities including FMV Isik University (for adaptive optics systems), Middle East Technical University (fort measurement, test and calibration purposes), Istanbul University (for new technology instruments, e.g. MKIDs) and as the coordinator Ataturk University (for obtaining NIR and VIS instruments). In this paper the status of the recently approved FPI project and its aims are presented and possible collaboration opportunities are emphasized.
  • Yayın
    Vikipedi ve Vikisözlük'ten Hypernym çıkarma
    (IEEE, 2017-06-27) Şaşmaz, Emre; Ehsani, Razieh; Yıldız, Olcay Taner
    Doğal dil işleme alanında kullanılan önemli yapılardan bir tanesi WordNet gibi büyük ölçekli sözlüklerdir. WordNet; eşanlamlı, zıt anlamlı gibi anlamsal ilişkileri de içeren kapsamlı bir sözlüktür. Bu bildiride, WordNet’in önemli bir parçası olan Hypernym-Hyponym ilişkisini çıkarmaya çalıştık. Bu amaca ulaşmak için, Vikipedi, Türkçe Sözlük ve Vikisözlük kaynaklarını kullandık. Sonlu Durum Makinelerinden ürettiğimiz kurallarla Hypernym-Hyponym ilişkilerini çıkardık.
  • Yayın
    English-Turkish parallel semantic annotation of Penn-Treebank
    (Oficyna Wydawnicza Politechniki Wroclawskiej, 2020) Arıcan, Bilge Nas; Bakay, Özge; Avar, Begüm; Yıldız, Olcay Taner; Ergelen, Özlem
    This paper reports our efforts in constructing a sense-labeled English-Turkish parallel corpus using the traditional method of manual tagging. We tagged a pre-built parallel treebank which was translated from the Penn Treebank corpus. This approach allowed us to generate a resource combining syntactic and semantic information. We provide statistics about the corpus itself as well as information regarding its development process.
  • Yayın
    Problems caused by semantic drift in WordNet synset construction
    (Institute of Electrical and Electronics Engineers Inc., 2019-09) Bakay, Özge; Ergelen, Özlem; Yıldız, Olcay Taner
    In this study, we summarize the semantic drift problem that occur in specific synsets of KeNet, a Turkish WordNet, which is caused by mis-merging of semantically-related lexical items, morphological markings and false part of speech (POS) matchings. We present our approach to these problems in order to eliminate the semantic drift. We have re-analyzed the dictionary definitions of the items, placed those that possess different verbal markings into separate synsets, and divided synsets based on the POS of the items in them.
  • Yayın
    Türkçe kelime ağı KeNet için arayüz
    (Institute of Electrical and Electronics Engineers Inc., 2019-04) Özçelik, Rıza; Uludoğan, Gökçe; Parlar, Selen; Bakay, Özge; Ergelen, Özlem; Yıldız, Olcay Taner
    Kelime ağları, bir dildeki kelimeler arasındaki bağlantıları, eş anlam kümeleri oluşturarak ve bu kümeleri birbirine çeşitli anlamsal bağıntılar ile bağlayarak temsil eden bir çizge veri yapısıdır. Doğal dil işleme alanındaki en yaygın bilinen kelime ağı WordNet 1990 yılında İngilizce için oluşturulmuşken, Türkçe için en kapsamlı ağ, 2018 yılında oluşturulan KeNet’tir. Bildiğimiz kadarıyla, içinde 80000 eş anlam kümesi ve 25 farklı anlamsal bağlantı bulunan KeNet için şu ana kadar geliştirilen bir kullanıcı arayüzü yoktur. Bu çalışmada, KeNet çizgesinde, anlamsal bağlantıları kullanarak eş anlam kümeleri arasında çevrimiçi olarak gezinmeyi sağlayan bir arayüz sunuyoruz. Bu arayüz sayesinde, bir söz öbeği KeNet’te aranabilir ve eş anlam kümeleri arasındaki üst/alt anlam, parça-bütün ilişkileri gibi ilişkiler kullanılarak KeNet üzerinde gezilebilir. Ayrıca, herhangi bir eş anlam kümesinin, varsa, İngilizce karşılığının kimliği de görüntülenebilir ve bu kümeye WordNet’e ait internet sayfasından erişilebilir.
  • Yayın
    On building the largest and cross-linguistic Turkish dependency corpus
    (Institute of Electrical and Electronics Engineers Inc., 2020-10-15) Kuzgun, Aslı; Cesur, Neslihan; Arıcan, Bilge Nas; Özçelik, Merve; Marşan, Büşra; Kara, Neslihan; Aslan, Deniz Baran; Yıldız, Olcay Taner
    In this paper, we aim to introduce the dependency annotation process of the largest and the only cross-linguistic Turkish dependency treebank which was translated from the original Penn Treebank corpus. Within the scope of this project, 16.400 sentences have been morphologically and semantically annotated, and the dependency relations were manually carried out by a team of linguists. It is hoped that this project will serve as a base for a successful dependency parser and a system which can automatically perform the bi-directional conversion between constituency and dependency trees.
  • Yayın
    Unsupervised morphological analysis using tries
    (Springer London, 2012) Ak, Koray; Yıldız, Olcay Taner
    This article presents an unsupervised morphological analysis algorithm to segment words into roots and affixes. The algorithm relies on word occurrences in a given dataset. Target languages are English, Finnish, and Turkish, but the algorithm can be used to segment any word from any language given the wordlists acquired from a corpus consisting of words and word occurrences. In each iteration, the algorithm divides words with respect to occurrences and constructs a new trie for the remaining affixes. Preliminary experimental results on three languages show that our novel algorithm performs better than most of the previous algorithms.
  • Yayın
    AnlamVer: Semantic model evaluation dataset for Turkish - word similarity and relatedness
    (Association for Computational Linguistics (ACL), 2018-08-26) Ercan, Gökhan; Yıldız, Olcay Taner
    In this paper, we present AnlamVer, which is a semantic model evaluation dataset for Turkish designed to evaluate word similarity and word relatedness tasks while discriminating those two relations from each other. Our dataset consists of 500 word-pairs annotated by 12 human subjects, and each pair has two distinct scores for similarity and relatedness. Word-pairs are selected to enable the evaluation of distributional semantic models by multiple attributes of words and word-pair relations such as frequency, morphology, concreteness and relation types (e.g., synonymy, antonymy). Our aim is to provide insights to semantic model researchers by evaluating models in multiple attributes. We balance dataset word-pairs by their frequencies to evaluate the robustness of semantic models concerning out-of-vocabulary and rare words problems, which are caused by the rich derivational and inflectional morphology of the Turkish language.