8 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 8 / 8
Yayın TRopBank: Turkish PropBank V2.0(European Language Resources Association (ELRA), 2020-05-16) Kara, Neslihan; Aslan, Deniz Baran; Marşan, Büşra; Bakay, Özge; Ak, Koray; Yıldız, Olcay TanerIn this paper, we present and explain TRopBank “Turkish PropBank v2.0”. PropBank is a hand-annotated corpus of propositions which is used to obtain the predicate-argument information of a language. Predicate-argument information of a language can help understand semantic roles of arguments. “Turkish PropBank v2.0”, unlike PropBank v1.0, has a much more extensive list of Turkish verbs, with 17.673 verbs in total.Yayın Integrating Turkish Wordnet KeNet to Princeton WordNet: The case of one-to-many correspondences(Institute of Electrical and Electronics Engineers Inc., 2019-10) Bakay, Özge; Ergelen, Özlem; Yıldız, Olcay TanerIn this paper, we introduce a novel approach of forming interlingual relations between multilingual wordnets. We have mapped Turkish senses in KeNet with their corresponding senses in Princeton WordNet by drawing one-To-many correspondences. As a result of language-specific properties, one synset in one language is matched with multiple synsets in the other language in some cases. Our method of integrating KeNet into a multilingual network also included mapping the most frequent 5000 senses in English with their equivalent senses in Turkish. What we demonstrate is that one-To-many interlingual correspondances are necessary to include in mappings both from Turkish-To-English and English-To-Turkish. Furthermore, one-To-many mappings give us insights into the semantic relations to be constructed in Turkish, such as hypernymy.Yayın English-Turkish parallel semantic annotation of Penn-Treebank(Oficyna Wydawnicza Politechniki Wroclawskiej, 2020) Arıcan, Bilge Nas; Bakay, Özge; Avar, Begüm; Yıldız, Olcay Taner; Ergelen, ÖzlemThis paper reports our efforts in constructing a sense-labeled English-Turkish parallel corpus using the traditional method of manual tagging. We tagged a pre-built parallel treebank which was translated from the Penn Treebank corpus. This approach allowed us to generate a resource combining syntactic and semantic information. We provide statistics about the corpus itself as well as information regarding its development process.Yayın Comparison of Turkish proposition banks by frame matching(IEEE, 2018-12-06) Ak, Koray; Bakay, Özge; Yıldız, Olcay TanerBy indicating semantic relations between a predicate and its associated participants in a sentence and identifying the role-bearing constituents, SRL provides an extensive dataset to understand natural languages and to enhance several NLP applications such as information retrieval, machine translation, information extraction, and question answering. The availability of large resources and the development of statistical machine learning methods have increased the studies in the field of SRL. One of the widely-used semantic resources applied for multiple languages is PropBank. In this paper, PropBanks applied for Turkish are compared by checking semantic roles in the frame files of matched verb senses. As this integrated lexical resource for Turkish is aimed to be used in a multilingual resource along with English, creation of an inclusive lexical resource for Turkish is of great importance.Yayın Comparing sense categorization between English propbank and english wordnet(Oficyna Wydawnicza Politechniki Wroclawskiej, 2020) Bakay, Özge; Avar, Begüm; Yıldız, Olcay TanerGiven the fact that verbs play a crucial role in language comprehension, this paper presents a study which compares the verb senses in English PropBank with the ones in English WordNet through manual tagging. After analyzing 1554 senses in 1453 distinct verbs, we have found out that while the majority of the senses in PropBank have their one-to-one correspondents in WordNet, a substantial amount of them are differentiated. Furthermore, by analysing the differences between our manually-tagged and an automatically-tagged resource, we claim that manual tagging can help provide better results in sense annotation.Yayın Problems caused by semantic drift in WordNet synset construction(Institute of Electrical and Electronics Engineers Inc., 2019-09) Bakay, Özge; Ergelen, Özlem; Yıldız, Olcay TanerIn this study, we summarize the semantic drift problem that occur in specific synsets of KeNet, a Turkish WordNet, which is caused by mis-merging of semantically-related lexical items, morphological markings and false part of speech (POS) matchings. We present our approach to these problems in order to eliminate the semantic drift. We have re-analyzed the dictionary definitions of the items, placed those that possess different verbal markings into separate synsets, and divided synsets based on the POS of the items in them.Yayın Türkçe kelime ağı KeNet için arayüz(Institute of Electrical and Electronics Engineers Inc., 2019-04) Özçelik, Rıza; Uludoğan, Gökçe; Parlar, Selen; Bakay, Özge; Ergelen, Özlem; Yıldız, Olcay TanerKelime ağları, bir dildeki kelimeler arasındaki bağlantıları, eş anlam kümeleri oluşturarak ve bu kümeleri birbirine çeşitli anlamsal bağıntılar ile bağlayarak temsil eden bir çizge veri yapısıdır. Doğal dil işleme alanındaki en yaygın bilinen kelime ağı WordNet 1990 yılında İngilizce için oluşturulmuşken, Türkçe için en kapsamlı ağ, 2018 yılında oluşturulan KeNet’tir. Bildiğimiz kadarıyla, içinde 80000 eş anlam kümesi ve 25 farklı anlamsal bağlantı bulunan KeNet için şu ana kadar geliştirilen bir kullanıcı arayüzü yoktur. Bu çalışmada, KeNet çizgesinde, anlamsal bağlantıları kullanarak eş anlam kümeleri arasında çevrimiçi olarak gezinmeyi sağlayan bir arayüz sunuyoruz. Bu arayüz sayesinde, bir söz öbeği KeNet’te aranabilir ve eş anlam kümeleri arasındaki üst/alt anlam, parça-bütün ilişkileri gibi ilişkiler kullanılarak KeNet üzerinde gezilebilir. Ayrıca, herhangi bir eş anlam kümesinin, varsa, İngilizce karşılığının kimliği de görüntülenebilir ve bu kümeye WordNet’e ait internet sayfasından erişilebilir.Yayın A tree-based approach for English-to-Turkish translation(Tubitak Scientific & Technical Research Council Turkey, 2019) Bakay, Özge; Avar, Begüm; Yıldız, Olcay TanerIn this paper, we present our English-to-Turkish translation methodology, which adopts a tree-based approach. Our approach relies on tree analysis and the application of structural modification rules to get the target side (Turkish) trees from source side (English) ones. We also use morphological analysis to get candidate root words and apply tree-based rules to obtain the agglutinated target words. Compared to earlier work on English-to-Turkish translation using phrase-based models, we have been able to obtain higher BLEU scores in our current study. Our syntactic subtree permutation strategy, combined with a word replacement algorithm, provides a 67% relative improvement from a baseline 12.8 to 21.4 BLEU, all averaged over 10-fold cross-validation. As future work, improvements in choosing the correct senses and structural rules are needed.












