6 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 6 / 6
Yayın TRopBank: Turkish PropBank V2.0(European Language Resources Association (ELRA), 2020-05-16) Kara, Neslihan; Aslan, Deniz Baran; Marşan, Büşra; Bakay, Özge; Ak, Koray; Yıldız, Olcay TanerIn this paper, we present and explain TRopBank “Turkish PropBank v2.0”. PropBank is a hand-annotated corpus of propositions which is used to obtain the predicate-argument information of a language. Predicate-argument information of a language can help understand semantic roles of arguments. “Turkish PropBank v2.0”, unlike PropBank v1.0, has a much more extensive list of Turkish verbs, with 17.673 verbs in total.Yayın Construction of a Turkish proposition bank(Tubitak Scientific & Technical Research Council Turkey, 2018) Ak, Koray; Toprak, Cansu; Esgel, Volkan; Yıldız, Olcay TanerThis paper describes our approach to developing the Turkish PropBank by adopting the semantic role-labeling guidelines of the original PropBank and using the translation of the English Penn-TreeBank as a resource. We discuss the semantic annotation process of the PropBank and language-specific cases for Turkish, the tools we have developed for annotation, and quality control for multiuser annotation. In the current phase of the project, more than 9500 sentences are semantically analyzed and predicate-argument information is extracted for 1330 verbs and 1914 verb senses. Our plan is to annotate 17,000 sentences by the end of 2017.Yayın Comparison of Turkish proposition banks by frame matching(IEEE, 2018-12-06) Ak, Koray; Bakay, Özge; Yıldız, Olcay TanerBy indicating semantic relations between a predicate and its associated participants in a sentence and identifying the role-bearing constituents, SRL provides an extensive dataset to understand natural languages and to enhance several NLP applications such as information retrieval, machine translation, information extraction, and question answering. The availability of large resources and the development of statistical machine learning methods have increased the studies in the field of SRL. One of the widely-used semantic resources applied for multiple languages is PropBank. In this paper, PropBanks applied for Turkish are compared by checking semantic roles in the frame files of matched verb senses. As this integrated lexical resource for Turkish is aimed to be used in a multilingual resource along with English, creation of an inclusive lexical resource for Turkish is of great importance.Yayın Unsupervised morphological analysis using tries(Springer London, 2012) Ak, Koray; Yıldız, Olcay TanerThis article presents an unsupervised morphological analysis algorithm to segment words into roots and affixes. The algorithm relies on word occurrences in a given dataset. Target languages are English, Finnish, and Turkish, but the algorithm can be used to segment any word from any language given the wordlists acquired from a corpus consisting of words and word occurrences. In each iteration, the algorithm divides words with respect to occurrences and constructs a new trie for the remaining affixes. Preliminary experimental results on three languages show that our novel algorithm performs better than most of the previous algorithms.Yayın Automatic propbank generation for Turkish(Incoma Ltd, 2019-09) Ak, Koray; Yıldız, Olcay TanerSemantic role labeling (SRL) is an important task for understanding natural languages, where the objective is to analyse propositions expressed by the verb and to identify each word that bears a semantic role. It provides an extensive dataset to enhance NLP applications such as information retrieval, machine translation, information extraction, and question answering. However, creating SRL models are difficult. Even in some languages, it is infeasible to create SRL models that have predicate-argument structure due to lack of linguistic resources. In this paper, we present our method to create an automatic Turkish PropBank by exploiting parallel data from the translated sentences of English PropBank. Experiments show that our method gives promising results. © 2019 Association for Computational Linguistics (ACL).Yayın A multilayer annotated corpus for Turkish(IEEE, 2018-06-06) Yıldız, Olcay Taner; Ak, Koray; Ercan, Gökhan; Topsakal, Ozan; Asmazoğlu, CengizIn this paper, we present the first multilayer annotated corpus for Turkish, which is a low-resourced agglutinative language. Our dataset consists of 9,600 sentences translated from the Penn Treebank Corpus. Annotated layers contain syntactic and semantic information including morphological disambiguation of words, named entity annotation, shallow parse, sense annotation, and semantic role label annotation.












