Arama Sonuçları

Listeleniyor 1 - 4 / 4
  • Yayın
    Construction of a Turkish proposition bank
    (Tubitak Scientific & Technical Research Council Turkey, 2018) Ak, Koray; Toprak, Cansu; Esgel, Volkan; Yıldız, Olcay Taner
    This paper describes our approach to developing the Turkish PropBank by adopting the semantic role-labeling guidelines of the original PropBank and using the translation of the English Penn-TreeBank as a resource. We discuss the semantic annotation process of the PropBank and language-specific cases for Turkish, the tools we have developed for annotation, and quality control for multiuser annotation. In the current phase of the project, more than 9500 sentences are semantically analyzed and predicate-argument information is extracted for 1330 verbs and 1914 verb senses. Our plan is to annotate 17,000 sentences by the end of 2017.
  • Yayın
    Parallel proposition bank construction for Turkish
    (Işık Üniversitesi, 2019-04-02) Ak, Koray; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Doktora Programı
    PropBank is the bank of propositions which contains hand-annotated corpus for predicate-argument information and semantic roles or arguments. It aims to provide an extensive dataset for enhancing NLP applications such as information retrieval, machine translation, information extraction, and question answering by adding a semantic information layer to the syntactic annotation. Via the added semantic layer, syntactic parser re?nements can be achieved which increases the e?ciency and improves application performance. The aim of this thesis is to construct proposition bank for Turkish Language. Only preliminary studies were carried out in terms of Turkish PropBank. This study is one of the pioneers for the language. In this study, a hand annotated Turkish PropBank is constructed from the translation of the parallel English PropBank corpus, other PropBank studies for Turkish language examined and compared with the proposition bank constructed, automatic PropBank construction for Turkish from both parallel sentence trees and phrase sentences is analyzed and automatic proposition banks generated for Turkish.
  • Yayın
    Unsupervised morphological analysis using tries
    (Işık Üniversitesi, 2011-04-29) Ak, Koray; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı
    Morphological analysis or decomposition studies the structure, formation, function of words, identifies the morphemes (smallest meaning-bearing elements) of the language and attempts to formulate rules that model the language. It is widely used in different areas such as speech recognition, machine translation, information retrieval, text understanding, and statistical language modeling. Considering that the natural language processing applications are dealing with large amounts of data, it is not feasible to use linguists to analyze text corpus by hand, the complexity and real time Processing requirements leads to automated morphological analysis. As an alternative to the hand-made systems, there exist algorithms that work unsupervised manner and autonomously do morphological analysis for the words in an unannotated text corpus. In this thesis, an unsupervised leaming algorithm is proposed to extract infor-mation about the text corpus and the model of the language. The proposed algorithm constructs a trie that consists of characters and the occurrences of the words as nodes. The algorithm then detects roots of the given words by examining the occurrences in the path of the word. When the root is revealed, the algorithm creates a new trie from the affix parts, left after the root for each word. The algorithm continues recursively until there is no affbc left to process. Experimental results on three languages (Finnish, English and Turkish) show that our novel algorithm performs better than most of the previous algorithms in the field.
  • Yayın
    Automatic propbank generation for Turkish
    (Incoma Ltd, 2019-09) Ak, Koray; Yıldız, Olcay Taner
    Semantic role labeling (SRL) is an important task for understanding natural languages, where the objective is to analyse propositions expressed by the verb and to identify each word that bears a semantic role. It provides an extensive dataset to enhance NLP applications such as information retrieval, machine translation, information extraction, and question answering. However, creating SRL models are difficult. Even in some languages, it is infeasible to create SRL models that have predicate-argument structure due to lack of linguistic resources. In this paper, we present our method to create an automatic Turkish PropBank by exploiting parallel data from the translated sentences of English PropBank. Experiments show that our method gives promising results. © 2019 Association for Computational Linguistics (ACL).