Işık Üniversitesi Kurumsal Akademik Belleği :: DSpace Angular

Listeleniyor 1 - 2 / 2

Parallel proposition bank construction for Turkish
(Işık Üniversitesi, 2019-04-02) Ak, Koray; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Doktora Programı
PropBank is the bank of propositions which contains hand-annotated corpus for predicate-argument information and semantic roles or arguments. It aims to provide an extensive dataset for enhancing NLP applications such as information retrieval, machine translation, information extraction, and question answering by adding a semantic information layer to the syntactic annotation. Via the added semantic layer, syntactic parser re?nements can be achieved which increases the e?ciency and improves application performance. The aim of this thesis is to construct proposition bank for Turkish Language. Only preliminary studies were carried out in terms of Turkish PropBank. This study is one of the pioneers for the language. In this study, a hand annotated Turkish PropBank is constructed from the translation of the parallel English PropBank corpus, other PropBank studies for Turkish language examined and compared with the proposition bank constructed, automatic PropBank construction for Turkish from both parallel sentence trees and phrase sentences is analyzed and automatic proposition banks generated for Turkish.
Unsupervised morphological analysis using tries
(Işık Üniversitesi, 2011-04-29) Ak, Koray; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı
Morphological analysis or decomposition studies the structure, formation, function of words, identifies the morphemes (smallest meaning-bearing elements) of the language and attempts to formulate rules that model the language. It is widely used in different areas such as speech recognition, machine translation, information retrieval, text understanding, and statistical language modeling. Considering that the natural language processing applications are dealing with large amounts of data, it is not feasible to use linguists to analyze text corpus by hand, the complexity and real time Processing requirements leads to automated morphological analysis. As an alternative to the hand-made systems, there exist algorithms that work unsupervised manner and autonomously do morphological analysis for the words in an unannotated text corpus. In this thesis, an unsupervised leaming algorithm is proposed to extract infor-mation about the text corpus and the model of the language. The proposed algorithm constructs a trie that consists of characters and the occurrences of the words as nodes. The algorithm then detects roots of the given words by examining the occurrences in the path of the word. When the root is revealed, the algorithm creates a new trie from the affix parts, left after the root for each word. The algorithm continues recursively until there is no affbc left to process. Experimental results on three languages (Finnish, English and Turkish) show that our novel algorithm performs better than most of the previous algorithms in the field.

Filtreler

Yazar

Konu

Tarih

Dil

Tür

Kategori

Bölüm

Erişim Hakkı

Tam Metin

Öğe Türü

Ayarlar

Sırala

Sayfa Başına Sonuç

Arama Sonuçları