Constructing a Turkish constituency parse treeBank

Yükleniyor...
Küçük Resim

Tarih

2016

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Springer Verlag

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Araştırma projeleri

Organizasyon Birimleri

Dergi sayısı

Özet

In this paper, we describe our initial efforts for creating a Turkish constituency parse treebank by utilizing the English Penn Treebank. We employ a semiautomated approach for annotation. In our previouswork [18], the English parse trees were manually translated to Turkish. In this paper, the words are semi-automatically annotated morphologically. As a second step, a rule-based approach is used for refining the parse trees based on the morphological analyses of the words. We generated Turkish phrase structure trees for 5143 sentences from Penn Treebank that contain fewer than 15 tokens. The annotated corpus can be used in statistical natural language processing studies for developing tools such as constituency parsers and statistical machine translation systems for Turkish.

Açıklama

Anahtar Kelimeler

Communications engineering, Networks, Computational linguistics, Syntactics, Dependency parser

Kaynak

Lecture Notes in Electrical Engineering

WoS Q Değeri

N/A

Scopus Q Değeri

Q4

Cilt

363

Sayı

Künye

Yıldız, O. T., Solak, E., Çandır, Ş., Ehsani, R. & Görgün, O. (2016). Constructing a Turkish constituency parse treeBank. Paper presented at the Lecture Notes in Electrical Engineering, 363, 339-347. doi:10.1007/978-3-319-22635-4_31