3 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 3 / 3
Yayın Chunking in Turkish with conditional random fields(Springer-Verlag, 2015-04-14) Yıldız, Olcay Taner; Solak, Ercan; Ehsani, Razieh; Görgün, OnurIn this paper, we report our work on chunking in Turkish. We used the data that we generated by manually translating a subset of the Penn Treebank. We exploited the already available tags in the trees to automatically identify and label chunks in their Turkish translations. We used conditional random fields (CRF) to train a model over the annotated data. We report our results on different levels of chunk resolution.Yayın Constructing a Turkish-English parallel treebank(Association for Computational Linguistics (ACL), 2014) Yıldız, Olcay Taner; Solak, Ercan; Görgün, Onur; Ehsani, RaziehIn this paper, we report our preliminary efforts in building an English-Turkish parallel treebank corpus for statistical machine translation. In the corpus, we manually generated parallel trees for about 5,000 sentences from Penn Treebank. English sentences in our set have a maximum of 15 tokens, including punctuation. We constrained the translated trees to the reordering of the children and the replacement of the leaf nodes with appropriate glosses. We also report the tools that we built and used in our tree translation task.Yayın An all-words sense annotated Turkish corpus(IEEE, 2018-06-06) Akçakaya, Sinan; Yıldız, Olcay TanerThis paper reports our efforts in constructing of a sense labeled Turkish corpus with respect to Turkish Language Institution's dictionary, using the traditional method of manual tagging. We tagged a pre-built parallel treebank which is translated from the Penn Treebank II corpus. This approach allowed us to generate a full-coverage resource, in which syntactic and semantic information merged. We also provide miscellaneous statistics about the corpus itself as well as its development process.












