3 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 3 / 3
Yayın Extension of conventional co-training learning strategies to three-view and committee-based learning strategies for effective automatic sentence segmentation(IEEE, 2018) Dalva, Doğan; Güz, Ümit; Gürkan, HakanThe objective of this work is to develop effective multi-view semi-supervised machine learning strategies for sentence boundary classification problem when only small sets of sentence boundary labeled data are available. We propose three-view and committee-based learning strategies incorporating with co-training algorithms with agreement, disagreement, and self-combined learning strategies using prosodic, lexical and morphological information. We compare experimental results of proposed three-view and committee-based learning strategies to other semi-supervised learning strategies in the literature namely, self-training and co-training with agreement, disagreement, and self-combined strategies. The experiment results show that sentence segmentation performance can be highly improved using multi-view learning strategies that we propose since data sets can be represented by three redundantly sufficient and disjoint feature sets. We show that the proposed strategies substantially improve the average performance when only a small set of manually labeled data is available for Turkish and English spoken languages, respectively.Yayın Extraction and comparison of various prosodic feature sets on sentence segmentation task for Turkish broadcast news data(IEEE, 2014) Dalva, Doğan; Revidi, İzel D.; Güz, Ümit; Gürkan, HakanIn this work, prosodic features of the Turkish Broadcast News (BN) data are extracted using an open source prosodic feature extraction tool based on Praat. The profiles and effectiveness of these features are also investigated for the sentence segmentation task on the Turkish BN data. We not only used some combinations of the feature sets but also collected some of them in one prosodic feature model in order to achieve one of the best performance. The results of the experiments show that some combinations of the prosodic feature sets are very useful for the automatic sentence segmentation task on the Turkish BN data.Yayın Türkçe haber yayını verileri için bürünsel bilginin çıkarılması ve cümle bölütlemede kullanılması(IEEE, 2014-04-23) Dalva, Doğan; Revidi, İzel D.; Güz, Ümit; Gürkan, HakanBu çalışmada, Türkçe haber yayını verilerine ilişkin bürünsel özelliklerin açık kaynak kodlu yazılımlar ile çıkarılması ve bürünsel özellik gruplarının Otomatik Konuşma Tanıma (Automatic Speech Recognition) Sistemi çıkışından elde edilen metin üzerinde cümle bölütlemedeki başarımlarının karşılaştırılması gerçekleştirilmiştir.Özellikle cümle bölütleme işlevi için oldukça yüksek başarım oranına sahip bir bürünsel özellik seti elde edilmiştir.












