Arama Sonuçları

Listeleniyor 1 - 8 / 8
  • Yayın
    Cascaded model adaptation for dialog act segmentation and tagging
    (Elsevier Ltd, 2010-04) Güz, Ümit; Tür, Gökhan; Hakkani Tür, Dilek; Cuendet, Sebastien
    There are many speech and language processing problems which require cascaded classification tasks. While model adaptation has been shown to be useful in isolated speech and language processing tasks, it is not clear what constitutes system adaptation for such complex systems. This paper studies the following questions: In cases where a sequence of classification tasks is employed, how important is to adapt the earlier or latter systems? Is the performance improvement obtained in the earlier stages via adaptation carried on to later stages in cases where the later stages perform adaptation using similar data and/or methods? In this study, as part of a larger scale multiparty meeting understanding system, we analyze various methods for adapting dialog act segmentation and tagging models trained on conversational telephone speech (CTS) to meeting style conversations. We investigate the effect of using adapted and unadapted models for dialog act segmentation with those of tagging, showing the effect of model adaptation for cascaded classification tasks. Our results indicate that we can achieve significantly better dialog act segmentation and tagging by adapting the out-of-domain models, especially when the amount of in-domain data is limited. Experimental results show that it is more effective to adapt the models in the latter classification tasks, in our case dialog act tagging, when dealing with a sequence of cascaded classification tasks
  • Yayın
    Effective semi-supervised learning strategies for automatic sentence segmentation
    (Elsevier Science BV, 2018-04-01) Dalva, Doğan; Güz, Ümit; Gürkan, Hakan
    The primary objective of sentence segmentation process is to determine the sentence boundaries of a stream of words output by the automatic speech recognizers. Statistical methods developed for sentence segmentation requires a significant amount of labeled data which is time-consuming, labor intensive and expensive. In this work, we propose new multi-view semi-supervised learning strategies for sentence boundary classification problem using lexical, prosodic, and morphological information. The aim is to find effective semi-supervised machine learning strategies when only small sets of sentence boundary labeled data are available. We primarily investigate two semi-supervised learning approaches, called self-training and co-training. Different example selection strategies were also used for co-training, namely, agreement, disagreement and self-combined. Furthermore, we propose three-view and committee-based algorithms incorporating with agreement, disagreement and self-combined strategies using three disjoint feature sets. We present comparative results of different learning strategies on the sentence segmentation task. The experimental results show that the sentence segmentation performance can be highly improved using multi-view learning strategies that we proposed since data sets can be represented by three redundantly sufficient and disjoint feature sets. We show that the proposed strategies substantially improve the average baseline F-measure of 67.66% to 75.15% and 64.84% to 66.32% when only a small set of manually labeled data is available for Turkish and English spoken languages, respectively.
  • Yayın
    A novel biometric identification system based on fingertip electrocardiogram and speech signals
    (Elsevier Inc., 2022-03) Güven, Gökhan; Güz, Ümit; Gürkan, Hakan
    In this research work, we propose a one-dimensional Convolutional Neural Network (CNN) based biometric identification system that combines speech and ECG modalities. The aim is to find an effective identification strategy while enhancing both the confidence and the performance of the system. In our first approach, we have developed a voting-based ECG and speech fusion system to improve the overall performance compared to the conventional methods. In the second approach, we have developed a robust rejection algorithm to prevent unauthorized access to the fusion system. We also presented a newly developed ECG spike and inconsistent beats removal algorithm to detect and eliminate the problems caused by portable fingertip ECG devices and patient movements. Furthermore, we have achieved a system that can work with only one authorized user by adding a Universal Background Model to our algorithm. In the first approach, the proposed fusion system achieved a 100% accuracy rate for 90 people by taking the average of 3-fold cross-validation. In the second approach, by using 90 people as genuine classes and 26 people as imposter classes, the proposed system achieved 92% accuracy in identifying genuine classes and 96% accuracy in rejecting imposter classes.
  • Yayın
    Extension of conventional co-training learning strategies to three-view and committee-based learning strategies for effective automatic sentence segmentation
    (IEEE, 2018) Dalva, Doğan; Güz, Ümit; Gürkan, Hakan
    The objective of this work is to develop effective multi-view semi-supervised machine learning strategies for sentence boundary classification problem when only small sets of sentence boundary labeled data are available. We propose three-view and committee-based learning strategies incorporating with co-training algorithms with agreement, disagreement, and self-combined learning strategies using prosodic, lexical and morphological information. We compare experimental results of proposed three-view and committee-based learning strategies to other semi-supervised learning strategies in the literature namely, self-training and co-training with agreement, disagreement, and self-combined strategies. The experiment results show that sentence segmentation performance can be highly improved using multi-view learning strategies that we propose since data sets can be represented by three redundantly sufficient and disjoint feature sets. We show that the proposed strategies substantially improve the average performance when only a small set of manually labeled data is available for Turkish and English spoken languages, respectively.
  • Yayın
    A new speech modeling method: SYMPES
    (IEEE, 2006) Güz, Ümit; Gürkan, Hakan; Yarman, Bekir Sıddık Binboğa
    In this paper, the new method of speech modeling which is called SYMPES is introduced and it is compared with the commercially available methods. It is shown that for the same compression ratio or better, SYMPES yields considerably better hearing quality over the coders such as G.726 at 16 Kbps and voice excited LPC-10E of 2.4Kbps.
  • Yayın
    A new coding method for speech and audio signals
    (IEEE, 2005) Güz, Ümit; Gürkan, Hakan; Yarman, Bekir Sıddık Binboğa
    In this paper a new representation or modeling method of speech signals is introduced. The proposed method is based on the generation of the so-called Predefined Signature S={S R } and Envelope vector E={E K } Sets (PSEVS). These vector sets are speaker and language independent. In this method, once the speech signals are divided into frames with selected lengths, then each frame signal piece X i is reconstructed by means of the mathematical form of X i =C i E K S R . In this representation, C i is called the frame coefficient, S R and E K are the vectors properly assigned from the PSEVS respectively. It is shown that the proposed method provides fast reconstruction and substantial compression ratio with acceptable hearing quality.
  • Yayın
    EEG signal compression based on classified signature and envelope vector sets
    (IEEE Computer Society, 2007) Gürkan, Hakan; Güz, Ümit; Yarman, Bekir Sıddık Binboğa
    In this paper, a novel method to compress ElectroEncephaloGram (EEG) Signal is proposed. The proposed method is based on the generation Classified Signature and Envelope Vector Sets (CSEVS) by using an effective k-means clustering algorithm. In this work on a frame basis, any EEG signal is modeled by multiplying three parameters as called the Classified Signature Vector, Classified Envelope Vector, and Frame-Scaling Coefficient. In this case, EEG signal for each frame is described in terms of the two indices R and K of CSEVS and the frame-scaling coefficient. The proposed method is assessed through the use of root-mean-square error (RMSE) and visual inspection measures. The proposed method achieves good compression ratios with low level reconstruction error while preserving diagnostic information in the reconstructed EEG signal.
  • Yayın
    Türkçe haber yayını verileri için bürünsel bilginin çıkarılması ve cümle bölütlemede kullanılması
    (IEEE, 2014-04-23) Dalva, Doğan; Revidi, İzel D.; Güz, Ümit; Gürkan, Hakan
    Bu çalışmada, Türkçe haber yayını verilerine ilişkin bürünsel özelliklerin açık kaynak kodlu yazılımlar ile çıkarılması ve bürünsel özellik gruplarının Otomatik Konuşma Tanıma (Automatic Speech Recognition) Sistemi çıkışından elde edilen metin üzerinde cümle bölütlemedeki başarımlarının karşılaştırılması gerçekleştirilmiştir.Özellikle cümle bölütleme işlevi için oldukça yüksek başarım oranına sahip bir bürünsel özellik seti elde edilmiştir.