3 sonuçlar
Arama Sonuçları
Listeleniyor 1 - 3 / 3
Yayın Cascaded model adaptation for dialog act segmentation and tagging(Elsevier Ltd, 2010-04) Güz, Ümit; Tür, Gökhan; Hakkani Tür, Dilek; Cuendet, SebastienThere are many speech and language processing problems which require cascaded classification tasks. While model adaptation has been shown to be useful in isolated speech and language processing tasks, it is not clear what constitutes system adaptation for such complex systems. This paper studies the following questions: In cases where a sequence of classification tasks is employed, how important is to adapt the earlier or latter systems? Is the performance improvement obtained in the earlier stages via adaptation carried on to later stages in cases where the later stages perform adaptation using similar data and/or methods? In this study, as part of a larger scale multiparty meeting understanding system, we analyze various methods for adapting dialog act segmentation and tagging models trained on conversational telephone speech (CTS) to meeting style conversations. We investigate the effect of using adapted and unadapted models for dialog act segmentation with those of tagging, showing the effect of model adaptation for cascaded classification tasks. Our results indicate that we can achieve significantly better dialog act segmentation and tagging by adapting the out-of-domain models, especially when the amount of in-domain data is limited. Experimental results show that it is more effective to adapt the models in the latter classification tasks, in our case dialog act tagging, when dealing with a sequence of cascaded classification tasksYayın A novel biometric identification system based on fingertip electrocardiogram and speech signals(Elsevier Inc., 2022-03) Güven, Gökhan; Güz, Ümit; Gürkan, HakanIn this research work, we propose a one-dimensional Convolutional Neural Network (CNN) based biometric identification system that combines speech and ECG modalities. The aim is to find an effective identification strategy while enhancing both the confidence and the performance of the system. In our first approach, we have developed a voting-based ECG and speech fusion system to improve the overall performance compared to the conventional methods. In the second approach, we have developed a robust rejection algorithm to prevent unauthorized access to the fusion system. We also presented a newly developed ECG spike and inconsistent beats removal algorithm to detect and eliminate the problems caused by portable fingertip ECG devices and patient movements. Furthermore, we have achieved a system that can work with only one authorized user by adding a Universal Background Model to our algorithm. In the first approach, the proposed fusion system achieved a 100% accuracy rate for 90 people by taking the average of 3-fold cross-validation. In the second approach, by using 90 people as genuine classes and 26 people as imposter classes, the proposed system achieved 92% accuracy in identifying genuine classes and 96% accuracy in rejecting imposter classes.Yayın Left/right and front/back in sign, speech, and co-speech gestures: what do data from Turkish sign language, croatian sign language, American sign language, Turkish, Croatian, and English reveal?(Versita, 2011-09) Arık, EnginResearch has shown that spoken languages differ from each other in their representation of space. Using hands, body, and physical space in front of signers to represent space, do sign languages differ from each other? To what extent are they similar to spoken languages in their expressions of spatial relations? The present study targeted these questions by exploring the descriptions of static situations in sign languages (Turkish Sign Language, Croatian Sign Language, American Sign Language) and spoken languages, including co-speech gestures (Turkish, Croatian, and English). It is found that signed and spoken languages differ from each other in their linguistic constructions for the left/right and front/back spatial relation. They also differ from one another in their mapping strategies. Crucially, being a signer does not require more direct iconic mappings than a speaker would use. It is also found that co-speech gestures can complement spoken language descriptions.












