Arama Sonuçları

Listeleniyor 1 - 7 / 7
  • Yayın
    Model adaptation for dialog act tagging
    (IEEE, 2006) Tür, Gökhan; Güz, Ümit; Hakkani Tür, Dilek
    In this paper, we analyze the effect of model adaptation for dialog act tagging. The goal of adaptation is to improve the performance of the tagger using out-of-domain data or models. Dialog act tagging aims to provide a basis for further discourse analysis and understanding in conversational speech. In this study we used the ICSI meeting corpus with high-level meeting recognition dialog act (MRDA) tags, that is, question, statement, backchannel, disruptions, and floor grabbers/holders. We performed controlled adaptation experiments using the Switchboard (SWBD) corpus with SWBD-DAMSL tags as the out-of-domain corpus. Our results indicate that we can achieve significantly better dialog act tagging by automatically selecting a subset of the Switchboard corpus and combining the confidences obtained by both in-domain and out-of-domain models via logistic regression, especially when the in-domain data is limited.
  • Yayın
    Comparison of Turkish proposition banks by frame matching
    (IEEE, 2018-12-06) Ak, Koray; Bakay, Özge; Yıldız, Olcay Taner
    By indicating semantic relations between a predicate and its associated participants in a sentence and identifying the role-bearing constituents, SRL provides an extensive dataset to understand natural languages and to enhance several NLP applications such as information retrieval, machine translation, information extraction, and question answering. The availability of large resources and the development of statistical machine learning methods have increased the studies in the field of SRL. One of the widely-used semantic resources applied for multiple languages is PropBank. In this paper, PropBanks applied for Turkish are compared by checking semantic roles in the frame files of matched verb senses. As this integrated lexical resource for Turkish is aimed to be used in a multilingual resource along with English, creation of an inclusive lexical resource for Turkish is of great importance.
  • Yayın
    A new speech modeling method: SYMPES
    (IEEE, 2006) Güz, Ümit; Gürkan, Hakan; Yarman, Bekir Sıddık Binboğa
    In this paper, the new method of speech modeling which is called SYMPES is introduced and it is compared with the commercially available methods. It is shown that for the same compression ratio or better, SYMPES yields considerably better hearing quality over the coders such as G.726 at 16 Kbps and voice excited LPC-10E of 2.4Kbps.
  • Yayın
    A novel method to represent the speech signals by using language and speaker independent predefined functions sets
    (IEEE, 2004) Güz, Ümit; Gürkan, Hakan; Yarman, Bekir Sıddık Binboğa
    In this paper a new modeling method of speech signals is introduced. The proposed method is based on the generation of the so-called Predefined Signature S={s(R)(t)} and Envelope Function E = {e(K)(t)} Sets (PSEFS). These function sets are independent of any speaker and any language. Once the speech signals are divided into frames with selected lengths, then each frame signal piece X-i(t) is synthesized by means of the mathematical form of x(i)(t)=C(i)e(K)(t)s(R)(t). In this representation, C-i is called the frame coefficient, s(R)(t) and e(K)(t) are properly assigned from the PSEFS respectively. It is shown that the proposed method provides fast reconstruction and substantial compression with acceptable hearing quality.
  • Yayın
    Assessing ChatGPT's accuracy in dyslexia inquiry
    (Institute of Electrical and Electronics Engineers Inc., 2024) Eroğlu, Günet; Harb, Mhd Raja Abou
    Dyslexia poses challenges in accessing reliable information, crucial for affected individuals and their families. Leveraging chatbot technology offers promise in this regard. This study evaluates the OpenAI Assistant's precision in addressing dyslexia-related inquiries. Three hundred questions commonly posed by parents were categorized and presented to the Assistant. Expert evaluation of responses, graded on accuracy and completeness, yielded consistently high scores (median=5). Descriptive questions scored higher (average=4.9568) than yes/no questions (average=4.8957), indicating potential response challenges. Statistical analysis highlighted the significance of question specificity in response quality. Despite occasional difficulties, the Assistant demonstrated adaptability and reliability in providing accurate dyslexia-related information.
  • Yayın
    Büyük dil modelleri için TR-MMLU benchmark’ı: performans değerlendirmesi, zorluklar ve iyileştirme fırsatları
    (Institute of Electrical and Electronics Engineers Inc., 2025-08-15) Bayram, M. Ali; Fincan, Ali Arda; Gümüş, Ahmet Semih; Diri, Banu; Yıldırım, Savaş; Aytaş, Öner
    Dil modelleri, insan dilini anlama ve üretme konularında önemli ilerlemeler kaydetmiş, birçok uygulamada dikkat çekici başarılar elde etmiştir. Ancak, özellikle Türkçe gibi kaynak açısından sınırlı dillere yönelik değerlendirme çalışmaları önemli ˘bir zorluk oluşturmaktadır. Bu sorunu ele almak amacıyla, büyük dil modellerinin (LLM) Türkçe dilindeki dilsel ve kavramsal yeteneklerini değerlendirmek için kapsamlı bir değerlendirme çerçevesi olan Türkçe MMLU (TR-MMLU) benchmark’ını tanıttık. TR-MMLU, Türk eğitim sisteminden 62 bölümdeki 6.200 çoktan seçmeli soruyu içeren, özenle hazırlanmış bir veri setine dayanmaktadır. Bu benchmark, Türkçe doğal dil işleme (NLP) araştırmalarına standart bir çerçeve sunmakta ve büyük dil modellerinin Türkçe metinleri işleme yeteneklerini detaylı bir şekilde analiz etmeyi sağlamaktadır. Çalışmamızda, TR-MMLU üzerinde en güncel büyük dil modellerini değerlendirdik ve model tasarımında iyileştirme gerektiren alanları vurguladık. TRMMLU, Türkçe NLP araştırmalarını ilerletmek ve gelecekteki yeniliklere ilham vermek için yeni bir standart oluşturmaktadır.
  • Yayın
    TurkEmbed4Retrieval: Türkçe için geri getirme görevine özel gömme modeli
    (Institute of Electrical and Electronics Engineers Inc., 2025-08-15) Ezerceli, Özay; Gümüşçekiçci, Gizem; Erkoç, Tuğba; Özenç, Berke
    Bu çalışmada, öncelikle Doğal Dil Çıkarımı (DDÇ) ve Anlamsal Metin Benzerliği (AMB) görevleri için geliştirilen TurkEmbed modelinin, MS-Marco-TR veri seti üzerinde ince ayar yapılarak geri getirme görevlerine uygun hale getirilmesini sağlayan TurkEmbed4Retrieval modelini tanıtıyoruz. Model, Matruşka temsili ögrenme ve özel tasarlanmış negatif çiftlerin sıralanması kayıp fonksiyonu gibi ileri seviye egitim teknikleri kullanılarak optimize edilmiştir. Yapılan kapsamlı deneyler, TurkEmbed4Retrieval’ın, geri getirme metriklerinde TurkishcolBERT modelini Scifact-TR veri kümesinde %19–26 oranında geçtiğini göstermektedir. Bu bağlamda, modelimiz, Türkçe bilgi getirme sistemleri için yeni bir çıtaya ulaşmaktadır.