Setting standards in Turkish NLP: TR-MMLU for large language model evaluation

Bayram, M. Ali; Fincan, Ali Arda; Gümüş, Ahmet Semih; Diri, Banu; Yıldırım, Savaş; Aytaş, Öner

Setting standards in Turkish NLP: TR-MMLU for large language model evaluation

dc.authorid	0000-0003-1298-4521
dc.authorid	0009-0002-7907-1209
dc.authorid	0000-0002-6652-4339
dc.authorid	0000-0002-7764-2891
dc.authorid	0000-0002-4305-8785
dc.contributor.author	Bayram, M. Ali	en_US
dc.contributor.author	Fincan, Ali Arda	en_US
dc.contributor.author	Gümüş, Ahmet Semih	en_US
dc.contributor.author	Diri, Banu	en_US
dc.contributor.author	Yıldırım, Savaş	en_US
dc.contributor.author	Aytaş, Öner	en_US
dc.date.accessioned	2025-10-07T12:33:03Z
dc.date.available	2025-10-07T12:33:03Z
dc.date.issued	2025-01-04
dc.department	Işık Üniversitesi, Meslek Yüksekokulu, Bilgisayar Programcılığı Programı	en_US
dc.department	Işık University, Vocational School, Computer Programming Program	en_US
dc.description.abstract	Language models have made remarkable advancements in understanding and generating human language, achieving notable success across a wide array of applications. However, evaluating these models remains a significant challenge, particularly for resource-limited languages such as Turkish. To address this gap, we introduce the Turkish MMLU (TR-MMLU) benchmark, a comprehensive evaluation framework designed to assess the linguistic and conceptual capabilities of large language models (LLMs) in Turkish. TR-MMLU is constructed from a carefully curated dataset comprising 6,200 multiple-choice questions across 62 sections, selected from a pool of 280,000 questions spanning 67 disciplines and over 800 topics within the Turkish education system. This benchmark provides a transparent, reproducible, and culturally relevant tool for evaluating model performance. It serves as a standard framework for Turkish NLP research, enabling detailed analyses of LLMs’ capabilities in processing Turkish text and fostering the development of more robust and accurate language models. In this study, we evaluate state-of-the-art LLMs on TR-MMLU, providing insights into their strengths and limitations for Turkish-specific tasks. Our findings reveal critical challenges, such as the impact of tokenization and fine-tuning strategies, and highlight areas for improvement in model design. By setting a new standard for evaluating Turkish language models, TR-MMLU aims to inspire future innovations and support the advancement of Turkish NLP research.	en_US
dc.description.version	Preprint's Version	en_US
dc.identifier.citation	Bayram, M. A., Fincan, A. A., Gümüş, A. S., Diri, B., Yıldırım, S. & Aytaş, Ö. (2025). Setting standards in Turkish NLP: TR-MMLU for large language model evaluation. Arxiv, 1-6. doi: https://doi.org/10.48550/arXiv.2501.00593	en_US
dc.identifier.endpage	6
dc.identifier.startpage	1
dc.identifier.uri	https://hdl.handle.net/11729/6751
dc.identifier.uri	https://doi.org/10.48550/arXiv.2501.00593
dc.identifier.wos	PPRN:120258859
dc.identifier.wosquality	N/A
dc.indekslendigikaynak	Web of Science	en_US
dc.indekslendigikaynak	Preprint Citation Index	en_US
dc.institutionauthor	Aytaş, Öner	en_US
dc.language.iso	en	en_US
dc.publisher	Cornell Univ	en_US
dc.relation.ispartof	Arxiv	en_US
dc.relation.publicationcategory	Ön Baskı – Uluslararası – Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Large Language Models (LLM)	en_US
dc.subject	Natural Language Processing (NLP)	en_US
dc.subject	Artificial Intelligence	en_US
dc.subject	Turkish NLP	en_US
dc.title	Setting standards in Turkish NLP: TR-MMLU for large language model evaluation	en_US
dc.type	Preprint	en_US
dspace.entity.type	Publication	en_US

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1

İsim:: Setting_Standards_in_Turkish_NLP_TR_MMLU_for_Large_Language_Model_Evaluation.pdf
Boyut:: 84.14 KB
Biçim:: Adobe Portable Document Format

İndir

Lisans paketi

Listeleniyor 1 - 1 / 1

İsim:: license.txt
Boyut:: 1.17 KB
Biçim:: Item-specific license agreed upon to submission
Açıklama:

İndir

Koleksiyon

Meslek Yüksekokulu Koleksiyonu
WoS İndeksli Yayınlar Koleksiyonu