TUR2SQL: A cross-domain Turkish dataset for Text-to-SQL

dc.authorid0009-0003-9031-1485
dc.authorid0000-0002-8649-6013
dc.contributor.authorKanburoğlu, Ali Buğraen_US
dc.contributor.authorTek, Faik Borayen_US
dc.date.accessioned2023-11-29T15:08:38Z
dc.date.available2023-11-29T15:08:38Z
dc.date.issued2023-09-15
dc.departmentIşık Üniversitesi, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.departmentIşık University, Faculty of Engineering and Natural Sciences, Department of Computer Engineeringen_US
dc.descriptionWe would like to thank all the participants who took part in the survey mentioned in this study, including the students enrolled in the Database Systems course taught by Asst. Prof. Emine Ekin from FMV Isik University during the Spring semester of 2022, as well as the employees of Huawei Turkey R&D Center. Their valuable contributions were crucial in creating natural language templates for our research.en_US
dc.description.abstractThe field of converting natural language into corresponding SQL queries using deep learning techniques has attracted significant attention in recent years. While existing Text-to-SQL datasets primarily focus on English and other languages such as Chinese, there is a lack of resources for the Turkish language. In this study, we introduce the first publicly available cross-domain Turkish Text-to-SQL dataset, named TUR2SQL. This dataset consists of 10,809 pairs of natural language statements and their corresponding SQL queries. We conducted experiments using SQLNet and ChatGPT on the TUR2SQL dataset. The experimental results show that SQLNet has limited performance and ChatGPT has superior performance on the dataset. We believe that TUR2SQL provides a foundation for further exploration and advancements in Turkish language-based Text-to-SQL research.en_US
dc.description.sponsorshipFMV Isik Universityen_US
dc.description.versionPublisher's Versionen_US
dc.identifier.citationKanburoğlu, A. B. & Tek, F. B. (2023). TUR2SQL: A cross-domain Turkish dataset for Text-to-SQL. Paper presented at the 8th International Conference on Computer Science and Engineering, UBMK 2023, 206-211. doi:10.1109/UBMK59864.2023.10286686en_US
dc.identifier.doi10.1109/UBMK59864.2023.10286686
dc.identifier.endpage211
dc.identifier.isbn9798350340815
dc.identifier.isbn9798350340822
dc.identifier.issn2521-1641
dc.identifier.issn2768-0592
dc.identifier.scopus2-s2.0-85177574067
dc.identifier.scopusqualityN/A
dc.identifier.startpage206
dc.identifier.urihttps://hdl.handle.net/11729/5803
dc.identifier.urihttp://dx.doi.org/10.1109/UBMK59864.2023.10286686
dc.indekslendigikaynakScopusen_US
dc.institutionauthorKanburoğlu, Ali Buğraen_US
dc.institutionauthorid0009-0003-9031-1485
dc.language.isoenen_US
dc.peerreviewedYesen_US
dc.publicationstatusPublisheden_US
dc.publisherIEEEen_US
dc.relation.ispartof8th International Conference on Computer Science and Engineering, UBMK 2023en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - İdari Personel ve Öğrencien_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectChatGPTen_US
dc.subjectDataseten_US
dc.subjectSQLNeten_US
dc.subjectText-to-SQLen_US
dc.subjectNatural language processing systemsen_US
dc.subjectCross-domainen_US
dc.subjectNatural languagesen_US
dc.subjectPerformanceen_US
dc.subjectSQL queryen_US
dc.subjectTurkish languageen_US
dc.subjectTurkishsen_US
dc.subjectDeep learningen_US
dc.titleTUR2SQL: A cross-domain Turkish dataset for Text-to-SQLen_US
dc.typeConference Objecten_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
TUR2SQL_A_cross_domain_Turkish_dataset_for_Text_to_SQL.pdf
Boyut:
278.77 KB
Biçim:
Adobe Portable Document Format
Açıklama:
Publisher's Version
Lisans paketi
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
license.txt
Boyut:
1.44 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: