TUR2SQL: A cross-domain Turkish dataset for Text-to-SQL
dc.authorid | 0009-0003-9031-1485 | |
dc.authorid | 0000-0002-8649-6013 | |
dc.contributor.author | Kanburoğlu, Ali Buğra | en_US |
dc.contributor.author | Tek, Faik Boray | en_US |
dc.date.accessioned | 2023-11-29T15:08:38Z | |
dc.date.available | 2023-11-29T15:08:38Z | |
dc.date.issued | 2023-09-15 | |
dc.department | Işık Üniversitesi, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü | en_US |
dc.department | Işık University, Faculty of Engineering and Natural Sciences, Department of Computer Engineering | en_US |
dc.description | We would like to thank all the participants who took part in the survey mentioned in this study, including the students enrolled in the Database Systems course taught by Asst. Prof. Emine Ekin from FMV Isik University during the Spring semester of 2022, as well as the employees of Huawei Turkey R&D Center. Their valuable contributions were crucial in creating natural language templates for our research. | en_US |
dc.description.abstract | The field of converting natural language into corresponding SQL queries using deep learning techniques has attracted significant attention in recent years. While existing Text-to-SQL datasets primarily focus on English and other languages such as Chinese, there is a lack of resources for the Turkish language. In this study, we introduce the first publicly available cross-domain Turkish Text-to-SQL dataset, named TUR2SQL. This dataset consists of 10,809 pairs of natural language statements and their corresponding SQL queries. We conducted experiments using SQLNet and ChatGPT on the TUR2SQL dataset. The experimental results show that SQLNet has limited performance and ChatGPT has superior performance on the dataset. We believe that TUR2SQL provides a foundation for further exploration and advancements in Turkish language-based Text-to-SQL research. | en_US |
dc.description.sponsorship | FMV Isik University | en_US |
dc.description.version | Publisher's Version | en_US |
dc.identifier.citation | Kanburoğlu, A. B. & Tek, F. B. (2023). TUR2SQL: A cross-domain Turkish dataset for Text-to-SQL. Paper presented at the 8th International Conference on Computer Science and Engineering, UBMK 2023, 206-211. doi:10.1109/UBMK59864.2023.10286686 | en_US |
dc.identifier.doi | 10.1109/UBMK59864.2023.10286686 | |
dc.identifier.endpage | 211 | |
dc.identifier.isbn | 9798350340815 | |
dc.identifier.isbn | 9798350340822 | |
dc.identifier.issn | 2521-1641 | |
dc.identifier.issn | 2768-0592 | |
dc.identifier.scopus | 2-s2.0-85177574067 | |
dc.identifier.scopusquality | N/A | |
dc.identifier.startpage | 206 | |
dc.identifier.uri | https://hdl.handle.net/11729/5803 | |
dc.identifier.uri | http://dx.doi.org/10.1109/UBMK59864.2023.10286686 | |
dc.indekslendigikaynak | Scopus | en_US |
dc.institutionauthor | Kanburoğlu, Ali Buğra | en_US |
dc.institutionauthorid | 0009-0003-9031-1485 | |
dc.language.iso | en | en_US |
dc.peerreviewed | Yes | en_US |
dc.publicationstatus | Published | en_US |
dc.publisher | IEEE | en_US |
dc.relation.ispartof | 8th International Conference on Computer Science and Engineering, UBMK 2023 | en_US |
dc.relation.publicationcategory | Konferans Öğesi - Uluslararası - İdari Personel ve Öğrenci | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | ChatGPT | en_US |
dc.subject | Dataset | en_US |
dc.subject | SQLNet | en_US |
dc.subject | Text-to-SQL | en_US |
dc.subject | Natural language processing systems | en_US |
dc.subject | Cross-domain | en_US |
dc.subject | Natural languages | en_US |
dc.subject | Performance | en_US |
dc.subject | SQL query | en_US |
dc.subject | Turkish language | en_US |
dc.subject | Turkishs | en_US |
dc.subject | Deep learning | en_US |
dc.title | TUR2SQL: A cross-domain Turkish dataset for Text-to-SQL | en_US |
dc.type | Conference Object | en_US |
Dosyalar
Orijinal paket
1 - 1 / 1
Küçük Resim Yok
- İsim:
- TUR2SQL_A_cross_domain_Turkish_dataset_for_Text_to_SQL.pdf
- Boyut:
- 278.77 KB
- Biçim:
- Adobe Portable Document Format
- Açıklama:
- Publisher's Version
Lisans paketi
1 - 1 / 1
Küçük Resim Yok
- İsim:
- license.txt
- Boyut:
- 1.44 KB
- Biçim:
- Item-specific license agreed upon to submission
- Açıklama: