Işık Üniversitesi Kurumsal Akademik Belleği :: DSpace Angular

Arama Sonuçları

Listeleniyor 1 - 7 / 7

TURSpider: a Turkish Text-to-SQL dataset and LLM-based study
(Institute of Electrical and Electronics Engineers Inc., 2024-11-25) Kanburoğlu, Ali Buğra; Tek, Faik Boray
This paper introduces TURSpider, a novel Turkish Text-to-SQL dataset developed through human translation of the widely used Spider dataset, aimed at addressing the current lack of complex, cross-domain SQL datasets for the Turkish language. TURSpider incorporates a wide range of query difficulties, including nested queries, to create a comprehensive benchmark for Turkish Text-to-SQL tasks. The dataset enables cross-language comparison and significantly enhances the training and evaluation of large language models (LLMs) in generating SQL queries from Turkish natural language inputs. We fine-tuned several Turkish-supported LLMs on TURSpider and evaluated their performance in comparison to state-of-the-art models like GPT-3.5 Turbo and GPT-4. Our results show that fine-tuned Turkish LLMs demonstrate competitive performance, with one model even surpassing GPT-based models on execution accuracy. We also apply the Chain-of-Feedback (CoF) methodology to further improve model performance, demonstrating its effectiveness across multiple LLMs. This work provides a valuable resource for Turkish NLP and addresses specific challenges in developing accurate Text-to-SQL models for low-resource languages.
Assessing ChatGPT's accuracy in dyslexia inquiry
(Institute of Electrical and Electronics Engineers Inc., 2024) Eroğlu, Günet; Harb, Mhd Raja Abou
Dyslexia poses challenges in accessing reliable information, crucial for affected individuals and their families. Leveraging chatbot technology offers promise in this regard. This study evaluates the OpenAI Assistant's precision in addressing dyslexia-related inquiries. Three hundred questions commonly posed by parents were categorized and presented to the Assistant. Expert evaluation of responses, graded on accuracy and completeness, yielded consistently high scores (median=5). Descriptive questions scored higher (average=4.9568) than yes/no questions (average=4.8957), indicating potential response challenges. Statistical analysis highlighted the significance of question specificity in response quality. Despite occasional difficulties, the Assistant demonstrated adaptability and reliability in providing accurate dyslexia-related information.
Text-to-SQL: a methodical review of challenges and models
(TÜBİTAK, 2024-05-20) Kanburoğlu, Ali Buğra; Tek, Faik Boray
This survey focuses on Text-to-SQL, automated translation of natural language queries into SQL queries. Initially, we describe the problem and its main challenges. Then, by following the PRISMA systematic review methodology, we survey the existing Text-to-SQL review papers in the literature. We apply the same method to extract proposed Text-to-SQL models and classify them with respect to used evaluation metrics and benchmarks. We highlight the accuracies achieved by various models on Text-to-SQL datasets and discuss execution-guided evaluation strategies. We present insights into model training times and implementations of different models. We also explore the availability of Text-to-SQL datasets in non-English languages. Additionally, we focus on large language model (LLM) based approaches for the Text-to-SQL task, where we examine LLM-based studies in the literature and subsequently evaluate the LLMs on the cross-domain Spider dataset. Finally, we conclude with a discussion of future directions for Text-to-SQL research, identifying potential areas of improvement and advancements in this field.
LLM’leri kullanarak otel incelemelerini görüntü manipülasyonu ile görselleştirme
(Institute of Electrical and Electronics Engineers Inc., 2025-08-15) Özdemir, Ata Onur; Giritli, Efe Batur; Can, Yekta Said
Dijital çağda müşteri yorumları, özellikle otelcilik sektöründe, karar verme sürecinde önemli bir rol oynamaktadır. Metin tabanlı yorumlar değerli bilgiler sunsa da, potansiyel müşteriler genellikle öznel ifadeleri doğru şekilde yorumlamakta zorlanmaktadır. Araştırmalar, görsel temsillerin anlaşılırlığı artırdığını ve kullanıcı etkileşimini güçlendirdiğini göstermektedir. Bu çalışma, metin tabanlı görüntü manipülasyonu ile yazılı otel yorumlarını orijinal otel görselleri üzerinde değişiklikler yaparak görsel incelemelere dönüştürmeyi amaçlamaktadır. Stable Diffusion modeli kullanılarak yazılı otel yorumları girdileriyle otel odası görüntüleri manipüle edilmiştir. Manipüle edilen görsellerin değerlendirilmesinde SSIM (Structural Similarity Index Measure) ve PSNR (Peak Signal-to-Noise Ratio) metrikleri uygulanmıştır. Ayrıca, manipüle edilmiş ve orijinal görüntü örnekleri karşılaştırmalı olarak sunulmuştur. Sonuçlar, modelin küçük ölçekli değişikliklerde başarılı olduğunu, ancak büyük değişikliklerde kalite kaybı yaşadığını göstermektedir.
Büyük dil modelleri için TR-MMLU benchmark’ı: performans değerlendirmesi, zorluklar ve iyileştirme fırsatları
(Institute of Electrical and Electronics Engineers Inc., 2025-08-15) Bayram, M. Ali; Fincan, Ali Arda; Gümüş, Ahmet Semih; Diri, Banu; Yıldırım, Savaş; Aytaş, Öner
Dil modelleri, insan dilini anlama ve üretme konularında önemli ilerlemeler kaydetmiş, birçok uygulamada dikkat çekici başarılar elde etmiştir. Ancak, özellikle Türkçe gibi kaynak açısından sınırlı dillere yönelik değerlendirme çalışmaları önemli ˘bir zorluk oluşturmaktadır. Bu sorunu ele almak amacıyla, büyük dil modellerinin (LLM) Türkçe dilindeki dilsel ve kavramsal yeteneklerini değerlendirmek için kapsamlı bir değerlendirme çerçevesi olan Türkçe MMLU (TR-MMLU) benchmark’ını tanıttık. TR-MMLU, Türk eğitim sisteminden 62 bölümdeki 6.200 çoktan seçmeli soruyu içeren, özenle hazırlanmış bir veri setine dayanmaktadır. Bu benchmark, Türkçe doğal dil işleme (NLP) araştırmalarına standart bir çerçeve sunmakta ve büyük dil modellerinin Türkçe metinleri işleme yeteneklerini detaylı bir şekilde analiz etmeyi sağlamaktadır. Çalışmamızda, TR-MMLU üzerinde en güncel büyük dil modellerini değerlendirdik ve model tasarımında iyileştirme gerektiren alanları vurguladık. TRMMLU, Türkçe NLP araştırmalarını ilerletmek ve gelecekteki yeniliklere ilham vermek için yeni bir standart oluşturmaktadır.
Automating cyber risk assessment with public LLMs: an expert-validated framework and comparative analysis
(Institute of Electrical and Electronics Engineers Inc., 2026-03-26) Ünal, Nezih Mahmut; Çeliktaş, Barış
Traditional cyber risk assessment methodologies face a critical dilemma: they are either quantitative yet static and context-agnostic (e.g., CVSS), or context-aware yet highly labor-intensive and subjective (e.g., NIST SP 800-30). Consequently, organizations struggle to scale risk assessment to match the pace of evolving threats. This paper presents an automated, context-aware risk assessment framework that leverages the reasoning capabilities of publicly available Large Language Models (LLMs) to operationalize expert knowledge. Rather than positioning the LLM as the final decision-maker, the framework decouples semantic interpretation from risk scoring authority through a transparent, deterministic Dynamic Metric Engine. Unlike complex closed box machine learning models, our approach anchors the AI's reasoning to this expert-validated metric schema, with weights derived using the Rank Order Centroid (ROC) method from a survey of 101 cybersecurity professionals. We evaluated the framework through a comparative study involving 15 diverse real-world vulnerability scenarios (C1-C15) and three supplementary sensitivity stress tests (C16-C18). The validation scenarios were independently assessed by a cohort of ten senior human experts and two state-of-the-art LLM agents (GPT-4o and Gemini 2.0 Flash). The results show that the LLM-driven agents achieve scoring consistency closely aligned with the human median (Pearson r ranging from 0.9390 to 0.9717, Spearman ρ from 0.8472 to 0.9276) against a highly reliable expert baseline (Cronbach's α =0.996), while reducing the assessment cycle time by more than 100× (averaging under 4 seconds per case vs. a human average of 6 minutes). Furthermore, a dedicated context sensitivity analysis (C13-C15) indicates that the framework adapts risk scores based on organizational context (e.g., SME vs. Critical Infrastructure) for identical technical vulnerabilities. Importantly, the system is designed not merely to replicate expert intuition, but to enforce bounded, policy-consistent risk evaluation under predefined governance constraints. Overall, these findings suggest that commercially available LLMs, when constrained by expert-validated metric schemas, can support reproducible, transparent, and real-time risk assessments.
Adaptive incident escalation in SOCs via AI-driven skill-aware assignment and tier optimization
(Institute of Electrical and Electronics Engineers Inc., 2026-04-15) Abuaziz, Ahmed; Çeliktaş, Barış
Modern Security Operations Centers (SOCs) face significant operational bottlenecks driven by escalating alert volumes, increasingly sophisticated cyberattack vectors, and chronic imbalances in analyst workloads. Conventional rule-based escalation models frequently fail to account for the multi-dimensional nature of incident characteristics, the nuances of analyst expertise, and fluctuating operational demands. This study proposes a comprehensive AI-driven framework for intelligent incident assignment and workload optimization. The framework introduces five primary contributions: 1) a multi-factor scoring model that integrates severity and complexity metrics with dynamic workload balancing; 2) two novel optimization algorithms, Quantile-Targeted Normality-Regularized Optimization (QT-NRO) and Joint Optimization of Weights and Thresholds (JOWT), to calibrate scoring coefficients against target analyst utilization; 3) a Large Language Model (LLM) engine leveraging Retrieval-Augmented Generation (RAG) for semantic alignment between incident requirements and analyst expertise; 4) an Adaptive Capacity Zoning mechanism for dynamic workload management; and 5) a novel RAG Relevance Score metric—a pre-resolution, semantic alignment indicator that quantifies analyst-incident assignment quality independently of resolution time, addressing a fundamental limitation of traditional temporal metrics such as Mean Time to Resolution (MTTR) and providing a reusable benchmark applicable to any skill-aware assignment system. In addition, the framework incorporates a feedback-based continuous learning mechanism that utilizes historical resolution data to inform future assignments. An experimental evaluation using 10,021 real-world incidents from Microsoft Defender demonstrates that the JOWT algorithm achieves a tier distribution alignment within 0.8% of targets. LLM-enhanced semantic matching yields improvements between 26.7% and 126.8% in skill alignment across both normal-load and high-load evaluations, while simulations indicate a 31.8% reduction in MTTR. These results substantiate the efficacy of AI-driven methodologies in enhancing SOC operational efficiency and response precision.

Filtreler

Yazar

Konu

Tarih

İndeks

WoS Q

Scopus Q

Dil

Tür

Kategori

Bölüm

Erişim Hakkı

Tam Metin

Öğe Türü

Ayarlar

Sırala

Sayfa Başına Sonuç

Arama Sonuçları