Arama Sonuçları

Listeleniyor 1 - 9 / 9
  • Yayın
    Multi-task learning on mental disorder detection, sentiment detection and emotion detection
    (Işık Üniversitesi, 2024-02-12) Armah, Courage; Dehkharghani, Rahim; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı; Işık University, School of Graduate Studies, Computer Science Engineering Master Program
    Suicidal behavior is a global cause of life-threatening injury and most of the time, death. Mental disorders such as depression, anxiety, and bipolar are prevalent among the youth in recent decades. Social media are popular platforms for individuals to post their thoughts and feelings on. Extracting people’s sentiments and feelings from such online platforms would help detect mental disorders of the users to treat them before it becomes too late. This thesis investigates the use of multi-task learning systems and single-task learning techniques to estimate behaviors and mental states for early diagnosis. I used data mined from Reddit, one of the popular social media platforms that provides anonymity. Anonymity increases the chances of individuals sharing what they truly feel in their real life. The obtained results by the proposed approaches open new doors to the understanding of how multi-task systems can increase the performance of text classification problems such as depression detection, emotion detection, and sentiment analysis, trained together in a multi-task learning network when compared to their training in isolation in a single-task learning network. We used the SWMH dataset, already labeled by 5 different depression labels (depression, anxiety, suicide, bipolar, and off my chest) and then added emotion and polarity labels to it and made it publicly available for researchers in the literature. The obtained results in this study are also comparable to other approaches in the field.
  • Yayın
    All-words word sense disambiguation in Turkish
    (Işık Üniversitesi, 2019-09-06) Akçakaya, Sinan; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı
    Word sense disambiguation (WSD) is the identi cation of the meaning of words in context in a computational manner. The main subject of this study is to implement and compare the WSD results of various supervised classi ers (Naive Bayes, K Nearest Neighbor, Rocchio and C4.5) in all-words setting. To this end, we have constructed an all-words sense annotated Turkish corpus, using traditional method of manual tagging. During the annotation, a pre-built parallel treebank (aligned from Penn Treebank) has been tagged with the senses of Turkish Language Institutions dictionary. The approach of annotating a treebank allowed us to generate a full-coverage resource, in which syntactic and semantic information merged. In the WSD evaluations, three distinct experiments have been organized to determine the efect of using different feature sets on the disambiguation performance. First experiment has been conducted with a simple feature set that includes the fundamental local features. In the second experiment, the initial feature set has been augmented with several effective morphological features, and in the third one, the feature set has further been extended with the syntactic features. Our test results show that all classi ers have achieved better results in parallel to growing feature set. Additionally, integration of syntactic features has proved to be useful for WSD.
  • Yayın
    Semantic role labeling for Turkish propbank
    (Işık Üniversitesi, 2019-09-06) Esgel, Volkan; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı
    People's communication with each other takes place through sentences that combine words with different purposes. Words can gain different meanings with the presence of other words in the sentences in which they take place. With the rapid development of technology, the studies on understanding of human language by computational power have gained speed. These studies are generally referred to Natural Language Processing and their main purpose is to understand the sentences in human communication. The words in the sentence ful l different purposes. Some words describe an event, while other words indicate details of that event. De ning the semantic roles of words is possible with different algorithms. This study rst started by contributing to the process of determining the semantic roles of the word groups in the sentence by manpower. In addition, the semantic roles in the English sentences were parsed and shared on a web site with the marked roles in the Turkish sentences for comparison purposes. Finally, it is tried to measure how the algorithms aiming to nd the semantic roles of the words in the sentence perform automatically for Turkish.
  • Yayın
    Word sense disambiguation, named entity recognition, and shallow parsing tasks for Turkish
    (Işık Üniversitesi, 2019-04-02) Topsakal, Ozan; Yıldız, Olcay Taner; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı
    People interactions are based on sentences. The process of understanding sentences is thru converging, parsing the words and making sense of words. The ultimate goal of Natural Language Processing is to understand the meaning of sentences. There are three main areas that are the topics of this thesis, namely, Named Entity Recognition, Shallow Parsing, and Word Sense Disambiguation. The Natural Language Processing algorithms that learn entities, like person, location, time etc. are called Named Entity Recognition algorithms. Parsing sentences is one of the biggest challenges in Natural Language Processing. Since time efficiency and accuracy are inversely proportional with each other, one of the best ideas is to use shallow parsing algorithms to deal with this challenge. Many of words have more than one meaning. Recognizing the correct meaning that is used in a sentence is a difficult problem. In Word Sense Disambiguation literature there are lots of algorithms that can help to solve this problem. This thesis tries to find solutions to these three challenges by applying machine learning trained algorithms. Experiments are done on a dataset, containing 9,557 sentences.
  • Yayın
    Entity-relationship diagram generation with natural language processing and machine learning approach
    (Işık Üniversitesi, 2023-08-24) Köprülü, Mertali; Ekin, Emine; Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı
    As software systems continue to grow in complexity, the need for efficient and accurate design methodologies becomes increasingly critical. Entity-Relationship Diagrams (ERDs) provide a powerful visual representation of system structures and dependencies, serving as a foundation for software engineering and database design. However, manually creating ERDs from textual requirements is time-consuming and manual. To address this challenge, this research explores the application of natural language processing (NLP) techniques to automatically extract relevant information from unstructured text and generate ERDs. The proposed approach leverages the strengths of rule-based techniques, semantic analysis, and machine learning algorithms to automatically identify entities, attributes, relationships, and cardinalities from natural language input. Our study offers practical insights into the utilization of linguistic and semantic analysis, and machine learning for efficient information extraction. The proposed system aims to streamline the ERD creation process and improve the accuracy and quality of the resulting diagrams. While the proposed approach shows promising results, the limitations in heuristic rule coverage and data dependencies are acknowledge. Furthermore, the evaluation results demonstrate in detecting entities, attributes, and relations, with f1-scores of 0.96, 0.93, and 0.92, and resolving the components specifications achieved accuracy of 0.87, 0.84, 0.91, respectively. The findings contribute to advancing ERD extraction from text and suggest future research directions for improving the robustness and usability of the solution. The fusion of NLP techniques with ERD creation highlights the potential for enhancing the software development lifecycle and opens new avenues for research in the realm of information extraction from natural language text.
  • Yayın
    Morphological analyser for Turkish
    (Işık Üniversitesi, 2018-01-25) Özenç, Berke; Solak, Ercan; Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı
    Natural Language Processing is one one the fields of work in computer science and specializes in text summarization, machine translation and many various topics. Morphology is one of the Natural Language Processing features which analyses the words with its suxes. A words meaning can change according to the sux that it takes. Turkish is an agglutinative language with rich morphological structure and set of suxes. This features of Turkish result in complex morphology structure. In this study, we present an analyser for Modern Anatolian Turkish which has high coverage on suffixes and morphological rules of Turkish. Two-Level transformation method which is convenient to design morphology of a language, consists our base of approach. We used HFST which is a Finite State Transducer implementation, as our implementation technique. The analyser covers all morphological and phonetic rules that exist in Turkish and contains a lexicon which consist of today's Turkish words. The analyser is publicly available and can be used on http://ddil.isikun.edu.tr/mortur.
  • Yayın
    LuminaURO: a comprehensive Artificial Intelligence Driven Assistant for enhancing urological diagnostics and patient care
    (Hayat Sağlık ve Sosyal Hizmetler Vakfı, 2025-05-29) Soylu, Tuncay; Topçu, İbrahim; Karaman, Muhammet İhsan; Tuzcu, Esra Melis; Kınık, Abdullah Harun; Güneren, Mustafa Sacit; Salman, Zeynep; Demir, Perihan; Beyzanur, Kaç
    Aim: This study aims to develop and validate LuminaURO, a Retrieval-Augmented Generation (RAG)-based AI Assistant specifically designed for urological healthcare, addressing the limitations of conventional Large Language Models (LLMs) in healthcare applications. Methods: We developed LuminaURO using a specialized repository of urological documents and implemented a novel pooling methodology to search multilingual documents and aggregate information for response generation. The system was evaluated using multiple similarity algorithms (OESM, Spacy, T5, and BERTScore) and expert assessment by urologists (n=3). Results: LuminaURO generates responses within 8-15 seconds from multilingual documents and enhances user interaction by providing two contextually relevant follow-up questions per query. The architecture demonstrates significant improvements in search latency, memory requirements, and similarity metrics compared to state-of-the-art approaches. Validation shows similarity scores of 0.6756, 0.7206, 0.9296, 0.9223, and 0.9183 for English responses, and 0.6686, 0.7166, 0.8119, 0.9220, 0.9315, and 0.9086 for Turkish responses. Expert evaluation by urologists revealed similarity scores of 0.9444 and 0.9408 for English and Turkish responses, respectively. Conclusion: LuminaURO successfully addresses the limitations of conventional LLM implementations in healthcare by utilizing specialized urological documents and our innovative pooling methodology for multilanguage document processing. The high similarity scores across multiple evaluation metrics and strong expert validation confirm the system’s effectiveness in providing accurate and relevant urological information. Future research will focus on expanding this approach to other medical specialties, with the ultimate goal of developing LuminaHealth, a comprehensive healthcare assistant covering all medical domains.
  • Yayın
    Turkish sentiment analysis: a comprehensive review
    (Yildiz Technical University, 2024-08) Altınel Girgin, Ayşe Berna; Gümüşçekiçci, Gizem; Birdemir, Nuri Can
    Sentiment analysis (SA) is a very popular research topic in the text mining field. SA is the process of textual mining in which the meaning of a text is detected and extracted. One of the key aspects of SA is to analyze the body of a text to determine its polarity to understand the opinions it expresses. Substantial amounts of data are produced by online resources such as social media sites, blogs, news sites, etc. Due to this reason, it is impossible to process all of this data without automated systems, which has contributed to the rise in popularity of SA in recent years. SA is considered to be extremely essential, mostly due to its ability to analyze mass opinions. SA, and Natural Language Processing (NLP) in particular, has become an overwhelmingly popular topic as social media usage has increased. The data collected from social media has sourced numerous different SA studies due to being versatile and accessible to the masses. This survey presents a comprehensive study categorizing past and present studies by their employed methodologies and levels of sentiment. In this survey, Turkish SA studies were categorized under three sections. These are Dictionary-based, Machine Learning-based, and Hybrid-based. Researchers can discover, compare, and analyze properties of different Turkish SA studies reviewed in this survey, as well as obtain information on the public dataset and the dictionaries used in the studies. The main purpose of this study is to combine Turkish SA approaches and methods while briefly explaining its concepts. This survey uniquely categorizes a large number of related articles and visualizes their properties. To the best of our knowledge, there is no such comprehensive and up-to-date survey that strictly covers Turkish SA which mainly concerns analysis of sentiment levels. Furthermore, this survey contributes to the literature due to its unique property of being the first of its kind.
  • Yayın
    Geopolitical parallax: beyond Walter Lippmann just after large language models
    (Cornell Univ, 2025-08-27) Yavuz, Mehmet Can; Kabir, Humza Gohar; Özkan, Aylin
    Objectivity in journalism has long been contested, oscillating between ideals of neutral, fact-based reporting and the inevitability of subjective framing. With the advent of large language models (LLMs), these tensions are now mediated by algorithmic systems whose training data and design choices may themselves embed cultural or ideological biases. This study investigates geopolitical parallax—systematic divergence in news quality and subjectivity assessments—by comparing articlelevel embeddings from Chinese-origin (Qwen, BGE, Jina) and Western-origin (Snowflake, Granite) model families. We evaluate both on a human-annotated news quality benchmark spanning fifteen stylistic, informational, and affective dimensions, and on parallel corpora covering politically sensitive topics, including Palestine and reciprocal China–United States coverage. Using logistic regression probes and matched-topic evaluation, we quantify per-metric differences in predicted positive-class probabilities between model families. Our findings reveal consistent, nonrandom divergences aligned with model origin. In Palestinerelated coverage, Western models assign higher subjectivity and positive emotion scores, while Chinese models emphasize novelty and descriptiveness. Cross-topic analysis shows asymmetries in structural quality metrics—Chinese-on-US scoring notably lower in fluency, conciseness, technicality, and overall quality—contrasted by higher negative emotion scores. These patterns align with media bias theory and our distinction between semantic, emotional, and relational subjectivity, and extend LLM bias literature by showing that geopolitical framing effects persist in downstream quality assessment tasks. We conclude that LLMbased media evaluation pipelines require cultural calibration to avoid conflating content differences with model-induced bias.