Big data storage and automated text summarization in Turkish text
dc.contributor.advisor | Yıldız, Olcay Taner | en_US |
dc.contributor.author | Aysu, Erdinç | en_US |
dc.contributor.other | Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı | en_US |
dc.date.accessioned | 2018-11-22T23:10:40Z | |
dc.date.available | 2018-11-22T23:10:40Z | |
dc.date.issued | 2018-06-19 | |
dc.department | Işık Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Yüksek Lisans Programı | en_US |
dc.description | Text in English ; Abstract: English and Turkish | en_US |
dc.description | Includes bibliographical references (leaves 51-52) | en_US |
dc.description | x, 52 leaves | en_US |
dc.description.abstract | The subject of this study is storing the large datasets in accordance with Big Data ecosystem and to extract the summary sentences of a text in Turkish, apply the automatic text summarization process which is a subtopic of Natural language processing (NLP). For this purpose, Turkish news articles were collected and the study was carried out through these texts. For the performance test of the work done, 50 different news textiles were given to 20 different persons and 3 sentences which were considered important from each other were asked to be selected and their results were compared with each other. Then, the results from the people were compared with the results from this study. As a result of the test process, the summation performance of the work was measured approximately as thirty-six percentage. | en_US |
dc.description.abstract | Bu çalışmanın konusu, geniş çapta veriyi Dev Veri ekosistemine uygun bir şekilde saklamak ve bir Türkçe dokumanın özet cümlelerinin çıkarılması için doğal dil işleme (DDİ) alt konusu olan otomatik metin özetleme işlemini uygulamaktır. Bu amaçla Türkçe haber metinleri toplanmış ve çalışma bu metinler üzerinden yürütülmüştür. Yapılan çalışmanın performans testi için 20 farklı kişiye 50 farklı haber metni verilmiş ve her metnin içerisinden önemli gördükleri 3 cümlenin seçilmesi istenmiştir ve sonuçlar birbirleriyle karşılaştırılmıştır. Daha sonra kişilerden alınan sonuç ile bu çalışmadaki çıkan sonuç karşılaştırılmıştır. Test işleminin neticesinde çalışmanın özetleme performansı yaklaşık olarak yüzde otuz altı ölçülmüştür. | en_US |
dc.description.tableofcontents | A Brief Look to the Big Data | en_US |
dc.description.tableofcontents | Characteristics Of Big Data | en_US |
dc.description.tableofcontents | Volume | en_US |
dc.description.tableofcontents | Velocity | en_US |
dc.description.tableofcontents | Variety | en_US |
dc.description.tableofcontents | Big Data Storage and Distributed Computing System | en_US |
dc.description.tableofcontents | Distributed Calculation | en_US |
dc.description.tableofcontents | Relation of Machine Learning with Big Data | en_US |
dc.description.tableofcontents | Computing Power | en_US |
dc.description.tableofcontents | Ideal Distributed Systems | en_US |
dc.description.tableofcontents | Scalability | en_US |
dc.description.tableofcontents | Hadoop | en_US |
dc.description.tableofcontents | File Compression | en_US |
dc.description.tableofcontents | Codecs | en_US |
dc.description.tableofcontents | Hadoop vs Relational Database Management Systems | en_US |
dc.description.tableofcontents | Hadoop Components | en_US |
dc.description.tableofcontents | HDFS | en_US |
dc.description.tableofcontents | Architecture | en_US |
dc.description.tableofcontents | Map Reduce | en_US |
dc.description.tableofcontents | Map Reduce Algorithm | en_US |
dc.description.tableofcontents | Yarn | en_US |
dc.description.tableofcontents | Relevant Hadoop Technologies | en_US |
dc.description.tableofcontents | Pig | en_US |
dc.description.tableofcontents | Hive | en_US |
dc.description.tableofcontents | Mahout | en_US |
dc.description.tableofcontents | Spark | en_US |
dc.description.tableofcontents | Summarization | en_US |
dc.description.tableofcontents | Experimental Work | en_US |
dc.description.tableofcontents | Setup | en_US |
dc.description.tableofcontents | Virtual Machine | en_US |
dc.description.tableofcontents | Distributed Multi-Node Hadoop Cluster | en_US |
dc.description.tableofcontents | Hadoop Setup and Adjustment | en_US |
dc.description.tableofcontents | SSH | en_US |
dc.description.tableofcontents | HDFS | en_US |
dc.description.tableofcontents | Dataset | en_US |
dc.description.tableofcontents | News Articles Collection | en_US |
dc.description.tableofcontents | Hurriyet Search API | en_US |
dc.description.tableofcontents | Preprocessing of Data | en_US |
dc.description.tableofcontents | Batch Data Input | en_US |
dc.description.tableofcontents | Summarization Methodology | en_US |
dc.description.tableofcontents | Morphologic Analysis | en_US |
dc.description.tableofcontents | Scoring Process | en_US |
dc.description.tableofcontents | Summarization Phase | en_US |
dc.identifier.citation | Aysu, E. (2018). Big data storage and automated text summarization in Turkish text. İstanbul: Işık Üniversitesi Fen Bilimleri Enstitüsü. | en_US |
dc.identifier.uri | https://hdl.handle.net/11729/1385 | |
dc.institutionauthor | Aysu, Erdinç | en_US |
dc.language.iso | en | en_US |
dc.publisher | Işık Üniversitesi | en_US |
dc.relation.publicationcategory | Tez | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 United States | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/us/ | * |
dc.subject | Big data | en_US |
dc.subject | Hadoop | en_US |
dc.subject | NLP | en_US |
dc.subject | Summarization | en_US |
dc.subject | Dev veri | en_US |
dc.subject | DDİ | en_US |
dc.subject | Özetleme | en_US |
dc.subject.lcc | QA76.9.B45 A97 2018 | |
dc.subject.lcsh | Big data. | en_US |
dc.subject.lcsh | Data mining. | en_US |
dc.subject.lcsh | Information. | en_US |
dc.title | Big data storage and automated text summarization in Turkish text | en_US |
dc.title.alternative | Dev veri depolama ve Türkçe metin için otomatik özetleme | en_US |
dc.type | Master Thesis | en_US |
dspace.entity.type | Publication |