AnlamVer: Semantic model evaluation dataset for Turkish - word similarity and relatedness

dc.authorid0000-0002-2782-8217
dc.authorid0000-0001-5838-4615
dc.contributor.authorErcan, Gökhanen_US
dc.contributor.authorYıldız, Olcay Taneren_US
dc.date.accessioned2024-01-16T22:22:52Z
dc.date.available2024-01-16T22:22:52Z
dc.date.issued2018-08-26
dc.departmentIşık Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.departmentIşık University, Faculty of Engineering, Department of Computer Engineeringen_US
dc.description.abstractIn this paper, we present AnlamVer, which is a semantic model evaluation dataset for Turkish designed to evaluate word similarity and word relatedness tasks while discriminating those two relations from each other. Our dataset consists of 500 word-pairs annotated by 12 human subjects, and each pair has two distinct scores for similarity and relatedness. Word-pairs are selected to enable the evaluation of distributional semantic models by multiple attributes of words and word-pair relations such as frequency, morphology, concreteness and relation types (e.g., synonymy, antonymy). Our aim is to provide insights to semantic model researchers by evaluating models in multiple attributes. We balance dataset word-pairs by their frequencies to evaluate the robustness of semantic models concerning out-of-vocabulary and rare words problems, which are caused by the rich derivational and inflectional morphology of the Turkish language.en_US
dc.description.versionPublisher's Versionen_US
dc.identifier.citationErcan, G. & Yıldız, O. T. (2018). AnlamVer: Semantic model evaluation dataset for Turkish - word similarity and relatedness. Paper presented at the COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings, 3819 - 3836.en_US
dc.identifier.endpage3836
dc.identifier.isbn9781948087506
dc.identifier.scopus2-s2.0-85071979333
dc.identifier.scopusqualityN/A
dc.identifier.startpage3819
dc.identifier.urihttps://hdl.handle.net/11729/5886
dc.indekslendigikaynakScopusen_US
dc.institutionauthorErcan, Gökhanen_US
dc.institutionauthorYıldız, Olcay Taneren_US
dc.institutionauthorid0000-0002-2782-8217
dc.institutionauthorid0000-0001-5838-4615
dc.language.isoenen_US
dc.peerreviewedYesen_US
dc.publicationstatusPublisheden_US
dc.publisherAssociation for Computational Linguistics (ACL)en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectComputational linguisticsen_US
dc.subjectSemanticsen_US
dc.subjectDistributional semanticsen_US
dc.subjectEvaluating modelsen_US
dc.subjectHuman subjectsen_US
dc.subjectModel evaluationen_US
dc.subjectMultiple attributesen_US
dc.subjectSemantic modellingen_US
dc.subjectTurkishsen_US
dc.subjectWord problemen_US
dc.subjectWord similarityen_US
dc.subjectWord-pairsen_US
dc.subjectMorphologyen_US
dc.titleAnlamVer: Semantic model evaluation dataset for Turkish - word similarity and relatednessen_US
dc.typeConference Objecten_US

Dosyalar

Lisans paketi
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
license.txt
Boyut:
1.44 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: