Unreasonable effectiveness of last hidden layer activations for adversarial robustness

dc.authorid0000-0002-6214-6262
dc.authorid0000-0003-0298-0690
dc.contributor.authorTuna, Ömer Faruken_US
dc.contributor.authorÇatak, Ferhat Özgüren_US
dc.contributor.authorEskil, Mustafa Taneren_US
dc.date.accessioned2022-10-26T18:25:01Z
dc.date.available2022-10-26T18:25:01Z
dc.date.issued2022
dc.departmentIşık Üniversitesi, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.departmentIşık University, Faculty of Engineering and Natural Sciences, Department of Computer Engineeringen_US
dc.description.abstractIn standard Deep Neural Network (DNN) based classifiers, the general convention is to omit the activation function in the last (output) layer and directly apply the softmax function on the logits to get the probability scores of each class. In this type of architectures, the loss value of the classifier against any output class is directly proportional to the difference between the final probability score and the label value of the associated class. Standard White-box adversarial evasion attacks, whether targeted or untargeted, mainly try to exploit the gradient of the model loss function to craft adversarial samples and fool the model. In this study, we show both mathematically and experimentally that using some widely known activation functions in the output layer of the model with high temperature values has the effect of zeroing out the gradients for both targeted and untargeted attack cases, preventing attackers from exploiting the model's loss function to craft adversarial samples. We've experimentally verified the efficacy of our approach on MNIST (Digit), CIFAR10 datasets. Detailed experiments confirmed that our approach substantially improves robustness against gradient-based targeted and untargeted attack threats. And, we showed that the increased non-linearity at the output layer has some ad-ditional benefits against some other attack methods like Deepfool attack.en_US
dc.description.versionPublisher's Versionen_US
dc.identifier.citationTuna, Ö. F., Çatak, F. Ö. & Eskil, M. T. (2022). Unreasonable effectiveness of last hidden layer activations for adversarial robustness. Paper presented at the 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), 1098-1103. doi:10.1109/COMPSAC54236.2022.00172en_US
dc.identifier.doi10.1109/COMPSAC54236.2022.00172
dc.identifier.endpage1103
dc.identifier.isbn9781665488105
dc.identifier.isbn9781665488112
dc.identifier.issn0730-3157
dc.identifier.scopus2-s2.0-85136991056
dc.identifier.scopusqualityN/A
dc.identifier.startpage1098
dc.identifier.urihttps://hdl.handle.net/11729/5093
dc.identifier.urihttp://dx.doi.org/10.1109/COMPSAC54236.2022.00172
dc.identifier.wosWOS:000855983300164
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.indekslendigikaynakConference Proceedings Citation Index – Science (CPCI-S)en_US
dc.institutionauthorTuna, Ömer Faruken_US
dc.institutionauthorEskil, Mustafa Taneren_US
dc.institutionauthorid0000-0002-6214-6262
dc.institutionauthorid0000-0003-0298-0690
dc.language.isoenen_US
dc.peerreviewedYesen_US
dc.publicationstatusPublisheden_US
dc.publisherInstitute of Electrical and Electronics Engineers Inc.en_US
dc.relation.ispartof2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectAdversarial machine learningen_US
dc.subjectDeep neural networksen_US
dc.subjectRobustnessen_US
dc.subjectTrustworthy AIen_US
dc.subjectChemical activationen_US
dc.subjectMultilayer neural networksen_US
dc.subjectActivation functionsen_US
dc.subjectHidden layersen_US
dc.subjectLoss functionsen_US
dc.subjectMachine-learningen_US
dc.subjectNetwork-baseden_US
dc.subjectOutput layeren_US
dc.subjectWhite boxen_US
dc.subjectObject detectionen_US
dc.subjectDeep learningen_US
dc.subjectIOUen_US
dc.titleUnreasonable effectiveness of last hidden layer activations for adversarial robustnessen_US
dc.typeConference Objecten_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
Unreasonable_effectiveness_of_last_hidden_layer_activations_for_adversarial_robustness.pdf
Boyut:
570.35 KB
Biçim:
Adobe Portable Document Format
Açıklama:
Publisher's Version
Lisans paketi
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
license.txt
Boyut:
1.44 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: