Bolstering Adversarial Robustness with Latent Disparity Regularization

David Schwartz, Gregory Ditzler

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent research has revealed that neural networks and other machine learning models are vulnerable to adversarial attacks that aim to subvert their predictions' integrity or privacy by adding a small calculated perturbation to inputs. Further, the adversary can significantly degrade the performance of the model. The number and severity of attacks continues to grow. However, a dearth of techniques robustly defends machine learning models in a computationally inexpensive way. Against this background, we propose an adversarially robust training procedure and objective function for arbitrary neural network architectures. Robustness of neural networks against adversarial attacks on integrity is achieved by augmentation of a novel regularization term. This regularizer penalizes the discrepancy between the representations induced in hidden layers by benign and adversarial data. We benchmark our regularization approach on the Fashion-Mnist and Cifar-10 datasets. Our model is benchmarked against three state-of-the-art defense methods, namely: (i) regularization to the largest eigenvalue in the Fisher information matrix of the activity of the terminal layer, (ii) a higher-level representation guided denoising autoencoder (trained with adversarial examples), and (iii) training an otherwise undefended model on data distorted by additive white Gaussian noise. Our experiments show that the proposed regularizer provides significant improvements in adversarial robustness over both an undefended baseline model as well as the same model defended with other techniques. This result is observed over several adversarial budgets with only a small (but seemingly unavoidable) decline in benign test accuracy.

Original languageEnglish (US)
Title of host publicationIJCNN 2021 - International Joint Conference on Neural Networks, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9780738133669
DOIs
StatePublished - Jul 18 2021
Event2021 International Joint Conference on Neural Networks, IJCNN 2021 - Virtual, Shenzhen, China
Duration: Jul 18 2021Jul 22 2021

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2021-July

Conference

Conference2021 International Joint Conference on Neural Networks, IJCNN 2021
Country/TerritoryChina
CityVirtual, Shenzhen
Period7/18/217/22/21

Keywords

  • Adversarial Defenses
  • Fast Gradient Sign Method
  • Regularization

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Bolstering Adversarial Robustness with Latent Disparity Regularization'. Together they form a unique fingerprint.

Cite this