Lagrangian objective function leads to improved unforeseen attack generalization
From MaRDI portal
Publication:6134360
DOI10.1007/S10994-023-06348-3zbMATH Open1518.68337arXiv2103.15385MaRDI QIDQ6134360FDOQ6134360
Authors: Mohammad Azizmalayeri, Mohammad Hossein Rohban
Publication date: 22 August 2023
Published in: Machine Learning (Search for Journal in Brave)
Abstract: Recent improvements in deep learning models and their practical applications have raised concerns about the robustness of these models against adversarial examples. Adversarial training (AT) has been shown effective to reach a robust model against the attack that is used during training. However, it usually fails against other attacks, i.e. the model overfits to the training attack scheme. In this paper, we propose a simple modification to the AT that mitigates the mentioned issue. More specifically, we minimize the perturbation norm while maximizing the classification loss in the Lagrangian form. We argue that crafting adversarial examples based on this scheme results in enhanced attack generalization in the learned model. We compare our final model robust accuracy against attacks that were not used during training to closely related state-of-the-art AT methods. This comparison demonstrates that our average robust accuracy against unseen attacks is 5.9% higher in the CIFAR-10 dataset and is 3.2% higher in the ImageNet-100 dataset than corresponding state-of-the-art methods. We also demonstrate that our attack is faster than other attack schemes that are designed for unseen attack generalization, and conclude that it is feasible for large-scale datasets.
Full work available at URL: https://arxiv.org/abs/2103.15385
Recommendations
- Robustifying models against adversarial attacks by Langevin dynamics
- Enhancing adversarial attack transferability with multi-scale feature attack
- Generating universal adversarial perturbation with ResNet
- Adversarial defense via the data-dependent activation, total variation minimization, and adversarial training
- Unifying adversarial training algorithms with data gradient regularization
Cites Work
Cited In (6)
- Spanning attack: reinforce black-box attacks with unlabeled data
- Towards improving fast adversarial training in multi-exit network
- LADDER: latent boundary-guided adversarial training
- A3T: accuracy aware adversarial training
- Generating universal adversarial perturbation with ResNet
- Robustifying models against adversarial attacks by Langevin dynamics
This page was built for publication: Lagrangian objective function leads to improved unforeseen attack generalization
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6134360)