Adversarial examples in random neural networks with general activations (Q6062703)

Summary: A substantial body of empirical work documents the lack of robustness in deep learning models to adversarial examples. Recent theoretical work proved that adversarial examples are ubiquitous in two-layers networks with sub-exponential width and ReLU or smooth activations, and multi-layer ReLU networks with sub-exponential width. We present a result of the same type, with no restriction on width and for general locally Lipschitz continuous activations. More precisely, given a neural network \(f(\,\cdot\,;\boldsymbol{\theta})\) with random weights \(\boldsymbol{\theta}\), and feature vector \(\boldsymbol{x}\), we show that an adversarial example \(\boldsymbol{x}'\) can be found with high probability along the direction of the gradient \(\nabla_{\boldsymbol{x}}f(\boldsymbol{x};\boldsymbol{\theta})\). Our proof is based on a Gaussian conditioning technique. Instead of proving that \(f\) is approximately linear in a neighborhood of \(\boldsymbol{x}\), we characterize the joint distribution of \(f(\boldsymbol{x};\boldsymbol{\theta})\) and \(f(\boldsymbol{x}';\boldsymbol{\theta})\) for \(\boldsymbol{x}' = \boldsymbol{x}-s(\boldsymbol{x}) \nabla_{\boldsymbol{x}}f (\boldsymbol{x};\boldsymbol{\theta})\), where \(s(\boldsymbol{x}) = \mathrm{sign}(f(\boldsymbol{x}; \boldsymbol{\theta})) \cdot s_d\) for some positive step size \(s_d\).

0 references

zbMATH Keywords

adversarial example

0 references

neural network

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Deep learning: a statistical viewpoint

0 references