Adversarial examples in random neural networks with general activations (Q6062703)

From MaRDI portal
scientific article; zbMATH DE number 7761503
Language Label Description Also known as
English
Adversarial examples in random neural networks with general activations
scientific article; zbMATH DE number 7761503

    Statements

    Adversarial examples in random neural networks with general activations (English)
    0 references
    6 November 2023
    0 references
    Summary: A substantial body of empirical work documents the lack of robustness in deep learning models to adversarial examples. Recent theoretical work proved that adversarial examples are ubiquitous in two-layers networks with sub-exponential width and ReLU or smooth activations, and multi-layer ReLU networks with sub-exponential width. We present a result of the same type, with no restriction on width and for general locally Lipschitz continuous activations. More precisely, given a neural network \(f(\,\cdot\,;\boldsymbol{\theta})\) with random weights \(\boldsymbol{\theta}\), and feature vector \(\boldsymbol{x}\), we show that an adversarial example \(\boldsymbol{x}'\) can be found with high probability along the direction of the gradient \(\nabla_{\boldsymbol{x}}f(\boldsymbol{x};\boldsymbol{\theta})\). Our proof is based on a Gaussian conditioning technique. Instead of proving that \(f\) is approximately linear in a neighborhood of \(\boldsymbol{x}\), we characterize the joint distribution of \(f(\boldsymbol{x};\boldsymbol{\theta})\) and \(f(\boldsymbol{x}';\boldsymbol{\theta})\) for \(\boldsymbol{x}' = \boldsymbol{x}-s(\boldsymbol{x}) \nabla_{\boldsymbol{x}}f (\boldsymbol{x};\boldsymbol{\theta})\), where \(s(\boldsymbol{x}) = \mathrm{sign}(f(\boldsymbol{x}; \boldsymbol{\theta})) \cdot s_d\) for some positive step size \(s_d\).
    0 references
    adversarial example
    0 references
    neural network
    0 references

    Identifiers