Adversarial Robustness with Partial Isometry.

Shi-Garrier, Loïc; Bouaynaya, Nidhal Carla; Delahaye, Daniel

Shi-Garrier, Loïc; Bouaynaya, Nidhal Carla; Delahaye, Daniel.

Afiliación

Shi-Garrier L; ENAC, Université de Toulouse, 31400 Toulouse, France.
Bouaynaya NC; Department of Electrical and Computer Engineering, Rowan University, Glassboro, NJ 08028, USA.
Delahaye D; ENAC, Université de Toulouse, 31400 Toulouse, France.

Entropy (Basel) ; 26(2)2024 Jan 24.

Article en En | MEDLINE | ID: mdl-38392358

ABSTRACT

ABSTRACT

Despite their remarkable performance, deep learning models still lack robustness guarantees, particularly in the presence of adversarial examples. This significant vulnerability raises concerns about their trustworthiness and hinders their deployment in critical domains that require certified levels of robustness. In this paper, we introduce an information geometric framework to establish precise robustness criteria for l2 white-box attacks in a multi-class classification setting. We endow the output space with the Fisher information metric and derive criteria on the input-output Jacobian to ensure robustness. We show that model robustness can be achieved by constraining the model to be partially isometric around the training points. We evaluate our approach using MNIST and CIFAR-10 datasets against adversarial attacks, revealing its substantial improvements over defensive distillation and Jacobian regularization for medium-sized perturbations and its superior robustness performance to adversarial training for large perturbations, all while maintaining the desired accuracy.

Palabras clave

adversarial robustness; fisher information metric; information geometry; multi-class classification

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Entropy (Basel) Año: 2024 Tipo del documento: Article País de afiliación: Francia Pais de publicación: Suiza

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google