An informative dual ForkNet for video anomaly detection.

Li, Hongjun; Wang, Yunlong; Wang, Yating; Chen, Junjie

Li, Hongjun; Wang, Yunlong; Wang, Yating; Chen, Junjie.

Afiliación

Li H; School of Information Science and Technology, Nantong University, Nantong 226019, Jiangsu, China. Electronic address: lihongjun@ntu.edu.cn.
Wang Y; School of Information Science and Technology, Nantong University, Nantong 226019, Jiangsu, China.
Wang Y; School of Information Science and Technology, Nantong University, Nantong 226019, Jiangsu, China.
Chen J; School of Information Science and Technology, Nantong University, Nantong 226019, Jiangsu, China.

Neural Netw ; 179: 106509, 2024 Nov.

Article en En | MEDLINE | ID: mdl-39029297

ABSTRACT

ABSTRACT

An autoencoder for video anomaly detection task is a type of algorithm with the primary purpose of learning an "informative" representation of the normal data that can be used for identifying the abnormal data by learning to reconstruct a set of input observations. Based on the encoding-decoding structure, we explore a novel dual ForkNet architecture that can dissociate and process the spatio-temporal representation. It is well-known in the information theory community that most autoencoders coding processes are inevitably accompanied by a certain loss of information. In this dual ForkNet, we focus on mitigating the information loss problem and propose a novel architectural recalibration approach, which we term the "Informetrics Recalibration" (IR). It can adaptively recalibrate latent feature representation by explicitly modeling the similarity between the corresponding feature maps of encoder and decoder, and retain more useful semantic information to generate greater differentiation between normal and abnormal events. Additionally, because the structure of the autoencoder itself determines the difficulty to obtain deep semantic information, we introduce a Secondary Encoder (SE) in each ForkNet, so as to recalibrate target features responses of latent feature representation. Our model is easy to be trained and robust to be applied, because it basically consists of some ResNet blocks without using complicated modules. Extensive experiments on the five publicly available benchmarks show that our model outperforms the existing state-of-the-art architectures, demonstrating our framework's effectiveness.

Asunto(s)

Algoritmos; Redes Neurales de la Computación; Grabación en Video; Humanos; Semántica

Palabras clave

ForkNet structure; Informetrics recalibration; Video anomaly detection

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Grabación en Video / Algoritmos / Redes Neurales de la Computación Límite: Humans Idioma: En Revista: Neural Netw Asunto de la revista: NEUROLOGIA Año: 2024 Tipo del documento: Article Pais de publicación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google