Active listening.

Friston, Karl J; Sajid, Noor; Quiroga-Martinez, David Ricardo; Parr, Thomas; Price, Cathy J; Holmes, Emma

Active listening.

Friston, Karl J; Sajid, Noor; Quiroga-Martinez, David Ricardo; Parr, Thomas; Price, Cathy J; Holmes, Emma.

Afiliación

Friston KJ; The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR, UK. Electronic address: k.friston@ucl.ac.uk.
Sajid N; The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR, UK. Electronic address: noor.sajid.18@ucl.ac.uk.
Quiroga-Martinez DR; The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR, UK. Electronic address: dquiroga@clin.au.dk.
Parr T; The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR, UK. Electronic address: thomas.parr.12@ucl.ac.uk.
Price CJ; The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR, UK. Electronic address: c.j.price@ucl.ac.uk.
Holmes E; The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR, UK. Electronic address: emma.holmes@ucl.ac.uk.

Hear Res ; 399: 107998, 2021 01.

Article en En | MEDLINE | ID: mdl-32732017

RESUMEN

This paper introduces active listening, as a unified framework for synthesising and recognising speech. The notion of active listening inherits from active inference, which considers perception and action under one universal imperative: to maximise the evidence for our (generative) models of the world. First, we describe a generative model of spoken words that simulates (i) how discrete lexical, prosodic, and speaker attributes give rise to continuous acoustic signals; and conversely (ii) how continuous acoustic signals are recognised as words. The 'active' aspect involves (covertly) segmenting spoken sentences and borrows ideas from active vision. It casts speech segmentation as the selection of internal actions, corresponding to the placement of word boundaries. Practically, word boundaries are selected that maximise the evidence for an internal model of how individual words are generated. We establish face validity by simulating speech recognition and showing how the inferred content of a sentence depends on prior beliefs and background noise. Finally, we consider predictive validity by associating neuronal or physiological responses, such as the mismatch negativity and P300, with belief updating under active listening, which is greatest in the absence of accurate prior beliefs about what will be heard next.

Asunto(s)

Audición; Lenguaje; Ruido/efectos adversos; Percepción del Habla

Palabras clave

Audition; Segmentation; Variational Bayes; Voice; active inference; active listening; speech recognition

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Audición Tipo de estudio: Prognostic_studies Idioma: En Revista: Hear Res Año: 2021 Tipo del documento: Article Pais de publicación: Países Bajos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google