Búsqueda | Portal Regional de la BVS

Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos.

Koller, Oscar; Camgoz, Necati Cihan; Ney, Hermann; Bowden, Richard.

IEEE Trans Pattern Anal Mach Intell ; 42(9): 2306-2320, 2020 09.

Artículo en Inglés | MEDLINE | ID: mdl-30990421

RESUMEN

In this work we present a new approach to the field of weakly supervised learning in the video domain. Our method is relevant to sequence learning problems which can be split up into sub-problems that occur in parallel. Here, we experiment with sign language data. The approach exploits sequence constraints within each independent stream and combines them by explicitly imposing synchronisation points to make use of parallelism that all sub-problems share. We do this with multi-stream HMMs while adding intermediate synchronisation constraints among the streams. We embed powerful CNN-LSTM models in each HMM stream following the hybrid approach. This allows the discovery of attributes which on their own lack sufficient discriminative power to be identified. We apply the approach to the domain of sign language recognition exploiting the sequential parallelism to learn sign language, mouth shape and hand shape classifiers. We evaluate the classifiers on three publicly available benchmark data sets featuring challenging real-life sign language with over 1,000 classes, full sentence based lip-reading and articulated hand shape recognition on a fine-grained hand shape taxonomy featuring over 60 different hand shapes. We clearly outperform the state-of-the-art on all data sets and observe significantly faster convergence using the parallel alignment approach.

Upper and Lower Tight Error Bounds for Feature Omission with an Extension to Context Reduction.

Schluter, Ralf; Beck, Eugen; Ney, Hermann.

IEEE Trans Pattern Anal Mach Intell ; 41(2): 502-514, 2019 02.

Artículo en Inglés | MEDLINE | ID: mdl-29990282

RESUMEN

In this work, fundamental analytic results in the form of error bounds are presented that quantify the effect of feature omission and selection for pattern classification in general, as well as the effect of context reduction in string classification, like automatic speech recognition, printed/handwritten character recognition, or statistical machine translation. A general simulation framework is introduced that supports discovery and proof of error bounds, which lead to the error bounds presented here. Initially derived tight lower and upper bounds for feature omission are generalized to feature selection, followed by another extension to context reduction of string class priors (aka language models) in string classification. For string classification, the quantitative effect of string class prior context reduction on symbol-level Bayes error is presented. The tightness of the original feature omission bounds seems lost in this case, as further simulations indicate. However, combining both feature omission andcontext reduction, the tightness of the bounds is retained. A central result of this work is the proof of the existence, and the amount of a statistical threshold w.r.t. the introduction of additional features in general pattern classification, or the increase of context in string classification beyond which a decrease in Bayes error is guaranteed.

Does the cost function matter in Bayes decision rule?

Schlü ter, Ralf; Nussbaum-Thom, Markus; Ney, Hermann.

IEEE Trans Pattern Anal Mach Intell ; 34(2): 292-301, 2012 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-21844628

RESUMEN

In many tasks in pattern recognition, such as automatic speech recognition (ASR), optical character recognition (OCR), part-of-speech (POS) tagging, and other string recognition tasks, we are faced with a well-known inconsistency: The Bayes decision rule is usually used to minimize string (symbol sequence) error, whereas, in practice, we want to minimize symbol (word, character, tag, etc.) error. When comparing different recognition systems, we do indeed use symbol error rate as an evaluation measure. The topic of this work is to analyze the relation between string (i.e., 0-1) and symbol error (i.e., metric, integer valued) cost functions in the Bayes decision rule, for which fundamental analytic results are derived. Simple conditions are derived for which the Bayes decision rule with integer-valued metric cost function and with 0-1 cost gives the same decisions or leads to classes with limited cost. The corresponding conditions can be tested with complexity linear in the number of classes. The results obtained do not make any assumption w.r.t. the structure of the underlying distributions or the classification problem. Nevertheless, the general analytic results are analyzed via simulations of string recognition problems with Levenshtein (edit) distance cost function. The results support earlier findings that considerable improvements are to be expected when initial error rates are high.

Asunto(s)

Teorema de Bayes , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Simulación por Computador , Software de Reconocimiento del Habla

Latent log-linear models for handwritten digit classification.

Deselaers, Thomas; Gass, Tobias; Heigold, Georg; Ney, Hermann.

IEEE Trans Pattern Anal Mach Intell ; 34(6): 1105-17, 2012 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-22064798

RESUMEN

We present latent log-linear models, an extension of log-linear models incorporating latent variables, and we propose two applications thereof: log-linear mixture models and image deformation-aware log-linear models. The resulting models are fully discriminative, can be trained efficiently, and the model complexity can be controlled. Log-linear mixture models offer additional flexibility within the log-linear modeling framework. Unlike previous approaches, the image deformation-aware model directly considers image deformations and allows for a discriminative training of the deformation parameters. Both are trained using alternating optimization. For certain variants, convergence to a stationary point is guaranteed and, in practice, even variants without this guarantee converge and find models that perform well. We tune the methods on the USPS data set and evaluate on the MNIST data set, demonstrating the generalization capabilities of our proposed models. Our models, although using significantly fewer parameters, are able to obtain competitive results with models proposed in the literature.

Asunto(s)

Modelos Lineales , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Interpretación de Imagen Asistida por Computador/métodos , Procesamiento de Lenguaje Natural

Extended query refinement for medical image retrieval.

Deserno, Thomas M; Güld, Mark O; Plodowski, Bartosz; Spitzer, Klaus; Wein, Berthold B; Schubert, Henning; Ney, Hermann; Seidl, Thomas.

J Digit Imaging ; 21(3): 280-9, 2008 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-17497197

RESUMEN

The impact of image pattern recognition on accessing large databases of medical images has recently been explored, and content-based image retrieval (CBIR) in medical applications (IRMA) is researched. At the present, however, the impact of image retrieval on diagnosis is limited, and practical applications are scarce. One reason is the lack of suitable mechanisms for query refinement, in particular, the ability to (1) restore previous session states, (2) combine individual queries by Boolean operators, and (3) provide continuous-valued query refinement. This paper presents a powerful user interface for CBIR that provides all three mechanisms for extended query refinement. The various mechanisms of man-machine interaction during a retrieval session are grouped into four classes: (1) output modules, (2) parameter modules, (3) transaction modules, and (4) process modules, all of which are controlled by a detailed query logging. The query logging is linked to a relational database. Nested loops for interaction provide a maximum of flexibility within a minimum of complexity, as the entire data flow is still controlled within a single Web page. Our approach is implemented to support various modalities, orientations, and body regions using global features that model gray scale, texture, structure, and global shape characteristics. The resulting extended query refinement has a significant impact for medical CBIR applications.

Asunto(s)

Almacenamiento y Recuperación de la Información/métodos , Internet/estadística & datos numéricos , Interpretación de Imagen Radiográfica Asistida por Computador , Sistemas de Información Radiológica/instrumentación , Interfaz Usuario-Computador , Gráficos por Computador , Bases de Datos Factuales , Diagnóstico por Imagen/métodos , Humanos , Aplicaciones de la Informática Médica , Reconocimiento de Normas Patrones Automatizadas , Sensibilidad y Especificidad , Diseño de Software

Deformation models for image recognition.

Keysers, Daniel; Deselaers, Thomas; Gollan, Christian; Ney, Hermann.

IEEE Trans Pattern Anal Mach Intell ; 29(8): 1422-35, 2007 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-17568145

RESUMEN

We present the application of different nonlinear image deformation models to the task of image recognition. The deformation models are especially suited for local changes as they often occur in the presence of image object variability. We show that, among the discussed models, there is one approach that combines simplicity of implementation, low-computational complexity, and highly competitive performance across various real-world image recognition tasks. We show experimentally that the model performs very well for four different handwritten digit recognition tasks and for the classification of medical images, thus showing high generalization capacity. In particular, an error rate of 0.54 percent on the MNIST benchmark is achieved, as well as the lowest reported error rate, specifically 12.6 percent, in the 2005 international ImageCLEF evaluation of medical image categorization.

Asunto(s)

Procesamiento de Imagen Asistido por Computador , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Inteligencia Artificial , Simulación por Computador , Humanos , Interpretación de Imagen Asistida por Computador , Dinámicas no Lineales

Automatic categorization of medical images for content-based retrieval and data mining.

Lehmann, Thomas M; Güld, Mark O; Deselaers, Thomas; Keysers, Daniel; Schubert, Henning; Spitzer, Klaus; Ney, Hermann; Wein, Berthold B.

Comput Med Imaging Graph ; 29(2-3): 143-55, 2005.

Artículo en Inglés | MEDLINE | ID: mdl-15755534

RESUMEN

Categorization of medical images means selecting the appropriate class for a given image out of a set of pre-defined categories. This is an important step for data mining and content-based image retrieval (CBIR). So far, published approaches are capable to distinguish up to 10 categories. In this paper, we evaluate automatic categorization into more than 80 categories describing the imaging modality and direction as well as the body part and biological system examined. Based on 6231 reference images from hospital routine, 85.5% correctness is obtained combining global texture features with scaled images. With a frequency of 97.7%, the correct class is within the best ten matches, which is sufficient for medical CBIR applications.

Asunto(s)

Diagnóstico por Imagen , Almacenamiento y Recuperación de la Información , Automatización , Alemania

Adaptation in statistical pattern recognition using tangent vectors.

Keysers, Daniel; Macherey, Wolfgang; Ney, Hermann; Dahmen, Jörg.

IEEE Trans Pattern Anal Mach Intell ; 26(2): 269-74, 2004 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-15376902

RESUMEN

We integrate the tangent method into a statistical framework for classification analytically and practically. The resulting consistent framework for adaptation allows us to efficiently estimate the tangent vectors representing the variability. The framework improves classification results on two real-world pattern recognition tasks from the domains handwritten character recognition and automatic speech recognition.

Asunto(s)

Algoritmos , Inteligencia Artificial , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Reconocimiento de Normas Patrones Automatizadas , Técnica de Sustracción , Análisis por Conglomerados , Procesamiento Automatizado de Datos , Retroalimentación , Aumento de la Imagen/métodos , Modelos Estadísticos , Procesamiento de Lenguaje Natural , Análisis Numérico Asistido por Computador , Lectura , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Procesamiento de Señales Asistido por Computador , Percepción del Habla

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA