Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 16.029
Filtrar
1.
Spectrochim Acta A Mol Biomol Spectrosc ; 324: 124953, 2025 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-39128385

RESUMEN

Improving the ease of operation and portability of hydrogen peroxide (H2O2) detection in daily production and life holds significant application value. However, it remains a challenge to achieve rapid colorimetric detection of H2O2 and color change quantification. In this study, we achieved rapid and visual detection of H2O2 by MoOx (2 ≤ x ≤ 3) nanoparticles with rich oxygen vacancies using machine vision. As the concentration of H2O2 increases, the detection system exhibited a visible multi-color change from blue to green and then yellow and the absorption peak near 680 nm measured by the UV-visible spectrophotometer gradually decreased. With excellent sensitivity, a wide linear range of 0.1-600 µmol/L, concentrations as low as 0.1 µmol/L can be detected with good selectivity towards H2O2. The sensing mechanism of detecting H2O2 by the change of oxygen vacancies in MoOx was revealed through characterization methods such as XPS, EPR, and DFT. In addition, the Hue, Saturation, Value (HSV) visual analysis system based on MoOx was constructed to assist in the rapid, portable, and sensitive monitoring of H2O2 in practical application scenarios. This work offers an easy-to operate, low cost, and convenience for achieving rapid colorimetric determination of H2O2 and has broad application prospects in daily life and industrial production.

2.
Artículo en Inglés | MEDLINE | ID: mdl-39035636

RESUMEN

Objectives: Although color information is important in gastrointestinal endoscopy, there are limited studies on how endoscopic images are viewed by people with color vision deficiency. We aimed to investigate the differences in the visibility of blood vessels during endoscopic submucosal dissection (ESD) among people with different color vision characteristics and to examine the effect of red dichromatic imaging (RDI) on blood vessel visibility. Methods: Seventy-seven pairs of endoscopic images of white light imaging (WLI) and RDI of the same site were obtained during colorectal ESD. The original images were set as type C (WLI-C and RDI-C), a common color vision. These images were computationally converted to simulate images perceived by people with color vision deficiency protanope (Type P) or deutanope (Type D) and denoted as WLI-P and RDI-P or WLI-D and RDI-D. Blood vessels and background submucosa that needed to be identified during ESD were selected in each image, and the color differences between these two objects were measured using the color difference (ΔE 00) to assess the visibility of blood vessels. Results: ΔE 00 between a blood vessel and the submucosa was greater under RDI (RDI-C/P/D: 24.05 ± 0.64/22.85 ± 0.66/22.61 ± 0.64) than under WLI (WLI-C/P/D: 22.26 ± 0.60/5.19 ± 0.30/8.62 ± 0.42), regardless of color vision characteristics. This improvement was more pronounced in Type P and Type D and approached Type C in RDI. Conclusions: Color vision characteristics affect the visibility of blood vessels during ESD, and RDI improves blood vessel visibility regardless of color vision characteristics.

3.
Food Chem ; 462: 140911, 2025 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-39213969

RESUMEN

This study presents a low-cost smartphone-based imaging technique called smartphone video imaging (SVI) to capture short videos of samples that are illuminated by a colour-changing screen. Assisted by artificial intelligence, the study develops new capabilities to make SVI a versatile imaging technique such as the hyperspectral imaging (HSI). SVI enables classification of samples with heterogeneous contents, spatial representation of analyte contents and reconstruction of hyperspectral images from videos. When integrated with a residual neural network, SVI outperforms traditional computer vision methods for ginseng classification. Moreover, the technique effectively maps the spatial distribution of saffron purity in powder mixtures with predictive performance that is comparable to that of HSI. In addition, SVI combined with the U-Net deep learning module can produce high-quality images that closely resemble the target images acquired by HSI. These results suggest that SVI can serve as a consumer-oriented solution for food authentication.


Asunto(s)
Teléfono Inteligente , Imágenes Hiperespectrales/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Contaminación de Alimentos/análisis , Grabación en Video , Análisis de los Alimentos
4.
J. optom. (Internet) ; 17(3): [100490], jul.-sept2024. ilus, graf, tab
Artículo en Inglés | IBECS | ID: ibc-231868

RESUMEN

Purpose: To evaluate the efficacy of anti-suppression exercises in children with small-angle esotropia in achieving binocular vision. Methods: A retrospective review of patients aged 3–8 years who underwent anti-suppression exercises for either monocular or alternate suppression between January 2016 and December 2021 was conducted. Patients with esotropia less than 15 prism diopters (PD) and visual acuity ≥ 6/12 were included. Patients with previous intra-ocular surgery or less than three-month follow-up were excluded. Success was defined as the development of binocular single vision (BSV) for distance, near, or both (measured clinically with either the 4 prism base out test or Worth four dot test) and maintained at two consecutive visits. Qualified success was defined as the presence of diplopia response for both distance and near. Additionally, improvement in near stereo acuity was measured using the Stereo Fly test. Results: Eighteen patients with a mean age of 5.4 ± 1.38 years (range 3–8 years) at the time of initiation of exercises were included in the study. The male female ratio was 10:8. The mean best corrected visual acuity was 0.18 LogMAR unit(s) and the mean spherical equivalent was +3.8 ± 0.14 diopters (D). The etiology of the esotropia was fully accommodative refractive esotropia (8), microtropia (1), post–operative infantile esotropia (4), partially accommodative esotropia (1), and post-operative partially accommodative esotropia (4). Patients received either office-based, home-based, or both modes of treatment for an average duration of 4.8 months (range 3–8). After therapy, BSV was achieved for either distance or near in 66.6 % of patients (95 % CI = 40.03–93.31 %). Binocular single vision for both distance and near was seen in 50 % of children. Qualified success was observed in 38.46% of patients. Persistence of suppression was observed in one patient (5.5 %)... (AU)


Asunto(s)
Humanos , Niño , Supresión , Visión Binocular , Esotropía , Agudeza Visual , Terapéutica
5.
J. optom. (Internet) ; 17(3): [100506], jul.-sept2024. ilus, tab, graf
Artículo en Inglés | IBECS | ID: ibc-231870

RESUMEN

Purpose: To investigate the visual function correlates of self-reported vision-related night driving difficulties among drivers. Methods: One hundred and seven drivers (age: 46.06 ± 8.24, visual acuity [VA] of 0.2logMAR or better) were included in the study. A standard vision and night driving questionnaire (VND-Q) was administered. VA and contrast sensitivity were measured under photopic and mesopic conditions. Mesopic VA was remeasured after introducing a peripheral glare source into the participants' field of view to enable computation of disability glare index. Regression analyses were used to assess the associations between VND-Q scores, and visual function measures. Results: The mean VND-Q score was -3.96±1.95 logit (interval scale score: 2.46±1.28). Simple linear regression models for photopic contrast sensitivity, mesopic VA, mesopic contrast sensitivity, and disability index significantly predicted VND-Q score (P<0.05), with mesopic VA and disability glare index accounting for the greatest variation (21 %) in VND-Q scores followed by photopic contrast sensitivity (19 %), and mesopic contrast sensitivity (15 %). A multiple regression model to determine the association between the predictors (photopic contrast sensitivity, mesopic VA, mesopic contrast sensitivity, and disability index) and VND-Q score yielded significant results, F (4, 102) = 8.58, P < 0.001, adj. R2 = 0.2224. Seeing dark-colored cars was the most challenging vision task. Conclusion: Changes in mesopic visual acuity, photopic and mesopic contrast sensitivity, as well as disability glare index are associated with and explain night driving-related visual difficulties. It is recommended to incorporate measurement of these visual functions into assessments related to driving performance.(AU)


Asunto(s)
Humanos , Masculino , Femenino , Conducción de Automóvil , Visión Nocturna , Accidentes de Tránsito , Visión de Colores , Visión Mesópica , Deslumbramiento/efectos adversos
6.
J. optom. (Internet) ; 17(3): [100510], jul.-sept2024. tab
Artículo en Inglés | IBECS | ID: ibc-231872

RESUMEN

Purpose: To evaluate the association between visual symptoms and use of digital devices considering the presence of visual dysfunctions. Methods: An optometric examination was conducted in a clinical sample of 346 patients to diagnose any type of visual anomaly. Visual symptoms were collected using the validated SQVD questionnaire. A threshold of 6 hours per day was used to quantify the effects of digital device usage and patients were divided into two groups: under and above of 35 years old. A multivariate logistic regression was employed to investigate the association between digital device use and symptoms, with visual dysfunctions considered as a confounding variable. Crude and the adjusted odds ratio (OR) were calculated for each variable. Results: 57.02 % of the subjects reported visual symptoms, and 65.02% exhibited some form of visual dysfunction. For patients under 35 years old, an association was found between having visual symptoms and digital device use (OR = 2.10, p = 0.01). However, after adjusting for visual dysfunctions, this association disappeared (OR = 1.44, p = 0.27) and the association was instead between symptoms and refractive dysfunction (OR = 6.52, p < 0.001), accommodative (OR = 10.47, p < 0.001), binocular (OR = 6.68, p < 0.001) and accommodative plus binocular dysfunctions (OR = 46.84, p < 0.001). Among patients over 35 years old, no association was found between symptoms and the use of digital devices (OR = 1.27, p = 0.49) but there was an association between symptoms and refractive dysfunction (OR = 3.54, p = 0.001). Conclusions: Visual symptoms are not dependent on the duration of digital device use but rather on the presence of any type of visual dysfunction: refractive, accommodative and/or binocular one, which should be diagnosed.(AU)


Asunto(s)
Humanos , Masculino , Femenino , Visión Ocular , Pruebas de Visión , Campos Visuales , Personas con Daño Visual , Visión Binocular , Encuestas y Cuestionarios , Optometría
7.
J. optom. (Internet) ; 17(3): [100491], jul.-sept2024. ilus, tab, graf
Artículo en Inglés | IBECS | ID: ibc-231873

RESUMEN

Background and objectives: The invention described herein is a prototype based on computer vision technology that measures depth perception and is intended for the early examination of stereopsis. Materials and methods: The prototype (software and hardware) is a depth perception measurement system that consists on: (a) a screen showing stereoscopic models with a guide point that the subject must point to; (b) a camera capturing the distance between the screen and the subject's finger; and (c) a unit for recording, processing and storing the captured measurements. For test validation, the reproducibility and reliability of the platform were calculated by comparing results with standard stereoscopic tests. A demographic study of depth perception by subgroup analysis is shown. Subjective comparison of the different tests was carried out by means of a satisfaction survey. Results: We included 94 subjects, 25 children and 69 adults, with a mean age of 34.2 ± 18.9 years; 36.2 % were men and 63.8 % were women. The DALE3D platform obtained good repeatability with an interclass correlation coefficient (ICC) between 0.94 and 0.87, and coefficient of variation (CV) between 0.1 and 0.26. Threshold determining optimal and suboptimal results was calculated for Randot and DALE3D test. Spearman's correlation coefficient, between thresholds was not statistically significant (p value > 0.05). The test was considered more visually appealing and easier to use by the participants (90 % maximum score). Conclusions: The DALE3D platform is a potentially useful tool for measuring depth perception with optimal reproducibility rates. Its innovative design makes it a more intuitive tool for children than current stereoscopic tests. Nevertheless, further studies will be needed to assess whether the depth perception measured by the DALE3D platform is a sufficiently reliable parameter to assess stereopsis.(AU)


Asunto(s)
Humanos , Masculino , Femenino , Niño , Adolescente , Adulto Joven , Visión Binocular , Percepción de Profundidad , Visión Ocular , Pruebas de Visión
8.
J. optom. (Internet) ; 17(3): [100514], jul.-sept2024. tab
Artículo en Inglés | IBECS | ID: ibc-231876

RESUMEN

Purpose: To analyze binocular vision of individuals aged 18 to 35 years diagnosed with keratoconus, utilizing spectacles and rigid gas-permeable (RGP) contact lenses. Research was led by the Universidad Autónoma de Aguascalientes, México and Fundación Universitaria del Área Andina Pereira, Colombia. Methods: A single center, prospective non-randomized, comparative, interventional, open-label study, in which the differences in binocular vision performance with both spectacles and RGP contact lenses was carried out from December 2018 to December 2019. Sampling was performed according to consecutive cases with keratoconus that met the inclusion criteria until the proposed sample size was reached. Results: Rigid gas-permeable (RGP) contact lenses notably enhanced distance and near visual acuity in keratoconus patients compared to spectacles. Visual alignment analysis shows exophoria at both distances and is slightly higher with RGP contact lenses. The difference was statistically significant (p<0.05), with 82.5 % presenting compensated phoria with spectacles and pnly 42.50% with RGP contact lenses. Stereoscopic vision improved while wearing RGP contact lenses (42.59 %), although accommodation and accommodative flexibility remained within normal ranges. Conclusions: Patients with keratoconus fitted with RGP contact lenses have improved binocular vision skills such as visual acuity, stereopsis, and accommodative flexibility. However, even when the vergence and motor system is decompensated with respect to normal ranges, the range between break and recovery points for both fusional reserves and the near point of convergence (NPC) improves with the use of RGP contact lenses, giving indications of an adaptive condition of the motor system from the medium to the long term.(AU)


Asunto(s)
Humanos , Masculino , Femenino , Adolescente , Adulto Joven , Queratocono , Anteojos , Lentes de Contacto , Visión Binocular , Pruebas de Visión , Colombia , México , Oftalmología , Estudios Prospectivos
9.
Artículo en Inglés | MEDLINE | ID: mdl-39250172

RESUMEN

INTRODUCTION: Fusional reserves differ with the method of measurement. The goal of this study was to compare the subjective and objective responses during the measurement of positive and negative fusional reserves using both step and ramp methods. METHODS: A haploscopic system was used to measure fusional reserves. Eye movements were recorded using an EyeLink 1000 Plus eye tracker (SR Research). The stimulus disparity was changed to either mimic a prism bar (steps) or a Risley prism (ramp). Subjective responses were obtained by pressing a key on the keyboard, whereas objective break and recovery points were determined offline using a custom algorithm coded in Matlab. RESULTS: Thirty-three adults participated in this study. For the ramp method, the subjective and objective responses were similar for the negative (break and recovery points (t(32) = -0.82, p = 0.42) and (t(32) = 0.42, p = 0.67), respectively) and positive fusional reserves (break and recovery points (U = -1.34, p = 0.18) and t(19) = -0.25, p = 0.81), respectively). For the step method, no significant differences in positive fusional reserves were observed when measured subjectively and objectively for the break (t(32) = 1.27, p = 0.21) or the recovery point (U = -2.02, Bonferroni-adjusted p = 0.04). For the negative fusional reserve, differences were not significant for either the break or recovery points (U = -0.10, p = 0.92 and t(19) = 1.17, p = 0.26, respectively). CONCLUSION: Subjective and objective responses exhibited good agreement when measured with the ramp and step methods.

10.
Sci Rep ; 14(1): 21033, 2024 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-39251692

RESUMEN

A seminal component of systems thinking is the application of an advanced technology in one domain to solve a challenging problem in a different domain. This article introduces a method of using advanced computer vision to solve the challenging signal processing problem of specific emitter identification. A one-dimensional signal is sampled; those samples are transformed into to two-dimensional images by computing a bispectrum; those images are evaluated using advanced computer vision; and the results are statistically combined until any user-selected level of classification accuracy is obtained. In testing on a published DARPA challenge dataset, for every eight additional signal samples taken from a candidate signal (out of many thousands), classification error decreases by an entire order of magnitude.

11.
Sci Rep ; 14(1): 21032, 2024 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-39251734

RESUMEN

Remote sensing of forests is a powerful tool for monitoring the biodiversity of ecosystems, maintaining general planning, and accounting for resources. Various sensors bring together heterogeneous data, and advanced machine learning methods enable their automatic handling in wide territories. Key forest properties usually under consideration in environmental studies include dominant species, tree age, height, basal area and timber stock. Being proxies of stand productivity, they can be utilized for forest carbon stock estimation to analyze forests' status and proper climate change mitigation measures on a global scale. In this study, we aim to develop an effective machine learning-based pipeline for automatic carbon stock estimation using solely freely available and regularly updated satellite observations. We employed multispectral Sentinel-2 remote sensing data to predict forest structure characteristics and produce their detailed spatial maps. Using the Extreme Gradient Boosting (XGBoost) algorithm in classification and regression settings and management-level inventory data as reference measurements, we achieved quality of predictions of species equal to 0.75 according to the F1-score, and for stand age, height, and basal area, we achieved an accuracy of 0.75, 0.58 and 0.56, respectively, according to the R2. We focused on the growing stock volume as the main proxy to estimate forest carbon stocks on the example of the stem pool. We explored two approaches: a direct approach and a hierarchical approach. The direct approach leverages the remote sensing data to create the target maps, and the hierarchical approach calculates the target forest properties using predicted inventory characteristics and conversion equations. We estimated stem carbon stock based on the same approach: from Earth observation imagery directly and using biomass and conversion factors developed for the northern regions. Thus, our study proposes an end-to-end solution for carbon stock estimations based on the complexation of inventory data at the forest stand level, Earth observation imagery, machine learning predictions and conversion equations for the region. The presented approach enables more robust and accurate large-scale assessments using limited annotated datasets.

12.
Int J Geriatr Psychiatry ; 39(9): e6149, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39289786

RESUMEN

OBJECTIVES: Hearing and vision impairments are associated with cognitive decline and dementia risk. Explanations for this include age-related processes impacting on sensory and cognitive function (common cause), or sensory impairments having a direct or indirect impact on cognition via social engagement, depression and physical activity (cascade). We tested whether associations between hearing, vision and episodic memory were mediated by allostatic load, social engagement, depression and physical activity. METHODS: We used structural equation modelling with cross-sectional data from the USA (n = 4746, aged 50-101), England (n = 4907, aged 50-89) and Ireland (4292, aged 50-80) to model factors related to the common cause (indexed by allostatic load) and the cascade hypothesis with respect to cognitive ability (episodic memory). RESULTS: Poorer hearing/vision was associated with lower social engagement, depression and sedentary lifestyle. Poor vision was not related to allostatic load, and poor hearing was associated with allostatic load in only one data set, contributing to a common-cause hypothesis. Lower social engagement, depression and a sedentary lifestyle were associated with poorer episodic memory, contributing to the cascade hypothesis. Using effect estimates to calculate the proportion of the total effects mediated by the combined mediator variables, up to two fifths of the relationship between hearing and vision with episodic memory can be explained by the mediators. CONCLUSIONS: The association between hearing, vision and episodic memory is mediated by allostatic load, social engagement, depression, and physical activity. The finding that social engagement, depression, and physical activity mediate the association between sensory abilities and cognitive function supported the cascade hypotheses. Interventions to improve healthy lifestyle, reduce depression and foster social engagement of older people with sensory impairments are likely to be beneficial in preventing cognitive decline and dementia.


Asunto(s)
Disfunción Cognitiva , Depresión , Trastornos de la Visión , Humanos , Anciano , Masculino , Femenino , Anciano de 80 o más Años , Estados Unidos/epidemiología , Irlanda/epidemiología , Inglaterra/epidemiología , Estudios Transversales , Persona de Mediana Edad , Trastornos de la Visión/epidemiología , Trastornos de la Visión/fisiopatología , Trastornos de la Visión/psicología , Disfunción Cognitiva/epidemiología , Disfunción Cognitiva/fisiopatología , Depresión/epidemiología , Pérdida Auditiva/epidemiología , Pérdida Auditiva/psicología , Memoria Episódica , Ejercicio Físico/fisiología , Alostasis/fisiología , Cognición/fisiología , Análisis de Clases Latentes , Participación Social
13.
Acta Trop ; 260: 107392, 2024 Sep 08.
Artículo en Inglés | MEDLINE | ID: mdl-39255861

RESUMEN

Mosquito-borne diseases continue to pose a great threat to global public health systems due to increased insecticide resistance and climate change. Accurate vector identification is crucial for effective control, yet it presents significant challenges. IDX - an automated computer vision-based device capable of capturing mosquito images and outputting mosquito species ID has been deployed globally resulting in algorithms currently capable of identifying 53 mosquito species. In this study, we evaluate deployed performance of the IDX mosquito species identification algorithms using data from partners in the Southeastern United States (SE US) and Papua New Guinea (PNG) in 2023 and 2024. This preliminary assessment indicates continued improvement of the IDX mosquito species identification algorithms over the study period for individual species as well as average regional accuracy with macro average recall improving from 55.3 % [Confidence Interval (CI) 48.9, 61.7] to 80.2 % [CI 77.3, 84.9] for SE US, and 84.1 % [CI 75.1, 93.1] to 93.6 % [CI 91.6, 95.6] for PNG using a CI of 90 %. This study underscores the importance of algorithm refinement and dataset expansion covering more species and regions to enhance identification systems thereby reducing the workload for human experts, addressing taxonomic expertise gaps, and improving vector control efforts.

14.
Sci Rep ; 14(1): 21366, 2024 09 12.
Artículo en Inglés | MEDLINE | ID: mdl-39266610

RESUMEN

Accurate detection and tracking of animals across diverse environments are crucial for studying brain and behavior. Recently, computer vision techniques have become essential for high-throughput behavioral studies; however, localizing animals in complex conditions remains challenging due to intra-class visual variability and environmental diversity. These challenges hinder studies in naturalistic settings, such as when animals are partially concealed within nests. Moreover, current tools are laborious and time-consuming, requiring extensive, setup-specific annotation and training procedures. To address these challenges, we introduce the 'Detect-Any-Mouse-Model' (DAMM), an object detector for localizing mice in complex environments with minimal training. Our approach involved collecting and annotating a diverse dataset of single- and multi-housed mice in complex setups. We trained a Mask R-CNN, a popular object detector in animal studies, to perform instance segmentation and validated DAMM's performance on a collection of downstream datasets using zero-shot and few-shot inference. DAMM excels in zero-shot inference, detecting mice and even rats, in entirely unseen scenarios and further improves with minimal training. Using the SORT algorithm, we demonstrate robust tracking, competitive with keypoint-estimation-based methods. Notably, to advance and simplify behavioral studies, we release our code, model weights, and data, along with a user-friendly Python API and a Google Colab implementation.


Asunto(s)
Algoritmos , Conducta Animal , Animales , Ratones , Conducta Animal/fisiología , Ratas , Ambiente
15.
Sensors (Basel) ; 24(17)2024 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-39275368

RESUMEN

In online video understanding, which has a wide range of real-world applications, inference speed is crucial. Many approaches involve frame-level visual feature extraction, which often represents the biggest bottleneck. We propose RetinaViT, an efficient method for extracting frame-level visual features in an online video stream, aiming to fundamentally enhance the efficiency of online video understanding tasks. RetinaViT is composed of efficiently approximated Transformer blocks that only take changed tokens (event tokens) as queries and reuse the already processed tokens from the previous timestep for the others. Furthermore, we restrict keys and values to the spatial neighborhoods of event tokens to further improve efficiency. RetinaViT involves tuning multiple parameters, which we determine through a multi-step process. During model training, we randomly vary these parameters and then perform black-box optimization to maximize accuracy and efficiency on the pre-trained model. We conducted extensive experiments on various online video recognition tasks, including action recognition, pose estimation, and object segmentation, validating the effectiveness of each component in RetinaViT and demonstrating improvements in the speed/accuracy trade-off compared to baselines. In particular, for action recognition, RetinaViT built on ViT-B16 reduces inference time by approximately 61.9% on the CPU and 50.8% on the GPU, while achieving slight accuracy improvements rather than degradation.

16.
Sensors (Basel) ; 24(17)2024 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-39275386

RESUMEN

For automated quayside container cranes, accurate measurement of the three-dimensional positioning and attitude of the container spreader is crucial for the safe and efficient transfer of containers. This paper proposes a high-precision measurement method for the spreader's three-dimensional position and rotational angles based on a single vertically mounted fixed-focus visual camera. Firstly, an image preprocessing method is proposed for complex port environments. The improved YOLOv5 network, enhanced with an attention mechanism, increases the detection accuracy of the spreader's keypoints and the container lock holes. Combined with image morphological processing methods, the three-dimensional position and rotational angle changes of the spreader are measured. Compared to traditional detection methods, the single-camera-based method for three-dimensional positioning and attitude measurement of the spreader employed in this paper achieves higher detection accuracy for spreader keypoints and lock holes in experiments and improves the operational speed of single operations in actual tests, making it a feasible measurement approach.

17.
Sensors (Basel) ; 24(17)2024 Aug 27.
Artículo en Inglés | MEDLINE | ID: mdl-39275441

RESUMEN

Pose estimation is crucial for ensuring passenger safety and better user experiences in semi- and fully autonomous vehicles. Traditional methods relying on pose estimation from regular color images face significant challenges due to a lack of three-dimensional (3D) information and the sensitivity to occlusion and lighting conditions. Depth images, which are invariant to lighting issues and provide 3D information about the scene, offer a promising alternative. However, there is a lack of strong work in 3D pose estimation from such images due to the time-consuming process of annotating depth images with 3D postures. In this paper, we present a novel approach to 3D human posture estimation using depth and infrared (IR) images. Our method leverages a three-stage fine-tuning process involving simulation data, approximated data, and a limited set of manually annotated samples. This approach allows us to effectively train a model capable of accurate 3D pose estimation with a median error of under 10 cm across all joints, using fewer than 100 manually annotated samples. To the best of our knowledge, this is the first work focusing on vehicle occupant posture detection utilizing only depth and IR data. Our results demonstrate the feasibility and efficacy of this approach, paving the way for enhanced passenger safety in autonomous vehicle systems.

18.
Sensors (Basel) ; 24(17)2024 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-39275478

RESUMEN

Water leakage defects often occur in underground structures, leading to accelerated structural aging and threatening structural safety. Leakage identification can detect early diseases of underground structures and provide important guidance for reinforcement and maintenance. Deep learning-based computer vision methods have been rapidly developed and widely used in many fields. However, establishing a deep learning model for underground structure leakage identification usually requires a lot of training data on leakage defects, which is very expensive. To overcome the data shortage, a deep neural network method for leakage identification is developed based on transfer learning in this paper. For comparison, four famous classification models, including VGG16, AlexNet, SqueezeNet, and ResNet18, are constructed. To train the classification models, a transfer learning strategy is developed, and a dataset of underground structure leakage is created. Finally, the classification performance on the leakage dataset of different deep learning models is comparatively studied under different sizes of training data. The results showed that the VGG16, AlexNet, and SqueezeNet models with transfer learning can overall provide higher and more stable classification performance on the leakage dataset than those without transfer learning. The ResNet18 model with transfer learning can overall provide a similar value of classification performance on the leakage dataset than that without transfer learning, but its classification performance is more stable than that without transfer learning. In addition, the SqueezeNet model obtains an overall higher and more stable performance than the comparative models on the leakage dataset for all classification metrics.

19.
Sensors (Basel) ; 24(17)2024 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-39275491

RESUMEN

In maritime transportation, a ship's draft survey serves as a primary method for weighing bulk cargo. The accuracy of the ship's draft reading determines the fairness of bulk cargo transactions. Human visual-based draft reading methods face issues such as safety concerns, high labor costs, and subjective interpretation. Therefore, some image processing methods are utilized to achieve automatic draft reading. However, due to the limitations in the spectral characteristics of RGB images, existing image processing methods are susceptible to water surface environmental interference, such as reflections. To solve this issue, we obtained and annotated 524 multispectral images of a ship's draft as the research dataset, marking the first application of integrating NIR information and RGB images for automatic draft reading tasks. Additionally, a dual-branch backbone named BIF is proposed to extract and combine spectral information from RGB and NIR images. The backbone network can be combined with the existing segmentation head and detection head to perform waterline segmentation and draft detection. By replacing the original ResNet-50 backbone of YOLOv8, we reached a mAP of 99.2% in the draft detection task. Similarly, combining UPerNet with our dual-branch backbone, the mIoU of the waterline segmentation task was improved from 98.9% to 99.3%. The inaccuracy of the draft reading is less than ±0.01 m, confirming the efficacy of our method for automatic draft reading tasks.

20.
Sensors (Basel) ; 24(17)2024 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-39275498

RESUMEN

Road crack detection is of paramount importance for ensuring vehicular traffic safety, and implementing traditional detection methods for cracks inevitably impedes the optimal functioning of traffic. In light of the above, we propose a USSC-YOLO-based target detection algorithm for unmanned aerial vehicle (UAV) road cracks based on machine vision. The algorithm aims to achieve the high-precision detection of road cracks at all scale levels. Compared with the original YOLOv5s, the main improvements to USSC-YOLO are the ShuffleNet V2 block, the coordinate attention (CA) mechanism, and the Swin Transformer. First, to address the problem of large network computational spending, we replace the backbone network of YOLOv5s with ShuffleNet V2 blocks, reducing computational overhead significantly. Next, to reduce the problems caused by the complex background interference, we introduce the CA attention mechanism into the backbone network, which reduces the missed and false detection rate. Finally, we integrate the Swin Transformer block at the end of the neck to enhance the detection accuracy for small target cracks. Experimental results on our self-constructed UAV near-far scene road crack i(UNFSRCI) dataset demonstrate that our model reduces the giga floating-point operations per second (GFLOPs) compared to YOLOv5s while achieving a 6.3% increase in mAP@50 and a 12% improvement in mAP@ [50:95]. This indicates that the model remains lightweight meanwhile providing excellent detection performance. In future work, we will assess road safety conditions based on these detection results to prioritize maintenance sequences for crack targets and facilitate further intelligent management.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA