Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Comput Biol Med ; 173: 108257, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38520922

RESUMEN

We developed an attention model to predict future adverse glycemic events 30 min in advance based on the observation of past glycemic values over a 35 min period. The proposed model effectively encodes insulin administration and meal intake time using Time2Vec (T2V) for glucose prediction. The proposed impartial feature selection algorithm is designed to distribute rewards proportionally according to agent contributions. Agent contributions are calculated by a step-by-step negation of updated agents. Thus, the proposed feature selection algorithm optimizes features from electronic medical records to improve performance. For evaluation, we collected continuous glucose monitoring data from 102 patients with type 2 diabetes admitted to Cheonan Hospital, Soonchunhyang University. Using our proposed model, we achieved F1-scores of 89.0%, 60.6%, and 89.8% for normoglycemia, hypoglycemia, and hyperglycemia, respectively.


Asunto(s)
Diabetes Mellitus Tipo 2 , Hipoglucemia , Humanos , Hipoglucemiantes , Glucemia , Diabetes Mellitus Tipo 2/tratamiento farmacológico , Automonitorización de la Glucosa Sanguínea , Hipoglucemia/inducido químicamente , Insulina
2.
J Xray Sci Technol ; 32(2): 173-205, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38217633

RESUMEN

BACKGROUND: In recent years, deep reinforcement learning (RL) has been applied to various medical tasks and produced encouraging results. OBJECTIVE: In this paper, we demonstrate the feasibility of deep RL for denoising simulated deep-silicon photon-counting CT (PCCT) data in both full and interior scan modes. PCCT offers higher spatial and spectral resolution than conventional CT, requiring advanced denoising methods to suppress noise increase. METHODS: In this work, we apply a dueling double deep Q network (DDDQN) to denoise PCCT data for maximum contrast-to-noise ratio (CNR) and a multi-agent approach to handle data non-stationarity. RESULTS: Using our method, we obtained significant image quality improvement for single-channel scans and consistent improvement for all three channels of multichannel scans. For the single-channel interior scans, the PSNR (dB) and SSIM increased from 33.4078 and 0.9165 to 37.4167 and 0.9790 respectively. For the multichannel interior scans, the channel-wise PSNR (dB) increased from 31.2348, 30.7114, and 30.4667 to 31.6182, 30.9783, and 30.8427 respectively. Similarly, the SSIM improved from 0.9415, 0.9445, and 0.9336 to 0.9504, 0.9493, and 0.0326 respectively. CONCLUSIONS: Our results show that the RL approach improves image quality effectively, efficiently, and consistently across multiple spectral channels and has great potential in clinical applications.


Asunto(s)
Algoritmos , Silicio , Rayos X , Relación Señal-Ruido , Tomografía Computarizada por Rayos X/métodos , Procesamiento de Imagen Asistido por Computador/métodos
3.
Front Artif Intell ; 6: 804682, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37547229

RESUMEN

Intuitively, experience playing against one mixture of opponents in a given domain should be relevant for a different mixture in the same domain. If the mixture changes, ideally we would not have to train from scratch, but rather could transfer what we have learned to construct a policy to play against the new mixture. We propose a transfer learning method, Q-Mixing, that starts by learning Q-values against each pure-strategy opponent. Then a Q-value for any distribution of opponent strategies is approximated by appropriately averaging the separately learned Q-values. From these components, we construct policies against all opponent mixtures without any further training. We empirically validate Q-Mixing in two environments: a simple grid-world soccer environment, and a social dilemma game. Our experiments find that Q-Mixing can successfully transfer knowledge across any mixture of opponents. Next, we consider the use of observations during play to update the believed distribution of opponents. We introduce an opponent policy classifier-trained reusing Q-learning data-and use the classifier results to refine the mixing of Q-values. Q-Mixing augmented with the opponent policy classifier performs better, with higher variance, than training directly against a mixed-strategy opponent.

4.
Sensors (Basel) ; 23(7)2023 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-37050685

RESUMEN

Deep reinforcement learning has produced many success stories in recent years. Some example fields in which these successes have taken place include mathematics, games, health care, and robotics. In this paper, we are especially interested in multi-agent deep reinforcement learning, where multiple agents present in the environment not only learn from their own experiences but also from each other and its applications in multi-robot systems. In many real-world scenarios, one robot might not be enough to complete the given task on its own, and, therefore, we might need to deploy multiple robots who work together towards a common global objective of finishing the task. Although multi-agent deep reinforcement learning and its applications in multi-robot systems are of tremendous significance from theoretical and applied standpoints, the latest survey in this domain dates to 2004 albeit for traditional learning applications as deep reinforcement learning was not invented. We classify the reviewed papers in our survey primarily based on their multi-robot applications. Our survey also discusses a few challenges that the current research in this domain faces and provides a potential list of future applications involving multi-robot systems that can benefit from advances in multi-agent deep reinforcement learning.

5.
Neural Comput Appl ; 34(3): 1653-1671, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35221541

RESUMEN

A dynamical systems perspective on multi-agent learning, based on the link between evolutionary game theory and reinforcement learning, provides an improved, qualitative understanding of the emerging collective learning dynamics. However, confusion exists with respect to how this dynamical systems account of multi-agent learning should be interpreted. In this article, I propose to embed the dynamical systems description of multi-agent learning into different abstraction levels of cognitive analysis. The purpose of this work is to make the connections between these levels explicit in order to gain improved insight into multi-agent learning. I demonstrate the usefulness of this framework with the general and widespread class of temporal-difference reinforcement learning. I find that its deterministic dynamical systems description follows a minimum free-energy principle and unifies a boundedly rational account of game theory with decision-making under uncertainty. I then propose an on-line sample-batch temporal-difference algorithm which is characterized by the combination of applying a memory-batch and separated state-action value estimation. I find that this algorithm serves as a micro-foundation of the deterministic learning equations by showing that its learning trajectories approach the ones of the deterministic learning equations under large batch sizes. Ultimately, this framework of embedding a dynamical systems description into different abstraction levels gives guidance on how to unleash the full potential of the dynamical systems approach to multi-agent learning.

6.
J Theor Biol ; 527: 110712, 2021 10 21.
Artículo en Inglés | MEDLINE | ID: mdl-33933477

RESUMEN

Learning is thought to be achieved by the selective, activity dependent, adjustment of synaptic connections. Individual learning can also be very hard and/or slow. Social, supervised, learning from others might amplify individual, possibly mainly unsupervised, learning by individuals, and might underlie the development and evolution of culture. We studied a minimal neural network model of the interaction of individual, unsupervised, and social supervised learning by communicating "agents". Individual agents attempted to learn to track a hidden fluctuating "source", which, linearly mixed with other masking fluctuations, generated observable input vectors. In this model data are generated linearly, facilitating mathematical analysis. Learning was driven either solely by direct observation of input data (unsupervised, Hebbian) or, in addition, by observation of another agent's output (supervised, Delta rule). To make learning more difficult, and to enhance biological realism, the learning rules were made slightly connection-inspecific, so that incorrect individual learning sometimes occurs. We found that social interaction can foster both correct and incorrect learning. Useful social learning therefore presumably involves additional factors some of which we outline.


Asunto(s)
Modelos Neurológicos , Redes Neurales de la Computación , Humanos
7.
Sensors (Basel) ; 19(1)2019 Jan 03.
Artículo en Inglés | MEDLINE | ID: mdl-30609866

RESUMEN

Transmission latency minimization and energy efficiency improvement are two main challenges in multi-hop Cognitive Radio Networks (CRN), where the knowledge of topology and spectrum statistics are hard to obtain. For this reason, a cross-layer routing protocol based on quasi-cooperative multi-agent learning is proposed in this study. Firstly, to jointly consider the end-to-end delay and power efficiency, a comprehensive utility function is designed to form a reasonable tradeoff between the two measures. Then the joint design problem is modeled as a Stochastic Game (SG), and a quasi-cooperative multi-agent learning scheme is presented to solve the SG, which only needs information exchange with previous nodes. To further enhance performance, experience replay is applied to the update of conjecture belief to break the correlations and reduce the variance of updates. Simulation results demonstrate that the proposed scheme is superior to traditional algorithms leading to a shorter delay, lower packet loss ratio and higher energy efficiency, which is close to the performance of an optimum scheme.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA