Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 97
Filtrar
1.
Neural Netw ; 180: 106700, 2024 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-39293175

RESUMEN

Neural Architecture Search (NAS) outperforms handcrafted Neural Network (NN) design. However, current NAS methods generally use hard-coded search spaces, and predefined hierarchical architectures. As a consequence, adapting them to a new problem can be cumbersome, and it is hard to know which of the NAS algorithm or the predefined hierarchical structure impacts performance the most. To improve flexibility, and be less reliant on expert knowledge, this paper proposes a NAS methodology in which the search space is easily customizable, and allows for full network search. NAS is performed with Gaussian Process (GP)-based Bayesian Optimization (BO) in a continuous architecture embedding space. This embedding is built upon a Wasserstein Autoencoder, regularized by both a Maximum Mean Discrepancy (MMD) penalization and a Fully Input Convex Neural Network (FICNN) latent predictor, trained to infer the parameter count of architectures. This paper first assesses the embedding's suitability for optimization by solving 2 computationally inexpensive problems: minimizing the number of parameters, and maximizing a zero-shot accuracy proxy. Then, two variants of complexity-aware NAS are performed on CIFAR-10 and STL-10, based on two different search spaces, providing competitive NN architectures with limited model sizes.

2.
Comput Methods Programs Biomed ; 257: 108419, 2024 Sep 11.
Artículo en Inglés | MEDLINE | ID: mdl-39293231

RESUMEN

BACKGROUND AND OBJECTIVE: The accurate diagnosis of schizophrenia spectrum disorder plays an important role in improving patient outcomes, enabling timely interventions, and optimizing treatment plans. Functional connectivity analysis, utilizing functional magnetic resonance imaging data, has been demonstrated to offer invaluable biomarkers conducive to clinical diagnosis. However, previous studies mainly focus on traditional machine learning methods or hand-crafted neural networks, which may not fully capture the spatial topological relationship between brain regions. METHODS: This paper proposes an evolutionary algorithm (EA) based graph neural architecture search (GNAS) method. EA-GNAS has the ability to search for high-performance graph neural networks for schizophrenia spectrum disorder prediction. Moreover, we adopt GNNExplainer to investigate the explainability of the acquired architectures, ensuring that the model's predictions are both accurate and comprehensible. RESULTS: The results suggest that the graph neural network model, derived using genetic algorithm search, outperforms under five-fold cross-validation, achieving a fitness of 0.1850. Relative to conventional machine learning and other deep learning approaches, the proposed method yields superior accuracy, F1 score, and AUC values of 0.8246, 0.8438, and 0.8258, respectively. CONCLUSION: Based on a multi-site dataset from schizophrenia spectrum disorder patients, the findings reveal an enhancement over prior methods, advancing our comprehension of brain function and potentially offering a biomarker for diagnosing schizophrenia spectrum disorder.

3.
Sensors (Basel) ; 24(17)2024 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-39275608

RESUMEN

Autonomous driving systems are a rapidly evolving technology. Trajectory prediction is a critical component of autonomous driving systems that enables safe navigation by anticipating the movement of surrounding objects. Lidar point-cloud data provide a 3D view of solid objects surrounding the ego-vehicle. Hence, trajectory prediction using Lidar point-cloud data performs better than 2D RGB cameras due to providing the distance between the target object and the ego-vehicle. However, processing point-cloud data is a costly and complicated process, and state-of-the-art 3D trajectory predictions using point-cloud data suffer from slow and erroneous predictions. State-of-the-art trajectory prediction approaches suffer from handcrafted and inefficient architectures, which can lead to low accuracy and suboptimal inference times. Neural architecture search (NAS) is a method proposed to optimize neural network models by using search algorithms to redesign architectures based on their performance and runtime. This paper introduces TrajectoryNAS, a novel neural architecture search (NAS) method designed to develop an efficient and more accurate LiDAR-based trajectory prediction model for predicting the trajectories of objects surrounding the ego vehicle. TrajectoryNAS systematically optimizes the architecture of an end-to-end trajectory prediction algorithm, incorporating all stacked components that are prerequisites for trajectory prediction, including object detection and object tracking, using metaheuristic algorithms. This approach addresses the neural architecture designs in each component of trajectory prediction, considering accuracy loss and the associated overhead latency. Our method introduces a novel multi-objective energy function that integrates accuracy and efficiency metrics, enabling the creation of a model that significantly outperforms existing approaches. Through empirical studies, TrajectoryNAS demonstrates its effectiveness in enhancing the performance of autonomous driving systems, marking a significant advancement in the field. Experimental results reveal that TrajcetoryNAS yields a minimum of 4.8 higger accuracy and 1.1* lower latency over competing methods on the NuScenes dataset.

4.
Natl Sci Rev ; 11(8): nwae282, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39262926

RESUMEN

Automated machine learning (AutoML) has achieved remarkable success in automating the non-trivial process of designing machine learning models. Among the focal areas of AutoML, neural architecture search (NAS) stands out, aiming to systematically explore the complex architecture space to discover the optimal neural architecture configurations without intensive manual interventions. NAS has demonstrated its capability of dramatic performance improvement across a large number of real-world tasks. The core components in NAS methodologies normally include (i) defining the appropriate search space, (ii) designing the right search strategy and (iii) developing the effective evaluation mechanism. Although early NAS endeavors are characterized via groundbreaking architecture designs, the imposed exorbitant computational demands prompt a shift towards more efficient paradigms such as weight sharing and evaluation estimation, etc. Concurrently, the introduction of specialized benchmarks has paved the way for standardized comparisons of NAS techniques. Notably, the adaptability of NAS is evidenced by its capability of extending to diverse datasets, including graphs, tabular data and videos, etc., each of which requires a tailored configuration. This paper delves into the multifaceted aspects of NAS, elaborating on its recent advances, applications, tools, benchmarks and prospective research directions.

5.
Biomed Phys Eng Express ; 10(5)2024 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-39137798

RESUMEN

Investigating U-Net model robustness in medical image synthesis against adversarial perturbations, this study introduces RobMedNAS, a neural architecture search strategy for identifying resilient U-Net configurations. Through retrospective analysis of synthesized CT from MRI data, employing Dice coefficient and mean absolute error metrics across critical anatomical areas, the study evaluates traditional U-Net models and RobMedNAS-optimized models under adversarial attacks. Findings demonstrate RobMedNAS's efficacy in enhancing U-Net resilience without compromising on accuracy, proposing a novel pathway for robust medical image processing.


Asunto(s)
Algoritmos , Procesamiento de Imagen Asistido por Computador , Imagen por Resonancia Magnética , Redes Neurales de la Computación , Tomografía Computarizada por Rayos X , Humanos , Imagen por Resonancia Magnética/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Tomografía Computarizada por Rayos X/métodos , Estudios Retrospectivos , Encéfalo/diagnóstico por imagen
6.
Neural Netw ; 179: 106427, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-39003983

RESUMEN

Multi-modal attention mechanisms have been successfully used in multi-modal graph learning for various tasks. However, existing attention-based multi-modal graph learning (AMGL) architectures heavily rely on manual design, requiring huge effort and expert experience. Meanwhile, graph neural architecture search (GNAS) has made great progress toward automatically designing graph-based learning architectures. However, it is challenging to directly adopt existing GNAS methods to search for better AMGL architectures because of the search spaces that only focus on designing graph neural network architectures and the search objective that ignores multi-modal interactive information between modalities and long-term content dependencies within different modalities. To address these issues, we propose an automated attention-based multi-modal graph learning architecture search (AutoAMS) framework, which can automatically design the optimal AMGL architectures for different multi-modal tasks. Specifically, we design an effective attention-based multi-modal (AM) search space consisting of four sub-spaces, which can jointly support the automatic search of multi-modal attention representation and other components of multi-modal graph learning architecture. In addition, a novel search objective based on an unsupervised multi-modal reconstruction loss and task-specific loss is introduced to search and train AMGL architectures. The search objective can extract the global features and capture multi-modal interactions from multiple modalities. The experimental results on multi-modal tasks show strong evidence that AutoAMS is capable of designing high-performance AMGL architectures.


Asunto(s)
Atención , Redes Neurales de la Computación , Atención/fisiología , Humanos , Algoritmos , Aprendizaje Automático
7.
Bioengineering (Basel) ; 11(7)2024 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-39061756

RESUMEN

Dental age estimation is extensively employed in forensic medicine practice. However, the accuracy of conventional methods fails to satisfy the need for precision, particularly when estimating the age of adults. Herein, we propose an approach for age estimation utilizing orthopantomograms (OPGs). We propose a new dental dataset comprising OPGs of 27,957 individuals (16,383 females and 11,574 males), covering an age range from newborn to 93 years. The age annotations were meticulously verified using ID card details. Considering the distinct nature of dental data, we analyzed various neural network components to accurately estimate age, such as optimal network depth, convolution kernel size, multi-branch architecture, and early layer feature reuse. Building upon the exploration of distinctive characteristics, we further employed the widely recognized method to identify models for dental age prediction. Consequently, we discovered two sets of models: one exhibiting superior performance, and the other being lightweight. The proposed approaches, namely AGENet and AGE-SPOS, demonstrated remarkable superiority and effectiveness in our experimental results. The proposed models, AGENet and AGE-SPOS, showed exceptional effectiveness in our experiments. AGENet outperformed other CNN models significantly by achieving outstanding results. Compared to Inception-v4, with the mean absolute error (MAE) of 1.70 and 20.46 B FLOPs, our AGENet reduced the FLOPs by 2.7×. The lightweight model, AGE-SPOS, achieved an MAE of 1.80 years with only 0.95 B FLOPs, surpassing MobileNetV2 by 0.18 years while utilizing fewer computational operations. In summary, we employed an effective DNN searching method for forensic age estimation, and our methodology and findings hold significant implications for age estimation with oral imaging.

8.
Front Neurosci ; 18: 1412559, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38966757

RESUMEN

In neural circuits, recurrent connectivity plays a crucial role in network function and stability. However, existing recurrent spiking neural networks (RSNNs) are often constructed by random connections without optimization. While RSNNs can produce rich dynamics that are critical for memory formation and learning, systemic architectural optimization of RSNNs is still an open challenge. We aim to enable systematic design of large RSNNs via a new scalable RSNN architecture and automated architectural optimization. We compose RSNNs based on a layer architecture called Sparsely-Connected Recurrent Motif Layer (SC-ML) that consists of multiple small recurrent motifs wired together by sparse lateral connections. The small size of the motifs and sparse inter-motif connectivity leads to an RSNN architecture scalable to large network sizes. We further propose a method called Hybrid Risk-Mitigating Architectural Search (HRMAS) to systematically optimize the topology of the proposed recurrent motifs and SC-ML layer architecture. HRMAS is an alternating two-step optimization process by which we mitigate the risk of network instability and performance degradation caused by architectural change by introducing a novel biologically-inspired "self-repairing" mechanism through intrinsic plasticity. The intrinsic plasticity is introduced to the second step of each HRMAS iteration and acts as unsupervised fast self-adaptation to structural and synaptic weight modifications introduced by the first step during the RSNN architectural "evolution." We demonstrate that the proposed automatic architecture optimization leads to significant performance gains over existing manually designed RSNNs: we achieve 96.44% on TI46-Alpha, 94.66% on N-TIDIGITS, 90.28% on DVS-Gesture, and 98.72% on N-MNIST. To the best of the authors' knowledge, this is the first work to perform systematic architecture optimization on RSNNs.

9.
Network ; : 1-24, 2024 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-38994690

RESUMEN

Plant diseases pose a significant threat to agricultural productivity worldwide. Convolutional neural networks (CNNs) have achieved state-of-the-art performances on several plant disease detection tasks. However, the manual development of CNN models using an exhaustive approach is a resource-intensive task. Neural Architecture Search (NAS) has emerged as an innovative paradigm that seeks to automate model generation procedures without human intervention. However, the application of NAS in plant disease detection has received limited attention. In this work, we propose a two-stage meta-learning-based neural architecture search system (ML NAS) to automate the generation of CNN models for unseen plant disease detection tasks. The first stage recommends the most suitable benchmark models for unseen plant disease detection tasks based on the prior evaluations of benchmark models on existing plant disease datasets. In the second stage, the proposed NAS operators are employed to optimize the recommended model for the target task. The experimental results showed that the MLNAS system's model outperformed state-of-the-art models on the fruit disease dataset, achieving an accuracy of 99.61%. Furthermore, the MLNAS-generated model outperformed the Progressive NAS model on the 8-class plant disease dataset, achieving an accuracy of 99.8%. Hence, the proposed MLNAS system facilitates faster model development with reduced computational costs.

10.
Front Artif Intell ; 7: 1414707, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38962503

RESUMEN

Integration between constrained optimization and deep networks has garnered significant interest from both research and industrial laboratories. Optimization techniques can be employed to optimize the choice of network structure based not only on loss and accuracy but also on physical constraints. Additionally, constraints can be imposed during training to enhance the performance of networks in specific contexts. This study surveys the literature on the integration of constrained optimization with deep networks. Specifically, we examine the integration of hyper-parameter tuning with physical constraints, such as the number of FLOPS (FLoating point Operations Per Second), a measure of computational capacity, latency, and other factors. This study also considers the use of context-specific knowledge constraints to improve network performance. We discuss the integration of constraints in neural architecture search (NAS), considering the problem as both a multi-objective optimization (MOO) challenge and through the imposition of penalties in the loss function. Furthermore, we explore various approaches that integrate logic with deep neural networks (DNNs). In particular, we examine logic-neural integration through constrained optimization applied during the training of NNs and the use of semantic loss, which employs the probabilistic output of the networks to enforce constraints on the output.

11.
Natl Sci Rev ; 11(8): nwad292, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39007004

RESUMEN

Formulating the methodology of machine learning by bilevel optimization techniques provides a new perspective to understand and solve automated machine learning problems.

12.
Artículo en Inglés | MEDLINE | ID: mdl-38933471

RESUMEN

Machine learning at the extreme edge has enabled a plethora of intelligent, time-critical, and remote applications. However, deploying interpretable artificial intelligence systems that can perform high-level symbolic reasoning and satisfy the underlying system rules and physics within the tight platform resource constraints is challenging. In this paper, we introduce TinyNS, the first platform-aware neurosymbolic architecture search framework for joint optimization of symbolic and neural operators. TinyNS provides recipes and parsers to automatically write microcontroller code for five types of neurosymbolic models, combining the context awareness and integrity of symbolic techniques with the robustness and performance of machine learning models. TinyNS uses a fast, gradient-free, black-box Bayesian optimizer over discontinuous, conditional, numeric, and categorical search spaces to find the best synergy of symbolic code and neural networks within the hardware resource budget. To guarantee deployability, TinyNS talks to the target hardware during the optimization process. We showcase the utility of TinyNS by deploying microcontroller-class neurosymbolic models through several case studies. In all use cases, TinyNS outperforms purely neural or purely symbolic approaches while guaranteeing execution on real hardware.

13.
Sensors (Basel) ; 24(12)2024 Jun 09.
Artículo en Inglés | MEDLINE | ID: mdl-38931532

RESUMEN

The combination of deep-learning and IoT plays a significant role in modern smart solutions, providing the capability of handling task-specific real-time offline operations with improved accuracy and minimised resource consumption. This study provides a novel hardware-aware neural architecture search approach called ESC-NAS, to design and develop deep convolutional neural network architectures specifically tailored for handling raw audio inputs in environmental sound classification applications under limited computational resources. The ESC-NAS process consists of a novel cell-based neural architecture search space built with 2D convolution, batch normalization, and max pooling layers, and capable of extracting features from raw audio. A black-box Bayesian optimization search strategy explores the search space and the resulting model architectures are evaluated through hardware simulation. The models obtained from the ESC-NAS process achieved the optimal trade-off between model performance and resource consumption compared to the existing literature. The ESC-NAS models achieved accuracies of 85.78%, 81.25%, 96.25%, and 81.0% for the FSC22, UrbanSound8K, ESC-10, and ESC-50 datasets, respectively, with optimal model sizes and parameter counts for edge deployment.

14.
Biomimetics (Basel) ; 9(6)2024 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-38921249

RESUMEN

The evolution of super-resolution (SR) technology has seen significant advancements through the adoption of deep learning methods. However, the deployment of such models by resource-constrained devices necessitates models that not only perform efficiently, but also conserve computational resources. Binary neural networks (BNNs) offer a promising solution by minimizing the data precision to binary levels, thus reducing the computational complexity and memory requirements. However, for BNNs, an effective architecture is essential due to their inherent limitations in representing information. Designing such architectures traditionally requires extensive computational resources and time. With the advancement in neural architecture search (NAS), differentiable NAS has emerged as an attractive solution for efficiently crafting network structures. In this paper, we introduce a novel and efficient binary network search method tailored for image super-resolution tasks. We adapt the search space specifically for super resolution to ensure it is optimally suited for the requirements of such tasks. Furthermore, we incorporate Libra Parameter Binarization (Libra-PB) to maximize information retention during forward propagation. Our experimental results demonstrate that the network structures generated by our method require only a third of the parameters, compared to conventional methods, and yet deliver comparable performance.

15.
ACS Appl Mater Interfaces ; 16(23): 30166-30175, 2024 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-38780088

RESUMEN

Perovskite oxides are gaining significant attention for use in next-generation magnetic and ferroelectric devices due to their exceptional charge transport properties and the opportunity to tune the charge, spin, lattice, and orbital degrees of freedom. Interfaces between perovskite oxides, exemplified by La1-xSrxCoO3-δ/La1-xSrxMnO3-δ (LSCO/LSMO) bilayers, exhibit unconventional magnetic exchange switching behavior, offering a pathway for innovative designs in perovskite oxide-based devices. However, the precise atomic-level stoichiometric compositions and chemophysical properties of these interfaces remain elusive, hindering the establishment of surrogate design principles. We leverage first-principles simulations, evolutionary algorithms, and neural network searches with on-the-fly uncertainty quantification to design deep learning model ensembles to investigate over 50,000 LSCO/LSMO bilayer structures as a function of oxygen deficiency (δ) and strontium concentration (x). Structural analysis of the low-energy interface structures reveals that preferential segregation of oxygen vacancies toward the interfacial La0.7Sr0.3CoO3-δ layers causes distortion of the CoOx polyhedra and the emergence of magnetically active Co2+ ions. At the same time, an increase in the Sr concentration and a decrease in oxygen vacancies in the La0.7Sr0.3MnO3-δ layers tend to retain MnO6 octahedra and promote the formation of Mn4+ ions. Electronic structure analysis reveals that the nonuniform distributions of Sr ions and oxygen vacancies on both sides of the interface can alter the local magnetization at the interface, showing a transition from ferromagnetic (FM) to local antiferromagnetic (AFM) or ferrimagnetic regions. Therefore, the exotic properties of La1-xSrxCoO3-δ/La1-xSrxMnO3-δ are strongly coupled to the presence of hard/soft magnetic layers, as well as the FM to AFM transition at the interface, and can be tuned by changing the Sr concentration and oxygen partial pressure during growth. These insights provide valuable guidance for the precise design of perovskite oxide multilayers, enabling tailoring of their functional properties to meet specific requirements for various device applications.

16.
Neural Netw ; 175: 106312, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38642415

RESUMEN

In recent years, there has been a significant advancement in memristor-based neural networks, positioning them as a pivotal processing-in-memory deployment architecture for a wide array of deep learning applications. Within this realm of progress, the emerging parallel analog memristive platforms are prominent for their ability to generate multiple feature maps in a single processing cycle. However, a notable limitation is that they are specifically tailored for neural networks with fixed structures. As an orthogonal direction, recent research reveals that neural architecture should be specialized for tasks and deployment platforms. Building upon this, the neural architecture search (NAS) methods effectively explore promising architectures in a large design space. However, these NAS-based architectures are generally heterogeneous and diversified, making it challenging for deployment on current single-prototype, customized, parallel analog memristive hardware circuits. Therefore, investigating memristive analog deployment that overrides the full search space is a promising and challenging problem. Inspired by this, and beginning with the DARTS search space, we study the memristive hardware design of primitive operations and propose the memristive all-inclusive hypernetwork that covers 2×1025 network architectures. Our computational simulation results on 3 representative architectures (DARTS-V1, DARTS-V2, PDARTS) show that our memristive all-inclusive hypernetwork achieves promising results on the CIFAR10 dataset (89.2% of PDARTS with 8-bit quantization precision), and is compatible with all architectures in the DARTS full-space. The hardware performance simulation indicates that the memristive all-inclusive hypernetwork costs slightly more resource consumption (nearly the same in power, 22%∼25% increase in Latency, 1.5× in Area) relative to the individual deployment, which is reasonable and may reach a tolerable trade-off deployment scheme for industrial scenarios.


Asunto(s)
Redes Neurales de la Computación , Simulación por Computador , Aprendizaje Profundo , Algoritmos
17.
Health Inf Sci Syst ; 12(1): 22, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38469455

RESUMEN

The utilization of lung sounds to diagnose lung diseases using respiratory sound features has significantly increased in the past few years. The Digital Stethoscope data has been examined extensively by medical researchers and technical scientists to diagnose the symptoms of respiratory diseases. Artificial intelligence-based approaches are applied in the real universe to distinguish respiratory disease signs from human pulmonary auscultation sounds. The Deep CNN model is implemented with combined multi-feature channels (Modified MFCC, Log Mel, and Soft Mel) to obtain the sound parameters from lung-based Digital Stethoscope data. The model analysis is observed with max-pooling and without max-pool operations using multi-feature channels on respiratory digital stethoscope data. In addition, COVID-19 sound data and enriched data, which are recently acquired data to enhance model performance using a combination of L2 regularization to overcome the risk of overfitting because of less respiratory sound data, are included in the work. The suggested DCNN with Max-Pooling on the improved dataset demonstrates cutting-edge performance employing a multi-feature channels spectrogram. The model has been developed with different convolutional filter sizes (1×12, 1×24, 1×36, 1×48, and 1×60) that helped to test the proposed neural network. According to the experimental findings, the suggested DCNN architecture with a max-pooling function performs better to identify respiratory disease symptoms than DCNN without max-pooling. In order to demonstrate the model's effectiveness in categorization, it is trained and tested with the DCNN model that extract several modalities of respiratory sound data.

18.
ACS Appl Mater Interfaces ; 16(10): 13326-13334, 2024 Mar 13.
Artículo en Inglés | MEDLINE | ID: mdl-38480983

RESUMEN

Flexible sensors for application in various industries, including biomedicine and wearable electronics, are frequently made using silver nanoparticle (AgNP) inks and inkjet printing (IJP) technology. Inkjet-printed flexible electronic devices are made up of many printed lines that run parallel to each other, and the surface morphology of the printed lines and the interline state directly impact the electrical conductivity of the electronic devices. This paper describes the experimental setup for IJP, the definition of print line characteristics, and common unavoidable defects. Conductivity and physical defects are considered in defining the print line quality assessment. In addition, two prediction models of flexible sensors before batch printing and a model for detecting defects after printing are provided. The predictive models can guide actions, leading to a print success rate of over 80%. We build the defect detection model using a neural architecture search because manually fine-tuning neural networks for reference is challenging. Finally, a target detection model with a mAP@0.5 of 81.2% is built in just 0.77 graphics processing unit (GPU) days. The model takes only 4.6 ms to detect an image, satisfying the real-time monitoring needs. At the same time, an accuracy of 95.5% can be achieved in the test data set. This work provides a new idea for the high-volume preparation of flexible sensors.

19.
Neural Netw ; 174: 106263, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38547802

RESUMEN

Channel Pruning is one of the most widespread techniques used to compress deep neural networks while maintaining their performances. Currently, a typical pruning algorithm leverages neural architecture search to directly find networks with a configurable width, the key step of which is to identify representative subnet for various pruning ratios by training a supernet. However, current methods mainly follow a serial training strategy to optimize supernet, which is very time-consuming. In this work, we introduce PSE-Net, a novel parallel-subnets estimator for efficient channel pruning. Specifically, we propose a parallel-subnets training algorithm that simulate the forward-backward pass of multiple subnets by droping extraneous features on batch dimension, thus various subnets could be trained in one round. Our proposed algorithm facilitates the efficiency of supernet training and equips the network with the ability to interpolate the accuracy of unsampled subnets, enabling PSE-Net to effectively evaluate and rank the subnets. Over the trained supernet, we develop a prior-distributed-based sampling algorithm to boost the performance of classical evolutionary search. Such algorithm utilizes the prior information of supernet training phase to assist in the search of optimal subnets while tackling the challenge of discovering samples that satisfy resource constraints due to the long-tail distribution of network configuration. Extensive experiments demonstrate PSE-Net outperforms previous state-of-the-art channel pruning methods on the ImageNet dataset while retaining superior supernet training efficiency. For example, under 300M FLOPs constraint, our pruned MobileNetV2 achieves 75.2% Top-1 accuracy on ImageNet dataset, exceeding the original MobileNetV2 by 2.6 units while only cost 30%/16% times than BCNet/AutoAlim.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Evolución Biológica
20.
Neural Netw ; 173: 106172, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38402808

RESUMEN

Spiking neural networks (SNNs) are brain-inspired models that utilize discrete and sparse spikes to transmit information, thus having the property of energy efficiency. Recent advances in learning algorithms have greatly improved SNN performance due to the automation of feature engineering. While the choice of neural architecture plays a significant role in deep learning, the current SNN architectures are mainly designed manually, which is a time-consuming and error-prone process. In this paper, we propose a spiking neural architecture search (NAS) method that can automatically find efficient SNNs. To tackle the challenge of long search time faced by SNNs when utilizing NAS, the proposed NAS encodes candidate architectures in a branchless spiking supernet which significantly reduces the computation requirements in the search process. Considering that real-world tasks prefer efficient networks with optimal accuracy under a limited computational budget, we propose a Synaptic Operation (SynOps)-aware optimization to automatically find the computationally efficient subspace of the supernet. Experimental results show that, in less search time, our proposed NAS can find SNNs with higher accuracy and lower computational cost than state-of-the-art SNNs. We also conduct experiments to validate the search process and the trade-off between accuracy and computational cost.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Automatización , Ingeniería
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA