Deeply Supervised Block-Wise Neural Architecture Search.

Yang, An; Liu, Ying; Li, Chunguang; Ren, Qinyuan

ABSTRACT

Neural architecture search (NAS) has shown great promise in automatically designing neural network models. Recently, block-wise NAS has been proposed to alleviate deep coupling problem between architectures and weights existed in the well-known weight-sharing NAS, by training the huge weight-sharing supernet block-wisely. However, the existing block-wise NAS methods, which resort to either supervised distillation or self-supervised contrastive learning scheme to enable block-wise optimization, take massive computational cost. To be specific, the former introduces an external high-capacity teacher model, while the latter involves supernet-scale momentum model and requires a long training schedule. Considering this, in this work, we propose a resource-friendly deeply supervised block-wise NAS (DBNAS) method. In the proposed DBNAS, we construct a lightweight deeply-supervised module after each block to enable a simple supervised learning scheme and leverage ground-truth labels to indirectly supervise optimization of each block progressively. Besides, the deeply-supervised module is specifically designed as structural and functional condensation of the supernet, which establishes global awareness for progressive block-wise optimization and helps search for promising architectures. Experimental results show that the DBNAS method only takes less than 1 GPU day to search out promising architectures on the ImageNet dataset with less GPU memory footprint than the other block-wise NAS works. The best-performing model among the searched DBNAS family achieves 75.6% Top-1 accuracy on ImageNet, which is competitive with the state-of-the-art NAS models. Moreover, our DBNAS family models also achieve good transfer performance on CIFAR-10/100, as well as two downstream tasks object detection and semantic segmentation.