Nat Commun ; 14(1): 908, 2023 Feb 17.
Artículo en Inglés | MEDLINE | ID: mdl-36804926


Deep neural networks (DNNs) are powerful tools for compressing and distilling information. Their scale and complexity, often involving billions of inter-dependent parameters, render direct microscopic analysis difficult. Under such circumstances, a common strategy is to identify slow variables that average the erratic behavior of the fast microscopic variables. Here, we identify a similar separation of scales occurring in fully trained finitely over-parameterized deep convolutional neural networks (CNNs) and fully connected networks (FCNs). Specifically, we show that DNN layers couple only through the second cumulant (kernels) of their activations and pre-activations. Moreover, the latter fluctuates in a nearly Gaussian manner. For infinite width DNNs, these kernels are inert, while for finite ones they adapt to the data and yield a tractable data-aware Gaussian Process. The resulting thermodynamic theory of deep learning yields accurate predictions in various settings. In addition, it provides new ways of analyzing and understanding DNNs in general.

Phys Rev Lett ; 127(24): 240603, 2021 Dec 10.
Artículo en Inglés | MEDLINE | ID: mdl-34951810


Identifying the relevant degrees of freedom in a complex physical system is a key stage in developing effective theories in and out of equilibrium. The celebrated renormalization group provides a framework for this, but its practical execution in unfamiliar systems is fraught with ad hoc choices, whereas machine learning approaches, though promising, lack formal interpretability. Here we present an algorithm employing state-of-the-art results in machine-learning-based estimation of information-theoretic quantities, overcoming these challenges, and use this advance to develop a new paradigm in identifying the most relevant operators describing properties of the system. We demonstrate this on an interacting model, where the emergent degrees of freedom are qualitatively different from the microscopic constituents. Our results push the boundary of formally interpretable applications of machine learning, conceptually paving the way toward automated theory building.

Phys Rev Lett ; 126(24): 240601, 2021 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-34213918


The analysis of complex physical systems hinges on the ability to extract the relevant degrees of freedom from among the many others. Though much hope is placed in machine learning, it also brings challenges, chief of which is interpretability. It is often unclear what relation, if any, the architecture- and training-dependent learned "relevant" features bear to standard objects of physical theory. Here we report on theoretical results which may help to systematically address this issue: we establish equivalence between the field-theoretic relevance of the renormalization group, and an information-theoretic notion of relevance we define using the information bottleneck (IB) formalism of compression theory. We show analytically that for statistical physical systems described by a field theory the relevant degrees of freedom found using IB compression indeed correspond to operators with the lowest scaling dimensions. We confirm our field theoretic predictions numerically. We study dependence of the IB solutions on the physical symmetries of the data. Our findings provide a dictionary connecting two distinct theoretical toolboxes, and an example of constructively incorporating physical interpretability in applications of deep learning in physics.

Phys Rev E ; 104(6-1): 064106, 2021 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-35030903


Real-space mutual information (RSMI) was shown to be an important quantity, formally and from a numerical standpoint, in finding coarse-grained descriptions of physical systems. It very generally quantifies spatial correlations and can give rise to constructive algorithms extracting relevant degrees of freedom. Efficient and reliable estimation or maximization of RSMI is, however, numerically challenging. A recent breakthrough in theoretical machine learning has been the introduction of variational lower bounds for mutual information, parametrized by neural networks. Here we describe in detail how these results can be combined with differentiable coarse-graining operations to develop a single unsupervised neural-network-based algorithm, the RSMI-NE, efficiently extracting the relevant degrees of freedom in the form of the operators of effective field theories, directly from real-space configurations. We study the information contained in the statistical ensemble of constructed coarse-graining transformations and its recovery from partial input data using a secondary machine learning analysis applied to this ensemble. In particular, we show how symmetries, also emergent, can be identified. We demonstrate the extraction of the phase diagram and the order parameters for equilibrium systems and consider also an example of a nonequilibrium problem.

Phys Rev E ; 104(6-1): 064301, 2021 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-35030925


A recent line of works studied wide deep neural networks (DNNs) by approximating them as Gaussian processes (GPs). A DNN trained with gradient flow was shown to map to a GP governed by the neural tangent kernel (NTK), whereas earlier works showed that a DNN with an i.i.d. prior over its weights maps to the so-called neural network Gaussian process (NNGP). Here we consider a DNN training protocol, involving noise, weight decay, and finite width, whose outcome corresponds to a certain non-Gaussian stochastic process. An analytical framework is then introduced to analyze this non-Gaussian process, whose deviation from a GP is controlled by the finite width. Our contribution is threefold: (i) In the infinite width limit, we establish a correspondence between DNNs trained with noisy gradients and the NNGP, not the NTK. (ii) We provide a general analytical form for the finite width correction (FWC) for DNNs with arbitrary activation functions and depth and use it to predict the outputs of empirical finite networks with high accuracy. Analyzing the FWC behavior as a function of n, the training set size, we find that it is negligible for both the very small n regime, and, surprisingly, for the large n regime [where the GP error scales as O(1/n)]. (iii) We flesh out algebraically how these FWCs can improve the performance of finite convolutional neural networks (CNNs) relative to their GP counterparts on image classification tasks.

Sci Rep ; 9(1): 17802, 2019 Nov 28.
Artículo en Inglés | MEDLINE | ID: mdl-31780783


The growing field of nano nuclear magnetic resonance (nano-NMR) seeks to estimate spectra or discriminate between spectra of minuscule amounts of complex molecules. While this field holds great promise, nano-NMR experiments suffer from detrimental inherent noise. This strong noise masks to the weak signal and results in a very low signal-to-noise ratio. Moreover, the noise model is usually complex and unknown, which renders the data processing of the measurement results very complicated. Hence, spectra discrimination is hard to achieve and in particular, it is difficult to reach the optimal discrimination. In this work we present strong indications that this difficulty can be overcome by deep learning (DL) algorithms. The DL algorithms can mitigate the adversarial effects of the noise efficiently by effectively learning the noise model. We show that in the case of frequency discrimination DL algorithms reach the optimal discrimination without having any pre-knowledge of the physical model. Moreover, the DL discrimination scheme outperform Bayesian methods when verified on noisy experimental data obtained by a single Nitrogen-Vacancy (NV) center. In the case of frequency resolution we show that this approach outperforms Bayesian methods even when the latter have full pre-knowledge of the noise model and the former has none. These DL algorithms also emerge as much more efficient in terms of computational resources and run times. Since in many real-world scenarios the noise is complex and difficult to model, we argue that DL is likely to become a dominant tool in the field.

Sci Adv ; 3(9): e1701758, 2017 09.
Artículo en Inglés | MEDLINE | ID: mdl-28959729


It is believed that not all quantum systems can be simulated efficiently using classical computational resources. This notion is supported by the fact that it is not known how to express the partition function in a sign-free manner in quantum Monte Carlo (QMC) simulations for a large number of important problems. The answer to the question-whether there is a fundamental obstruction to such a sign-free representation in generic quantum systems-remains unclear. Focusing on systems with bosonic degrees of freedom, we show that quantized gravitational responses appear as obstructions to local sign-free QMC. In condensed matter physics settings, these responses, such as thermal Hall conductance, are associated with fractional quantum Hall effects. We show that similar arguments also hold in the case of spontaneously broken time-reversal (TR) symmetry such as in the chiral phase of a perturbed quantum Kagome antiferromagnet. The connection between quantized gravitational responses and the sign problem is also manifested in certain vertex models, where TR symmetry is preserved.

Phys Rev Lett ; 111(22): 226401, 2013 Nov 27.
Artículo en Inglés | MEDLINE | ID: mdl-24329460


One-dimensional (1D) quasicrystals exhibit physical phenomena associated with the 2D integer quantum Hall effect. Here, we transcend dimensions and show that a previously inaccessible phase of matter-the 4D integer quantum Hall effect-can be incorporated in a 2D quasicrystal. Correspondingly, our 2D model has a quantized charge-pump accommodated by an elaborate edge phenomena with protected level crossings. We propose experiments to observe these 4D phenomena, and generalize our results to a plethora of topologically equivalent quasicrystals. Thus, 2D quasicrystals may pave the way to the experimental study of 4D physics.

Phys Rev Lett ; 109(10): 106402, 2012 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-23005308


The unrelated discoveries of quasicrystals and topological insulators have in turn challenged prevailing paradigms in condensed-matter physics. We find a surprising connection between quasicrystals and topological phases of matter: (i) quasicrystals exhibit nontrivial topological properties and (ii) these properties are attributed to dimensions higher than that of the quasicrystal. Specifically, we show, both theoretically and experimentally, that one-dimensional quasicrystals are assigned two-dimensional Chern numbers and, respectively, exhibit topologically protected boundary states equivalent to the edge states of a two-dimensional quantum Hall system. We harness the topological nature of these states to adiabatically pump light across the quasicrystal. We generalize our results to higher-dimensional systems and other topological indices. Hence, quasicrystals offer a new platform for the study of topological phases while their topology may better explain their surface properties.