Retrieval of aerosol properties from in situ, multi-angle light scattering measurements using invertible neural networks

By Romana Boiger, Rob L. Modini, Alireza Moallemi, David Degen, Martin Gysel-Beer, Andreas Adelmann
arxiv.org
 5 days ago

Atmospheric aerosols have a major influence on the earths climate and public health. Hence, studying their properties and recovering them from light scattering measurements is of great importance. State of the art retrieval methods such as pre-computed look-up tables and iterative, physics-based algorithms can suffer from...

Learning a compass spin model with neural network quantum states

Neural network quantum states provide a novel representation of the many-body states of interacting quantum systems and open up a promising route to solve frustrated quantum spin models that evade other numerical approaches. Yet its capacity to describe complex magnetic orders with large unit cells has not been demonstrated, and its performance in a rugged energy landscape has been questioned. Here we apply restricted Boltzmann machines and stochastic gradient descent to seek the ground states of a compass spin model on the honeycomb lattice, which unifies the Kitaev model, Ising model and the quantum 120$^\circ$ model with a single tuning parameter. We report calculation results on the variational energy, order parameters and correlation functions. The phase diagram obtained is in good agreement with the predictions of tensor network ansatz, demonstrating the capacity of restricted Boltzmann machines in learning the ground states of frustrated quantum spin Hamiltonians. The limitations of the calculation are discussed. A few strategies are outlined to address some of the challenges in machine learning frustrated quantum magnets.
An Application of Quantum Machine Learning on Quantum Correlated Systems: Quantum Convolutional Neural Network as a Classifier for Many-Body Wavefunctions from the Quantum Variational Eigensolver

Machine learning has been applied on a wide variety of models, from classical statistical mechanics to quantum strongly correlated systems for the identification of phase transitions. The recently proposed quantum convolutional neural network (QCNN) provides a new framework for using quantum circuits instead of classical neural networks as the backbone of classification methods. We present here the results from training the QCNN by the wavefunctions of the variational quantum eigensolver for the one-dimensional transverse field Ising model (TFIM). We demonstrate that the QCNN identifies wavefunctions which correspond to the paramagnetic phase and the ferromagnetic phase of the TFIM with good accuracy. The QCNN can be trained to predict the corresponding phase of wavefunctions around the putative quantum critical point, even though it is trained by wavefunctions far away from it. This provides a basis for exploiting the QCNN to identify the quantum critical point.
Predicting Lattice Phonon Vibrational Frequencies Using Deep Graph Neural Networks

Lattice vibration frequencies are related to many important materials properties such as thermal and electrical conductivity as well as superconductivity. However, computational calculation of vibration frequencies using density functional theory (DFT) methods is too computationally demanding for a large number of samples in materials screening. Here we propose a deep graph neural network-based algorithm for predicting crystal vibration frequencies from crystal structures with high accuracy. Our algorithm addresses the variable dimension of vibration frequency spectrum using the zero padding scheme. Benchmark studies on two data sets with 15,000 and 35,552 samples show that the aggregated $R^2$ scores of the prediction reaches 0.554 and 0.724 respectively. Our work demonstrates the capability of deep graph neural networks to learn to predict phonon spectrum properties of crystal structures in addition to phonon density of states (DOS) and electronic DOS in which the output dimension is constant.
Tuneable spin-glass optical simulator based on multiple light scattering

The race to heuristically solve non-deterministic polynomial-time (NP) problems through efficient methods is ongoing. Recently, optics was demonstrated as a promising tool to find the ground state of a spin-glass Ising Hamiltonian, which represents an archetypal NP problem. However, achieving completely programmable spin couplings in these large-scale optical Ising simulators remains an open challenge. Here, by exploiting the knowledge of the transmission matrix of a random medium, we experimentally demonstrate the possibility of controlling the couplings of a fully connected Ising spin system. By further tailoring the input wavefront we showcase the possibility of modifying the Ising Hamiltonian both by accounting for an external magnetic field and by controlling the number of degenerate ground states and their properties and probabilities. Our results represent a relevant step toward the realisation of fully-programmable Ising machines on thin optical platforms, capable of solving complex spin-glass Hamiltonians on a large scale.
#Neural Networks#Aerosols#Multi#Retrieval#Computational Physics
Spatiotemporal observation of light propagation in a three-dimensional scattering medium

Spatiotemporal information about light pulse propagation obtained with femtosecond temporal resolution plays an important role in understanding transient phenomena and light"“matter interactions. Although ultrafast optical imaging techniques have been developed, it is still difficult to capture light pulse propagation spatiotemporally. Furthermore, imaging through a three-dimensional (3-D) scattering medium is a longstanding challenge due to the optical scattering caused by the interaction between light pulse and a 3-D scattering medium. Here, we propose a technique for ultrafast optical imaging of light pulses propagating inside a 3D scattering medium. We record an image of the light pulse propagation using the ultrashort light pulse even when the interaction between light pulse and a 3-D scattering medium causes the optical scattering. We demonstrated our proposed technique by recording converging, refracted, and diffracted propagating light for 59Â ps with femtosecond temporal resolution.
GAlaxy Light profile convolutional neural NETworks (GaLNets). I. fast and accurate structural parameters for billion galaxy samples

Next generation large sky surveys, from ground and space, will observe up to billions of galaxies for which basic structural parameters are needed to study their evolution. This is a challenging task that, for ground-based observations, is complicated by the seeing limited point-spread-function (PSF), strongly affecting the intrinsic light profile of galaxies. To perform fast and accurate analysis of galaxy surface brightness, we have developed a family of "supervised" Convolutional Neural Network (CNN) tools to derive S{é}rsic profile parameters of galaxies. In this work, we present the first two Galaxy Light profile convolutional neural Networks (GaLNets) of this family. A first one, trained using galaxy images only (GaLNet-1), and a second one, trained with both galaxy images and the ``local'' PSF (GaLNet-2). The two CNNs have been tested on a subset of public data from the Kilo-Degree Survey (KiDS), as a pathfinder dataset for high-quality ground-based observations. We have compared the results from the two CNNs with structural parameters (namely the total magnitude $mag$, the effective radius $R_{\rm eff}$, and S{é}rsic index $n$) derived for the same galaxies by 2DPHOT, as a representative of "standard" PSF-convolved S{é}rsic fitting tools. The comparison shows that, provided a suitable prior distribution is adopted, GaLNet-2 can reach an accuracy as high as 2DPHOT, while GaLNet-1 performs slightly worse because it misses the information on the ``local'' PSF. In terms of computational speed, both GaLNets are more than three orders of magnitude faster than standard methods. This first application of CNN to ground-based galaxy surface photometry shows that CNNs are promising tools to perform parametric analyses of very large samples of galaxy light profiles, as expected from surveys like Vera Rubin/LSST, Euclid mission and the Chinese Space Station Telescope.
Piezoelectric modulus prediction using machine learning and graph neural networks

Piezoelectric materials are widely used in all kinds of industries such as electric cigarette lighters, diesel engines and x-ray shutters. However, discovering high-performance and environmentally friendly (e.g. lead-free) piezoelectric materials is a difficult problem due to the sophisticated relationships from materials' composition/structures to the piezoelectric effect. Compared to other material properties such as formation energy, band gap, and bulk modulus, it is much more challenging to predict piezoelectric coefficients. Here, we propose a comprehensive study on designing and evaluating advanced machine learning models for predicting the piezoelectric modulus from materials' composition and/or structures. We train the prediction models based on extensive feature engineering combined with machine learning models (Random Forest and Support Vector Machines) and automated feature learning based on deep graph neural networks. Our SVM model with crystal structure feature outperform other methods. We also use this model to predict the piezoelectric coefficients for 12,680 materials from the Materials Project database and report the top 20 potential high performance piezoelectric materials.
Pairwise interactions for Potential energy surfaces and Atomic forces with Deep Neural network

Molecular dynamics (MD) simulation, which is considered an important tool for studying physical and chemical processes at the atomic scale, requires accurate calculations of energies and forces. Although reliable energies and forces can be obtained by electronic structure calculations such as those based on density functional theory (DFT), this approach is computationally expensive. In this work, we propose a full-stack model using deep neural network (NN) to enhance the calculation of force and energy, in which the NN is designed to extract the embedding feature of pairwise interactions of an atom and its neighbors, which are aggregated to obtain its feature vector for predicting atomic force and potential energy. By designing the features of the pairwise interactions, we can control the performance of models and take into account the many-body effects and other physics of the atomic interactions. Moreover, we demonstrated that using the Coulomb matrix of the local structures in complement to the pairwise information, we can improve the prediction of force and energy for silicon systems and the transferability of our models is confirmed to larger systems, with high accuracy.
Science
In-situ measurements of dendrite tip shape selection in a metallic alloy

The size and shape of the primary dendrite tips determine the principal length scale of the microstructure evolving during solidification of alloys. In-situ X-ray measurements of the tip shape in metals have been unsuccessful so far due to insufficient spatial resolution or high image noise. To overcome these limitations, high-resolution synchrotron radiography and advanced image processing techniques are applied to a thin sample of a solidifying Ga-35wt.%In alloy. Quantitative in-situ measurements are performed of the growth of dendrite tips during the fast initial transient and the subsequent steady growth period, with tip velocities ranging over almost two orders of magnitude. The value of the dendrite tip shape selection parameter is found to be $\sigma^* = 0.0768$, which suggests an interface energy anisotropy of $\varepsilon_4 = 0.015$ for the present Ga-In alloy. The non-axisymmetric dendrite tip shape amplitude coefficient is measured to be $A_4 \approx 0.004$, which is in excellent agreement with the universal value previously established for dendrites.
Efficient Neural Network Training via Forward and Backward Propagation Sparsification

Sparse training is a natural idea to accelerate the training speed of deep neural networks and save the memory usage, especially since large modern neural networks are significantly over-parameterized. However, most of the existing methods cannot achieve this goal in practice because the chain rule based gradient (w.r.t. structure parameters) estimators adopted by previous methods require dense computation at least in the backward propagation step. This paper solves this problem by proposing an efficient sparse training method with completely sparse forward and backward passes. We first formulate the training process as a continuous minimization problem under global sparsity constraint. We then separate the optimization process into two steps, corresponding to weight update and structure parameter update. For the former step, we use the conventional chain rule, which can be sparse via exploiting the sparse structure. For the latter step, instead of using the chain rule based gradient estimators as in existing methods, we propose a variance reduced policy gradient estimator, which only requires two forward passes without backward propagation, thus achieving completely sparse training. We prove that the variance of our gradient estimator is bounded. Extensive experimental results on real-world datasets demonstrate that compared to previous methods, our algorithm is much more effective in accelerating the training process, up to an order of magnitude faster.
Physics-informed neural networks for understanding shear migration of particles in viscous flow

We harness the physics-informed neural network (PINN) approach to extend the utility of phenomenological models for particle migration in shear flow. Specifically, we propose to constrain the neural network training via a model for the physics of shear-induced particle migration in suspensions. Then, we train the PINN against experimental data from the literature, showing that this approach provides both better fidelity to the experiments, and novel understanding of the relative roles of the hypothesized migration fluxes. We first verify the PINN approach for solving the inverse problem of radial particle migration in a non-Brownian suspension in an annular Couette flow. In this classical case, the PINN yields the same value (as reported in the literature) for the ratio of the two parameters of the empirical model. Next, we apply the PINN approach to analyze experiments on particle migration in both non-Brownian and Brownian suspensions in Poiseuille slot flow, for which a definitive calibration of the phenomenological migration model has been lacking. Using the PINN approach, we identify the unknown/empirical parameters in the physical model through the inverse solver capability of PINNs. Specifically, the values are significantly different from those for the Couette cell, highlighting an inconsistency in the literature that uses the latter value for Poiseuille flow. Importantly, the PINN results also show that the inferred values of the empirical model's parameters vary with the shear Péclet number and the particle bulk volume fraction of the suspension, instead of being constant as assumed in previous literature.
Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks

Multi-modality data is becoming readily available in remote sensing (RS) and can provide complementary information about the Earth's surface. Effective fusion of multi-modal information is thus important for various applications in RS, but also very challenging due to large domain differences, noise, and redundancies. There is a lack of effective and scalable fusion techniques for bridging multiple modality encoders and fully exploiting complementary information. To this end, we propose a new multi-modality network (MultiModNet) for land cover mapping of multi-modal remote sensing data based on a novel pyramid attention fusion (PAF) module and a gated fusion unit (GFU). The PAF module is designed to efficiently obtain rich fine-grained contextual representations from each modality with a built-in cross-level and cross-view attention fusion mechanism, and the GFU module utilizes a novel gating mechanism for early merging of features, thereby diminishing hidden redundancies and noise. This enables supplementary modalities to effectively extract the most valuable and complementary information for late feature fusion. Extensive experiments on two representative RS benchmark datasets demonstrate the effectiveness, robustness, and superiority of the MultiModNet for multi-modal land cover classification.
Learned Dynamics of Electrothermally-Actuated Soft Robot Limbs Using LSTM Neural Networks

Modeling the dynamics of soft robot limbs with electrothermal actuators is generally challenging due to thermal and mechanical hysteresis and the complex physical interactions that can arise during robot operation. This article proposes a neural network based on long short-term memory (LSTM) to address these challenges in actuator modeling. A planar soft limb, actuated by a pair of shape memory alloy (SMA) coils and containing embedded sensors for temperature and angular deflection, is used as a test platform. Data from this robot are used to train LSTM neural networks, using different combinations of sensor data, to model both unidirectional (one SMA) and bidirectional (both SMAs) motion. Open-loop rollout results show that the learned model is able to predict motions over extraordinarily long open-loop timescales (10 minutes) with little drift. Prediction errors are on the order of the soft deflection sensor's accuracy, even when using only the actuator's pulse width modulation inputs for learning. These LSTM models can be used in-situ, without extensive sensing, helping to bring soft electrothermally-actuated robots into practical application.
Mesoscale properties of mutualistic networks in ecosystems

Uncovering structural properties of ecological networks is a crucial starting point of studying the system's stability in response to various types of perturbations. We analyze pollination and seed disposal networks, which are representative examples of mutualistic networks in ecosystems, in various scales. In particular, we examine mesoscale properties such as the nested structure, the core-periphery structure, and the community structure by statistically investigating their interrelationships with real network data. As a result of community detection in different scales, we find the absence of meaningful hierarchy between networks, and the negative correlation between the modularity and the two other structures (nestedness and core-periphery-ness), which themselves are highly positively correlated. In addition, no characteristic scale of communities is perceivable from the community-inconsistency analysis. Therefore, the community structures, which are most widely studied mesoscale structures of networks, are not in fact adequate to characterize the mutualistic networks of this scale in ecosystems.
Self-Compression in Bayesian Neural Networks

Machine learning models have achieved human-level performance on various tasks. This success comes at a high cost of computation and storage overhead, which makes machine learning algorithms difficult to deploy on edge devices. Typically, one has to partially sacrifice accuracy in favor of an increased performance quantified in terms of reduced memory usage and energy consumption. Current methods compress the networks by reducing the precision of the parameters or by eliminating redundant ones. In this paper, we propose a new insight into network compression through the Bayesian framework. We show that Bayesian neural networks automatically discover redundancy in model parameters, thus enabling self-compression, which is linked to the propagation of uncertainty through the layers of the network. Our experimental results show that the network architecture can be successfully compressed by deleting parameters identified by the network itself while retaining the same level of accuracy.
MAC-ReconNet: A Multiple Acquisition Context based Convolutional Neural Network for MR Image Reconstruction using Dynamic Weight Prediction

Convolutional Neural network-based MR reconstruction methods have shown to provide fast and high quality reconstructions. A primary drawback with a CNN-based model is that it lacks flexibility and can effectively operate only for a specific acquisition context limiting practical applicability. By acquisition context, we mean a specific combination of three input settings considered namely, the anatomy under study, undersampling mask pattern and acceleration factor for undersampling. The model could be trained jointly on images combining multiple contexts. However the model does not meet the performance of context specific models nor extensible to contexts unseen at train time. This necessitates a modification to the existing architecture in generating context specific weights so as to incorporate flexibility to multiple contexts. We propose a multiple acquisition context based network, called MAC-ReconNet for MRI reconstruction, flexible to multiple acquisition contexts and generalizable to unseen contexts for applicability in real scenarios. The proposed network has an MRI reconstruction module and a dynamic weight prediction (DWP) module. The DWP module takes the corresponding acquisition context information as input and learns the context-specific weights of the reconstruction module which changes dynamically with context at run time. We show that the proposed approach can handle multiple contexts based on cardiac and brain datasets, Gaussian and Cartesian undersampling patterns and five acceleration factors. The proposed network outperforms the naive jointly trained model and gives competitive results with the context-specific models both quantitatively and qualitatively. We also demonstrate the generalizability of our model by testing on contexts unseen at train time.
An Underexplored Dilemma between Confidence and Calibration in Quantized Neural Networks

Modern convolutional neural networks (CNNs) are known to be overconfident in terms of their calibration on unseen input data. That is to say, they are more confident than they are accurate. This is undesirable if the probabilities predicted are to be used for downstream decision making. When considering accuracy, CNNs are also surprisingly robust to compression techniques, such as quantization, which aim to reduce computational and memory costs. We show that this robustness can be partially explained by the calibration behavior of modern CNNs, and may be improved with overconfidence. This is due to an intuitive result: low confidence predictions are more likely to change post-quantization, whilst being less accurate. High confidence predictions will be more accurate, but more difficult to change. Thus, a minimal drop in post-quantization accuracy is incurred. This presents a potential conflict in neural network design: worse calibration from overconfidence may lead to better robustness to quantization. We perform experiments applying post-training quantization to a variety of CNNs, on the CIFAR-100 and ImageNet datasets.
DistIR: An Intermediate Representation and Simulator for Efficient Neural Network Distribution

The rapidly growing size of deep neural network (DNN) models and datasets has given rise to a variety of distribution strategies such as data, tensor-model, pipeline parallelism, and hybrid combinations thereof. Each of these strategies offers its own trade-offs and exhibits optimal performance across different models and hardware topologies. Selecting the best set of strategies for a given setup is challenging because the search space grows combinatorially, and debugging and testing on clusters is expensive. In this work we propose DistIR, an expressive intermediate representation for distributed DNN computation that is tailored for efficient analyses, such as simulation. This enables automatically identifying the top-performing strategies without having to execute on physical hardware. Unlike prior work, DistIR can naturally express many distribution strategies including pipeline parallelism with arbitrary schedules. Our evaluation on MLP training and GPT-2 inference models demonstrates how DistIR and its simulator enable fast grid searches over complex distribution spaces spanning up to 1000+ configurations, reducing optimization time by an order of magnitude for certain regimes.
Convolutional Neural Network Dynamics: A Graph Perspective

The success of neural networks (NNs) in a wide range of applications has led to increased interest in understanding the underlying learning dynamics of these models. In this paper, we go beyond mere descriptions of the learning dynamics by taking a graph perspective and investigating the relationship between the graph structure of NNs and their performance. Specifically, we propose (1) representing the neural network learning process as a time-evolving graph (i.e., a series of static graph snapshots over epochs), (2) capturing the structural changes of the NN during the training phase in a simple temporal summary, and (3) leveraging the structural summary to predict the accuracy of the underlying NN in a classification or regression task. For the dynamic graph representation of NNs, we explore structural representations for fully-connected and convolutional layers, which are key components of powerful NN models. Our analysis shows that a simple summary of graph statistics, such as weighted degree and eigenvector centrality, over just a few epochs can be used to accurately predict the performance of NNs. For example, a weighted degree-based summary of the time-evolving graph that is constructed based on 5 training epochs of the LeNet architecture achieves classification accuracy of over 93%. Our findings are consistent for different NN architectures, including LeNet, VGG, AlexNet and ResNet.
Convolutional Neural Networks with Radio-Frequency Spintronic Nano-Devices

Convolutional neural networks are state-of-the-art and ubiquitous in modern signal processing and machine vision. Nowadays, hardware solutions based on emerging nanodevices are designed to reduce the power consumption of these networks. Spintronics devices are promising for information processing because of the various neural and synaptic functionalities they offer. However, due to their low OFF/ON ratio, performing all the multiplications required for convolutions in a single step with a crossbar array of spintronic memories would cause sneak-path currents. Here we present an architecture where synaptic communications have a frequency selectivity that prevents crosstalk caused by sneak-path currents. We first demonstrate how a chain of spintronic resonators can function as synapses and make convolutions by sequentially rectifying radio-frequency signals encoding consecutive sets of inputs. We show that a parallel implementation is possible with multiple chains of spintronic resonators to avoid storing intermediate computational steps in memory. We propose two different spatial arrangements for these chains. For each of them, we explain how to tune many artificial synapses simultaneously, exploiting the synaptic weight sharing specific to convolutions. We show how information can be transmitted between convolutional layers by using spintronic oscillators as artificial microwave neurons. Finally, we simulate a network of these radio-frequency resonators and spintronic oscillators to solve the MNIST handwritten digits dataset, and obtain results comparable to software convolutional neural networks. Since it can run convolutional neural networks fully in parallel in a single step with nano devices, the architecture proposed in this paper is promising for embedded applications requiring machine vision, such as autonomous driving.
