Mathematics

Tensor Moments of Gaussian Mixture Models: Theory and Applications

By João M. Pereira, Joe Kileel, Tamara G. Kolda
 2 days ago

Gaussian mixture models (GMM) are fundamental tools in statistical and data sciences. We study the moments of multivariate Gaussians and GMMs. The $d$-th moment of an $n$-dimensional random variable is a symmetric $d$-way tensor of size $n^d$, so working with moments naively is assumed to be prohibitively expensive for...

arxiv.org

A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits

We consider the sequential optimization of an unknown, continuous, and expensive to evaluate reward function, from noisy and adversarially corrupted observed rewards. When the corruption attacks are subject to a suitable budget $C$ and the function lives in a Reproducing Kernel Hilbert Space (RKHS), the problem can be posed as corrupted Gaussian process (GP) bandit optimization. We propose a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants. Our algorithm, Robust GP Phased Elimination (RGP-PE), successfully balances robustness to corruptions with exploration and exploitation such that its performance degrades minimally in the presence (or absence) of adversarial corruptions. When $T$ is the number of samples and $\gamma_T$ is the maximal information gain, the corruption-dependent term in our regret bound is $O(C \gamma_T^{3/2})$, which is significantly tighter than the existing $O(C \sqrt{T \gamma_T})$ for several commonly-considered kernels. We perform the first empirical study of robustness in the corrupted GP bandit setting, and show that our algorithm is robust against a variety of adversarial attacks.
CODING & PROGRAMMING
arxiv.org

Locally Random P-adic Alloy Codes with ChannelCoding Theorems for Distributed Coded Tensors

Tensors, i.e., multi-linear functions, are a fundamental building block of machine learning algorithms. In order to train on large data-sets, it is common practice to distribute the computation amongst workers. However, stragglers and other faults can severely impact the performance and overall training time. A novel strategy to mitigate these failures is the use of coded computation. We introduce a new metric for analysis called the typical recovery threshold, which focuses on the most likely event and provide a novel construction of distributed coded tensor operations which are optimal with this measure. We show that our general framework encompasses many other computational schemes and metrics as a special case. In particular, we prove that the recovery threshold and the tensor rank can be recovered as a special case of the typical recovery threshold when the probability of noise, i.e., a fault, is equal to zero, thereby providing a noisy generalization of noiseless computation as a serendipitous result. Far from being a purely theoretical construction, these definitions lead us to practical random code constructions, i.e., locally random p-adic alloy codes, which are optimal with respect to the measures. We analyze experiments conducted on Amazon EC2 and establish that they are faster and more numerically stable than many other benchmark computation schemes in practice, as is predicted by theory.
CODING & PROGRAMMING
arxiv.org

Functional Mixtures-of-Experts

We consider the statistical analysis of heterogeneous data for clustering and prediction purposes, in situations where the observations include functions, typically time series. We extend the modeling with Mixtures-of-Experts (ME), as a framework of choice in modeling heterogeneity in data for prediction and clustering with vectorial observations, to this functional data analysis context. We first present a new family of functional ME (FME) models, in which the predictors are potentially noisy observations, from entire functions, and the data generating process of the pair predictor and the real response, is governed by a hidden discrete variable representing an unknown partition, leading to complex situations to which the standard ME framework is not adapted. Second, we provide sparse and interpretable functional representations of the FME models, thanks to Lasso-like regularizations, notably on the derivatives of the underlying functional parameters of the model, projected onto a set of continuous basis functions. We develop dedicated expectation--maximization algorithms for Lasso-like regularized maximum-likelihood parameter estimation strategies, to encourage sparse and interpretable solutions. The proposed FME models and the developed EM-Lasso algorithms are studied in simulated scenarios and in applications to two real data sets, and the obtained results demonstrate their performance in accurately capturing complex nonlinear relationships between the response and the functional predictor, and in clustering.
COMPUTERS
arxiv.org

Global sensitivity analysis based on Gaussian-process metamodelling for complex biomechanical problems

Biomechanical models often need to describe very complex systems, organs or diseases, and hence also include a large number of parameters. One of the attractive features of physics-based models is that in those models (most) parameters have a clear physical meaning. Nevertheless, the determination of these parameters is often very elaborate and costly and shows a large scatter within the population. Hence, it is essential to identify the most important parameter for a particular problem at hand. In order to distinguish parameters which have a significant influence on a specific model output from non-influential parameters, we use sensitivity analysis, in particular the Sobol method as a global variance-based method. However, the Sobol method requires a large number of model evaluations, which is prohibitive for computationally expensive models. We therefore employ Gaussian processes as a metamodel for the underlying full model. Metamodelling introduces further uncertainty, which we also quantify. We demonstrate the approach by applying it to two different problems: nanoparticle-mediated drug delivery in a multiphase tumour-growth model, and arterial growth and remodelling. Even relatively small numbers of evaluations of the full model suffice to identify the influential parameters in both cases and to separate them from non-influential parameters. The approach also allows the quantification of higher-order interaction effects. We thus show that a variance-based global sensitivity analysis is feasible for computationally expensive biomechanical models. Different aspects of sensitivity analysis are covered including a transparent declaration of the uncertainties involved in the estimation process. Such a global sensitivity analysis not only helps to massively reduce costs for experimental determination of parameters but is also highly beneficial for inverse analysis of such complex models.
SCIENCE
#Tensor#Mixture#Gmm#Gaussians#Bell#Machine Learning#Lg#Numerical Analysis
arxiv.org

Theory of Two-Photon Absorption with Broadband Bright Squeezed Vacuum: Part 1 Quantum Model

We present an analytical quantum theoretic model for nonresonant two-photon absorption (TPA) of broadband squeezed vacuum with pulse duration much greater than the coherence time, including low gain (isolated entangled photon pairs or EPP) and high gain (bright squeezed vacuum or BSV) regimes. In the case of high gain we find that if the atomic or molecular TPA linewidth is much narrower than the bandwidth of the exciting light, bright squeezed vacuum is found to be equally effective in driving TPA as is a quasi-monochromatic coherent-state (classical) pulse of the same temporal shape and mean photon number. Therefore, the sought-for advantage of observing TPA at extremely low optical flux is not provided by broadband bright squeezed vacuum. In the case that the atomic or molecular TPA linewidth is much broader than the bandwidth of the exciting light, we show that the TPA rate is proportional to the second-order intensity autocorrelation function at zero time delay , as expected. And we find that for to reach the idealized form , with being the mean number of photons per mode, dispersion compensation is required. Part 2 of this two-paper series considers the same questions in the context of a classical model of squeezed light.
PHYSICS
arxiv.org

Collective neutrino oscillations with tensor networks using a time-dependent variational principle

Michael J. Cervia, Pooja Siwach, Amol V. Patwardhan, A. B. Balantekin, S. N. Coppersmith, Calvin W. Johnson. A system of oscillating neutrinos at high densities, wherein neutrino-neutrino coherent forward scatterings are non-negligible, represents a many-body problem engendered by the weak interaction. Whether an interplay between the one-body and two-body interaction terms in the neutrino Hamiltonian could result in significant nontrivial quantum entanglement developing between the constituent neutrinos remains an open question. Numerical computations of the time-evolution of many-body quantum systems are challenging because the size of the Hilbert space scales exponentially with the number of particles in the system. Such calculations therefore tend to become extremely memory intensive even at a relatively small number of particles. As a result, approximate numerical treatments become necessary in order to facilitate comparisons with mean-field calculations at larger values of N. Here we investigate the efficacy of tensor network methods for extending the calculations of time-evolving systems of interacting neutrinos to larger values of N than are possible with conventional methods. In particular, we introduce the use of time-dependent variational principle methods to address the long-range (in momentum space) interactions of the neutrino Hamiltonian when including many distinct vacuum oscillation frequencies. Furthermore, we define new error measures based upon the instantaneously conserved charge operators known for this Hamiltonian to determine validity of large-N tensor network calculations.
PHYSICS
arxiv.org

New Inner and Outer Bounds for 2-User Gaussian Broadcast Channels with Heterogeneous Blocklength Constraints

We investigate both a novel inner and outer bound on the rate region of a 2-user Gaussian broadcast channel with finite, heterogeneous blocklength constraints (HB-GBC). In particular, we introduce a new, modified Sato-type outer bound that can be applied in the finite blocklength regime and does not require the same marginal property. We then develop and analyze concatenated shell codes, which are suitable for the HB-GBC. Especially, to achieve a smaller decoding latency for the user with shorter blocklength constraint when successive interference cancellation is used, we derive the number of symbols needed to successfully early decode the other user's message. We numerically compare our derived outer bound to the best known achievable rate regions. Numerical results show that the new early decoding performance is significantly improved compared to the state of the art, and performs very close to the asymptotic limit.
SCIENCE
arxiv.org

14-moment maximum-entropy modelling of collisionless ions for Hall thruster discharges

Ions in Hall thruster devices are often characterized by a low collisionality. In the presence of acceleration fields and azimuthal electric field waves this results in strong deviations from thermodynamic equilibrium, which requires one to employ kinetic descriptions. This work investigates the application of the 14-moment maximum-entropy model to this problem. This method consists in a set of 14 PDEs that describe the density, momentum, pressure tensor components, heat flux and fourth-order moment of the gas. The method is applied to the ion dynamics and its accuracy is assessed against the kinetic solution. Three test cases are considered: a purely axial acceleration problem, the problem of ion-wave trapping and finally the evolution of ions in the axial-azimuthal plane. Only ions are considered in this work, since the goal is providing a direct comparison of different methods. The coupling with electrons is thus removed by prescribing reasonable values of the electric field. The maximum-entropy system appears to be a robust and accurate option for the considered test cases, bringing significant improvements over the simpler pressureless gas models or the Euler equations for gas dynamics.
PHYSICS
Mathematics
Science
Computer Science
arxiv.org

Magnetism of QCD matter and pion mass from tensor-type spin polarization and anomalous magnetic moment of quarks

We investigate the magnetism of QCD matter and pion mass under magnetic field considering the contribution from the tensor-type spin polarization and the anomalous magnetic moment (AMM) of quarks. It is found that the tensor-type spin polarization (TSP) induces the magnetic catalysis of chiral condensate and diamagnetism (negative magnetic susceptibility) of quark matter at low temperature, both neutral and charged pion masses increase quickly with magnetic field in the case of TSP. The anomalous magnetic moment (AMM) of quarks induces magnetic inhibition and a magnetic dependent AMM causes inverse magnetic catalysis at finite temperature, and the neutral pion mass decreases with magnetic field while the charged pion mass shows nonmonotonic behavior with the magnetic field, which is qualitatively in agreement with lattice result. However, the magnetic susceptibility is positive at low temperature with AMM. In the current framework, our results show the irreconcilable contradiction between the diamagnetism and inverse magnetic catalysis.
PHYSICS
arxiv.org

Brownian non-Gaussian polymer diffusion and queing theory in the mean-field limit

We link the Brownian non-Gaussian diffusion of a polymer center of mass to a microscopic cause: the polymerization/depolymerization phenomenon occurring when the polymer is in contact with a monomer chemostat. The anomalous behavior is triggered by the polymer critical point, separating the dilute and the dense phase in the grand canonical ensemble. In the mean-field limit we establish contact with queuing theory and show that the kurtosis of the polymer center of mass diverges alike a response function when the system becomes critical, a result which holds for general polymer dynamics (Zimm, Rouse, reptation). Both the equilibrium and nonequilibrium behaviors are solved exactly as a reference study for novel stochastic modeling and experimental setup.
PHYSICS
arxiv.org

Entanglement estimation in tensor network states via sampling

We introduce a method for extracting meaningful entanglement measures of tensor network states in general dimensions. Current methods require the explicit reconstruction of the density matrix, which is highly demanding, or the contraction of replicas, which requires an effort exponential in the number of replicas and which is costly in terms of memory. In contrast, our method requires the stochastic sampling of matrix elements of the classically represented reduced states with respect to random states drawn from simple product probability measures constituting frames. Even though not corresponding to physical operations, such matrix elements are straightforward to calculate for tensor network states, and their moments provide the Rényi entropies and negativities as well as their symmetry-resolved components. We test our method on the one-dimensional critical XX chain and the two-dimensional toric code. Although the cost is exponential in the subsystem size, it is sufficiently moderate so that - in contrast with other approaches - accurate results can be obtained on a personal computer for relatively large subsystem sizes.
SCIENCE
arxiv.org

Shared purity and concurrence of a mixture of ground and low-lying excited states as indicators of quantum phase transitions

We investigate the efficacy of shared purity, a measure of quantum correlation that is independent of separability-entanglement paradigm, as a quantum phase transition indicator in comparison with concurrence, a bipartite entanglement measure. The order parameters are investigated for a mixture of the ground state and low-lying excited states of the systems considered. In the case of the one-dimensional J1-J2 Heisenberg quantum spin model and the one-dimensional transverse-field quantum Ising model, shared purity turns out to be as effective as concurrence in indicating quantum phase transitions. In the two-dimensional J1-J2 Heisenberg quantum spin model, shared purity indicates the two quantum phase transitions present in the model, while concurrence detects only one of them. Moreover, we find diverging finite-size scaling exponents for the order parameters near the transitions in odd- and even-sized systems governed by the one-dimensional J1-J2 model, as had previously been reported for quantum spins on odd- and even-legged ladders. It is plausible that the divergence is related to a Mobius strip-like boundary condition required for odd-sized systems, while for even-sized systems, the usual periodic boundary condition is sufficient.
PHYSICS
arxiv.org

MIONet: Learning multiple-input operators via tensor product

As an emerging paradigm in scientific machine learning, neural operators aim to learn operators, via neural networks, that map between infinite-dimensional function spaces. Several neural operators have been recently developed. However, all the existing neural operators are only designed to learn operators defined on a single Banach space, i.e., the input of the operator is a single function. Here, for the first time, we study the operator regression via neural networks for multiple-input operators defined on the product of Banach spaces. We first prove a universal approximation theorem of continuous multiple-input operators. We also provide detailed theoretical analysis including the approximation error, which provides a guidance of the design of the network architecture. Based on our theory and a low-rank approximation, we propose a novel neural operator, MIONet, to learn multiple-input operators. MIONet consists of several branch nets for encoding the input functions and a trunk net for encoding the domain of the output function. We demonstrate that MIONet can learn solution operators involving systems governed by ordinary and partial differential equations. In our computational examples, we also show that we can endow MIONet with prior knowledge of the underlying system, such as linearity and periodicity, to further improve the accuracy.
CODING & PROGRAMMING
arxiv.org

Simulation of Primordial Black Holes with large negative non-Gaussianity

In this work, we have performed numerical simulations of primordial black hole (PBH) formation in the Friedman-Lemaître-Robertson-Walker universe filled by radiation fluid, introducing the local-type non-Gaussianity to the primordial curvature fluctuation. We have compared the numerical results from simulations with previous analytical estimations on the threshold value for PBH formation done in the previous paper arXiv:2109.00791, particularly for negative values of the non-linearity parameter $f_{\rm NL}$. Our numerical results show the existence of PBH formation of (the so-called) type I also in the case $f_{\rm NL} \lesssim -0.336$, which was not found in the previous analytical expectations using the critical averaged compaction function. In particular, although the universal value for the averaged critical compaction function $\bar{\mathcal{C}}_{c}=2/5$ found previously in the literature is not satisfied for all the profiles considered in this work, an alternative direct analytical estimate has been found to be roughly accurate to estimate the thresholds, which gives the value of the critical averaged density with a few $\%$ deviation from the numerical one for $f_{\rm NL}\gtrsim -1$.
ASTRONOMY
arxiv.org

Mixture-of-Rookies: Saving DNN Computations by Predicting ReLU Outputs

Deep Neural Networks (DNNs) are widely used in many applications domains. However, they require a vast amount of computations and memory accesses to deliver outstanding accuracy. In this paper, we propose a scheme to predict whether the output of each ReLu activated neuron will be a zero or a positive number in order to skip the computation of those neurons that will likely output a zero. Our predictor, named Mixture-of-Rookies, combines two inexpensive components. The first one exploits the high linear correlation between binarized (1-bit) and full-precision (8-bit) dot products, whereas the second component clusters together neurons that tend to output zero at the same time. We propose a novel clustering scheme based on the analysis of angles, as the sign of the dot product of two vectors depends on the cosine of the angle between them. We implement our hybrid zero output predictor on top of a state-of-the-art DNN accelerator. Experimental results show that our scheme introduces a small area overhead of 5.3% while achieving a speedup of 1.2x and reducing energy consumption by 16.5% on average for a set of diverse DNNs.
COMPUTERS
arxiv.org

Collisional open quantum dynamics with a generally correlated environment: Exact solvability in tensor networks

Quantum collision models are receiving increasing attention as they describe many nontrivial phenomena in dynamics of open quantum systems. In a general scenario of both fundamental and practical interest, a quantum system repeatedly interacts with individual particles or modes forming a correlated and structured reservoir; however, classical and quantum environment correlations greatly complicate the calculation and interpretation of the system dynamics. Here we propose an exact solution to this problem based on the tensor network formalism. We find a natural Markovian embedding for the system dynamics, where the role of an auxiliary system is played by virtual indices of the network. The constructed embedding is amenable to analytical treatment for a number of timely problems like the system interaction with two-photon wavepackets, structured photonic states, and one-dimensional spin chains. We also derive a time-convolution master equation and relate its memory kernel with the environment correlation function, thus revealing a clear physical picture of memory effects in the dynamics. The results advance tensor-network methods in the fields of quantum optics and quantum transport.
SCIENCE
arxiv.org

A closer look at predicting turbulence statistics of arbitrary moments when based on a non-modelled symmetry approach

A recent Letter by Oberlack et al. [Phys. Rev. Lett. 128, 024502 (2022)] claims to have derived new symmetry-induced solutions of the non-modelled statistical Navier-Stokes equations of turbulent channel flow. A high accuracy match to DNS data for all streamwise moments up to order 6 is presented, both in the region of the channel-center and in the inertial sublayer close to the wall. Here we will show that the findings and conclusions in that study are highly misleading, as they give the impression that a significant breakthrough in turbulence research has been achieved. But, unfortunately, this is not the case. Besides trivial and misleading aspects, we will demonstrate that even basic turbulence-relevant correlations as the Reynolds-stress cannot be fitted to data using the proposed symmetry-induced scaling laws. The Lie-group symmetry method as used by Oberlack et al. cannot bypass the closure problem of turbulence. It is just another assumption-based method that requires modelling and is not, as claimed, a first-principle method that leads directly to solutions. Next to PRL, two more papers by Oberlack et al. are called out for correction or a retraction.
SCIENCE
arxiv.org

Theory of Disordered Superconductors with Applications to Nonlinear Current Response

I present a review of the theory and basic equations for charge transport in superconducting alloys starting from the Keldysh formulation of the quasiclassical transport equations developed by Eilenberger, Larkin and Ovchinnikov and Eliashberg. This formulation is the natural extension of Landau's theory of normal Fermi liquids to the superconducting state of strongly correlated metals. For dirty metals the transport equations reduce to equations for charge diffusion, with the current response given by the Drude conductivity at low temperatures. The extension of the diffusion equation for the charge and current response of a strongly disordered normal metal to the superconducting state yields Usadel's equations for the non-equilibrium quasiclassical Keldysh propagator. The conditions for the applicability of the Usadel equations are discussed, the pair-breaking effect of disorder on the current response, including the nonlinear current response to an EM field in the dirty limit, $\tau \ll \hbar/\Delta$, are reported. The same nonlinearity is shown to lead to source currents for photon generation and nonlinear Kerr rotation driven by the nonlinear response to excitation of the superconductor by a multi-mode EM field. The potential relevance of the nonlinear source currents to SRF cavities as detectors of axion-like dark matter candidates is briefly discussed.
PHYSICS
arxiv.org

The Noise Covariances of Linear Gaussian Systems with Unknown Inputs Are Not Uniquely Identifiable Using Autocovariance Least-squares

Existing works in optimal filtering for linear Gaussian systems with arbitrary unknown inputs assume perfect knowledge of the noise covariances in the filter design. This is impractical and raises the question of whether and under what conditions one can identify the noise covariances of linear Gaussian systems with arbitrary unknown inputs. This paper considers the above identifiability question using the correlation-based autocovariance least-squares (ALS) approach. In particular, for the ALS framework, we prove that (i) the process noise covariance Q and the measurement noise covariance R cannot be uniquely jointly identified; (ii) neither Q nor R is uniquely identifiable, when the other is known. This not only helps us to have a better understanding of the applicability of existing filtering frameworks under unknown inputs (since almost all of them require perfect knowledge of the noise covariances) but also calls for further investigation of alternative and more viable noise covariance methods under unknown inputs. Especially, it remains to be explored whether the noise covariances are uniquely identifiable using other correlation-based methods. We are also interested to use regularization for noise covariance estimation under unknown inputs, and investigate the relevant property guarantees for the covariance estimates. The above topics are the main subject of our current and future work.
COMPUTERS
arxiv.org

Stein Particle Filter for Nonlinear, Non-Gaussian State Estimation

Estimation of a dynamical system's latent state subject to sensor noise and model inaccuracies remains a critical yet difficult problem in robotics. While Kalman filters provide the optimal solution in the least squared sense for linear and Gaussian noise problems, the general nonlinear and non-Gaussian noise case is significantly more complicated, typically relying on sampling strategies that are limited to low-dimensional state spaces. In this paper we devise a general inference procedure for filtering of nonlinear, non-Gaussian dynamical systems that exploits the differentiability of both the update and prediction models to scale to higher dimensional spaces. Our method, Stein particle filter, can be seen as a deterministic flow of particles, embedded in a reproducing kernel Hilbert space, from an initial state to the desirable posterior. The particles evolve jointly to conform to a posterior approximation while interacting with each other through a repulsive force. We evaluate the method in simulation and in complex localization tasks while comparing it to sequential Monte Carlo solutions.
COMPUTERS

