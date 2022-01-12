ContributorsPublishersAdvertisers
Mathematics

Explicit Analytical Solution for Random Close Packing in d=2 and d=3

By Alessio Zaccone
arxiv.org
 3 days ago

We present an analytical derivation of the volume fractions for random close packing (RCP) in both $d=3$ and $d=2$, based on the same methodology. Using suitably modified nearest neigbhour statistics for hard spheres, we obtain...

arxiv.org

Related
arxiv.org

Automated Reinforcement Learning: An Overview

Reinforcement Learning and recently Deep Reinforcement Learning are popular methods for solving sequential decision making problems modeled as Markov Decision Processes. RL modeling of a problem and selecting algorithms and hyper-parameters require careful considerations as different configurations may entail completely different performances. These considerations are mainly the task of RL experts; however, RL is progressively becoming popular in other fields where the researchers and system designers are not RL experts. Besides, many modeling decisions, such as defining state and action space, size of batches and frequency of batch updating, and number of timesteps are typically made manually. For these reasons, automating different components of RL framework is of great importance and it has attracted much attention in recent years. Automated RL provides a framework in which different components of RL including MDP modeling, algorithm selection and hyper-parameter optimization are modeled and defined automatically. In this article, we explore the literature and present recent work that can be used in automated RL. Moreover, we discuss the challenges, open questions and research directions in AutoRL.
COMPUTERS
arxiv.org

Neural Koopman Lyapunov Control

Learning and synthesizing stabilizing controllers for unknown nonlinear systems is a challenging problem for real-world and industrial applications. Koopman operator theory allow one to analyze nonlinear systems through the lens of linear systems and nonlinear control systems through the lens of bilinear control systems. The key idea of these methods, lies in the transformation of the coordinates of the nonlinear system into the Koopman observables, which are coordinates that allow the representation of the original system (control system) as a higher dimensional linear (bilinear control) system. However, for nonlinear control systems, the bilinear control model obtained by applying Koopman operator based learning methods is not necessarily stabilizable and therefore, the existence of a stabilizing feedback control is not guaranteed which is crucial for many real world applications. Simultaneous identification of these stabilizable Koopman based bilinear control systems as well as the associated Koopman observables is still an open problem. In this paper, we propose a framework to identify and construct these stabilizable bilinear models and its associated observables from data by simultaneously learning a bilinear Koopman embedding for the underlying unknown nonlinear control system as well as a Control Lyapunov Function (CLF) for the Koopman based bilinear model using a learner and falsifier. Our proposed approach thereby provides provable guarantees of global asymptotic stability for the nonlinear control systems with unknown dynamics. Numerical simulations are provided to validate the efficacy of our proposed class of stabilizing feedback controllers for unknown nonlinear systems.
ENGINEERING
arxiv.org

Multi-Scale Adaptive Graph Neural Network for Multivariate Time Series Forecasting

Multivariate time series (MTS) forecasting plays an important role in the automation and optimization of intelligent applications. It is a challenging task, as we need to consider both complex intra-variable dependencies and inter-variable dependencies. Existing works only learn temporal patterns with the help of single inter-variable dependencies. However, there are multi-scale temporal patterns in many real-world MTS. Single inter-variable dependencies make the model prefer to learn one type of prominent and shared temporal patterns. In this paper, we propose a multi-scale adaptive graph neural network (MAGNN) to address the above issue. MAGNN exploits a multi-scale pyramid network to preserve the underlying temporal dependencies at different time scales. Since the inter-variable dependencies may be different under distinct time scales, an adaptive graph learning module is designed to infer the scale-specific inter-variable dependencies without pre-defined priors. Given the multi-scale feature representations and scale-specific inter-variable dependencies, a multi-scale temporal graph neural network is introduced to jointly model intra-variable dependencies and inter-variable dependencies. After that, we develop a scale-wise fusion module to effectively promote the collaboration across different time scales, and automatically capture the importance of contributed temporal patterns. Experiments on four real-world datasets demonstrate that MAGNN outperforms the state-of-the-art methods across various settings.
COMPUTERS
arxiv.org

Fully Adaptive Bayesian Algorithm for Data Analysis, FABADA

The aim of this paper is to describe a novel non-parametric noise reduction technique from the point of view of Bayesian inference that may automatically improve the signal-to-noise ratio of one- and two-dimensional data, such as e.g. astronomical images and spectra. The algorithm iteratively evaluates possible smoothed versions of the data, the smooth models, obtaining an estimation of the underlying signal that is statistically compatible with the noisy measurements. Iterations stop based on the evidence and the $\chi^2$ statistic of the last smooth model, and we compute the expected value of the signal as a weighted average of the whole set of smooth models. In this paper, we explain the mathematical formalism and numerical implementation of the algorithm, and we evaluate its performance in terms of the peak signal to noise ratio, the structural similarity index, and the time payload, using a battery of real astronomical observations. Our Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) yields results that, without any parameter tuning, are comparable to standard image processing algorithms whose parameters have been optimized based on the true signal to be recovered, something that is impossible in a real application. State-of-the-art non-parametric methods, such as BM3D, offer slightly better performance at high signal-to-noise ratio, while our algorithm is significantly more accurate for extremely noisy data (higher than $20-40\%$ relative errors, a situation of particular interest in the field of astronomy). In this range, the standard deviation of the residuals obtained by our reconstruction may become more than an order of magnitude lower than that of the original measurements. The source code needed to reproduce all the results presented in this report, including the implementation of the method, is publicly available at this https URL.
COMPUTERS
IN THIS ARTICLE
#Solution#Fcc#Neural Networks#Statistical Mechanics
arxiv.org

On Sampling Collaborative Filtering Datasets

We study the practical consequences of dataset sampling strategies on the ranking performance of recommendation algorithms. Recommender systems are generally trained and evaluated on samples of larger datasets. Samples are often taken in a naive or ad-hoc fashion: e.g. by sampling a dataset randomly or by selecting users or items with many interactions. As we demonstrate, commonly-used data sampling schemes can have significant consequences on algorithm performance. Following this observation, this paper makes three main contributions: (1) characterizing the effect of sampling on algorithm performance, in terms of algorithm and dataset characteristics (e.g. sparsity characteristics, sequential dynamics, etc.); (2) designing SVP-CF, which is a data-specific sampling strategy, that aims to preserve the relative performance of models after sampling, and is especially suited to long-tailed interaction data; and (3) developing an oracle, Data-Genie, which can suggest the sampling scheme that is most likely to preserve model performance for a given dataset. The main benefit of Data-Genie is that it will allow recommender system practitioners to quickly prototype and compare various approaches, while remaining confident that algorithm performance will be preserved, once the algorithm is retrained and deployed on the complete data. Detailed experiments show that using Data-Genie, we can discard upto 5x more data than any sampling strategy with the same level of performance.
COMPUTERS
arxiv.org

Forecast-based Multi-aspect Framework for Multivariate Time-series Anomaly Detection

Today's cyber-world is vastly multivariate. Metrics collected at extreme varieties demand multivariate algorithms to properly detect anomalies. However, forecast-based algorithms, as widely proven approaches, often perform sub-optimally or inconsistently across datasets. A key common issue is they strive to be one-size-fits-all but anomalies are distinctive in nature. We propose a method that tailors to such distinction. Presenting FMUAD - a Forecast-based, Multi-aspect, Unsupervised Anomaly Detection framework. FMUAD explicitly and separately captures the signature traits of anomaly types - spatial change, temporal change and correlation change - with independent modules. The modules then jointly learn an optimal feature representation, which is highly flexible and intuitive, unlike most other models in the category. Extensive experiments show our FMUAD framework consistently outperforms other state-of-the-art forecast-based anomaly detectors.
SOFTWARE
arxiv.org

Certifiable Robustness for Nearest Neighbor Classifiers

ML models are typically trained using large datasets of high quality. However, training datasets often contain inconsistent or incomplete data. To tackle this issue, one solution is to develop algorithms that can check whether a prediction of a model is certifiably robust. Given a learning algorithm that produces a classifier and given an example at test time, a classification outcome is certifiably robust if it is predicted by every model trained across all possible worlds (repairs) of the uncertain (inconsistent) dataset. This notion of robustness falls naturally under the framework of certain answers. In this paper, we study the complexity of certifying robustness for a simple but widely deployed classification algorithm, $k$-Nearest Neighbors ($k$-NN). Our main focus is on inconsistent datasets when the integrity constraints are functional dependencies (FDs). For this setting, we establish a dichotomy in the complexity of certifying robustness w.r.t. the set of FDs: the problem either admits a polynomial time algorithm, or it is coNP-hard. Additionally, we exhibit a similar dichotomy for the counting version of the problem, where the goal is to count the number of possible worlds that predict a certain label. As a byproduct of our study, we also establish the complexity of a problem related to finding an optimal subset repair that may be of independent interest.
CODING & PROGRAMMING
arxiv.org

A Geometric Approach to $k$-means

$k$-means clustering is a fundamental problem in various disciplines. This problem is nonconvex, and standard algorithms are only guaranteed to find a local optimum. Leveraging the structure of local solutions characterized in [1], we propose a general algorithmic framework for escaping undesirable local solutions and recovering the global solution (or the ground truth). This framework consists of alternating between the following two steps iteratively: (i) detect mis-specified clusters in a local solution and (ii) improve the current local solution by non-local operations. We discuss implementation of these steps, and elucidate how the proposed framework unifies variants of $k$-means algorithm in literature from a geometric perspective. In addition, we introduce two natural extensions of the proposed framework, where the initial number of clusters is misspecified. We provide theoretical justification for our approach, which is corroborated with extensive experiments.
MATHEMATICS
YOU MAY ALSO LIKE
NewsBreak
Mathematics
NewsBreak
Science
NewsBreak
Computer Science
arxiv.org

Discovering Governing Equations from Partial Measurements with Deep Delay Autoencoders

A central challenge in data-driven model discovery is the presence of hidden, or latent, variables that are not directly measured but are dynamically important. Takens' theorem provides conditions for when it is possible to augment these partial measurements with time delayed information, resulting in an attractor that is diffeomorphic to that of the original full-state system. However, the coordinate transformation back to the original attractor is typically unknown, and learning the dynamics in the embedding space has remained an open challenge for decades. Here, we design a custom deep autoencoder network to learn a coordinate transformation from the delay embedded space into a new space where it is possible to represent the dynamics in a sparse, closed form. We demonstrate this approach on the Lorenz, Rössler, and Lotka-Volterra systems, learning dynamics from a single measurement variable. As a challenging example, we learn a Lorenz analogue from a single scalar variable extracted from a video of a chaotic waterwheel experiment. The resulting modeling framework combines deep learning to uncover effective coordinates and the sparse identification of nonlinear dynamics (SINDy) for interpretable modeling. Thus, we show that it is possible to simultaneously learn a closed-form model and the associated coordinate system for partially observed dynamics.
MATHEMATICS
arxiv.org

Quartic Regularity

In this paper, we propose new linearly convergent second-order methods for minimizing convex quartic polynomials. This framework is applied for designing optimization schemes, which can solve general convex problems satisfying a new condition of quartic regularity. It assumes positive definiteness and boundedness of the fourth derivative of the objective function. For such problems, an appropriate quartic regularization of Damped Newton Method has global linear rate of convergence. We discuss several important consequences of this result. In particular, it can be used for constructing new second-order methods in the framework of high-order proximal-point schemes. These methods have convergence rate $\tilde O(k^{-p})$, where $k$ is the iteration counter, $p$ is equal to 3, 4, or 5, and tilde indicates the presence of logarithmic factors in the complexity bounds for the auxiliary problems, which are solved at each iteration of the schemes.
MATHEMATICS
arxiv.org

The Polyhedral Geometry of Pivot Rules and Monotone Paths

Motivated by the analysis of the performance of the simplex method we study the behavior of families of pivot rules of linear programs. We introduce normalized-weight pivot rules which are fundamental for the following reasons: First, they are memory-less, in the sense that the pivots are governed by local information encoded by an arborescence. Second, many of the most used pivot rules belong to that class, and we show this subclass is critical for understanding the complexity of all pivot rules. Finally, normalized-weight pivot rules can be parametrized in a natural continuous manner.
MATHEMATICS
arxiv.org

Interface potential and line tension for Bose-Einstein condensate mixtures near a hard wall

Within Gross-Pitaevskii (GP) theory we derive the interface potential V (l) which describes the interaction between the interface separating two demixed Bose-condensed gases and an optical hard wall at a distance l. Previous work revealed that this interaction gives rise to extraordinary wetting and prewetting phenomena. Calculations that explore non-equilibrium properties by using l as a constraint provide a thorough explanation for this behavior. We find that at bulk two-phase coexistence, V (l) for both complete wetting and partial wetting is monotonic with exponential decay. Remarkably, at the first-order wetting phase transition, V(l) is independent of l. This anomaly explains the infinite continuous degeneracy of the grand potential reported earlier. As a physical application, using V(l) we study the three-phase contact line where the interface meets the wall under a contact angle theta. Employing an interface displacement model we calculate the structure of this inhomogeneity and its line tension tau. Contrary to what happens at a usual first-order wetting transition in systems with short-range forces, tau does not approach a nonzero positive constant for theta going to zero, but instead approaches zero (from below) as would be expected for a critical wetting transition. This hybrid character of tau is a consequence of the absence of a barrier in V(l) at wetting. For a typical V(l) we provide a conjecture for the exact line tension within GP theory.
MATHEMATICS
arxiv.org

Comments on the mass sheet degeneracy in cosmography analyses

We make a number of comments regarding modeling degeneracies in strong lensing measurements of the Hubble parameter $H_0$. The first point concerns the impact of weak lensing associated with different segments of the line of sight. We show that external convergence terms associated with the lens-source and observer-lens segments need to be included in cosmographic modeling, in addition to the usual observer-source term, to avoid systematic bias in the inferred value of $H_0$. Specifically, we show how an incomplete account of some line of sight terms biases stellar kinematics as well as ray tracing simulation methods to alleviate the mass sheet degeneracy. The second point concerns the use of imaging data for multiple strongly-lensed sources in a given system. We show that the mass sheet degeneracy is not fully resolved by the availability of multiple sources: some degeneracy remains because of differential external convergence between the different sources. Similarly, differential external convergence also complicates the use of multiple sources in addressing the approximate mass sheet degeneracy associated with a local ("internal") core component in lens galaxies. This internal-external degeneracy is amplified by the non-monotonicity of the angular diameter distance as a function of redshift. For a rough assessment of the weak lensing effects, we provide estimates of external convergence using the nonlinear matter power spectrum, paying attention to non-equal time correlators.
SCIENCE
arxiv.org

Reply to 'Comment on "Proper and improper chiral magnetic interactions" '

Manuel dos Santos Dias, Sascha Brinker, András Lászlóffy, Bendegúz Nyári, Stefan Blügel, László Szunyogh, Samir Lounis. In our previous Letter [Phys. Rev. B 103, L140408 (2021)], we presented a discussion of the fundamental physical properties of the interactions parameterizing atomistic spin models in connection to first-principles approaches that enable their calculation for a given material. This explained how some of those approaches can apparently lead to magnetic interactions that do not comply with the expected physical properties, such as Dzyaloshinskii-Moriya interactions which are non-chiral and independent of the spin-orbit interaction, and which we consequently termed `improper'. In the preceding Comment [Phys. Rev. B 105, 026401], the authors present arguments based on the distinction between global and local approaches to the mapping of the magnetic energy using first-principles calculations to support their proposed non-chiral Dzyaloshinskii-Moriya interactions and their dismissal of our distinction between `proper' and `improper' magnetic interactions. In this Reply, we identify the missing step in the local approach to the mapping and explain how ignoring this step leads to the identification of magnetic interactions which do not comply with established physical principles and that we have previously termed `improper'.
PHYSICS
arxiv.org

The curse of overparametrization in adversarial training: Precise analysis of robust generalization for random features regression

Successful deep learning models often involve training neural network architectures that contain more parameters than the number of training samples. Such overparametrized models have been extensively studied in recent years, and the virtues of overparametrization have been established from both the statistical perspective, via the double-descent phenomenon, and the computational perspective via the structural properties of the optimization landscape.
COMPUTERS
arxiv.org

When geometry meets optimization theory: partially orthogonal tensors

Due to the multi-linearity of tensors, most algorithms for tensor optimization problems are designed based on the block coordinate descent method. Such algorithms are widely employed by practitioners for their implementability and effectiveness. However, these algorithms usually suffer from the lack of theoretical guarantee of global convergence and analysis of convergence rate. In this paper, we propose a block coordinate descent type algorithm for the low rank partially orthogonal tensor approximation problem and analyse its convergence behaviour. To achieve this, we carefully investigate the variety of low rank partially orthogonal tensors and its geometric properties related to the parameter space, which enable us to locate KKT points of the concerned optimization problem. With the aid of these geometric properties, we prove without any assumption that: (1) Our algorithm converges globally to a KKT point; (2) For any given tensor, the algorithm exhibits an overall sublinear convergence with an explicit rate which is sharper than the usual $O(1/k)$ for first order methods in nonconvex optimization; {(3)} For a generic tensor, our algorithm converges $R$-linearly.
MATHEMATICS
arxiv.org

Entanglement entropy in $(2+1)$D interacting theory: A dimension reduction approach

A formidable perspective in understanding collective quantum phenomena of a given many-body system is through its entanglement contents. Yet apart from well-established knowledge for free theories, so far much less is known about entanglement structure of interacting particles, especially for the cases beyond $(1+1)$ dimension. Here, we develop an efficient scheme to study the entanglement entropy for $(2+1)$-dimensional quantum field theories, which is able to go beyond the non-interacting or conformal settings. Within this framework, we exactly derive the area-law entanglement entropy for $(2+1)$-dimensional free scalar field and Dirac field, which are consistent with the expectations from existing studies. As a concrete example of interacting theory, we investigate the entanglement entropy of $(2+1)$-dimensional Dirac fermion under a random magnetic field, which cannot be straightforwardly solved via previous approaches. We analytically prove the area-law entanglement entropy remains, with a minor modification of the area-law coefficient by disorder. Additionally, our analytical solution is further validated by the corresponding lattice simulation. This advance not only offers a tool to exploring the correlations and quantum criticality, but also achieves a deepened understanding of the entanglement structure of quantum many-body systems.
PHYSICS
arxiv.org

Equivalence between fermion-to-qubit mappings in two spatial dimensions

We argue that all locality-preserving mappings between fermionic observables and Pauli matrices on a two-dimensional lattice can be generated from the exact bosonization in Ref. [1], whose gauge constraints project onto the subspace of the toric code with emergent fermions. Starting from the exact bosonization and applying Clifford finite-depth generalized local unitary (gLU) transformation, we can achieve all possible fermion-to-qubit mappings (up to the re-pairing of Majorana fermions). In particular, we discover a new super-compact encoding using 1.25 qubits per fermion on the square lattice, which is lower than any method in the literature. We prove the existence of fermion-to-qubit mappings with qubit-fermion ratios $r=1+ \frac{1}{2k}$ for positive integers $k$, where the proof utilizes the trivialness of quantum cellular automata (QCA) in two spatial dimensions. When the ratio approaches 1, the fermion-to-qubit mapping reduces to the 1d Jordan-Wigner transformation along a certain path in the two-dimensional lattice. Finally, we explicitly demonstrate that the Bravyi-Kitaev superfast simulation, the Verstraete-Cirac auxiliary method, Kitaev's exactly solved model, the Majorana loop stabilizer codes, and the compact fermion-to-qubit mapping can all be obtained from the exact bosonization.
MATHEMATICS
arxiv.org

New Class of Landau Levels and Hall Phases in a 2D Electron Gas Subject to an Inhomogeneous Magnetic Field: An Analytic Solution

An analytic closed form solution is derived for the bound states of electrons subject to a static, inhomogeneous ($1/r$-decaying) magnetic field, including the Zeeman interaction. The solution provides access to many-body properties of a two-dimensional, non-interacting, electron gas in the thermodynamic limit. Radially distorted Landau levels can be identified as well as magnetic field induced density and current oscillations close to the magnetic impurity. These radially localised oscillations depend strongly on the coupling of the spin to the magnetic field, which give rise to non-trivial spin currents. Moreover, the Zeeman interaction introduces a lowest flat band for $E_F=0^+$ assuming a spin $g_s$-factor of two. Surprisingly, in this case the charge and current densities can be computed analytically in the thermodynamic limit. Numerical calculations show that the total magnetic response of the electron gas remains diamagnetic (similar to Landau levels) independent of the Fermi energy. However, the contribution of certain, infinitely degenerate energy levels may become paramagnetic. Furthermore, numerical computations of the Hall conductivity reveal asymptotic properties of the electron gas, which are driven by the anisotropy of the vector potential instead of the magnetic field, i.e. become independent of spin. Eventually, the distorted Landau levels give rise to different Hall conductivity phases, which are characterized by sharp sign flips at specific Fermi energies. Overall, our work merges "impurity" with Landau-level physics, which provides novel physical insights, not only locally, but also in the asymptotic limit. This paves the way for a large number of future theoretical as well as experimental investigations.
PHYSICS
arxiv.org

Spectral fingerprints of non-equilibrium dynamics: The case of a Brownian gyrator

The same system can exhibit a completely different dynamical behavior when it evolves in equilibrium conditions or when it is driven out-of-equilibrium by, e.g., connecting some of its components to heat baths kept at different temperatures. Here we concentrate on an analytically solvable and experimentally-relevant model of such a system -- the so-called Brownian gyrator -- a two-dimensional nanomachine that performs a systematic, on average, rotation around the origin under non-equilibrium conditions, while no net rotation takes place in equilibrium. On this example, we discuss a question whether it is possible to distinguish between two types of a behavior judging not upon the statistical properties of the trajectories of components, but rather upon their respective spectral densities. The latter are widely used to characterize diverse dynamical systems and are routinely calculated from the data using standard built-in packages. From such a perspective, we inquire whether the power spectral densities possess some "fingerprint" properties specific to the behavior in non-equilibrium. We show that indeed one can conclusively distinguish between equilibrium and non-equilibrium dynamics by analyzing the cross-correlations between the spectral densities of both components in the short frequency limit, or from the spectral densities of both components evaluated at zero frequency. Our analytical predictions, corroborated by experimental and numerical results, open a new direction for the analysis of a non-equilibrium dynamics.
SCIENCE

