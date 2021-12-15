ContributorsPublishersAdvertisers
Central Limit Type Theorem and Large Deviations for Multi-Scale McKean-Vlasov SDEs

By Wei Hong, Shihu Li, Wei Liu, Xiaobin Sun
arxiv.org
 4 days ago

In this paper, we aim to study the asymptotic behavior for multi-scale McKean-Vlasov stochastic dynamical systems. Firstly, we obtain a central limit type theorem, i.e, the deviation between the slow component $X^{\varepsilon}$...

arxiv.org

The resource cost of large scale quantum computing

This thesis deals with the problematics of scalability of fault tolerant quantum computing. This question is studied under the angle of estimating the resources needed to set up such computers. What we call resource is in principle very general, it could be the power, the energy, the total bandwidth allocated to the different qubits... However, we mainly focus on the energetic cost of quantum computing. In particular, we develop a multi-disciplinary approach that allows to minimize the resources required to implement algorithms on quantum computers. By asking to find the minimum amount of resources required to perform a computation under the constraint that the algorithm provides a correct answer with a targeted accuracy, it is possible to optimize the full computer in order to minimize the resources spent, while being certain to have a correct answer with a high probability. We apply this approach to a complete model fault-tolerant quantum computer based on superconducting qubits. Our results indicate that for algorithms implemented on thousands of logical qubits, our method makes possible to reduce the energetic cost by orders of magnitudes in regimes where, without optimizing, the power consumption could be close to the gigawatt. This work illustrates the fact that the energetic cost of quantum computing should be a criterion in itself allowing to evaluate the scaling potential of a given quantum computer technology. It also illustrates that the optimization of the architecture of a quantum computer, via transverse methods, including algorithms, error correction, qubit physics, engineering aspects, such as the ones that we propose, can prove to be a powerful tool, clearly improving the scaling potential of quantum computers. Finally, we provide general hints about how to make fault-tolerant quantum computers energy efficient.
COMPUTERS
arxiv.org

Incentive Compatible Pareto Alignment for Multi-Source Large Graphs

In this paper, we focus on learning effective entity matching models over multi-source large-scale data. For real applications, we relax typical assumptions that data distributions/spaces, or entity identities are shared between sources, and propose a Relaxed Multi-source Large-scale Entity-matching (RMLE) problem. Challenges of the problem include 1) how to align large-scale entities between sources to share information and 2) how to mitigate negative transfer from joint learning multi-source data. What's worse, one practical issue is the entanglement between both challenges. Specifically, incorrect alignments may increase negative transfer; while mitigating negative transfer for one source may result in poorly learned representations for other sources and then decrease alignment accuracy. To handle the entangled challenges, we point out that the key is to optimize information sharing first based on Pareto front optimization, by showing that information sharing significantly influences the Pareto front which depicts lower bounds of negative transfer. Consequently, we proposed an Incentive Compatible Pareto Alignment (ICPA) method to first optimize cross-source alignments based on Pareto front optimization, then mitigate negative transfer constrained on the optimized alignments. This mechanism renders each source can learn based on its true preference without worrying about deteriorating representations of other sources. Specifically, the Pareto front optimization encourages minimizing lower bounds of negative transfer, which optimizes whether and which to align. Comprehensive empirical evaluation results on four large-scale datasets are provided to demonstrate the effectiveness and superiority of ICPA. Online A/B test results at a search advertising platform also demonstrate the effectiveness of ICPA in production environments.
COMPUTERS
arxiv.org

Multi-scale Feature Learning Dynamics: Insights for Double Descent

A key challenge in building theoretical foundations for deep learning is the complex optimization dynamics of neural networks, resulting from the high-dimensional interactions between the large number of network parameters. Such non-trivial dynamics lead to intriguing behaviors such as the phenomenon of "double descent" of the generalization error. The more commonly studied aspect of this phenomenon corresponds to model-wise double descent where the test error exhibits a second descent with increasing model complexity, beyond the classical U-shaped error curve. In this work, we investigate the origins of the less studied epoch-wise double descent in which the test error undergoes two non-monotonous transitions, or descents as the training time increases. By leveraging tools from statistical physics, we study a linear teacher-student setup exhibiting epoch-wise double descent similar to that in deep neural networks. In this setting, we derive closed-form analytical expressions for the evolution of generalization error over training. We find that double descent can be attributed to distinct features being learned at different scales: as fast-learning features overfit, slower-learning features start to fit, resulting in a second descent in test error. We validate our findings through numerical experiments where our theory accurately predicts empirical findings and remains consistent with observations in deep neural networks.
CODING & PROGRAMMING
arxiv.org

Subspace Decomposition based DNN algorithm for elliptic type multi-scale PDEs

While deep learning algorithms demonstrate a great potential in scientific computing, its application to multi-scale problems remains to be a big challenge. This is manifested by the "frequency principle" that neural networks tend to learn low frequency components first. Novel architectures such as multi-scale deep neural network (MscaleDNN) were proposed to alleviate this problem to some extent. In this paper, we construct a subspace decomposition based DNN (dubbed SD$^2$NN) architecture for a class of multi-scale problems by combining traditional numerical analysis ideas and MscaleDNN algorithms. The proposed architecture includes one low frequency normal DNN submodule, and one (or a few) high frequency MscaleDNN submodule(s), which are designed to capture the smooth part and the oscillatory part of the multi-scale solutions, respectively. In addition, a novel trigonometric activation function is incorporated in the SD$^2$NN model. We demonstrate the performance of the SD$^2$NN architecture through several benchmark multi-scale problems in regular or irregular geometric domains. Numerical results show that the SD$^2$NN model is superior to existing models such as MscaleDNN.
CODING & PROGRAMMING
arxiv.org

A geometric perspective on the scaling limits of critical Ising and $φ^4_d$ models

The lecture delivered at the \emph{Current Developments in Mathematics} conference (Harvard-MIT 2020) focused on the recent proof of the Gaussian structure of the scaling limits of the critical Ising and $ \varphi^4$ fields in the marginal case of four dimensions. These notes expand on the background of the question addressed by this result, approaching it from two partly overlapping perspectives: one concerning critical phenomena in statistical mechanics and the other functional integrals over Euclidean spaces which could serve as a springboard to quantum field theory. We start by recalling some basic results concerning the models' critical behavior in different dimensions. The analysis is framed in the models' stochastic geometric random current representation. It yields intuitive explanations as well as tools for proving a range of dimension dependent results, including: the emergence in $2D$ of Fermionic degrees of freedom, the non-gaussianity of the scaling limits in two dimensions, and conversely the emergence of Gaussian behavior in four and higher dimensions. To cover the marginal case of $4D$ the tree diagram bound which has sufficed for higher dimensions needed to be supplemented by a singular correction. Its presence was established through multi-scale analysis in a recent joint work with Hugo Duminil-Copin.
MATHEMATICS
arxiv.org

Large-scale dark matter simulations

We review the field of collisionless numerical simulations for the large-scale structure of the Universe. We start by providing the main set of equations solved by these simulations and their connection with General Relativity. We then recap the relevant numerical approaches: discretization of the phase-space distribution (focusing on N-body but including alternatives, e.g., Lagrangian submanifold and Schrödinger-Poisson) and the respective techniques for their time evolution and force calculation (Direct summation, mesh techniques, and hierarchical tree methods). We pay attention to the creation of initial conditions and the connection with Lagrangian Perturbation Theory. We then discuss the possible alternatives in terms of the micro-physical properties of dark matter (e.g., neutralinos, warm dark matter, QCD axions, Bose-Einstein condensates, and primordial black holes), and extensions to account for multiple fluids (baryons and neutrinos), primordial non-Gaussianity and modified gravity. We continue by discussing challenges involved in achieving highly accurate predictions. A key aspect of cosmological simulations is the connection to cosmological observables, we discuss various techniques in this regard: structure finding, galaxy formation and baryonic modelling, the creation of emulators and light-cones, and the role of machine learning. We finalise with a recount of state-of-the-art large-scale simulations and conclude with an outlook for the next decade.
ASTRONOMY
arxiv.org

DVHN: A Deep Hashing Framework for Large-scale Vehicle Re-identification

In this paper, we make the very first attempt to investigate the integration of deep hash learning with vehicle re-identification. We propose a deep hash-based vehicle re-identification framework, dubbed DVHN, which substantially reduces memory usage and promotes retrieval efficiency while reserving nearest neighbor search accuracy. Concretely,~DVHN directly learns discrete compact binary hash codes for each image by jointly optimizing the feature learning network and the hash code generating module. Specifically, we directly constrain the output from the convolutional neural network to be discrete binary codes and ensure the learned binary codes are optimal for classification. To optimize the deep discrete hashing framework, we further propose an alternating minimization method for learning binary similarity-preserved hashing codes. Extensive experiments on two widely-studied vehicle re-identification datasets- \textbf{VehicleID} and \textbf{VeRi}-~have demonstrated the superiority of our method against the state-of-the-art deep hash methods. \textbf{DVHN} of $2048$ bits can achieve 13.94\% and 10.21\% accuracy improvement in terms of \textbf{mAP} and \textbf{Rank@1} for \textbf{VehicleID (800)} dataset. For \textbf{VeRi}, we achieve 35.45\% and 32.72\% performance gains for \textbf{Rank@1} and \textbf{mAP}, respectively.
CARS
arxiv.org

Thermodynamic and Scaling Limits of the non-Gaussian Membrane Model

We characterize the behavior of a random discrete interface $\phi$ on $[-L,L]^d \cap \mathbb{Z}^d$ with energy $\sum V(\Delta \phi(x))$ as $L \to \infty$, where $\Delta$ is the discrete Laplacian and $V$ is a uniformly convex, symmetric, and smooth potential. The interface $\phi$ is called the non-Gaussian membrane model. By analyzing the Helffer-Sjöstrand representation associated to $\Delta \phi$, we provide a unified approach to continuous scaling limits of the rescaled and interpolated interface in dimensions $d=2,3$, Gaussian approximation in negative regularity spaces for all $d \geq 2$, and the infinite volume limit in $d \geq 5$. Our results generalize some of those of arXiv:1801.05663.
MATHEMATICS
arxiv.org

A Large-Scale Benchmark for the Incompressible Navier-Stokes Equations

We introduce a collection of benchmark problems in 2D and 3D (geometry description and boundary conditions), including simple cases with known analytic solution, classical experimental setups, and complex geometries with fabricated solutions for evaluation of numerical schemes for incompressible Navier-Stokes equations in laminar flow regime. We compare the performance of a representative selection of most broadly used algorithms for Navier-Stokes equations on this set of problems. Where applicable, we compare the most common spatial discretization choices (unstructured triangle/tetrahedral meshes and structured or semi-structured quadrilateral/hexahedral meshes).
MATHEMATICS
arxiv.org

Multi-scale magnetic field structures in an expanding elongated plasma cloud with hot electrons subject to an external magnetic field

We carry out 3D and 2D PIC-simulations of the expansion of a magnetized plasma that initially uniformly fills a half-space and contains a semi-cylindrical region of heated electrons elongated along the surface of the plasma boundary. This geometry is related, for instance, to the ablation of a plane target by a femtosecond laser beam under quasi-cylindrical focusing. We find that the decay of the inhomogeneous plasma--vacuum discontinuity is strongly affected by an external magnetic field parallel to its boundary.
SCIENCE
arxiv.org

Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring

The success of the state-of-the-art video deblurring methods stems mainly from implicit or explicit estimation of alignment among the adjacent frames for latent video restoration. However, due to the influence of the blur effect, estimating the alignment information from the blurry adjacent frames is not a trivial task. Inaccurate estimations will interfere the following frame restoration. Instead of estimating alignment information, we propose a simple and effective deep Recurrent Neural Network with Multi-scale Bi-directional Propagation (RNN-MBP) to effectively propagate and gather the information from unaligned neighboring frames for better video deblurring. Specifically, we build a Multi-scale Bi-directional Propagation~(MBP) module with two U-Net RNN cells which can directly exploit the inter-frame information from unaligned neighboring hidden states by integrating them in different scales. Moreover, to better evaluate the proposed algorithm and existing state-of-the-art methods on real-world blurry scenes, we also create a Real-World Blurry Video Dataset (RBVD) by a well-designed Digital Video Acquisition System (DVAS) and use it as the training and evaluation dataset. Extensive experimental results demonstrate that the proposed RBVD dataset effectively improves the performance of existing algorithms on real-world blurry videos, and the proposed algorithm performs favorably against the state-of-the-art methods on three typical benchmarks. The code is available at this https URL.
COMPUTERS
arxiv.org

Enhancing Multi-Scale Implicit Learning in Image Super-Resolution with Integrated Positional Encoding

Is the center position fully capable of representing a pixel? There is nothing wrong to represent pixels with their centers in a discrete image representation, but it makes more sense to consider each pixel as the aggregation of signals from a local area in an image super-resolution (SR) context. Despite the great capability of coordinate-based implicit representation in the field of arbitrary-scale image SR, this area's nature of pixels is not fully considered. To this end, we propose integrated positional encoding (IPE), extending traditional positional encoding by aggregating frequency information over the pixel area. We apply IPE to the state-of-the-art arbitrary-scale image super-resolution method: local implicit image function (LIIF), presenting IPE-LIIF. We show the effectiveness of IPE-LIIF by quantitative and qualitative evaluations, and further demonstrate the generalization ability of IPE to larger image scales and multiple implicit-based methods. Code will be released.
COMPUTERS
arxiv.org

Neural Network Acceleration of Large-scale Structure Theory Calculations

We make use of neural networks to accelerate the calculation of power spectra required for the analysis of galaxy clustering and weak gravitational lensing data. For modern perturbation theory codes, evaluation time for a single cosmology and redshift can take on the order of two seconds. In combination with the comparable time required to compute linear predictions using a Boltzmann solver, these calculations are the bottleneck for many contemporary large-scale structure analyses. In this work, we construct neural network-based surrogate models for Lagrangian perturbation theory (LPT) predictions of matter power spectra, real and redshift space galaxy power spectra, and galaxy--matter cross power spectra that attain $\sim 0.1\%$ (at one sigma) accuracy over a broad range of scales in a $w$CDM parameter space. The neural network surrogates can be evaluated in approximately one millisecond, a factor of 1000 times faster than the full Boltzmann code and LPT computations. In a simulated full-shape redshift space galaxy power spectrum analysis, we demonstrate that the posteriors obtained using our surrogates are accurate compared to those obtained using the full LPT model. We make our surrogate models public at this https URL, so that others may take advantage of the speed gains they provide to enable rapid iteration on analysis settings, something that is essential in complex contemporary large-scale structure analyses.
SCIENCE
arxiv.org

Category-theoretic recipe for dualities in one-dimensional quantum lattice models

We present a systematic approach for generating duality transformations in quantum lattice models. Within our formalism, dualities are completely characterized by equivalent but distinct realizations of a given (possibly non-abelian and non-invertible) symmetry. These different realizations are encoded into fusion categories, and dualities are methodically generated by considering all Morita equivalent categories. The full set of symmetric operators can then be constructed from the categorical data. We construct explicit intertwiners, in the form of matrix product operators, that convert local symmetric operators of one realization into local symmetric operators of its dual. Concurrently, it maps local operators that transform non-trivially into non-local ones. This guarantees that the structure constants of the algebra of all symmetric operators are equal in both dual realizations. Families of dual Hamiltonians, possibly with long range interactions, are then designed by taking linear combinations of the corresponding symmetric operators. We illustrate this approach by establishing matrix product operator intertwiners for well-known dualities such as Kramers-Wannier and Jordan-Wigner, consider theories with two copies of the Ising category symmetry, and present an example with quantum group symmetries. Finally, we comment on generalizations to higher dimensions of this categorical approach to dualities.
MATHEMATICS
arxiv.org

Adaptation and Attention for Neural Video Coding

Nannan Zou, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed R. Tavakoli, Jani Lainema, Emre Aksu, Miska Hannuksela, Esa Rahtu. Neural image coding represents now the state-of-the-art image compression approach. However, a lot of work is still to be done in the video domain. In this work, we propose an end-to-end learned video codec that introduces several architectural novelties as well as training novelties, revolving around the concepts of adaptation and attention. Our codec is organized as an intra-frame codec paired with an inter-frame codec. As one architectural novelty, we propose to train the inter-frame codec model to adapt the motion estimation process based on the resolution of the input video. A second architectural novelty is a new neural block that combines concepts from split-attention based neural networks and from DenseNets. Finally, we propose to overfit a set of decoder-side multiplicative parameters at inference time. Through ablation studies and comparisons to prior art, we show the benefits of our proposed techniques in terms of coding gains. We compare our codec to VVC/H.266 and RLVC, which represent the state-of-the-art traditional and end-to-end learned codecs, respectively, and to the top performing end-to-end learned approach in 2021 CLIC competition, E2E_T_OL. Our codec clearly outperforms E2E_T_OL, and compare favorably to VVC and RLVC in some settings.
CODING & PROGRAMMING
arxiv.org

Linear instability and resonance effects in large-scale opposition flow control

Opposition flow control is a robust strategy that has been proved effective in turbulent wall-bounded flows. Its conventional setup consists of measuring wall-normal velocity in the buffer layer and opposing it at the wall. This work explores the possibility of implementing this strategy with a detection plane in the logarithmic layer, where control could be feasible experimentally. We apply control on a channel flow at $Re_\tau = 932$, only on the eddies with relatively large wavelengths ($\lambda / h > 0.1$). Similarly to the buffer layer opposition control, our control strategy results in a virtual-wall effect for the wall-normal velocity, creating a minimum in its intensity. However, it also induces a large response in the streamwise velocity and Reynolds stresses near the wall, with a substantial drag increase. When the phase of the control lags with respect to the detection plane, spanwise-homogeneous rollers are observed near the channel wall. We show that they are a result of a linear instability. In contrast, when the control leads with respect to the detection plane, this instability is inactive and oblique waves are observed. Their wall-normal profiles can be predicted linearly as a response of the turbulent channel flow to a forcing with the advection velocity of the detection plane. The linearity, governing the flow, opens a possibility to affect large scales of the flow in a controlled manner, when enhanced turbulence intensity or mixing is desired.
SCIENCE
High Point Enterprise

Escape the maze: Manage complex, large-scale systems with flexibility and simplicity

Managing a complex, large-scale system can feel like being deep in a maze, but it doesn’t need to. Data is the common thread that can lead you out. Multiple types of applications developed by different teams with different tools may need to address different data types in multiple locations. Managing all these differences threatens to be chaos.
COMPUTERS
arxiv.org

Constrained multi-objective optimization of process design parameters in settings with scarce data: an application to adhesive bonding

Alejandro Morales-Hernández, Sebastian Rojas Gonzalez, Inneke Van Nieuwenhuyse, Jeroen Jordens, Maarten Witters, Bart Van Doninck. Adhesive joints are increasingly used in industry for a wide variety of applications because of their favorable characteristics such as high strength-to-weight ratio, design flexibility, limited stress concentrations, planar force transfer, good damage tolerance and fatigue resistance. Finding the optimal process parameters for an adhesive bonding process is challenging: the optimization is inherently multi-objective (aiming to maximize break strength while minimizing cost) and constrained (the process should not result in any visual damage to the materials, and stress tests should not result in failures that are adhesion-related). Real life physical experiments in the lab are expensive to perform; traditional evolutionary approaches (such as genetic algorithms) are then ill-suited to solve the problem, due to the prohibitive amount of experiments required for evaluation. In this research, we successfully applied specific machine learning techniques (Gaussian Process Regression and Logistic Regression) to emulate the objective and constraint functions based on a limited amount of experimental data. The techniques are embedded in a Bayesian optimization algorithm, which succeeds in detecting Pareto-optimal process settings in a highly efficient way (i.e., requiring a limited number of extra experiments).
ENGINEERING
arxiv.org

Bayesian Distributionally Robust Optimization

We introduce a new framework, Bayesian Distributionally Robust Optimization (Bayesian-DRO), for data-driven stochastic optimization where the underlying distribution is unknown. Bayesian-DRO contrasts with most of the existing DRO approaches in the use of Bayesian estimation of the unknown distribution. To make computation of Bayesian updating tractable, Bayesian-DRO first assumes the underlying distribution takes a parametric form with unknown parameter and then computes the posterior distribution of the parameter. To address the model uncertainty brought by the assumed parametric distribution, Bayesian-DRO constructs an ambiguity set of distributions with the assumed parametric distribution as the reference distribution and then optimizes with respect to the worst case in the ambiguity set. We show the strong exponential consistency of the Bayesian posterior distribution and subsequently the convergence of objective functions and optimal solutions of Bayesian-DRO. We also consider several approaches to selecting the ambiguity set size in Bayesian-DRO and compare them numerically. Our numerical results demonstrate the out-of-sample performance of Bayesian-DRO on the news vendor problem of different dimensions and data types.
CODING & PROGRAMMING
arxiv.org

Iterative subspace algorithms for finite-temperature solution of Dyson equation

One-particle Green's functions obtained from the self-consistent solution of the Dyson equation can be employed in evaluation of spectroscopic and thermodynamic properties for both molecules and solids. However, typical acceleration techniques used in the traditional quantum chemistry self-consistent algorithms cannot be easily deployed for the Green's function methods, because of non-convex grand potential functional and non-idempotent density matrix. Moreover, the inclusion of correlation effects in the form of the self-energy matrix and changing chemical potential or fluctuations in the number of particles can make the optimization problem more difficult. In this paper, we study acceleration techniques to target the self-consistent solution of the Dyson equation directly. We use the direct inversion in the iterative subspace (DIIS), the least-squared commutator in the iterative subspace (LCIIS), and the Krylov space accelerated inexact Newton method (KAIN). We observe that the definition of the residual has a significant impact on the convergence of the iterative procedure. Based on the Dyson equation, we generalize the concept of the commutator residual used in DIIS (CDIIS) and LCIIS, and compare it with the difference residual used in DIIS and KAIN. The commutator residuals outperform the difference residuals for all considered molecular and solid systems within both GW and GF2. The generalized CDIIS and LCIIS methods successfully converged restricted GF2 calculations for a number of strongly correlated systems, which could not be converged before. We also provide practical recommendations to guide convergence in such pathological cases.
MATHEMATICS

