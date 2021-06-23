Cancel
A stochastic quantum program synthesis framework based on Bayesian optimization

By Yao Xiao, Shahin Nazarian, Paul Bogdan
Nature.com
 13 days ago

Cover picture for the articleQuantum computers and algorithms can offer exponential performance improvement over some NP-complete programs which cannot be run efficiently through a Von Neumann computing approach. In this paper, we present BayeSyn, which utilizes an enhanced stochastic program synthesis and Bayesian optimization to automatically generate quantum programs from high-level languages subject to certain constraints. We find that stochastic synthesis can comparatively and efficiently generate a program with a lower cost from the high dimensional program space. We also realize that hyperparameters used in stochastic synthesis play a significant role in determining the optimal program. Therefore, BayeSyn utilizes Bayesian optimization to fine-tune such parameters to generate a suitable quantum program.

Astronomyscitechdaily.com

Astronomers Use Artificial Intelligence to Reveal the Actual Shape of the Universe

Japanese astronomers have developed a new artificial intelligence (AI) technique to remove noise in astronomical data due to random variations in galaxy shapes. After extensive training and testing on large mock data created by supercomputer simulations, they then applied this new tool to actual data from Japan’s Subaru Telescope and found that the mass distribution derived from using this method is consistent with the currently accepted models of the Universe. This is a powerful new tool for analyzing big data from current and planned astronomy surveys.
Coding & Programmingarxiv.org

Repulsive Deep Ensembles are Bayesian

Deep ensembles have recently gained popularity in the deep learning community for their conceptual simplicity and efficiency. However, maintaining functional diversity between ensemble members that are independently trained with gradient descent is challenging. This can lead to pathologies when adding more ensemble members, such as a saturation of the ensemble performance, which converges to the performance of a single model. Moreover, this does not only affect the quality of its predictions, but even more so the uncertainty estimates of the ensemble, and thus its performance on out-of-distribution data. We hypothesize that this limitation can be overcome by discouraging different ensemble members from collapsing to the same function. To this end, we introduce a kernelized repulsive term in the update rule of the deep ensembles. We show that this simple modification not only enforces and maintains diversity among the members but, even more importantly, transforms the maximum a posteriori inference into proper Bayesian inference. Namely, we show that the training dynamics of our proposed repulsive ensembles follow a Wasserstein gradient flow of the KL divergence with the true posterior. We study repulsive terms in weight and function space and empirically compare their performance to standard ensembles and Bayesian baselines on synthetic and real-world prediction tasks.
Coding & Programmingarxiv.org

Asynchronous Stochastic Optimization Robust to Arbitrary Delays

We consider stochastic optimization with delayed gradients where, at each time step $t$, the algorithm makes an update using a stale stochastic gradient from step $t - d_t$ for some arbitrary delay $d_t$. This setting abstracts asynchronous distributed optimization where a central server receives gradient updates computed by worker machines. These machines can experience computation and communication loads that might vary significantly over time. In the general non-convex smooth optimization setting, we give a simple and efficient algorithm that requires $O( \sigma^2/\epsilon^4 + \tau/\epsilon^2 )$ steps for finding an $\epsilon$-stationary point $x$, where $\tau$ is the \emph{average} delay $\smash{\frac{1}{T}\sum_{t=1}^T d_t}$ and $\sigma^2$ is the variance of the stochastic gradients. This improves over previous work, which showed that stochastic gradient decent achieves the same rate but with respect to the \emph{maximal} delay $\max_{t} d_t$, that can be significantly larger than the average delay especially in heterogeneous distributed systems. Our experiments demonstrate the efficacy and robustness of our algorithm in cases where the delay distribution is skewed or heavy-tailed.
Computersarxiv.org

Bayesian Differential Privacy for Linear Dynamical Systems

Differential privacy is a privacy measure based on the difficulty of discriminating between similar input data. In differential privacy analysis, similar data usually implies that their distance does not exceed a predetermined threshold. It, consequently, does not take into account the difficulty of distinguishing data sets that are far apart, which often contain highly private information. This problem has been pointed out in the research on differential privacy for static data, and Bayesian differential privacy has been proposed, which provides a privacy protection level even for outlier data by utilizing the prior distribution of the data. In this study, we introduce this Bayesian differential privacy to dynamical systems, and provide privacy guarantees for distant input data pairs and reveal its fundamental property. For example, we design a mechanism that satisfies the desired level of privacy protection, which characterizes the trade-off between privacy and information utility.
Coding & Programmingarxiv.org

Solving specified-time distributed optimization problem via sampled-data-based algorithm

Despite significant advances on distributed continuous-time optimization of multi-agent networks, there is still lack of an efficient algorithm to achieve the goal of distributed optimization at a pre-specified time. Herein, we design a specified-time distributed optimization algorithm for connected agents with directed topologies to collectively minimize the sum of individual objective functions subject to an equality constraint. With the designed algorithm, the settling time of distributed optimization can be exactly predefined. The specified selection of such a settling time is independent of not only the initial conditions of agents, but also the algorithm parameters and the communication topologies. Furthermore, the proposed algorithm can realize specified-time optimization by exchanging information among neighbours only at discrete sampling instants and thus reduces the communication burden. In addition, the equality constraint is always satisfied during the whole process, which makes the proposed algorithm applicable to online solving distributed optimization problems such as economic dispatch. For the special case of undirected communication topologies, a reduced-order algorithm is also designed. Finally, the effectiveness of the theoretical analysis is justified by numerical simulations.
Mathematicsarxiv.org

Quantum-classical distance as a tool to design optimal chiral quantum walk

Continuous-time quantum walks (CTQWs) provide a valuable model for quantum transport, universal quantum computation and quantum spatial search, among others. Recently, the empowering role of new degrees of freedom in the Hamiltonian generator of CTQWs, which are the complex phases along the loops of the underlying graph, was acknowledged for its interest in optimizing or suppressing transport on specific topologies. We argue that the quantum-classical distance, a figure of merit which was introduced to capture the difference in dynamics between a CTQW and its classical, stochastic counterpart, guides the optimization of parameters of the Hamiltonian to achieve better quantum transport on cycle graphs and spatial search to the quantum speed limit without an oracle on complete graphs, the latter also implying fast uniform mixing. We compare the variations of this quantity with the 1-norm of coherence and the Inverse Participation Ratio, showing that the quantum-classical distance is linked to both, but in a topology-dependent relation, which is key to spot the most interesting quantum evolution in each case.
Coding & Programmingarxiv.org

Local policy search with Bayesian optimization

Reinforcement learning (RL) aims to find an optimal policy by interaction with an environment. Consequently, learning complex behavior requires a vast number of samples, which can be prohibitive in practice. Nevertheless, instead of systematically reasoning and actively choosing informative samples, policy gradients for local search are often obtained from random perturbations. These random samples yield high variance estimates and hence are sub-optimal in terms of sample complexity. Actively selecting informative samples is at the core of Bayesian optimization, which constructs a probabilistic surrogate of the objective from past samples to reason about informative subsequent ones. In this paper, we propose to join both worlds. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient. Based on the model, the algorithm decides where to query a noisy zeroth-order oracle to improve the gradient estimates. The resulting algorithm is a novel type of policy search method, which we compare to existing black-box algorithms. The comparison reveals improved sample complexity and reduced variance in extensive empirical evaluations on synthetic objectives. Further, we highlight the benefits of active sampling on popular RL benchmarks.
Coding & Programmingarxiv.org

Tensor-based framework for training flexible neural networks

Activation functions (AFs) are an important part of the design of neural networks (NNs), and their choice plays a predominant role in the performance of a NN. In this work, we are particularly interested in the estimation of flexible activation functions using tensor-based solutions, where the AFs are expressed as a weighted sum of predefined basis functions. To do so, we propose a new learning algorithm which solves a constrained coupled matrix-tensor factorization (CMTF) problem. This technique fuses the first and zeroth order information of the NN, where the first-order information is contained in a Jacobian tensor, following a constrained canonical polyadic decomposition (CPD). The proposed algorithm can handle different decomposition bases. The goal of this method is to compress large pretrained NN models, by replacing subnetworks, {\em i.e.,} one or multiple layers of the original network, by a new flexible layer. The approach is applied to a pretrained convolutional neural network (CNN) used for character classification.
Computersarxiv.org

A stochastic linearized proximal method of multipliers for convex stochastic optimization with expectation constraints

This paper considers the problem of minimizing a convex expectation function with a set of inequality convex expectation constraints. We present a computable stochastic approximation type algorithm, namely the stochastic linearized proximal method of multipliers, to solve this convex stochastic optimization problem. This algorithm can be roughly viewed as a hybrid of stochastic approximation and the traditional proximal method of multipliers. Under mild conditions, we show that this algorithm exhibits $O(K^{-1/2})$ expected convergence rates for both objective reduction and constraint violation if parameters in the algorithm are properly chosen, where $K$ denotes the number of iterations. Moreover, we show that, with high probability, the algorithm has $O(\log(K)K^{-1/2})$ constraint violation bound and $O(\log^{3/2}(K)K^{-1/2})$ objective bound. Some preliminary numerical results demonstrate the performance of the proposed algorithm.
Coding & Programmingarxiv.org

Nonlinear Quantum Optimization Algorithms via Efficient Ising Model Encodings

Despite extensive research efforts, few quantum algorithms for classical optimization demonstrate realizable advantage. The utility of many quantum algorithms is limited by high requisite circuit depth and nonconvex optimization landscapes. We tackle these challenges to quantum advantage with two new variational quantum algorithms, which utilize multi-basis graph encodings and nonlinear activation functions to outperform existing methods with remarkably shallow quantum circuits. Both algorithms provide a polynomial reduction in measurement complexity and either a factor of two speedup \textit{or} a factor of two reduction in quantum resources. Typically, the classical simulation of such algorithms with many qubits is impossible due to the exponential scaling of traditional quantum formalism and the limitations of tensor networks. Nonetheless, the shallow circuits and moderate entanglement of our algorithms, combined with efficient tensor method-based simulation, enable us to successfully optimize the MaxCut of high-connectivity global graphs with up to $512$ nodes (qubits) on a single GPU.
Computersarxiv.org

A mechanistic-based data-driven approach to accelerate structural topology optimization through finite element convolutional neural network (FE-CNN)

In this paper, a mechanistic data-driven approach is proposed to accelerate structural topology optimization, employing an in-house developed finite element convolutional neural network (FE-CNN). Our approach can be divided into two stages: offline training, and online optimization. During offline training, a mapping function is built between high and low resolution representations of a given design domain. The mapping is expressed by a FE-CNN, which targets a common objective function value (e.g., structural compliance) across design domains of differing resolutions. During online optimization, an arbitrary design domain of high resolution is reduced to low resolution through the trained mapping function. The original high-resolution domain is thus designed by computations performed on only the low-resolution version, followed by an inverse mapping back to the high-resolution domain. Numerical examples demonstrate that this approach can accelerate optimization by up to an order of magnitude in computational time. Our proposed approach therefore shows great potential to overcome the curse-of-dimensionality incurred by density-based structural topology optimization. The limitation of our present approach is also discussed.
Technologyarxiv.org

Bayesian Eye Tracking

Model-based eye tracking has been a dominant approach for eye gaze tracking because of its ability to generalize to different subjects, without the need of any training data and eye gaze annotations. Model-based eye tracking, however, is susceptible to eye feature detection errors, in particular for eye tracking in the wild. To address this issue, we propose a Bayesian framework for model-based eye tracking. The proposed system consists of a cascade-Bayesian Convolutional Neural Network (c-BCNN) to capture the probabilistic relationships between eye appearance and its landmarks, and a geometric eye model to estimate eye gaze from the eye landmarks. Given a testing eye image, the Bayesian framework can generate, through Bayesian inference, the eye gaze distribution without explicit landmark detection and model training, based on which it not only estimates the most likely eye gaze but also its uncertainty. Furthermore, with Bayesian inference instead of point-based inference, our model can not only generalize better to different sub-jects, head poses, and environments but also is robust to image noise and landmark detection errors. Finally, with the estimated gaze uncertainty, we can construct a cascade architecture that allows us to progressively improve gaze estimation accuracy. Compared to state-of-the-art model-based and learning-based methods, the proposed Bayesian framework demonstrates significant improvement in generalization capability across several benchmark datasets and in accuracy and robustness under challenging real-world conditions.
Computersarxiv.org

A Clustering-based Framework for Classifying Data Streams

The non-stationary nature of data streams strongly challenges traditional machine learning techniques. Although some solutions have been proposed to extend traditional machine learning techniques for handling data streams, these approaches either require an initial label set or rely on specialized design parameters. The overlap among classes and the labeling of data streams constitute other major challenges for classifying data streams. In this paper, we proposed a clustering-based data stream classification framework to handle non-stationary data streams without utilizing an initial label set. A density-based stream clustering procedure is used to capture novel concepts with a dynamic threshold and an effective active label querying strategy is introduced to continuously learn the new concepts from the data streams. The sub-cluster structure of each cluster is explored to handle the overlap among classes. Experimental results and quantitative comparison studies reveal that the proposed method provides statistically better or comparable performance than the existing methods.
Mathematicsarxiv.org

Bayesian Mechanics for Stationary Processes

This paper develops a Bayesian mechanics for adaptive systems. Firstly, we model the interface between a system and its environment with a Markov blanket. This affords conditions under which states internal to the blanket encode information about external states. Second, we introduce dynamics and represent adaptive systems as Markov blankets...
Coding & Programmingarxiv.org

DeepStochLog: Neural Stochastic Logic Programming

Recent advances in neural symbolic learning, such as DeepProbLog, extend probabilistic logic programs with neural predicates. Like graphical models, these probabilistic logic programs define a probability distribution over possible worlds, for which inference is computationally hard. We propose DeepStochLog, an alternative neural symbolic framework based on stochastic definite clause grammars, a type of stochastic logic program, which defines a probability distribution over possible derivations. More specifically, we introduce neural grammar rules into stochastic definite clause grammars to create a framework that can be trained end-to-end. We show that inference and learning in neural stochastic logic programming scale much better than for neural probabilistic logic programs. Furthermore, the experimental evaluation shows that DeepStochLog achieves state-of-the-art results on challenging neural symbolic learning tasks.
Electronicsarxiv.org

MG-DVD: A Real-time Framework for Malware Variant Detection Based on Dynamic Heterogeneous Graph Learning

Detecting the newly emerging malware variants in real time is crucial for mitigating cyber risks and proactively blocking intrusions. In this paper, we propose MG-DVD, a novel detection framework based on dynamic heterogeneous graph learning, to detect malware variants in real time. Particularly, MG-DVD first models the fine-grained execution event streams of malware variants into dynamic heterogeneous graphs and investigates real-world meta-graphs between malware objects, which can effectively characterize more discriminative malicious evolutionary patterns between malware and their variants. Then, MG-DVD presents two dynamic walk-based heterogeneous graph learning methods to learn more comprehensive representations of malware variants, which significantly reduces the cost of the entire graph retraining. As a result, MG-DVD is equipped with the ability to detect malware variants in real time, and it presents better interpretability by introducing meaningful meta-graphs. Comprehensive experiments on large-scale samples prove that our proposed MG-DVD outperforms state-of-the-art methods in detecting malware variants in terms of effectiveness and efficiency.
Computersarxiv.org

Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems

Stochastic nested optimization, including stochastic compositional, min-max and bilevel optimization, is gaining popularity in many machine learning applications. While the three problems share the nested structure, existing works often treat them separately, and thus develop problem-specific algorithms and their analyses. Among various exciting developments, simple SGD-type updates (potentially on multiple variables) are still prevalent in solving this class of nested problems, but they are believed to have slower convergence rate compared to that of the non-nested problems. This paper unifies several SGD-type updates for stochastic nested problems into a single SGD approach that we term ALternating Stochastic gradient dEscenT (ALSET) method. By leveraging the hidden smoothness of the problem, this paper presents a tighter analysis of ALSET for stochastic nested problems. Under the new analysis, to achieve an $\epsilon$-stationary point of the nested problem, it requires ${\cal O}(\epsilon^{-2})$ samples. Under certain regularity conditions, applying our results to stochastic compositional, min-max and reinforcement learning problems either improves or matches the best-known sample complexity in the respective cases. Our results explain why simple SGD-type algorithms in stochastic nested problems all work very well in practice without the need for further modifications.
Mathematicsarxiv.org

Parameter Estimation for the McKean-Vlasov Stochastic Differential Equation

In this paper, we consider the problem of parameter estimation for a stochastic McKean-Vlasov equation, and the associated system of weakly interacting particles. We first establish consistency and asymptotic normality of the offline maximum likelihood estimator for the interacting particle system in the limit as the number of particles $N\rightarrow\infty$. We then propose an online estimator for the parameters of the McKean-Vlasov SDE, which evolves according to a continuous-time stochastic gradient descent algorithm on the asymptotic log-likelihood of the interacting particle system. We prove that this estimator converges in $\mathbb{L}^1$ to the stationary points of the asymptotic log-likelihood of the McKean-Vlasov SDE in the joint limit as $N\rightarrow\infty$ and $t\rightarrow\infty$, under suitable assumptions which guarantee ergodicity and uniform-in-time propagation of chaos. We then demonstrate, under the additional assumption of global strong concavity, that our estimator converges in $\mathbb{L}^2$ to the unique maximiser of this asymptotic log-likelihood function, and establish an $\mathbb{L}^2$ convergence rate. We also obtain analogous results under the assumption that, rather than observing multiple trajectories of the interacting particle system, we instead observe multiple independent replicates of the McKean-Vlasov SDE itself or, less realistically, a single sample path of the McKean-Vlasov SDE and its law. Our theoretical results are demonstrated via two numerical examples, a linear mean field model and a stochastic opinion dynamics model.
Mathematicsarxiv.org

Designing non-equilibrium states of quantum matter through stochastic resetting

We consider closed quantum many-body systems subject to stochastic resetting. This means that their unitary time evolution is interrupted by resets at randomly selected times. When a reset takes place the system is reinitialized to a state chosen from a set of reset states conditionally on the outcome of a measurement taken immediately before resetting. We construct analytically the resulting non-equilibrium stationary state, thereby establishing a novel connection between quantum quenches in closed systems and the emergent open system dynamics induced by stochastic resetting. We discuss as an application the paradigmatic transverse-field quantum Ising chain. We show that signatures of its ground-state quantum phase transition are visible in the steady state of the reset dynamics as a sharp crossover. Our findings show that a controlled stochastic resetting dynamics allows to design non-equilibrium stationary states of quantum many-body systems, where uncontrolled dissipation and heating can be prevented. These states can thus be created on demand and exploited, e.g., as a resource for quantum enhanced sensing on quantum simulator platforms.
Mathematicsarxiv.org

A Computationally Efficient Hamilton-Jacobi-based Formula for State-Constrained Optimal Control Problems

This paper investigates a Hamilton-Jacobi (HJ) analysis to solve finite-horizon optimal control problems for high-dimensional systems. Although grid-based methods, such as the level-set method [1], numerically solve a general class of HJ partial differential equations, the computational complexity is exponential in the dimension of the continuous state. To manage this computational complexity, methods based on Lax-Hopf theory have been developed for the state-unconstrained optimal control problem under certain assumptions, such as affine dynamics and state-independent stage cost. Based on the Lax formula [2], this paper proposes an HJ formula for the state-constrained optimal control problem for nonlinear systems. We call this formula \textit{the generalized Lax formula} for the optimal control problem. The HJ formula provides both the optimal cost and an optimal control signal. We also provide an efficient computational method for a class of problems for which the dynamics is affine in the state, and for which the stage and terminal cost, as well as the state constraints, are convex in the state. This class of problems does not require affine dynamics and convex stage cost in the control. This paper also provides three practical examples.

