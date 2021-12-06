ContributorsPublishersAdvertisers
Learning-based synthesis of robust linear time-invariant controllers

By Marc-Antoine Beaudoin, Benoit Boulet
arxiv.org
 7 days ago

Recent advances in learning for control allow to synthesize controllers from learned system dynamics and maintain robust stability guarantees. However, no approach is well-suited for training linear time-invariant (LTI) controllers using arbitrary learned models of the dynamics. This article introduces a method to do so....

ScienceAlert

Physicists Confirm The Existence of Time Crystals in Epic Quantum Computer Simulation

Are you in the market for a loophole in the laws that forbid perpetual motion? Knowing you've got yourself an authentic time crystal takes more than a keen eye for high-quality gems. In a new study, an international team of researchers used Google's Sycamore quantum computing hardware to double-check their theoretical vision of a time crystal, confirming it ticks all of the right boxes for an emerging form of technology we're still getting our head around. Similar to conventional crystals made of endlessly repeating units of atoms, a time crystal is an infinitely repeating change in a system, one that remarkably doesn't require energy...
New Scientist

DeepMind AI helps study strange electrons in chemical reactions

Machine-learning tools have taken us closer to understanding electrons and how they behave in chemical interactions, following news that UK-based AI company DeepMind, owned by Google’s parent company Alphabet, has created a tool that solves a fundamental problem with how we model chemistry. The tool, called DeepMind 21, is...
arxiv.org

The Impact of Data Distribution on Fairness and Robustness in Federated Learning

Federated Learning (FL) is a distributed machine learning protocol that allows a set of agents to collaboratively train a model without sharing their datasets. This makes FL particularly suitable for settings where data privacy is desired. However, it has been observed that the performance of FL is closely related to the similarity of the local data distributions of agents. Particularly, as the data distributions of agents differ, the accuracy of the trained models drop. In this work, we look at how variations in local data distributions affect the fairness and the robustness properties of the trained models in addition to the accuracy. Our experimental results indicate that, the trained models exhibit higher bias, and become more susceptible to attacks as local data distributions differ. Importantly, the degradation in the fairness, and robustness can be much more severe than the accuracy. Therefore, we reveal that small variations that have little impact on the accuracy could still be important if the trained model is to be deployed in a fairness/security critical context.
arxiv.org

Learning from Mistakes based on Class Weighting with Application to Neural Architecture Search

Learning from mistakes is an effective learning approach widely used in human learning, where a learner pays greater focus on mistakes to circumvent them in the future. It aids in improving the overall learning outcomes. In this work, we aim to investigate how effectively this exceptional learning ability can be used to improve machine learning models as well. We propose a simple and effective multi-level optimization framework called learning from mistakes (LFM), inspired by mistake-driven learning to train better machine learning models. Our LFM framework consists of a formulation involving three learning stages. The primary objective is to train a model to perform effectively on target tasks by using a re-weighting technique to prevent similar mistakes in the future. In this formulation, we learn the class weights by minimizing the validation loss of the model and re-train the model with the synthetic data from the image generator weighted by class-wise performance and real data. We apply our LFM framework for differential architecture search methods on image classification datasets such as CIFAR and ImageNet, where the results demonstrate the effectiveness of our proposed strategy.
arxiv.org

Permutationally invariant polynomial regression for energies and gradients, using reverse differentiation, achieves orders of magnitude speed-up with high precision compared to other machine learning methods

Permutationally invariant polynomial (PIP) regression has been used to obtain machine-learned (ML) potential energy surfaces, including analytical gradients, for many molecules and chemical reactions. Recently, the approach has been extended to moderate size molecules and applied to systems up to 15 atoms. The algorithm, including "purification of the basis", is computationally efficient for energies; however, we found that the recent extension to obtain analytical gradients, despite being a remarkable advance over previous methods, could be further improved. Here we report developments to compact further a purified basis and, more significantly, to use the reverse gradient approach to greatly speed up gradient evaluation. We demonstrate this for our recent 4-body water interaction potential. Comparisons of training and testing precision on the MD17 database of energies and gradients (forces) for ethanol against GP-SOAP, ANI, sGDML, PhysNet, pKREG, KRR, and other methods, which were recently assessed by Dral and co-workers, are given. The PIP fits are as precise as those using these methods, but the PIP computation time for energy and force evaluation is shown to be 10 to 1000 times faster. Finally, a new PIP PES is reported for ethanol based on a more extensive dataset of energies and gradients than in the MD17 database. Diffusion Monte Carlo calculations which fail on MD17-based PESs are successful using the new PES.
Photonics.com

Linear Positioning Stage

The LMS-XY linear positioning stage from Bold Laser Automation Inc. is an industrialized aluminum device suited for high-volume, high-speed precision ablation and cutting applications. The stage’s brushless and ironless linear driven motors provide ultra-smooth motion and are tuned to multiple loads with Tecnotion and Trilogy technology. Built for high-precision throughput,...
arxiv.org

Photonic Generation of Radar Signals with 30 GHz Bandwidth and Ultra-High Time-Frequency Linearity

Photonic generation of radio-frequency signals has shown significant advantages over the electronic counterparts, allowing the high precision generation of radio-frequency carriers up to the terahertz-wave region with flexible bandwidth for radar applications. Great progress has been made in photonics-based radio-frequency waveform generation. However, the approaches that rely on sophisticated benchtop digital microwave components, such as synthesizers and digital-to-analog converters have limited achievable bandwidth and thus resolution for radar detections. Methods based on voltage-controlled analog oscillators exhibit high time-frequency non-linearity, causing degraded sensing precision. Here, we demonstrate, for the first time, a photonic stepped-frequency (SF) waveform generation scheme enabled by MHz electronics with a tunable bandwidth exceeding 30 GHz and intrinsic time-frequency linearity. The ultra-wideband radio-frequency signal generation is enabled by using a polarization-stabilized optical cavity to suppress intra-cavity polarization-dependent instability; meanwhile, the signal's high-linearity is achieved via consecutive MHz acousto-optic frequency-shifting modulation without the necessity of using electro-optic modulators that have bias-drifting issues. We systematically evaluate the system's signal quality and imaging performance in comparison with conventional photonic radar schemes that use high-speed digital electronics, confirming its feasibility and excellent performance for high-resolution radar applications.
arxiv.org

Machine-Learning-Based Exchange-Correlation Functional with Physical Asymptotic Constraints

Density functional theory is the standard theory for computing the electronic structure of materials, which is based on a functional that maps the electron density to the energy. However, a rigorous form of the functional is not known and has been heuristically constructed by interpolating asymptotic constraints known for extreme situations, such as isolated atoms and uniform electron gas. Recent studies have demonstrated that the functional can be effectively approximated using machine learning (ML) approaches. However, most ML models do not satisfy asymptotic constraints. In this study, by applying a novel ML model architecture, we demonstrate a neural network-based exchange-correlation functional satisfying physical asymptotic constraints. Calculations reveal that the trained functional is applicable to various materials with an accuracy higher than that of existing functionals, even for materials whose electronic properties are different from the properties of materials in the training dataset. Our proposed approach thus improves the accuracy and generalization performance of the ML-based functional by combining the advantages of ML and analytical modeling.
arxiv.org

DeepCQ+: Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for Highly Dynamic Networks

Highly dynamic mobile ad-hoc networks (MANETs) remain as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing protocol which, in a novel manner integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants and achieves persistently higher performance across a wide range of topology and mobility configurations. While keeping the overall protocol structure of the Q-learning-based routing protocols, DeepCQ+ replaces statically configured parameterized thresholds and hand-written rules with carefully designed MADRL agents such that no configuration of such parameters is required a priori. Extensive simulation shows that DeepCQ+ yields significantly increased end-to-end throughput with lower overhead and no apparent degradation of end-to-end delays (hop counts) compared to its Q-learning based counterparts. Qualitatively, and perhaps more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful application of the MADRL framework for the MANET routing problem that demonstrates a high degree of scalability and robustness even under environments that are outside the trained range of scenarios. This implies that our MARL-based DeepCQ+ design solution significantly improves the performance of Q-learning based CQ+ baseline approach for comparison and increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios. Additional techniques to further increase the gains in performance and scalability are discussed.
arxiv.org

Improving the Robustness of Reinforcement Learning Policies with $\mathcal{L}_{1}$ Adaptive Control

A reinforcement learning (RL) control policy trained in a nominal environment could fail in a new/perturbed environment due to the existence of dynamic variations. For controlling systems with continuous state and action spaces, we propose an add-on approach to robustifying a pre-trained RLpolicy by augmenting it with an $\mathcal{L}_{1}$ adaptive controller ($ \mathcal{L}_{1}$AC). Leveraging the capability of an $\mathcal{L}_{1}$AC for fast estimation and active compensation of dynamic variations, the proposed approach can improve the robustness of an RL policy which is trained either in a simulator or in the real world without consideration of a broad class of dynamic variations. Numerical and real-world experiments empirically demonstrate the efficacy of the proposed approach in robustifying RL policies trained using both model-free and model-based methods. A video for the experiments on a real Pendubot setup is availableathttps://youtu.be/xgOB9vpyUgE.
arxiv.org

The Linear Template Fit

A matrix formalism for the determination of the best estimator in certain simulation-based parameter estimation problems will be presented and discussed. The equations, termed as the Linear Template Fit, combine a linear regression with a least square method and its optimization. The Linear Template Fit employs only predictions that are calculated beforehand and which are provided for a few values of the parameter of interest. Therefore, the Linear Template Fit is particularly suited for parameter estimation with computationally intensive simulations that are otherwise often limited in their usability for statistical inference, or for performance critical applications. Equations for error propagation are discussed, and the analytic form provides comprehensive insights into the parameter estimation problem. Furthermore, the quickly-converging algorithm of the Quadratic Template Fit will be presented, which is suitable for a non-linear dependence on the parameters. As an example application, a determination of the strong coupling constant, $\alpha_s(m_Z)$, from inclusive jet cross section data at the CERN Large Hadron Collider is studied and compared with previously published results.
arxiv.org

TridentAdapt: Learning Domain-invariance via Source-Target Confrontation and Self-induced Cross-domain Augmentation

Due to the difficulty of obtaining ground-truth labels, learning from virtual-world datasets is of great interest for real-world applications like semantic segmentation. From domain adaptation perspective, the key challenge is to learn domain-agnostic representation of the inputs in order to benefit from virtual data. In this paper, we propose a novel trident-like architecture that enforces a shared feature encoder to satisfy confrontational source and target constraints simultaneously, thus learning a domain-invariant feature space. Moreover, we also introduce a novel training pipeline enabling self-induced cross-domain data augmentation during the forward pass. This contributes to a further reduction of the domain gap. Combined with a self-training process, we obtain state-of-the-art results on benchmark datasets (e.g. GTA5 or Synthia to Cityscapes adaptation). Code and pre-trained models are available at this https URL.
aithority.com

SkyPoint Cloud Launches SkyPoint Resolve, Machine Learning-Based Identity Resolution

New product enables businesses to have a precise, up-to-date understanding of all customers. SkyPoint Cloud the privacy-first customer data platform that enables consumer and healthcare brands to build deeper relationships with their customers, announced the launch of SkyPoint Resolve, an affordable and scalable identity resolution SaaS product powered by machine learning, which provides consistent customer profiles, fraud detection and compliance.
arxiv.org

Online Robust Control of Linear Dynamical Systems with Prediction

We address the online robust control problem of a linear dynamical system with adversarial cost functions and adversarial disturbances. The goal is to find an online control policy that minimizes the disturbance gain, defined as the ratio of the cumulative cost and the cumulative energy in the disturbances. This problem is similar to the well-studied $\mathcal{H}_{\infty}$ problem in the robust control literature. However, unlike the standard $\mathcal{H}_{\infty}$ problem, where the cost functions are quadratic and fully known, we consider a more challenging online control setting where the cost functions are general and unknown a priori. We first propose an online robust control algorithm for the setting where the algorithm has access to an $N$-length preview of the future cost functions and future disturbances. We show that, under standard system assumptions, with $N$ greater than a threshold, the proposed algorithm can achieve a disturbance gain $(2+\rho(N)) \overline{\gamma}^2$, where $\overline{\gamma}^2$ is the best (minimum) possible disturbance gain for an oracle policy with full knowledge of the cost functions and disturbances, with $\rho(N) = O(1/N)$. We then propose an online robust control algorithm for a more challenging setting where only the preview of the cost functions is available. We show that under similar assumptions, with $N$ greater than the same threshold, the proposed algorithm achieves a disturbance gain of $6\overline{\gamma}^2$ with respect to the maximum cumulative energy in the disturbances.
arxiv.org

Active Learning for Domain Adaptation: An Energy-based Approach

Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of targe data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at this https URL.
arxiv.org

Learning Robust Recommender from Noisy Implicit Feedback

The ubiquity of implicit feedback makes it indispensable for building recommender systems. However, it does not actually reflect the actual satisfaction of users. For example, in E-commerce, a large portion of clicks do not translate to purchases, and many purchases end up with negative reviews. As such, it is of importance to account for the inevitable noises in implicit feedback. However, little work on recommendation has taken the noisy nature of implicit feedback into consideration. In this work, we explore the central theme of denoising implicit feedback for recommender learning, including training and inference. By observing the process of normal recommender training, we find that noisy feedback typically has large loss values in the early stages. Inspired by this observation, we propose a new training strategy named Adaptive Denoising Training (ADT), which adaptively prunes the noisy interactions by two paradigms (i.e., Truncated Loss and Reweighted Loss). Furthermore, we consider extra feedback (e.g., rating) as auxiliary signal and propose three strategies to incorporate extra feedback into ADT: finetuning, warm-up training, and colliding inference. We instantiate the two paradigms on the widely used binary cross-entropy loss and test them on three representative recommender models. Extensive experiments on three benchmarks demonstrate that ADT significantly improves the quality of recommendation over normal training without using extra feedback. Besides, the proposed three strategies for using extra feedback largely enhance the denoising ability of ADT.
arxiv.org

Robust and Adaptive Temporal-Difference Learning Using An Ensemble of Gaussian Processes

Value function approximation is a crucial module for policy evaluation in reinforcement learning when the state space is large or continuous. The present paper takes a generative perspective on policy evaluation via temporal-difference (TD) learning, where a Gaussian process (GP) prior is presumed on the sought value function, and instantaneous rewards are probabilistically generated based on value function evaluations at two consecutive states. Capitalizing on a random feature-based approximant of the GP prior, an online scalable (OS) approach, termed {OS-GPTD}, is developed to estimate the value function for a given policy by observing a sequence of state-reward pairs. To benchmark the performance of OS-GPTD even in an adversarial setting, where the modeling assumptions are violated, complementary worst-case analyses are performed by upper-bounding the cumulative Bellman error as well as the long-term reward prediction error, relative to their counterparts from a fixed value function estimator with the entire state-reward trajectory in hindsight. Moreover, to alleviate the limited expressiveness associated with a single fixed kernel, a weighted ensemble (E) of GP priors is employed to yield an alternative scheme, termed OS-EGPTD, that can jointly infer the value function, and select interactively the EGP kernel on-the-fly. Finally, performances of the novel OS-(E)GPTD schemes are evaluated on two benchmark problems.
arxiv.org

Approximations of interface topological invariants

This paper concerns continuous models for two-dimensional topological insulators and superconductors. Such systems are characterized by asymmetric transport along a $1$-dimensional curve representing the interface between two insulating materials. The asymmetric transport is quantified by an interface conductivity. Our first objective is to prove that the conductivity is quantized and stable with respect to a large class of perturbations, and to relate it to an integral involving the symbol of the system's Hamiltonian; this is a bulk-interface correspondence.
arxiv.org

Differentially Private Exploration in Reinforcement Learning with Linear Representation

This paper studies privacy-preserving exploration in Markov Decision Processes (MDPs) with linear representation. We first consider the setting of linear-mixture MDPs (Ayoub et al., 2020) (a.k.a.\ model-based setting) and provide an unified framework for analyzing joint and local differential private (DP) exploration. Through this framework, we prove a $\widetilde{O}(K^{3/4}/\sqrt{\epsilon})$ regret bound for $(\epsilon,\delta)$-local DP exploration and a $\widetilde{O}(\sqrt{K/\epsilon})$ regret bound for $(\epsilon,\delta)$-joint DP.
