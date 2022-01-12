ContributorsPublishersAdvertisers
Dyna-T: Dyna-Q and Upper Confidence Bounds Applied to Trees

By Tarek Faycal, Claudio Zito
arxiv.org
 3 days ago

In this work we present a preliminary investigation of a novel algorithm called Dyna-T. In reinforcement learning (RL) a planning agent has its own representation of the environment as a model. To discover an optimal policy to interact with the environment, the...

arxiv.org

The Associated Press

Third-Party Analysis Illustrates Dramatic Capture and Cleaning of Exhaled Air with New Air-Clenz™ Computer Monitor To Help Curtail Spread of Airborne Respiratory Particles, including COVID

Air-Clenz Systems™ (Air-Clenz™) today announced the results of an independent third-party analysis of the effectiveness of the Air-Clenz Computer Monitor. These simulations showed that the Air-Clenz monitor quickly captured and cleaned 95+% of the user’s exhaled air. And when compared side-by-side to that of a conventional computer monitor, identically positioned, the difference was visibly dramatic.
PUBLIC HEALTH
arxiv.org

Modular Welch Bounds with Applications

We prove the following two results. \begin{enumerate}. \item Let $\mathcal{A}$ be a unital commutative C*-algebra and $\mathcal{A}^d$ be the standard Hilbert C*-module over $\mathcal{A}$. Let $n\geq d$. If $\{\tau_j\}_{j=1}^n$ is any collection of vectors in $\mathcal{A}^d$ such that $\langle \tau_j, \tau_j \rangle =1$, $\forall 1\leq j \leq n$, then. \begin{align*}
MATHEMATICS
arxiv.org

Upper bounds for the probability of unusually small components in critical random graphs

We describe a methodology, mostly based on an estimate for the probability that a (mean zero) $\mathbb{Z}$-valued random walk remains below a constant barrier over a finite time interval and Kolmogorov's inequality, to derive upper bounds for the probability of observing unusually small maximal components in two classical random graphs models when considered near criticality. Specifically, we consider the random graph $\mathbb{G}(n,d,p)$ obtained by performing $p$-bond percolation on a $d$-regular graph selected uniformly at random from the set of all simple $d$-regular graph on $n$ vertices, as well as the Erdős-Rényi random graph $\mathbb{G}(n,p)$, and show that, near criticality, in both models the probability of observing a largest component containing less than $n^{2/3}/A$ vertices decays as $A^{-\epsilon}$ for some $\epsilon>0$. Even though this result is not new, our approach is quite robust and illustrate a general strategy that works for both models. Moreover, it allows us to provide a shorter analysis for the $\mathbb{G}(n,d,p)$ model with respect to the one available in the literature.
SCIENCE
arxiv.org

Neutrino bound states and bound systems

Yukawa interactions of neutrinos with a new light scalar boson $\phi$ can lead to formation of stable bound states and bound systems of many neutrinos ($\nu$-clusters). For allowed values of the coupling $y$ and the scalar mass $m_\phi$, the bound state of two neutrinos would have the size larger than $10^{12}$ cm. Bound states with sub-cm sizes are possible for keV scale sterile neutrinos with coupling $y > 10^{-4}$. For the $\nu$-clusters we study in detail the properties of final stable configurations. If there is an efficient cooling mechanism, these configurations are in the state of degenerate Fermi gas. We formulate and solve equations to describe the density distributions in $\nu$-clusters for different values of the total number of neutrinos, $N$. In the non-relativistic case, they are reduced to the Lane-Emden equation. We find that (i) stable configurations exist for any number of neutrinos; (ii) there is a maximal central density $\sim 10^9$ cm$^{-3}$ determined by the neutrino mass; (iii) for a given $m_\phi$ there is a minimal value of $Ny^3$ for which stable configurations can be formed; (iv) for a given strength of interaction ($\propto y/m_\phi$), the minimal radius of $\nu$-clusters exists. We discuss the formation of the $\nu$-clusters from relic neutrino background in the process of expansion and cooling of the Universe. One possibility is the development of instabilities in the $\nu$-background at $T < m_\nu$ which leads to its fragmentation. Another way is the growth of initial density perturbations in the $\nu$-background and virialiazation in analogy with formation of the Dark Matter halos. For allowed values of $y$, cooling of $\nu$-clusters due to $\phi$-bremsstrahlung and neutrino annihilation is negligible. $\nu$-clusters can be formed with the sizes ranging from $\sim$ km to $\sim 10$ Mpc.
SCIENCE
arxiv.org

Solving Dynamic Graph Problems with Multi-Attention Deep Reinforcement Learning

Graph problems such as traveling salesman problem, or finding minimal Steiner trees are widely studied and used in data engineering and computer science. Typically, in real-world applications, the features of the graph tend to change over time, thus, finding a solution to the problem becomes challenging. The dynamic version of many graph problems are the key for a plethora of real-world problems in transportation, telecommunication, and social networks. In recent years, using deep learning techniques to find heuristic solutions for NP-hard graph combinatorial problems has gained much interest as these learned heuristics can find near-optimal solutions efficiently. However, most of the existing methods for learning heuristics focus on static graph problems. The dynamic nature makes NP-hard graph problems much more challenging to learn, and the existing methods fail to find reasonable solutions.
COMPUTER SCIENCE
arxiv.org

Privacy Amplification by Subsampling in Time Domain

Aggregate time-series data like traffic flow and site occupancy repeatedly sample statistics from a population across time. Such data can be profoundly useful for understanding trends within a given population, but also pose a significant privacy risk, potentially revealing e.g., who spends time where. Producing a private version of a time-series satisfying the standard definition of Differential Privacy (DP) is challenging due to the large influence a single participant can have on the sequence: if an individual can contribute to each time step, the amount of additive noise needed to satisfy privacy increases linearly with the number of time steps sampled. As such, if a signal spans a long duration or is oversampled, an excessive amount of noise must be added, drowning out underlying trends. However, in many applications an individual realistically cannot participate at every time step. When this is the case, we observe that the influence of a single participant (sensitivity) can be reduced by subsampling and/or filtering in time, while still meeting privacy requirements. Using a novel analysis, we show this significant reduction in sensitivity and propose a corresponding class of privacy mechanisms. We demonstrate the utility benefits of these techniques empirically with real-world and synthetic time-series data.
TECHNOLOGY
towardsdatascience.com

Build a Q&A App with PyTorch

How to easily deploy a QA HuggingFace model using Docker and FastAPI. In the last few years a breadth of pre-trained models have been made available from computer vision to natural language processing, with some of the most well known aggregators being Model Zoo, Tensorflow Hub and HuggingFace. The availability...
PREMIER LEAGUE
arxiv.org

Automated Reinforcement Learning: An Overview

Reinforcement Learning and recently Deep Reinforcement Learning are popular methods for solving sequential decision making problems modeled as Markov Decision Processes. RL modeling of a problem and selecting algorithms and hyper-parameters require careful considerations as different configurations may entail completely different performances. These considerations are mainly the task of RL experts; however, RL is progressively becoming popular in other fields where the researchers and system designers are not RL experts. Besides, many modeling decisions, such as defining state and action space, size of batches and frequency of batch updating, and number of timesteps are typically made manually. For these reasons, automating different components of RL framework is of great importance and it has attracted much attention in recent years. Automated RL provides a framework in which different components of RL including MDP modeling, algorithm selection and hyper-parameter optimization are modeled and defined automatically. In this article, we explore the literature and present recent work that can be used in automated RL. Moreover, we discuss the challenges, open questions and research directions in AutoRL.
COMPUTERS
arxiv.org

Criticality-Based Varying Step-Number Algorithm for Reinforcement Learning

In the context of reinforcement learning we introduce the concept of criticality of a state, which indicates the extent to which the choice of action in that particular state influences the expected return. That is, a state in which the choice of action is more likely to influence the final outcome is considered as more critical than a state in which it is less likely to influence the final outcome.
CODING & PROGRAMMING
arxiv.org

Equivalence between fermion-to-qubit mappings in two spatial dimensions

We argue that all locality-preserving mappings between fermionic observables and Pauli matrices on a two-dimensional lattice can be generated from the exact bosonization in Ref. [1], whose gauge constraints project onto the subspace of the toric code with emergent fermions. Starting from the exact bosonization and applying Clifford finite-depth generalized local unitary (gLU) transformation, we can achieve all possible fermion-to-qubit mappings (up to the re-pairing of Majorana fermions). In particular, we discover a new super-compact encoding using 1.25 qubits per fermion on the square lattice, which is lower than any method in the literature. We prove the existence of fermion-to-qubit mappings with qubit-fermion ratios $r=1+ \frac{1}{2k}$ for positive integers $k$, where the proof utilizes the trivialness of quantum cellular automata (QCA) in two spatial dimensions. When the ratio approaches 1, the fermion-to-qubit mapping reduces to the 1d Jordan-Wigner transformation along a certain path in the two-dimensional lattice. Finally, we explicitly demonstrate that the Bravyi-Kitaev superfast simulation, the Verstraete-Cirac auxiliary method, Kitaev's exactly solved model, the Majorana loop stabilizer codes, and the compact fermion-to-qubit mapping can all be obtained from the exact bosonization.
MATHEMATICS
arxiv.org

Neural Koopman Lyapunov Control

Learning and synthesizing stabilizing controllers for unknown nonlinear systems is a challenging problem for real-world and industrial applications. Koopman operator theory allow one to analyze nonlinear systems through the lens of linear systems and nonlinear control systems through the lens of bilinear control systems. The key idea of these methods, lies in the transformation of the coordinates of the nonlinear system into the Koopman observables, which are coordinates that allow the representation of the original system (control system) as a higher dimensional linear (bilinear control) system. However, for nonlinear control systems, the bilinear control model obtained by applying Koopman operator based learning methods is not necessarily stabilizable and therefore, the existence of a stabilizing feedback control is not guaranteed which is crucial for many real world applications. Simultaneous identification of these stabilizable Koopman based bilinear control systems as well as the associated Koopman observables is still an open problem. In this paper, we propose a framework to identify and construct these stabilizable bilinear models and its associated observables from data by simultaneously learning a bilinear Koopman embedding for the underlying unknown nonlinear control system as well as a Control Lyapunov Function (CLF) for the Koopman based bilinear model using a learner and falsifier. Our proposed approach thereby provides provable guarantees of global asymptotic stability for the nonlinear control systems with unknown dynamics. Numerical simulations are provided to validate the efficacy of our proposed class of stabilizing feedback controllers for unknown nonlinear systems.
ENGINEERING
arxiv.org

Benchmarking Problems for Robust Discrete Optimization

Robust discrete optimization is a highly active field of research where a plenitude of combinations between decision criteria, uncertainty sets and underlying nominal problems are considered. Usually, a robust problem becomes harder to solve than its nominal counterpart, even if it remains in the same complexity class. For this reason, specialized solution algorithms have been developed. To further drive the development of stronger solution algorithms and to facilitate the comparison between methods, a set of benchmark instances is necessary but so far missing. In this paper we propose a further step towards this goal by proposing several instance generation procedures for combinations of min-max, min-max regret, two-stage and recoverable robustness with interval, discrete or budgeted uncertainty sets. Besides sampling methods that go beyond the simple uniform sampling method that is the de-facto standard to produce instances, also optimization models to construct hard instances are considered. Using a selection problem for the nominal ground problem, we are able to generate instances that are several orders of magnitudes harder to solve than uniformly sampled instances when solving them with a general mixed-integer programming solver. All instances and generator codes are made available online.
CODING & PROGRAMMING

