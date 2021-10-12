CreatorsPublishersAdvertisers
Information Theoretic Structured Generative Modeling

By Bo Hu, Shujian Yu, Jose C. Principe
arxiv.org
 10 days ago

Rényi's information provides a theoretical foundation for tractable and data-efficient non-parametric density estimation, based on pair-wise evaluations in a reproducing kernel Hilbert space (RKHS). This paper extends this framework to parametric probabilistic modeling, motivated by the fact that Rényi's information can be estimated in closed-form for Gaussian mixtures. Based on this

arxiv.org

arxiv.org

Exploring causal physical mechanisms via non-gaussian linear models and deep kernel learning: applications for ferroelectric domain structures

Rapid emergence of the multimodal imaging in scanning probe, electron, and optical microscopies have brought forth the challenge of understanding the information contained in these complex data sets, targeting both the intrinsic correlations between different channels and further exploring the underpinning causal physical mechanisms. Here, we develop such analysis framework for the Piezoresponse Force Microscopy. We argue that under certain conditions, we can bootstrap experimental observations with the prior knowledge of materials structure to get information on certain non-observed properties, and demonstrate linear causal analysis for PFM observables. We further demonstrate that this approach can be extended to complex descriptors using the deep kernel learning (DKL) model. In this DKL analysis, we use the prior information on domain structure within the image to predict the physical properties. This analysis demonstrates the correlative relationships between morphology, piezoresponse, elastic property, etc. at nanoscale. The prediction of morphology and other physical parameters illustrates a mutual interaction between surface condition and physical properties in ferroelectric materials. This analysis is universal and can be extended to explore the correlative relationships of other multi-channel datasets.
PHYSICS
arxiv.org

Discriminative Multimodal Learning via Conditional Priors in Generative Models

Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data. These two learning mechanisms can, however, conflict with each other and representations can fail to embed information on the data modalities. This research studies the realistic scenario in which all modalities and class labels are available for model training, but where some modalities and labels required for downstream tasks are missing. We show, in this scenario, that the variational lower bound limits mutual information between joint representations and missing modalities. We, to counteract these problems, introduce a novel conditional multi-modal discriminative model that uses an informative prior distribution and optimizes a likelihood-free objective function that maximizes mutual information between joint representations and missing modalities. Extensive experimentation shows the benefits of the model we propose, the empirical results showing that our model achieves state-of-the-art results in representative problems such as downstream classification, acoustic inversion and annotation generation.
SCIENCE
bnl.gov

EIC User Profile: Theoretical Nuclear and Particle Physicist Jennifer Rittenhouse West

Theoretical nuclear and particle physicist and postdoctoral fellow at Lawrence Berkeley National Laboratory and the EIC Center @JLab uses the solidity of mathematics to explore fundamental questions where Nature has the final say. Jennifer Rittenhouse West, a theoretical nuclear and particle physicist and postdoctoral fellow at Lawrence Berkeley National Laboratory...
COMPUTERS
arxiv.org

Understanding Model Robustness to User-generated Noisy Texts

Sensitivity of deep-neural models to input noise is known to be a challenging problem. In NLP, model performance often deteriorates with naturally occurring noise, such as spelling errors. To mitigate this issue, models may leverage artificially noised data. However, the amount and type of generated noise has so far been determined arbitrarily. We therefore propose to model the errors statistically from grammatical-error-correction corpora. We present a thorough evaluation of several state-of-the-art NLP systems' robustness in multiple languages, with tasks including morpho-syntactic analysis, named entity recognition, neural machine translation, a subset of the GLUE benchmark and reading comprehension. We also compare two approaches to address the performance drop: a) training the NLP models with noised data generated by our framework; and b) reducing the input noise with external system for natural language correction. The code is released at this https URL.
COMPUTERS
technologynetworks.com

3D Modeling Analyzes How Neural Networks Process Information

Creating human-like AI is about more than mimicking human behaviour – technology must also be able to process information, or ‘think’, like humans too if it is to be fully relied upon. New research, published in the journal Patterns and led by the University of Glasgow’s School of Psychology and...
CODING & PROGRAMMING
arxiv.org

Mutual cooperation and tolerance to defection in the context of socialization: the theoretical model and experimental evidence

The study of the nature of human cooperation still contains gaps needing investigation. Previous findings reveal that socialization effectively promotes cooperation in the well-known Prisoner's dilemma (PD) game. However, theoretical concepts fail to describe high levels of cooperation (probability higher than 50%) that were observed empirically. In this paper, we derive a symmetrical quantal response equilibrium (QRE) in PD in Markov strategies and test it against experimental data. Our results indicate that for low levels of rationality, QRE manages to describe high cooperation. In contrast, for high rationality QRE converges to the Nash equilibrium and describes low-cooperation behavior of participants. In the area of middle rationality, QRE matches the curve that represents the set of Nash equilibrium in Markov strategies. Further, we find that QRE serves as a dividing line between behavior before and after socialization, according to the experimental data. Finally, we successfully highlight the theoretically-predicted intersection of the set of Nash equilibrium in Markov strategies and the QRE curve.
SCIENCE
ScienceAlert

A Physicist Quantified The Amount of Information in The Entire Observable Universe

In attempts to understand the very nature of our reality, physicists sure have some mind-bending theories. Like what if information is a tangible and fundamental aspect of physical reality itself – alongside matter and energy? Or, alternatively, what if information is the fifth state of matter? Information is, after all, something all matter and energy measurably possess. The rules that govern their existence, like their mass, speed, or charge, are all bits of information they contain. So to allow experimental probing of such ideas, physicist Melvin Vopson from the University of Portsmouth in the UK estimated how much information a single elementary...
ASTRONOMY
arxiv.org

Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation

Recently, deep reinforcement learning (RL) has achieved remarkable empirical success by integrating deep neural networks into RL frameworks. However, these algorithms often require a large number of training samples and admit little theoretical understanding. To mitigate these issues, we propose a theoretically principled nearest neighbor (NN) function approximator that can improve the value networks in deep RL methods. Inspired by human similarity judgments, the NN approximator estimates the action values using rollouts on past observations and can provably obtain a small regret bound that depends only on the intrinsic complexity of the environment. We present (1) Nearest Neighbor Actor-Critic (NNAC), an online policy gradient algorithm that demonstrates the practicality of combining function approximation with deep RL, and (2) a plug-and-play NN update module that aids the training of existing deep RL methods. Experiments on classical control and MuJoCo locomotion tasks show that the NN-accelerated agents achieve higher sample efficiency and stability than the baseline agents. Based on its theoretical benefits, we believe that the NN approximator can be further applied to other complex domains to speed-up learning.
COMPUTERS
arxiv.org

An Interacting Neuronal Network with Inhibition: theoretical analysis and perfect simulation

We study a purely inhibitory neural network model where neurons are represented by their state of inhibition. The study we present here is partially based on the work of Cottrell \cite{Cot} and Fricker et al. \cite{FRST}. The spiking rate of a neuron depends only on its state of inhibition. When a neuron spikes, its state is replaced by a random new state, independently of anything else and the inhibition state of the other neurons increase by a positive value. Using the Perron-Frobenius theorem, we show the existence of a Lyapunov function for the process. Furthermore, we prove a local Doeblin condition which implies the existence of an invariant measure for the process. Finally, we extend our model to the case where the neurons are indexed by $ \mathbb{Z}. $ We construct a perfect simulation algorithm to show the recurrence of the process under certain conditions. To do this, we rely on the classical contour technique used in the study of contact processes, and assuming that the spiking rate lies on the interval $[ \beta_* , \beta^* ], $ we show that there is a critical threshold for the ratio $ \delta= \frac{\beta_*}{\beta^* - \beta_*}$ over which the process is ergodic. \\ \textbf{Keywords}: spiking rate, interacting neurons, perfect simulation algorithm, classical contour technique.
SCIENCE
Business Insider

Facebook is working on AI tech that will monitor your every move

Facebook envisions a future where smartglasses "become as useful in everyday life as smartphones," the company said in a new blog post. In order to achieve that future, such devices will require powerful AI software that can read and respond to the world around the headset's user. And the only way to train AI to see and hear the world like humans do is for it to experience the world like we do: from a first-person perspective.
INTERNET
arxiv.org

Model order reduction for bifurcating phenomena in Fluid-Structure Interaction problems

This work explores the development and the analysis of an efficient reduced order model for the study of a bifurcating phenomenon, known as the Coandă effect, in a multi-physics setting involving fluid and solid media. Taking into consideration a Fluid-Structure Interaction problem, we aim at generalizing previous works towards a more reliable description of the physics involved. In particular, we provide several insights on how the introduction of an elastic structure influences the bifurcating behaviour. We have addressed the computational burden by developing a reduced order branch-wise algorithm based on a monolithic Proper Orthogonal Decomposition. We compared different constitutive relations for the solid, and we observed that a nonlinear hyper-elastic law delays the bifurcation w.r.t. the standard model, while the same effect is even magnified when considering linear elastic solid.
MATHEMATICS
arxiv.org

Attention-guided Generative Models for Extractive Question Answering

We propose a novel method for applying Transformer models to extractive question answering (QA) tasks. Recently, pretrained generative sequence-to-sequence (seq2seq) models have achieved great success in question answering. Contributing to the success of these models are internal attention mechanisms such as cross-attention. We propose a simple strategy to obtain an extractive answer span from the generative model by leveraging the decoder cross-attention patterns. Viewing cross-attention as an architectural prior, we apply joint training to further improve QA performance. Empirical results show that on open-domain question answering datasets like NaturalQuestions and TriviaQA, our method approaches state-of-the-art performance on both generative and extractive inference, all while using much fewer parameters. Furthermore, this strategy allows us to perform hallucination-free inference while conferring significant improvements to the model's ability to rerank relevant passages.
COMPUTERS
arxiv.org

Automatic Generation of Grover Quantum Oracles for Arbitrary Data Structures

Raphael Seidel, Colin Kai-Uwe Becker, Sebastian Bock, Nikolay Tcholtchev, Ilie-Daniel Gheorge-Pop, Manfred Hauswirth. The steadily growing research interest in quantum computing - together with the accompanying technological advances in the realization of quantum hardware - fuels the development of meaningful real-world applications, as well as implementations for well-known quantum algorithms. One of the most prominent examples till today is Grover's algorithm, which can be used for efficient search in unstructured databases. Quantum oracles that are frequently masked as black boxes play an important role in Grover's algorithm. Hence, the automatic generation of oracles is of paramount importance. Moreover, the automatic generation of the corresponding circuits for a Grover quantum oracle is deeply linked to the synthesis of reversible quantum logic, which - despite numerous advances in the field - still remains a challenge till today in terms of synthesizing efficient and scalable circuits for complex boolean functions.
COMPUTERS
arxiv.org

F-Divergences and Cost Function Locality in Generative Modelling with Quantum Circuits

Generative modelling is an important unsupervised task in machine learning. In this work, we study a hybrid quantum-classical approach to this task, based on the use of a quantum circuit Born machine. In particular, we consider training a quantum circuit Born machine using $f$-divergences. We first discuss the adversarial framework for generative modelling, which enables the estimation of any $f$-divergence in the near term. Based on this capability, we introduce two heuristics which demonstrably improve the training of the Born machine. The first is based on $f$-divergence switching during training. The second introduces locality to the divergence, a strategy which has proved important in similar applications in terms of mitigating barren plateaus. Finally, we discuss the long-term implications of quantum devices for computing $f$-divergences, including algorithms which provide quadratic speedups to their estimation. In particular, we generalise existing algorithms for estimating the Kullback-Leibler divergence and the total variation distance to obtain a fault-tolerant quantum algorithm for estimating another $f$-divergence, namely, the Pearson divergence.
COMPUTERS
TheConversationCanada

Mining the moon's water will require a massive infrastructure investment, but should we?

We live in a world in which momentous decisions are made by people often without forethought. But some things are predictable, including that if you continually consume a finite resource without recycling, it will eventually run out. Yet, as we set our sights on embarking back to the moon, we will be bringing with us all our bad habits, including our urge for unrestrained consumption. Since the 1994 discovery of water ice on the moon by the Clementine spacecraft, excitement has reigned at the prospect of a return to the moon. This followed two decades of the doldrums after the end...
ASTRONOMY
Nature.com

Reusability report: Feature disentanglement in generating a three-dimensional structure from a two-dimensional slice with sliceGAN

Arising from Steve Kench & Samuel J. Cooper. Nature Machine Intelligence https://doi.org/10.1038/s42256-021-00322-1 (2021). All prices are NET prices. VAT will be added later in the checkout. Tax calculation will be finalised during checkout. Rent or Buy article. Get time limited or full article access on ReadCube. from$8.99. All prices are...
SCIENCE
Nature.com

Model retraining and information sharing in a supply chain with long-term fluctuating demands

Demand forecasting based on empirical data is a viable approach for optimizing a supply chain. However, in this approach, a model constructed from past data occasionally becomes outdated due to long-term changes in the environment, in which case the model should be updated (i.e., retrained) using the latest data. In this study, we examine the effects of updating models in a supply chain using a minimal setting. We demonstrate that when each party in the supply chain has its own forecasting model, uncoordinated model retraining causes the bullwhip effect even if a very simple replenishment policy is applied. Our results also indicate that sharing the forecasting model among the parties involved significantly reduces the bullwhip effect.
ECONOMY
arxiv.org

Simple or Complex? Complexity-Controllable Question Generation with Soft Templates and Deep Mixture of Experts Model

The ability to generate natural-language questions with controlled complexity levels is highly desirable as it further expands the applicability of question generation. In this paper, we propose an end-to-end neural complexity-controllable question generation model, which incorporates a mixture of experts (MoE) as the selector of soft templates to improve the accuracy of complexity control and the quality of generated questions. The soft templates capture question similarity while avoiding the expensive construction of actual templates. Our method introduces a novel, cross-domain complexity estimator to assess the complexity of a question, taking into account the passage, the question, the answer and their interactions. The experimental results on two benchmark QA datasets demonstrate that our QG model is superior to state-of-the-art methods in both automatic and manual evaluation. Moreover, our complexity estimator is significantly more accurate than the baselines in both in-domain and out-domain settings.
COMPUTERS
towardsdatascience.com

Using Gaussian Process Regression as a Generative Model, with Python

Nowadays, we can safely say that generative models are the hot shots of Artificial Intelligence. People that work with data may know the technical details, while for non-technical people the idea of being able to generate new stuff out of an existing dataset basically sounds like science-fiction. For this reason, when the non-technical crowd meets stuff like Deep-Fake it explodes pretty fast.
CODING & PROGRAMMING
arxiv.org

Information-Theoretic Measures of Dataset Difficulty

Estimating the difficulty of a dataset typically involves comparing state-of-the-art models to humans; the bigger the performance gap, the harder the dataset is said to be. Not only is this framework informal, but it also provides little understanding of how difficult each instance is, or what attributes make it difficult for a given model. To address these problems, we propose an information-theoretic perspective, framing dataset difficulty as the absence of $\textit{usable information}$. Measuring usable information is as easy as measuring performance, but has certain theoretical advantages. While the latter only allows us to compare different models w.r.t the same dataset, the former also allows us to compare different datasets w.r.t the same model. We then introduce $\textit{pointwise}$ $\mathcal{V}-$$\textit{information}$ (PVI) for measuring the difficulty of individual instances, where instances with higher PVI are easier for model $\mathcal{V}$. By manipulating the input before measuring usable information, we can understand $\textit{why}$ a dataset is easy or difficult for a given model, which we use to discover annotation artefacts in widely-used benchmarks.
COMPUTERS

