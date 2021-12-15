ContributorsPublishersAdvertisers
DRaGon: Mining Latent Radio Channel Information from Geographical Data Leveraging Deep Learning

By Benjamin Sliwa, Melina Geis, Caner Bektas, Melisa Lopéz, Preben Mogensen, Christian Wietfeld
 4 days ago

Radio channel modeling is one of the most fundamental aspects in the process of designing, optimizing, and simulating wireless communication networks. In this field, long-established approaches such as analytical channel models and ray tracing techniques represent the de-facto standard methodologies. However, as...

A Deep-Learning Intelligent System Incorporating Data Augmentation for Short-Term Voltage Stability Assessment of Power Systems

Facing the difficulty of expensive and trivial data collection and annotation, how to make a deep learning-based short-term voltage stability assessment (STVSA) model work well on a small training dataset is a challenging and urgent problem. Although a big enough dataset can be directly generated by contingency simulation, this data generation process is usually cumbersome and inefficient; while data augmentation provides a low-cost and efficient way to artificially inflate the representative and diversified training datasets with label preserving transformations. In this respect, this paper proposes a novel deep-learning intelligent system incorporating data augmentation for STVSA of power systems. First, due to the unavailability of reliable quantitative criteria to judge the stability status for a specific power system, semi-supervised cluster learning is leveraged to obtain labeled samples in an original small dataset. Second, to make deep learning applicable to the small dataset, conditional least squares generative adversarial networks (LSGAN)-based data augmentation is introduced to expand the original dataset via artificially creating additional valid samples. Third, to extract temporal dependencies from the post-disturbance dynamic trajectories of a system, a bi-directional gated recurrent unit with attention mechanism based assessment model is established, which bi-directionally learns the significant time dependencies and automatically allocates attention weights. The test results demonstrate the presented approach manages to achieve better accuracy and a faster response time with original small datasets. Besides classification accuracy, this work employs statistical measures to comprehensively examine the performance of the proposal.
COMPUTERS
Notes On Deep Learning For Coders 1

A superb course and book. Lesson 1 out of 8. One of the most interesting, exciting, and useful techniques you can learn today is, for me, Deep Learning. I started trying it out only in the last year, and I have to admit having an amazing time analyzing my friends’ conversations in our WhatsApp group or building a model to teach me French by telling me what I should answer. The latter proved movie scripts shouldn’t be used in real-life conversations, but that’s for another article :)
CODING & PROGRAMMING
Quantum readout error mitigation via deep learning

Quantum computing devices are inevitably subject to errors. To leverage quantum technologies for computational benefits in practical applications, quantum algorithms and protocols must be implemented reliably under noise and imperfections. Since noise and imperfections limit the size of quantum circuits that can be realized on a quantum device, developing quantum error mitigation techniques that do not require extra qubits and gates is of critical importance. In this work, we present a deep learning-based protocol for reducing readout errors on quantum hardware. Our technique is based on training an artificial neural network with the measurement results obtained from experiments with simple quantum circuits consisting of singe-qubit gates only. With the neural network and deep learning, non-linear noise can be corrected, which is not possible with the existing linear inversion methods. The advantage of our method against the existing methods is demonstrated through quantum readout error mitigation experiments performed on IBM five-qubit quantum devices.
COMPUTERS
Benchmark for Out-of-Distribution Detection in Deep Reinforcement Learning

Reinforcement Learning (RL) based solutions are being adopted in a variety of domains including robotics, health care and industrial automation. Most focus is given to when these solutions work well, but they fail when presented with out of distribution inputs. RL policies share the same faults as most machine learning models. Out of distribution detection for RL is generally not well covered in the literature, and there is a lack of benchmarks for this task. In this work we propose a benchmark to evaluate OOD detection methods in a Reinforcement Learning setting, by modifying the physical parameters of non-visual standard environments or corrupting the state observation for visual environments. We discuss ways to generate custom RL environments that can produce OOD data, and evaluate three uncertainty methods for the OOD detection task. Our results show that ensemble methods have the best OOD detection performance with a lower standard deviation across multiple environments.
COMPUTERS
#Mining Equipment#Data Validation#Geographical Area#Radio Propagation#Deep Radio Channel#Radio Environmental Maps#Dragon#Signal Processing
Robust Active Learning: Sample-Efficient Training of Robust Deep Learning Models

Active learning is an established technique to reduce the labeling cost to build high-quality machine learning models. A core component of active learning is the acquisition function that determines which data should be selected to annotate. State-of-the-art acquisition functions -- and more largely, active learning techniques -- have been designed to maximize the clean performance (e.g. accuracy) and have disregarded robustness, an important quality property that has received increasing attention. Active learning, therefore, produces models that are accurate but not robust.
COMPUTERS
SymmetryGAN: Symmetry Discovery with Deep Learning

What are the symmetries of a dataset? Whereas the symmetries of an individual data element can be characterized by its invariance under various transformations, the symmetries of an ensemble of data elements are ambiguous due to Jacobian factors introduced while changing coordinates. In this paper, we provide a rigorous statistical definition of the symmetries of a dataset, which involves inertial reference densities, in analogy to inertial frames in classical mechanics. We then propose SymmetryGAN as a novel and powerful approach to automatically discover symmetries using a deep learning method based on generative adversarial networks (GANs). When applied to Gaussian examples, SymmetryGAN shows excellent empirical performance, in agreement with expectations from the analytic loss landscape. SymmetryGAN is then applied to simulated dijet events from the Large Hadron Collider (LHC) to demonstrate the potential utility of this method in high energy collider physics applications. Going beyond symmetry discovery, we consider procedures to infer the underlying symmetry group from empirical data.
SCIENCE
Using Keras for Deep Learning with R

We are excited to announce new developments in Keras for R. Together with our current integration with torch, data scientists can use the most popular and powerful deep learning frameworks all within R. Expand data science capabilities with deep learning. Data scientists use machine learning to create models that improve...
SOFTWARE
Attention-based Transformation from Latent Features to Point Clouds

In point cloud generation and completion, previous methods for transforming latent features to point clouds are generally based on fully connected layers (FC-based) or folding operations (Folding-based). However, point clouds generated by FC-based methods are usually troubled by outliers and rough surfaces. For folding-based methods, their data flow is large, convergence speed is slow, and they are also hard to handle the generation of non-smooth surfaces. In this work, we propose AXform, an attention-based method to transform latent features to point clouds. AXform first generates points in an interim space, using a fully connected layer. These interim points are then aggregated to generate the target point cloud. AXform takes both parameter sharing and data flow into account, which makes it has fewer outliers, fewer network parameters, and a faster convergence speed. The points generated by AXform do not have the strong 2-manifold constraint, which improves the generation of non-smooth surfaces. When AXform is expanded to multiple branches for local generations, the centripetal constraint makes it has properties of self-clustering and space consistency, which further enables unsupervised semantic segmentation. We also adopt this scheme and design AXformNet for point cloud completion. Considerable experiments on different datasets show that our methods achieve state-of-the-art results.
COMPUTERS
Science
Computer Science
A Survey on Societal Event Forecasting with Deep Learning

Population-level societal events, such as civil unrest and crime, often have a significant impact on our daily life. Forecasting such events is of great importance for decision-making and resource allocation. Event prediction has traditionally been challenging due to the lack of knowledge regarding the true causes and underlying mechanisms of event occurrence. In recent years, research on event forecasting has made significant progress due to two main reasons: (1) the development of machine learning and deep learning algorithms and (2) the accessibility of public data such as social media, news sources, blogs, economic indicators, and other meta-data sources. The explosive growth of data and the remarkable advancement in software/hardware technologies have led to applications of deep learning techniques in societal event studies. This paper is dedicated to providing a systematic and comprehensive overview of deep learning technologies for societal event predictions. We focus on two domains of societal events: \textit{civil unrest} and \textit{crime}. We first introduce how event forecasting problems are formulated as a machine learning prediction task. Then, we summarize data resources, traditional methods, and recent development of deep learning models for these problems. Finally, we discuss the challenges in societal event forecasting and put forward some promising directions for future research.
The Peril of Popular Deep Learning Uncertainty Estimation Methods

Uncertainty estimation (UE) techniques -- such as the Gaussian process (GP), Bayesian neural networks (BNN), Monte Carlo dropout (MCDropout) -- aim to improve the interpretability of machine learning models by assigning an estimated uncertainty value to each of their prediction outputs. However, since too high uncertainty estimates can have fatal consequences in practice, this paper analyzes the above techniques.
COMPUTERS
More layers! End-to-end regression and uncertainty on tabular data with deep learning

This paper attempts to analyze the effectiveness of deep learning for tabular data processing. It is believed that decision trees and their ensembles is the leading method in this domain, and deep neural networks must be content with computer vision and so on. But the deep neural network is a framework for building gradient-based hierarchical representations, and this key feature should be able to provide the best processing of generic structured (tabular) data, not just image matrices and audio spectrograms. This problem is considered through the prism of the Weather Prediction track in the Yandex Shifts challenge (in other words, the Yandex Shifts Weather task). This task is a variant of the classical tabular data regression problem. It is also connected with another important problem: generalization and uncertainty in machine learning. This paper proposes an end-to-end algorithm for solving the problem of regression with uncertainty on tabular data, which is based on the combination of four ideas: 1) deep ensemble of self-normalizing neural networks, 2) regression as parameter estimation of the Gaussian target error distribution, 3) hierarchical multitask learning, and 4) simple data preprocessing. Three modifications of the proposed algorithm form the top-3 leaderboard of the Yandex Shifts Weather challenge respectively. This paper considers that this success has occurred due to the fundamental properties of the deep learning algorithm, and tries to prove this.
COMPUTERS
Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies

Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar, Josh Merel, Nicolas Heess, Raia Hadsell. For robots operating in the real world, it is desirable to learn reusable behaviours that can effectively be transferred and adapted to numerous tasks and scenarios. We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model. In contrast to existing work, our method exploits a three-level hierarchy of both discrete and continuous latent variables, to capture a set of high-level behaviours while allowing for variance in how they are executed. We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours, while retaining the flexibility of a continuous latent variable model. The resulting skills can be transferred and fine-tuned on new tasks, unseen objects, and from state to vision-based policies, yielding better sample efficiency and asymptotic performance compared to existing skill- and imitation-based methods. We further analyse how and when the skills are most beneficial: they encourage directed exploration to cover large regions of the state space relevant to the task, making them most effective in challenging sparse-reward settings.
ENGINEERING
Finding and fixing bugs with deep learning

Finding and fixing bugs in code is a time-consuming, and often frustrating, part of everyday work for software developers. Can deep learning address this problem and help developers deliver better software, faster? In a new paper, Self-Supervised Bug Detection and Repair, presented at the 2021 Conference on Neural Information Processing Systems (NeurIPS 2021), we show a promising deep learning model, which we call BugLab. BugLab can be taught to detect and fix bugs, without using labelled data, through a “hide and seek” game.
CODING & PROGRAMMING
Data-driven forward-inverse problems and modulational instability for Yajima-Oikawa system using deep learning with parameter regularization

We investigate data-driven forward-inverse problems for Yajima-Oikawa system by employing two technologies which improve the performance of PINN in deep physics-informed neural network (PINN), namely neuron-wise locally adaptive activation functions and L2 norm parameter regularization. In particular, we not only recover three different forms of vector rogue waves (RWs) in the forward problem of Yajima-Oikawa (YO) system, including bright-bright RWs, intermediatebright RWs and dark-bright RWs, but also study the inverse problem of YO system by data-driven with noise of different intensity. Compared with PINN method using only locally adaptive activation function, the PINN method with two strategies shows amazing robustness when studying the inverse problem of YO system with noisy training data, that is, the improved PINN model proposed by us has excellent noise immunity. The asymptotic analysis of wavenumber k and the MI analysis for YO system with unknown parameters are derived systematically by applying the linearized instability analysis on plane wave.
COMPUTERS
Adaptive Projected Residual Networks for Learning Parametric Maps from Sparse Data

Thomas O'Leary-Roseberry, Xiaosong Du, Anirban Chaudhuri, Joaquim R. R. A. Martins, Karen Willcox, Omar Ghattas. We present a parsimonious surrogate framework for learning high dimensional parametric maps from limited training data. The need for parametric surrogates arises in many applications that require repeated queries of complex computational models. These applications include such "outer-loop" problems as Bayesian inverse problems, optimal experimental design, and optimal design and control under uncertainty, as well as real time inference and control problems. Many high dimensional parametric mappings admit low dimensional structure, which can be exploited by mapping-informed reduced bases of the inputs and outputs. Exploiting this property, we develop a framework for learning low dimensional approximations of such maps by adaptively constructing ResNet approximations between reduced bases of their inputs and output. Motivated by recent approximation theory for ResNets as discretizations of control flows, we prove a universal approximation property of our proposed adaptive projected ResNet framework, which motivates a related iterative algorithm for the ResNet construction. This strategy represents a confluence of the approximation theory and the algorithm since both make use of sequentially minimizing flows. In numerical examples we show that these parsimonious, mapping-informed architectures are able to achieve remarkably high accuracy given few training data, making them a desirable surrogate strategy to be implemented for minimal computational investment in training data generation.
COMPUTERS
Interference Suppression Using Deep Learning: Current Approaches and Open Challenges

In light of the finite nature of the wireless spectrum and the increasing demand for spectrum use arising from recent technological breakthroughs in wireless communication, the problem of interference continues to persist. Despite recent advancements in resolving interference issues, interference still presents a difficult challenge to effective usage of the spectrum. This is partly due to the rise in the use of license-free and managed shared bands for Wi-Fi, long term evolution (LTE) unlicensed (LTE-U), LTE licensed assisted access (LAA), 5G NR, and other opportunistic spectrum access solutions. As a result of this, the need for efficient spectrum usage schemes that are robust against interference has never been more important. In the past, most solutions to interference have addressed the problem by using avoidance techniques as well as non-AI mitigation approaches (for example, adaptive filters). The key downside to non-AI techniques is the need for domain expertise in the extraction or exploitation of signal features such as cyclostationarity, bandwidth and modulation of the interfering signals. More recently, researchers have successfully explored AI/ML enabled physical (PHY) layer techniques, especially deep learning which reduces or compensates for the interfering signal instead of simply avoiding it. The underlying idea of ML based approaches is to learn the interference or the interference characteristics from the data, thereby sidelining the need for domain expertise in suppressing the interference. In this paper, we review a wide range of techniques that have used deep learning to suppress interference. We provide comparison and guidelines for many different types of deep learning techniques in interference suppression. In addition, we highlight challenges and potential future research directions for the successful adoption of deep learning in interference suppression.
Scalable Geometric Deep Learning on Molecular Graphs

Deep learning in molecular and materials sciences is limited by the lack of integration between applied science, artificial intelligence, and high-performance computing. Bottlenecks with respect to the amount of training data, the size and complexity of model architectures, and the scale of the compute infrastructure are all key factors limiting the scaling of deep learning for molecules and materials. Here, we present $\textit{LitMatter}$, a lightweight framework for scaling molecular deep learning methods. We train four graph neural network architectures on over 400 GPUs and investigate the scaling behavior of these methods. Depending on the model architecture, training time speedups up to $60\times$ are seen. Empirical neural scaling relations quantify the model-dependent scaling and enable optimal compute resource allocation and the identification of scalable molecular geometric deep learning model implementations.
CHEMISTRY
Neural Attention Models in Deep Learning: Survey and Taxonomy

Attention is a state of arousal capable of dealing with limited processing bottlenecks in human beings by focusing selectively on one piece of information while ignoring other perceptible information. For decades, concepts and functions of attention have been studied in philosophy, psychology, neuroscience, and computing. Currently, this property has been widely explored in deep neural networks. Many different neural attention models are now available and have been a very active research area over the past six years. From the theoretical standpoint of attention, this survey provides a critical analysis of major neural attention models. Here we propose a taxonomy that corroborates with theoretical aspects that predate Deep Learning. Our taxonomy provides an organizational structure that asks new questions and structures the understanding of existing attentional mechanisms. In particular, 17 criteria derived from psychology and neuroscience classic studies are formulated for qualitative comparison and critical analysis on the 51 main models found on a set of more than 650 papers analyzed. Also, we highlight several theoretical issues that have not yet been explored, including discussions about biological plausibility, highlight current research trends, and provide insights for the future.
COMPUTERS
Instabase adds deep learning to make sense of unstructured data

Whether they realize it or not, most enterprises are sitting on a mountain of priceless, yet untapped, data. Buried deep within PDFs, customer emails, and scanned documents is a trove of business intelligence and insights that often have the potential to inform critical business decisions – if only it can be extracted and harnessed, that is.
SOFTWARE
A deep learning model for data-driven discovery of functional connectivity

Functional connectivity (FC) studies have demonstrated the overarching value of studying the brain and its disorders through the undirected weighted graph of fMRI correlation matrix. Most of the work with the FC, however, depends on the way the connectivity is computed, and further depends on the manual post-hoc analysis of the FC matrices. In this work we propose a deep learning architecture BrainGNN that learns the connectivity structure as part of learning to classify subjects. It simultaneously applies a graphical neural network to this learned graph and learns to select a sparse subset of brain regions important to the prediction task. We demonstrate the model's state-of-the-art classification performance on a schizophrenia fMRI dataset and demonstrate how introspection leads to disorder relevant findings. The graphs learned by the model exhibit strong class discrimination and the sparse subset of relevant regions are consistent with the schizophrenia literature.
SCIENCE

