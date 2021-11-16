ContributorsPublishersAdvertisers
Computers

Switching Recurrent Kalman Networks

By Giao Nguyen-Quynh, Philipp Becker, Chen Qiu, Maja Rudolph, Gerhard Neumann
arxiv.org
 8 days ago

Forecasting driving behavior or other sensor measurements is an essential component of autonomous driving systems. Often real-world multivariate time series data is hard to model because the underlying dynamics are nonlinear and the observations are noisy. In...

arxiv.org

Comments / 0

Related
arxiv.org

Improvements to short-term weather prediction with recurrent-convolutional networks

The Weather4cast 2021 competition gave the participants a task of predicting the time evolution of two-dimensional fields of satellite-based meteorological data. This paper describes the author's efforts, after initial success in the first stage of the competition, to improve the model further in the second stage. The improvements consisted of a shallower model variant that is competitive against the deeper version, adoption of the AdaBelief optimizer, improved handling of one of the predicted variables where the training set was found not to represent the validation set well, and ensembling multiple models to improve the results further. The largest quantitative improvements to the competition metrics can be attributed to the increased amount of training data available in the second stage of the competition, followed by the effects of model ensembling. Qualitative results show that the model can predict the time evolution of the fields, including the motion of the fields over time, starting with sharp predictions for the immediate future and blurring of the outputs in later frames to account for the increased uncertainty.
ENVIRONMENT
arxiv.org

Kalman Filtering with Adversarial Corruptions

Here we revisit the classic problem of linear quadratic estimation, i.e. estimating the trajectory of a linear dynamical system from noisy measurements. The celebrated Kalman filter gives an optimal estimator when the measurement noise is Gaussian, but is widely known to break down when one deviates from this assumption, e.g. when the noise is heavy-tailed. Many ad hoc heuristics have been employed in practice for dealing with outliers. In a pioneering work, Schick and Mitter gave provable guarantees when the measurement noise is a known infinitesimal perturbation of a Gaussian and raised the important question of whether one can get similar guarantees for large and unknown perturbations.
MATHEMATICS
arxiv.org

Observation Error Covariance Specification in Dynamical Systems for Data assimilation using Recurrent Neural Networks

Data assimilation techniques are widely used to predict complex dynamical systems with uncertainties, based on time-series observation data. Error covariance matrices modelling is an important element in data assimilation algorithms which can considerably impact the forecasting accuracy. The estimation of these covariances, which usually relies on empirical assumptions and physical constraints, is often imprecise and computationally expensive especially for systems of large dimension. In this work, we propose a data-driven approach based on long short term memory (LSTM) recurrent neural networks (RNN) to improve both the accuracy and the efficiency of observation covariance specification in data assimilation for dynamical systems. Learning the covariance matrix from observed/simulated time-series data, the proposed approach does not require any knowledge or assumption about prior error distribution, unlike classical posterior tuning methods. We have compared the novel approach with two state-of-the-art covariance tuning algorithms, namely DI01 and D05, first in a Lorenz dynamical system and then in a 2D shallow water twin experiments framework with different covariance parameterization using ensemble assimilation. This novel method shows significant advantages in observation covariance specification, assimilation accuracy and computational efficiency.
COMPUTERS
arxiv.org

One model Packs Thousands of Items with Recurrent Conditional Query Learning

Recent studies have revealed that neural combinatorial optimization (NCO) has advantages over conventional algorithms in many combinatorial optimization problems such as routing, but it is less efficient for more complicated optimization tasks such as packing which involves mutually conditioned action spaces. In this paper, we propose a Recurrent Conditional Query Learning (RCQL) method to solve both 2D and 3D packing problems. We first embed states by a recurrent encoder, and then adopt attention with conditional queries from previous actions. The conditional query mechanism fills the information gap between learning steps, which shapes the problem as a Markov decision process. Benefiting from the recurrence, a single RCQL model is capable of handling different sizes of packing problems. Experiment results show that RCQL can effectively learn strong heuristics for offline and online strip packing problems (SPPs), outperforming a wide range of baselines in space utilization ratio. RCQL reduces the average bin gap ratio by 1.83% in offline 2D 40-box cases and 7.84% in 3D cases compared with state-of-the-art methods. Meanwhile, our method also achieves 5.64% higher space utilization ratio for SPPs with 1000 items than the state of the art.
COMPUTERS
IN THIS ARTICLE
#Switches#Srkn#Lg#Signal Processing
arxiv.org

Quantum switch of quantum switches

Recent results have shown that quantum theory is compatible with novel causal structures where events happen without a definite causal order. In particular, the quantum switch describes a process in which two quantum channels act in a coherent superposition of their two possible orders. Furthermore, the quantum switch can perform communication tasks that are impossible within the framework of the standard quantum Shannon theory. The present paper considers the scenario of one-shot heralded qubit communication using a higher-order quantum switch constructed from two quantum switches. Specifically, we show that two quantum switches put in a superposition of their alternative causal orders can transmit a qubit, without any error, with a probability strictly higher than that achievable with each quantum switch. We discuss three examples that demonstrate this communication advantage. Notably, a higher-order quantum switch not only can outperform useful quantum switches but also becomes useful as a resource even if the quantum switches making it up are useless.
SCIENCE
arxiv.org

TYolov5: A Temporal Yolov5 Detector Based on Quasi-Recurrent Neural Networks for Real-Time Handgun Detection in Video

Mario Alberto Duran-Vega, Miguel Gonzalez-Mendoza, Leonardo Chang-Fernandez, Cuauhtemoc Daniel Suarez-Ramirez. Timely handgun detection is a crucial problem to improve public safety; nevertheless, the effectiveness of many surveillance systems still depend of finite human attention. Much of the previous research on handgun detection is based on static image detectors, leaving aside valuable temporal information that could be used to improve object detection in videos. To improve the performance of surveillance systems, a real-time temporal handgun detection system should be built. Using Temporal Yolov5, an architecture based in Quasi-Recurrent Neural Networks, temporal information is extracted from video to improve the results of the handgun detection. Moreover, two publicity available datasets are proposed, labeled with hands, guns, and phones. One containing 2199 static images to train static detectors, and another with 5960 frames of videos to train temporal modules. Additionally, we explore two temporal data augmentation techniques based in Mosaic and Mixup. The resulting systems are three temporal architectures: one focused in reducing inference with a mAP$_{50:95}$ of 56.1, another in having a good balance between inference and accuracy with a mAP$_{50:95}$ of 59.4, and a last one specialized in accuracy with a mAP$_{50:95}$ of 60.2. Temporal Yolov5 achieves real-time detection in the small and medium architectures. Moreover, it takes advantage of temporal features contained in videos to perform better than Yolov5 in our temporal dataset, making TYolov5 suitable for real-world applications. The source code is publicly available at this https URL.
TECHNOLOGY
arxiv.org

Charting and navigating the space of solutions for recurrent neural networks

Recurrent Neural Networks (RNNs) were recently successfully used to model the way neural activity drives task-related behavior in animals, operating under the implicit assumption that the obtained solutions are universal. Observations in both neuroscience and machine learning challenge this assumption. Animals can approach a given task with a variety of strategies, and training machine learning algorithms introduces the phenomenon of underspecification. These observations imply that every task is associated with a space of solutions. To date, the structure of this space is not understood, limiting the approach of comparing RNNs with neural data. Here, we characterize the space of solutions associated with various tasks. We first study a simple two-neuron network on a task that leads to multiple solutions. We trace the nature of the final solution back to the network's initial connectivity and identify discrete dynamical regimes that underlie this diversity. We then examine three neuroscience-inspired tasks: Delayed and interval discrimination, and Time reproduction. For each task, we find a rich set of solutions. Variability can be found directly in the neural activity of the networks, and additionally by testing the trained networks' ability to extrapolate, as a perturbation to a system often reveals hidden structure. Furthermore, we relate extrapolation patterns to specific dynamical objects and effective algorithms found by the networks. We introduce a tool to derive the reduced dynamics of networks by generating a compact directed graph describing the essence of the dynamics with regards to behavioral inputs and outputs. Using this representation, we can partition the solutions to each task into a handful of types and partially predict them from neural features. Our results shed light on the concept of the space of solutions and its uses in Machine learning and in Neuroscience.
SCIENCE
The Mint Hill Times

Smart Switches

CHARLOTTE – Remember when you got your first smartphone, all that you never knew you wanted to do but can’t live without now. Smart switches are like that. Home automation is life-changing. Caseta by Lutron offers an elegant product for connectivity. The wall switches themselves are ultra-functional and easy to...
ELECTRONICS
YOU MAY ALSO LIKE
NewsBreak
Artificial Intelligence
NewsBreak
Technology
NewsBreak
Computers
Place
Sydney
arxiv.org

Neural Network Kalman filtering for 3D object tracking from linear array ultrasound data

Arttu Arjas, Erwin J. Alles, Efthymios Maneas, Simon Arridge, Adrien Desjardins, Mikko J. Sillanpää, Andreas Hauptmann. Many interventional surgical procedures rely on medical imaging to visualise and track instruments. Such imaging methods not only need to be real-time capable, but also provide accurate and robust positional information. In ultrasound applications, typically only two-dimensional data from a linear array are available, and as such obtaining accurate positional estimation in three dimensions is non-trivial. In this work, we first train a neural network, using realistic synthetic training data, to estimate the out-of-plane offset of an object with the associated axial aberration in the reconstructed ultrasound image. The obtained estimate is then combined with a Kalman filtering approach that utilises positioning estimates obtained in previous time-frames to improve localisation robustness and reduce the impact of measurement noise. The accuracy of the proposed method is evaluated using simulations, and its practical applicability is demonstrated on experimental data obtained using a novel optical ultrasound imaging setup. Accurate and robust positional information is provided in real-time. Axial and lateral coordinates for out-of-plane objects are estimated with a mean error of 0.1mm for simulated data and a mean error of 0.2mm for experimental data. Three-dimensional localisation is most accurate for elevational distances larger than 1mm, with a maximum distance of 5mm considered for a 25mm aperture.
SCIENCE
arxiv.org

Recurrent Variational Network: A Deep Learning Inverse Problem Solver applied to the task of Accelerated MRI Reconstruction

Magnetic Resonance Imaging can produce detailed images of the anatomy and physiology of the human body that can assist doctors in diagnosing and treating pathologies such as tumours. However, MRI suffers from very long acquisition times that make it susceptible to patient motion artifacts and limit its potential to deliver dynamic treatments. Conventional approaches such as Parallel Imaging and Compressed Sensing allow for an increase in MRI acquisition speed by reconstructing MR images by acquiring less MRI data using multiple receiver coils. Recent advancements in Deep Learning combined with Parallel Imaging and Compressed Sensing techniques have the potential to produce high-fidelity reconstructions from highly accelerated MRI data. In this work we present a novel Deep Learning-based Inverse Problem solver applied to the task of accelerated MRI reconstruction, called Recurrent Variational Network (RecurrentVarNet) by exploiting the properties of Convolution Recurrent Networks and unrolled algorithms for solving Inverse Problems. The RecurrentVarNet consists of multiple blocks, each responsible for one unrolled iteration of the gradient descent optimization algorithm for solving inverse problems. Contrary to traditional approaches, the optimization steps are performed in the observation domain ($k$-space) instead of the image domain. Each recurrent block of RecurrentVarNet refines the observed $k$-space and is comprised of a data consistency term and a recurrent unit which takes as input a learned hidden state and the prediction of the previous block. Our proposed method achieves new state of the art qualitative and quantitative reconstruction results on 5-fold and 10-fold accelerated data from a public multi-channel brain dataset, outperforming previous conventional and deep learning-based approaches. We will release all models code and baselines on our public repository.
SCIENCE
arxiv.org

Predicting High-Flow Nasal Cannula Failure in an ICU Using a Recurrent Neural Network with Transfer Learning and Input Data Perseveration: A Retrospective Analysis

High Flow Nasal Cannula (HFNC) provides non-invasive respiratory support for critically ill children who may tolerate it more readily than other Non-Invasive (NIV) techniques. Timely prediction of HFNC failure can provide an indication for increasing respiratory support. This work developed and compared machine learning models to predict HFNC failure. A retrospective study was conducted using EMR of patients admitted to a tertiary pediatric ICU from January 2010 to February 2020. A Long Short-Term Memory (LSTM) model was trained to generate a continuous prediction of HFNC failure. Performance was assessed using the area under the receiver operating curve (AUROC) at various times following HFNC initiation. The sensitivity, specificity, positive and negative predictive values (PPV, NPV) of predictions at two hours after HFNC initiation were also evaluated. These metrics were also computed in a cohort with primarily respiratory diagnoses. 834 HFNC trials [455 training, 173 validation, 206 test] met the inclusion criteria, of which 175 [103, 30, 42] (21.0%) escalated to NIV or intubation. The LSTM models trained with transfer learning generally performed better than the LR models, with the best LSTM model achieving an AUROC of 0.78, vs 0.66 for the LR, two hours after initiation. Machine learning models trained using EMR data were able to identify children at risk for failing HFNC within 24 hours of initiation. LSTM models that incorporated transfer learning, input data perseveration and ensembling showed improved performance than the LR and standard LSTM models.
HEALTH
electrek.co

Recurrent lets you monitor and compare the battery health of a current or prospective EV for free

While worldwide EV adoption grows month-over-month, many of the previous practices surrounding new and used vehicles will need to adapt to stay relevant. EVs are exceedingly different from ICE cars and require a keen focus on the vehicle’s battery as a crucial indicator of its overall health and longevity. Recurrent looks to bridge that gap for both current and prospective EV owners by using individual EV battery data and comparing that data to that of similar vehicles on the road. This technology has the potential to become the standard for understanding and benchmarking an EV’s battery.
MARKETS
arxiv.org

Modeling Irregular Time Series with Continuous Recurrent Units

Recurrent neural networks (RNNs) like long short-term memory networks (LSTMs) and gated recurrent units (GRUs) are a popular choice for modeling sequential data. Their gating mechanism permits weighting previous history encoded in a hidden state with new information from incoming observations. In many applications, such as medical records, observations times are irregular and carry important information. However, LSTMs and GRUs assume constant time intervals between observations. To address this challenge, we propose continuous recurrent units (CRUs) -a neural architecture that can naturally handle irregular time intervals between observations. The gating mechanism of the CRU employs the continuous formulation of a Kalman filter and alternates between (1) continuous latent state propagation according to a linear stochastic differential equation (SDE) and (2) latent state updates whenever a new observation comes in. In an empirical study, we show that the CRU can better interpolate irregular time series than neural ordinary differential equation (neural ODE)-based models. We also show that our model can infer dynamics from im-ages and that the Kalman gain efficiently singles out candidates for valuable state updates from noisy observations.
SCIENCE
arxiv.org

Uncertainty estimation under model misspecification in neural network regression

Maria R. Cervera, Rafael Dätwyler, Francesco D'Angelo, Hamza Keurti, Benjamin F. Grewe, Christian Henning. Although neural networks are powerful function approximators, the underlying modelling assumptions ultimately define the likelihood and thus the hypothesis class they are parameterizing. In classification, these assumptions are minimal as the commonly employed softmax is capable of representing any categorical distribution. In regression, however, restrictive assumptions on the type of continuous distribution to be realized are typically placed, like the dominant choice of training via mean-squared error and its underlying Gaussianity assumption. Recently, modelling advances allow to be agnostic to the type of continuous distribution to be modelled, granting regression the flexibility of classification models. While past studies stress the benefit of such flexible regression models in terms of performance, here we study the effect of the model choice on uncertainty estimation. We highlight that under model misspecification, aleatoric uncertainty is not properly captured, and that a Bayesian treatment of a misspecified model leads to unreliable epistemic uncertainty estimates. Overall, our study provides an overview on how modelling choices in regression may influence uncertainty estimation and thus any downstream decision making process.
SCIENCE
Massage Mag.com

Switch to Better Booking

Perfecting your trade is what you’ve been working for since day one. It’s the source of the passion that fuels your lifestyle, your career, and ultimately you as a person. Part of running a successful business, other than perfecting your trade, comes a laundry list of not-so-fun check boxes: Appointment management, customer communication, HR, marketing, record keeping, payment processing—phew! We’re out of breath. You know this list all too well. The very task of searching for a platform is daunting, let alone making sure you’re finding the best one for you and your business.
INTERNET
arxiv.org

Multi-task manifold learning for small sample size datasets

In this study, we develop a method for multi-task manifold learning. The method aims to improve the performance of manifold learning for multiple tasks, particularly when each task has a small number of samples. Furthermore, the method also aims to generate new samples for new tasks, in addition to new samples for existing tasks. In the proposed method, we use two different types of information transfer: instance transfer and model transfer. For instance transfer, datasets are merged among similar tasks, whereas for model transfer, the manifold models are averaged among similar tasks. For this purpose, the proposed method consists of a set of generative manifold models corresponding to the tasks, which are integrated into a general model of a fiber bundle. We applied the proposed method to artificial datasets and face image sets, and the results showed that the method was able to estimate the manifolds, even for a tiny number of samples.
COMPUTERS
arxiv.org

Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless Cellular Networks

Collaborative deep reinforcement learning (CDRL) algorithms in which multiple agents can coordinate over a wireless network is a promising approach to enable future intelligent and autonomous systems that rely on real-time decision-making in complex dynamic environments. Nonetheless, in practical scenarios, CDRL faces many challenges due to the heterogeneity of agents and their learning tasks, different environments, time constraints of the learning, and resource limitations of wireless networks. To address these challenges, in this paper, a novel semantic-aware CDRL method is proposed to enable a group of heterogeneous untrained agents with semantically-linked DRL tasks to collaborate efficiently across a resource-constrained wireless cellular network. To this end, a new heterogeneous federated DRL (HFDRL) algorithm is proposed to select the best subset of semantically relevant DRL agents for collaboration. The proposed approach then jointly optimizes the training loss and wireless bandwidth allocation for the cooperating selected agents in order to train each agent within the time limit of its real-time task. Simulation results show the superior performance of the proposed algorithm compared to state-of-the-art baselines.
COMPUTERS
arxiv.org

A Global Two-stage Algorithm for Non-convex Penalized High-dimensional Linear Regression Problems

By the asymptotic oracle property, non-convex penalties represented by minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD) have attracted much attentions in high-dimensional data analysis, and have been widely used in signal processing, image restoration, matrix estimation, etc. However, in view of their non-convex and non-smooth characteristics, they are computationally challenging. Almost all existing algorithms converge locally, and the proper selection of initial values is crucial. Therefore, in actual operation, they often combine a warm-starting technique to meet the rigid requirement that the initial value must be sufficiently close to the optimal solution of the corresponding problem. In this paper, based on the DC (difference of convex functions) property of MCP and SCAD penalties, we aim to design a global two-stage algorithm for the high-dimensional least squares linear regression problems. A key idea for making the proposed algorithm to be efficient is to use the primal dual active set with continuation (PDASC) method, which is equivalent to the semi-smooth Newton (SSN) method, to solve the corresponding sub-problems. Theoretically, we not only prove the global convergence of the proposed algorithm, but also verify that the generated iterative sequence converges to a d-stationary point. In terms of computational performance, the abundant research of simulation and real data show that the algorithm in this paper is superior to the latest SSN method and the classic coordinate descent (CD) algorithm for solving non-convex penalized high-dimensional linear regression problems.
CODING & PROGRAMMING
arxiv.org

QuantumCircuitOpt: An Open-source Framework for Provably Optimal Quantum Circuit Design

In recent years, the quantum computing community has seen an explosion of novel methods to implement non-trivial quantum computations on near-term hardware. An important direction of research has been to decompose an arbitrary entangled state, represented as a unitary, into a quantum circuit, that is, a sequence of gates supported by a quantum processor. It has been well known that circuits with longer decompositions and more entangling multi-qubit gates are error-prone for the current noisy, intermediate-scale quantum devices. To this end, there has been a significant interest to develop heuristic-based methods to discover compact circuits. We contribute to this effort by proposing QuantumCircuitOpt (QCOpt), a novel open-source framework which implements mathematical optimization formulations and algorithms for decomposing arbitrary unitary gates into a sequence of hardware-native gates. A core innovation of QCOpt is that it provides optimality guarantees on the quantum circuits that it produces. In particular, we show that QCOpt can find up to 57% reduction in the number of necessary gates on circuits with up to four qubits, and in run times less than a few minutes on commodity computing hardware. We also validate the efficacy of QCOpt as a tool for quantum circuit design in comparison with a naive brute-force enumeration algorithm. We also show how the QCOpt package can be adapted to various built-in types of native gate sets, based on different hardware platforms like those produced by IBM, Rigetti and Google. We hope this package will facilitate further algorithmic exploration for quantum processor designers, as well as quantum physicists.
CODING & PROGRAMMING
arxiv.org

Nonlinear conjugate gradient for smooth convex functions

The method of nonlinear conjugate gradients (NCG) is widely used in practice for unconstrained optimization, but it satisfies weak complexity bounds at best when applied to smooth convex functions. In contrast, Nesterov's accelerated gradient (AG) method is optimal up to constant factors for this class. However, when specialized to quadratic function, conjugate gradient is optimal in a strong sense among function-gradient methods. Therefore, there is seemingly a gap in the menu of available algorithms: NCG, the optimal algorithm for quadratic functions that also exhibits good practical performance for general functions, has poor complexity bounds compared to AG. We propose an NCG method called C+AG ("conjugate plus accelerated gradient") to close this gap, that is, it is optimal for quadratic functions and still satisfies the best possible complexity bound for more general smooth convex functions. It takes conjugate gradient steps until insufficient progress is made, at which time it switches to accelerated gradient steps, and later retries conjugate gradient. The proposed method has the following theoretical properties: (i) It is identical to linear conjugate gradient (and hence terminates finitely) if the objective function is quadratic; (ii) Its running-time bound is $O(\eps^{-1/2})$ gradient evaluations for an $L$-smooth convex function, where $\eps$ is the desired residual reduction, (iii) Its running-time bound is $O(\sqrt{L/\ell}\ln(1/\eps))$ if the function is both $L$-smooth and $\ell$-strongly convex. In computational tests, the function-gradient evaluation count for the C+AG method typically behaves as whichever is better of AG or classical NCG. In most test cases it outperforms both.
MATHEMATICS

Comments / 0

Community Policy