Data-Driven Models for Control Engineering Applications Using the Koopman Operator
By Annika Junker, Julia Timmermann, Ansgar Trächtler
4 days ago
Within this work, we investigate how data-driven numerical approximation methods of the Koopman operator can be used in practical control engineering applications. We refer to the method Extended Dynamic Mode Decomposition (EDMD), which approximates a nonlinear dynamical system...
We derive criteria for the selection of datapoints used for data-driven reduced-order modeling and other areas of supervised learning based on Gaussian process regression (GPR). While this is a well-studied area in the fields of active learning and optimal experimental design, most criteria in the literature are empirical. Here we introduce an optimality condition for the selection of a new input defined as the minimizer of the distance between the approximated output probability density function (pdf) of the reduced-order model and the exact one. Given that the exact pdf is unknown, we define the selection criterion as the supremum over the unit sphere of the native Hilbert space for the GPR. The resulting selection criterion, however, has a form that is difficult to compute. We combine results from GPR theory and asymptotic analysis to derive a computable form of the defined optimality criterion that is valid in the limit of small predictive variance. The derived asymptotic form of the selection criterion leads to convergence of the GPR model that guarantees a balanced distribution of data resources between probable and large-deviation outputs, resulting in an effective way for sampling towards data-driven reduced-order modeling.
Parcel sorting operations in logistics enterprises aim to achieve a high throughput of parcels through sorting centers. These sorting centers are composed of large circular conveyor belts on which incoming parcels are placed, with multiple arms known as chutes for sorting the parcels by destination, followed by packing into roller cages and loading onto outbound trucks. Modern sorting systems need to complement their hardware innovations with sophisticated algorithms and software to map destinations and workforce to specific chutes. While state of the art systems operate with fixed mappings, we propose an optimization approach that runs before every shift, and uses real-time forecast of destination demand and labor availability in order to maximize throughput. We use simulation to improve the performance and robustness of the optimization solution to stochasticity in the environment, through closed-loop tuning of the optimization parameters.
We investigate data-driven forward-inverse problems for Yajima-Oikawa system by employing two technologies which improve the performance of PINN in deep physics-informed neural network (PINN), namely neuron-wise locally adaptive activation functions and L2 norm parameter regularization. In particular, we not only recover three different forms of vector rogue waves (RWs) in the forward problem of Yajima-Oikawa (YO) system, including bright-bright RWs, intermediatebright RWs and dark-bright RWs, but also study the inverse problem of YO system by data-driven with noise of different intensity. Compared with PINN method using only locally adaptive activation function, the PINN method with two strategies shows amazing robustness when studying the inverse problem of YO system with noisy training data, that is, the improved PINN model proposed by us has excellent noise immunity. The asymptotic analysis of wavenumber k and the MI analysis for YO system with unknown parameters are derived systematically by applying the linearized instability analysis on plane wave.
Functional connectivity (FC) studies have demonstrated the overarching value of studying the brain and its disorders through the undirected weighted graph of fMRI correlation matrix. Most of the work with the FC, however, depends on the way the connectivity is computed, and further depends on the manual post-hoc analysis of the FC matrices. In this work we propose a deep learning architecture BrainGNN that learns the connectivity structure as part of learning to classify subjects. It simultaneously applies a graphical neural network to this learned graph and learns to select a sparse subset of brain regions important to the prediction task. We demonstrate the model's state-of-the-art classification performance on a schizophrenia fMRI dataset and demonstrate how introspection leads to disorder relevant findings. The graphs learned by the model exhibit strong class discrimination and the sparse subset of relevant regions are consistent with the schizophrenia literature.
A Caputo-type fractional-order mathematical model for "metapopulation cholera transmission" was recently proposed in [Chaos Solitons Fractals 117 (2018), 37--49]. A sensitivity analysis of that model is done here to show the accuracy relevance of parameter estimation. Then, a fractional optimal control (FOC) problem is formulated and numerically solved. A cost-effectiveness analysis is performed to assess the relevance of studied control measures. Moreover, such analysis allows us to assess the cost and effectiveness of the control measures during intervention. We conclude that the FOC system is more effective only in part of the time interval. For this reason, we propose a system where the derivative order varies along the time interval, being fractional or classical when more advantageous. Such variable-order fractional model, that we call a 'FractInt' system, shows to be the most effective in the control of the disease.
Policy Search and Model Predictive Control~(MPC) are two different paradigms for robot control: policy search has the strength of automatically learning complex policies using experienced data, while MPC can offer optimal control performance using models and trajectory optimization. An open research question is how to leverage and combine the advantages of both approaches. In this work, we provide an answer by using policy search for automatically choosing high-level decision variables for MPC, which leads to a novel policy-search-for-model-predictive-control framework. Specifically, we formulate the MPC as a parameterized controller, where the hard-to-optimize decision variables are represented as high-level policies. Such a formulation allows optimizing policies in a self-supervised fashion. We validate this framework by focusing on a challenging problem in agile drone flight: flying a quadrotor through fast-moving gates. Experiments show that our controller achieves robust and real-time control performance in both simulation and the real world. The proposed framework offers a new perspective for merging learning and control.
We propose a novel framework for constructing linear time-invariant (LTI) models for data-driven representations of the Koopman operator for a class of stable nonlinear dynamics. The Koopman operator (generator) lifts a finite-dimensional nonlinear system to a possibly infinite-dimensional linear feature space. To utilize it for modeling, one needs to discover finite-dimensional representations of the Koopman operator. Learning suitable features is challenging, as one needs to learn LTI features that are both Koopman-invariant (evolve linearly under the dynamics) as well as relevant (spanning the original state) - a generally unsupervised learning task. For a theoretically well-founded solution to this problem, we propose learning Koopman-invariant coordinates by composing a diffeomorphic learner with a lifted aggregate system of a latent linear model. Using an unconstrained parameterization of stable matrices along with the aforementioned feature construction, we learn the Koopman operator features without assuming a predefined library of functions or knowing the spectrum, while ensuring stability regardless of the operator approximation accuracy. We demonstrate the superior efficacy of the proposed method in comparison to a state-of-the-art method on the well-known LASA handwriting dataset.
Cloud applications are increasingly shifting to interactive and loosely-coupled microservices. Despite their advantages, microservices complicate resource management, due to inter-tier dependencies. We present Sinan, a cluster manager for interactive microservices that leverages easily-obtainable tracing data instead of empirical decisions, to infer the impact of a resource allocation on on end-to-end...
In control design most control strategies are model-based and require accurate models to be applied successfully. Due to simplifications and the model-reality-gap physics-derived models frequently exhibit deviations from real-world-systems. Likewise, purely data-driven methods often do not generalise well enough and may violate physical laws. Recently Physics-Guided Neural Networks (PGNN) and physics-inspired loss functions separately have shown promising results to conquer these drawbacks. In this contribution we extend existing methods towards the identification of non-autonomous systems and propose a combined approach PGNN-L, which uses a PGNN and a physics-inspired loss term (-L) to successfully identify the system's dynamics, while maintaining the consistency with physical laws. The proposed method is demonstrated on two real-world nonlinear systems and outperforms existing techniques regarding complexity and reliability.
Automatic Speech Recognition (ASR) systems have found their use in numerous industrial applications in very diverse domains. Since domain-specific systems perform better than their generic counterparts on in-domain evaluation, the need for memory and compute-efficient domain adaptation is obvious. Particularly, adapting parameter-heavy transformer-based language models used for rescoring ASR hypothesis is challenging. In this work, we introduce domain-prompts, a methodology that trains a small number of domain token embedding parameters to prime a transformer-based LM to a particular domain. With just a handful of extra parameters per domain, we achieve 7-14% WER improvement over the baseline of using an unadapted LM. Despite being parameter-efficient, these improvements are comparable to those of fully-fine-tuned models with hundreds of millions of parameters. With ablations on prompt-sizes, dataset sizes, initializations and domains, we provide evidence for the benefits of using domain-prompts in ASR systems.
Alejandro Morales-Hernández, Sebastian Rojas Gonzalez, Inneke Van Nieuwenhuyse, Jeroen Jordens, Maarten Witters, Bart Van Doninck. Adhesive joints are increasingly used in industry for a wide variety of applications because of their favorable characteristics such as high strength-to-weight ratio, design flexibility, limited stress concentrations, planar force transfer, good damage tolerance and fatigue resistance. Finding the optimal process parameters for an adhesive bonding process is challenging: the optimization is inherently multi-objective (aiming to maximize break strength while minimizing cost) and constrained (the process should not result in any visual damage to the materials, and stress tests should not result in failures that are adhesion-related). Real life physical experiments in the lab are expensive to perform; traditional evolutionary approaches (such as genetic algorithms) are then ill-suited to solve the problem, due to the prohibitive amount of experiments required for evaluation. In this research, we successfully applied specific machine learning techniques (Gaussian Process Regression and Logistic Regression) to emulate the objective and constraint functions based on a limited amount of experimental data. The techniques are embedded in a Bayesian optimization algorithm, which succeeds in detecting Pareto-optimal process settings in a highly efficient way (i.e., requiring a limited number of extra experiments).
In order to achieve reliable communication with a high data rate of massive multiple-input multiple-output (MIMO) systems in frequency division duplex (FDD) mode, the estimated channel state information (CSI) at the receiver needs to be fed back to the transmitter. However, the feedback overhead becomes exorbitant with the increasing number of antennas. In this paper, a two stages low rank (TSLR) CSI feedback scheme for millimeter wave (mmWave) massive MIMO systems is proposed to reduce the feedback overhead based on model-driven deep learning. Besides, we design a deep iterative neural network, named FISTA-Net, by unfolding the fast iterative shrinkage thresholding algorithm (FISTA) to achieve more efficient CSI feedback. Moreover, a shrinkage thresholding network (ST-Net) is designed in FISTA-Net based on the attention mechanism, which can choose the threshold adaptively. Simulation results show that the proposed TSLR CSI feedback scheme and FISTA-Net outperform the existing algorithms in various scenarios.
End-to-end driving with a deep learning neural network (DNN) has become a rapidly growing paradigm of autonomous driving in industry and academia. Yet safety measures and interpretability still pose challenges to this paradigm. We propose an end-to-end driving algorithm that integrates multi-task DNN, path prediction, and control models in a pipeline of data flow from sensory devices through these models to driving decisions. It provides quantitative measures to evaluate the holistic, dynamic, and real-time performance of end-to-end driving systems, and thus allows to quantify their safety and interpretability. The DNN is a modified UNet, a well known encoder-decoder neural network of semantic segmentation. It consists of one segmentation, one regression, and two classification tasks for lane segmentation, path prediction, and vehicle controls. We present three variants of the modified UNet architecture having different complexities, compare them on different tasks in four static measures for both single and multi-task (MT) architectures, and then identify the best one by two additional dynamic measures in real-time simulation. We also propose a learning- and model-based longitudinal controller using model predictive control method. With the Stanley lateral controller, our results show that MTUNet outperforms an earlier modified UNet in terms of curvature and lateral offset estimation on curvy roads at normal speed, which has been tested in a real car driving on real roads.
Sales forecasting is the prerequisite for a lot of managerial decisions such as production planning, material resource planning and budgeting in the supply chain. Promotions are one of the most important business strategies that are often used to boost sales. While promotions are attractive for generating demand, it is often difficult to forecast demand in their presence. In the past few decades, several quantitative models have been developed to forecast sales including statistical and machine learning models. However, these methods may not be adequate to account for all the internal and external factors that may impact sales. As a result, qualitative models have been adopted along with quantitative methods as consulting experts has been proven to improve forecast accuracy by providing contextual information. Such models are being used extensively to account for factors that can lead to a rapid change in sales, such as during promotions. In this paper, we aim to use Bayesian Networks to forecast promotional sales where a combination of factors such as price, type of promotions, and product location impacts sales. We choose to develop a BN model because BN models essentially have the capability to combine various qualitative and quantitative factors with causal forms, making it an attractive tool for sales forecasting during promotions. This can be used to adjust a company's promotional strategy in the context of this case study. We gather sales data for a particular product from a retailer that sells products in Australia. We develop a Bayesian Network for this product and validate our results by empirical analysis. This paper confirms that BNs can be effectively used to forecast sales, especially during promotions. In the end, we provide some research avenues for using BNs in forecasting sales.
Chimney fires constitute one of the most commonly occurring fire types. Precise prediction and prompt prevention are crucial in reducing the harm they cause. In this paper, we develop a combined machine learning and statistical modeling process to predict chimney fires. Firstly, we use random forests and permutation importance techniques to identify the most informative explanatory variables. Secondly, we design a Poisson point process model and apply associated logistic regression estimation to estimate the parameters. Moreover, we validate the Poisson model assumption using second-order summary statistics and residuals. We implement the modeling process on data collected by the Twente Fire Brigade and obtain plausible predictions. Compared to similar studies, our approach has two advantages: i) with random forests, we can select explanatory variables non-parametrically considering variable dependence; ii) using logistic regression estimation, we can fit the statistical model efficiently by tuning it to focus on important regions and times of the fire data.
When thermodynamical quantities are associated with quantum systems a question arises how to treat scenarios where the notion of temperature could exhibit some quantum features. It is known that the temperature of a gas in thermal equilibrium is not constant in a gravitational field, but it is not known how a delocalised quantum system would thermalise with such a bath. In this theoretical work we demonstrate two scenarios in which the notion of a `superposition of temperatures' arises. First: a probe interacting with different baths dependent on the state of another quantum system (control). Second: the probe interacting with a bath in a superposition of purified states, each associated with a different temperature. We show that these two scenarios are fundamentally different and can be operationally distinguished. Moreover, we show that the probe does not in general thermalise even when the involved temperatures of the baths or purifications are equal. Furthermore, we show the final probe state depends on the specific realisation of the thermalising channels, being sensitive to the particular Kraus representations of the channels. This point appears to explain recent results obtained in the context of quantum interference of relativistic particle detectors thermalising with Unruh or Hawking radiation. Finally, we show that these results are reproduced in partial and pre-thermalisation processes, and thus our approach and conclusions also generally apply beyond the idealised scenarios, where thermalisation is not exact.
