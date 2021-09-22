CreatorsPublishersAdvertisers
Constrained multi-agent ergodic area surveying control based on finite element approximation of the potential field

By Stefan Ivić, Ante Sikirica, Bojan Crnković
arxiv.org
 6 days ago

Heat Equation Driven Area Coverage (HEDAC) is a state-of-the-art multi-agent ergodic motion control guided by a gradient of a potential field. A finite element method is hereby implemented to obtain a solution of Helmholtz partial differential equation, which models the potential field for surveying motion control. This allows us to survey arbitrarily shaped domains and to include obstacles in an elegant and robust manner intrinsic to HEDAC's fundamental idea. For a simple kinematic motion, the obstacles and boundary avoidance constraints are successfully handled by directing the agent motion with the gradient of the potential. However, including additional constraints, such as the minimal clearance dsitance from stationary and moving obstacles and the minimal path curvature radius, requires further alternations of the control algorithm. We introduce a relatively simple yet robust approach for handling these constraints by formulating a straightforward optimization problem based on collision-free escapes route maneuvers. This approach provides a guaranteed collision avoidance mechanism, while being computationally inexpensive as a result of the optimization problem partitioning. The proposed motion control is evaluated in three realistic surveying scenarios simulations, showing the effectiveness of the surveying and the robustness of the control algorithm. Furthermore, potential maneuvering difficulties due to improperly defined surveying scenarios are highlighted and we provide guidelines on how to overpass them. The results are promising and indiacate real-world applicability of proposed constrained multi-agent motion control for autonomous surveying and potentially other HEDAC utilizations.

Related
arxiv.org

DSDF: An approach to handle stochastic agents in collaborative multi-agent reinforcement learning

Multi-Agent reinforcement learning has received lot of attention in recent years and have applications in many different areas. Existing methods involving Centralized Training and Decentralized execution, attempts to train the agents towards learning a pattern of coordinated actions to arrive at optimal joint policy. However if some agents are stochastic to varying degrees of stochasticity, the above methods often fail to converge and provides poor coordination among agents. In this paper we show how this stochasticity of agents, which could be a result of malfunction or aging of robots, can add to the uncertainty in coordination and there contribute to unsatisfactory global coordination. In this case, the deterministic agents have to understand the behavior and limitations of the stochastic agents while arriving at optimal joint policy. Our solution, DSDF which tunes the discounted factor for the agents according to uncertainty and use the values to update the utility networks of individual agents. DSDF also helps in imparting an extent of reliability in coordination thereby granting stochastic agents tasks which are immediate and of shorter trajectory with deterministic ones taking the tasks which involve longer planning. Such an method enables joint co-ordinations of agents some of which may be partially performing and thereby can reduce or delay the investment of agent/robot replacement in many circumstances. Results on benchmark environment for different scenarios shows the efficacy of the proposed approach when compared with existing approaches.
COMPUTERS
arxiv.org

Image-Based Multi-UAV Tracking System in a Cluttered Environment

A tracking controller for unmanned aerial vehicles (UAVs) is developed to track moving targets undergoing unknown translational and rotational motions. The main challenges are to control both the relative positions and angles between the target and the UAVs to within desired values, and to guarantee that the generated control inputs to the UAVs are feasible (i.e., within their motion capabilities). Moreover, the UAVs are controlled to ensure that the target always remains within the fields of view of their onboard cameras. To the best of our knowledge, this is the first work to apply multiple UAVs to cooperatively track a dynamic target while ensuring that the UAVs remain connected and that both occlusion and collisions are avoided. To achieve these control objectives, a designed controller solved based on the aforementioned tracking controller using quadratic programming can generate minimally invasive control actions to achieve occlusion avoidance and collision avoidance. Furthermore, control barrier functions (CBFs) with a distributed design are developed in order to reduce the amount of inter-UAV communication. Simulations were performed to assess the efficacy and performance of the developed CBF-based controller for the multi-UAV system in tracking a target.
TECHNOLOGY
arxiv.org

Data-Driven Moment-Based Distributionally Robust Chance-Constrained Optimization

Many stochastic optimization problems include chance constraints that enforce constraint satisfaction with a specific probability; however, solving an optimization problem with chance constraints assumes that the solver has access to the exact underlying probability distribution, which is often unreasonable. In data-driven applications, it is common instead to use historical data samples as a surrogate to the distribution; however, this comes at a significant computational cost from the added time spent either processing the data or, worse, adding additional variables and constraints to the optimization problem. On the other hand, the sample mean and covariance matrix are lightweight to calculate, and it is possible to reframe the chance constraint as a distributionally robust chance constraint. The challenge here is that the sample mean and covariance matrix themselves are random variables, so their uncertainty should be factored into the chance constraint. This work bridges this gap by modifying the standard method of distributionally robust chance constraints to guarantee its satisfaction. The proposed data-driven method is tested on a particularly problematic example. The results show that the computationally fast proposed method is not significantly more conservative than other methods.
COMPUTERS
arxiv.org

Finite Model Property and Bisimulation for LFD

Recently, Baltag and van Benthem introduced a decidable logic of functional dependence (LFD) that extends the logic of Cylindrical Relativized Set Algebras (CRS) with atomic local dependence statements. Its semantics can be given in terms of generalised assignment models or their modal counterparts, hence the logic is both a first-order and a modal logic. We show that LFD has the finite model property (FMP) using Herwig's theorem on extending partial isomorphisms, and prove a bisimulation invariance theorem characterizing LFD as a fragment of first-order logic.
MATHEMATICS
DMAPF: A Decentralized and Distributed Solver for Multi-Agent Path Finding Problem with Obstacles

Multi-Agent Path Finding (MAPF) is a problem of finding a sequence of movements for agents to reach their assigned location without collision. Centralized algorithms usually give optimal solutions, but have difficulties to scale without employing various techniques - usually with a sacrifice of optimality; but solving MAPF problems with the number of agents greater than a thousand remains a challenge nevertheless. To tackle the scalability issue, we present DMAPF - a decentralized and distributed MAPF solver, which is a continuation of our recently published work, ros-dmapf. We address the issues of ros-dmapf where it (i) only works in maps without obstacles; and (ii) has a low success rate with dense maps. Given a MAPF problem, both ros-dmapf and DMAPF divide the map spatially into subproblems, but the latter further divides each subproblem into disconnected regions called areas. Each subproblem is assigned to a distributed solver, which then individually creates an abstract plan - a sequence of areas that an agent needs to visit - for each agent in it, and interleaves agent migration with movement planning. Answer Set Programming, which is known for its performance in small but complex problems, is used in many parts including problem division, abstract planning, border assignment for the migration, and movement planning. Robot Operating System is used to facilitate communication between the solvers and to enable the opportunity to integrate with robotic systems. DMAPF introduces a new interaction protocol between the solvers, and mechanisms that together result in a higher success rate and better solution quality without sacrificing much of the performance. We implement and experimentally validate DMAPF by comparing it with other state-of-the-art MAPF solvers and the results show that our system achieves better scalability.
COMPUTERS
Security Analysis of Distributed Ledgers and Blockchains through Agent-based Simulation

In this paper we describe LUNES-Blockchain, an agent-based simulator of blockchains that relies on Parallel and Distributed Simulation (PADS) techniques to obtain high scalability. The software is organized as a multi-level simulator that permits to simulate a virtual environment, made of many nodes running the protocol of a specific Distributed Ledger Technology (DLT), such as the Bitcoin or the Ethereum blockchains. This virtual environment is executed on top of a lower-level Peer-to-Peer (P2P) network overlay, which can be structured based on different topologies and with a given number of nodes and edges. Functionalities at different levels of abstraction are managed separately, by different software modules and with different time granularity. This allows for accurate simulations, where (and when) it is needed, and enhances the simulation performance. Using LUNES-Blockchain, it is possible to simulate different types of attacks on the DLT. In this paper, we specifically focus on the P2P layer, considering the selfish mining, the 51% attack and the Sybil attack. For which concerns selfish mining and the 51% attack, our aim is to understand how much the hash-rate (i.e. a general measure of the processing power in the blockchain network) of the attacker can influence the outcome of the misbehaviour. On the other hand, in the filtering denial of service (i.e. Sybil Attack), we investigate which dissemination protocol in the underlying P2P network makes the system more resilient to a varying number of nodes that drop the messages. The results confirm the viability of the simulation-based techniques for the investigation of security aspects of DLTs.
SOFTWARE
Density-based Curriculum for Multi-goal Reinforcement Learning with Sparse Rewards

Multi-goal reinforcement learning (RL) aims to qualify the agent to accomplish multi-goal tasks, which is of great importance in learning scalable robotic manipulation skills. However, reward engineering always requires strenuous efforts in multi-goal RL. Moreover, it will introduce inevitable bias causing the suboptimality of the final policy. The sparse reward provides a simple yet efficient way to overcome such limits. Nevertheless, it harms the exploration efficiency and even hinders the policy from convergence. In this paper, we propose a density-based curriculum learning method for efficient exploration with sparse rewards and better generalization to desired goal distribution. Intuitively, our method encourages the robot to gradually broaden the frontier of its ability along the directions to cover the entire desired goal space as much and quickly as possible. To further improve data efficiency and generality, we augment the goals and transitions within the allowed region during training. Finally, We evaluate our method on diversified variants of benchmark manipulation tasks that are challenging for existing methods. Empirical results show that our method outperforms the state-of-the-art baselines in terms of both data efficiency and success rate.
COMPUTERS
Transformer-based Lexically Constrained Headline Generation

This paper explores a variant of automatic headline generation methods, where a generated headline is required to include a given phrase such as a company or a product name. Previous methods using Transformer-based models generate a headline including a given phrase by providing the encoder with additional information corresponding to the given phrase. However, these methods cannot always include the phrase in the generated headline. Inspired by previous RNN-based methods generating token sequences in backward and forward directions from the given phrase, we propose a simple Transformer-based method that guarantees to include the given phrase in the high-quality generated headline. We also consider a new headline generation strategy that takes advantage of the controllable generation order of Transformer. Our experiments with the Japanese News Corpus demonstrate that our methods, which are guaranteed to include the phrase in the generated headline, achieve ROUGE scores comparable to previous Transformer-based methods. We also show that our generation strategy performs better than previous strategies.
TECHNOLOGY
Technology
Computers
Artificial neural network-based reduced-order modeling for turbulent wake of a finite wall-mounted square cylinder

This study presents an artificial neural network and proper orthogonal decomposition (POD)-based reduced-order model (ROM) of turbulent flow around a finite wall-mounted square cylinder. The proposed model is suitable for turbulent wake control applications because it can predict the dynamics of the main features of the flow field without computing Navier-Stokes equations. Long short-term memory neural network (LSTM NN) and bidirectional long short-term memory neural network (BLSTM NN) are used to predict the temporal evolution of the POD time coefficients at different planes along the height of the obstacle. The improved delayed detached-eddy simulation (IDDES) is performed to generate the training datasets. Transfer learning (TL) approach is utilized in the training process by using the weights of the LSTM/BLSTM NN that are used to predict the POD time coefficients of the planes at lower elevations to initialize the weights of the networks at higher elevations along the height of the obstacle. The use of TL results in a remarkable improvement in the capability of the LSTM/BLSTM NN prediction compared with the one when the network is initialized with random weights. BLSTM NN shows better results compared with LSTM NN in terms of training and prediction error, indicating that the BLSTM-POD model is more suitable to be used as a ROM for predicting the turbulent wake. Furthermore, the temporal behavior of the time coefficients is carefully examined using the phase space plots and Poincar$\acute{e}$ sections. The results of using different lengths of the prediction time window showed that the prediction error of the POD time coefficients increases as the prediction time window increases and the error increasing rate decreases with the ranking of the POD time coefficients.
COMPUTERS
Comprehensive Multi-Agent Epistemic Planning

Over the last few years, the concept of Artificial Intelligence has become central in different tasks concerning both our daily life and several working scenarios. Among these tasks automated planning has always been central in the AI research community. In particular, this manuscript is focused on a specialized kind of planning known as Multi-agent Epistemic Planning (MEP). Epistemic Planning (EP) refers to an automated planning setting where the agent reasons in the space of knowledge/beliefs states and tries to find a plan to reach a desirable state from a starting one. Its general form, the MEP problem, involves multiple agents who need to reason about both the state of the world and the information flows between agents. To tackle the MEP problem several tools have been developed and, while the diversity of approaches has led to a deeper understanding of the problem space, each proposed tool lacks some abilities and does not allow for a comprehensive investigation of the information flows. That is why, the objective of our work is to formalize an environment where a complete characterization of the agents' knowledge/beliefs interaction and update is possible. In particular, we aim to achieve such goal by defining a new action-based language for multi-agent epistemic planning and to implement an epistemic planner based on it. This solver should provide a tool flexible enough to reason on different domains, e.g., economy, security, justice and politics, where considering others' knowledge/beliefs could lead to winning strategies.
SOFTWARE
Constrained Optimization Visualization Toolbox

A toolbox for visualizing the graphical approach in solving 2D constrained optimization problems. This toolbox is part of the optimization visualizing series that I developed during the optimization class at Georgia Tech. More details about these toolboxes and packages is presented in this blog post. Particularly, this toolbox can be...
CODING & PROGRAMMING
Convergence analysis of an operator-compressed multiscale finite element method for Schrödinger equations with multiscale potentials

In this paper, we analyze the convergence of the operator-compressed multiscale finite element method (OC MsFEM) for Schrödinger equations with general multiscale potentials in the semiclassical regime. In the OC MsFEM the multiscale basis functions are constructed by solving a constrained energy minimization. Under a mild assumption on the mesh size $H$, we prove the exponential decay of the multiscale basis functions so that localized multiscale basis functions can be constructed, which achieve the same accuracy as the global ones if the oversampling size $m = O(\log(1/H))$. We prove the first-order convergence in the energy norm and second-order convergence in the $L^2$ norm for the OC MsFEM and super convergence rates can be obtained if the solution possesses sufficiently high regularity. By analysing the regularity of the solution, we also derive the dependence of the error estimates on the small parameters of the Schrödinger equation. We find that the OC MsFEM outperforms the finite element method (FEM) due to the super convergence behavior for high-regularity solutions and weaker dependence on the small parameters for low-regularity solutions in the presence of the multiscale potential. Finally, we present numerical results to demonstrate the accuracy and robustness of the OC MsFEM.
MATHEMATICS
Background Independent Field Quantization with Sequences of Gravity-Coupled Approximants II: Metric Fluctuations

We apply the new quantization scheme outlined in Phys. Rev. D102 (2020) 125001 to explore the influence which quantum vacuum fluctuations of the spacetime metric exert on the universes of Quantum Einstein Gravity, which is regarded an effective theory here. The scheme promotes the principle of Background Independence to the level of the regularized precursors of a quantum field theory ("approximants") and severely constrains admissible regularization schemes. Without any tuning of parameters, we find that the zero point oscillations of linear gravitons on maximally symmetric spacetimes do not create the commonly expected cosmological constant problem of a cutoff-size curvature. On the contrary, metric fluctuations are found to reduce positive curvatures to arbitrarily tiny and ultimately vanishing values when the cutoff is lifted. This suggests that flat space could be the distinguished groundstate of pure quantum gravity. Our results contradict traditional beliefs founded upon background-dependent calculations whose validity must be called into question therefore.
PHYSICS
An Approximation Algorithm for a General Class of Multi-Parametric Optimization Problems

In a widely studied class of multi-parametric optimization problems, the objective value of each solution is an affine function of real-valued parameters. For many important multi-parametric optimization problems, an optimal solutions set with minimum cardinality can contain super-polynomially many solutions. Consequently, any exact algorithm for such problems must output a super-polynomial number of solutions.
CODING & PROGRAMMING
Reinforcement Learning for Finite-Horizon Restless Multi-Armed Multi-Action Bandits

We study a finite-horizon restless multi-armed bandit problem with multiple actions, dubbed R(MA)^2B. The state of each arm evolves according to a controlled Markov decision process (MDP), and the reward of pulling an arm depends on both the current state of the corresponding MDP and the action taken. The goal is to sequentially choose actions for arms so as to maximize the expected value of the cumulative rewards collected. Since finding the optimal policy is typically intractable, we propose a computationally appealing index policy which we call Occupancy-Measured-Reward Index Policy. Our policy is well-defined even if the underlying MDPs are not indexable. We prove that it is asymptotically optimal when the activation budget and number of arms are scaled up, while keeping their ratio as a constant. For the case when the system parameters are unknown, we develop a learning algorithm. Our learning algorithm uses the principle of optimism in the face of uncertainty and further uses a generative model in order to fully exploit the structure of Occupancy-Measured-Reward Index Policy. We call it the R(MA)^2B-UCB algorithm. As compared with the existing algorithms, R(MA)^2B-UCB performs close to an offline optimum policy, and also achieves a sub-linear regret with a low computational complexity. Experimental results show that R(MA)^2B-UCB outperforms the existing algorithms in both regret and run time.
SCIENCE
Multi-angle Quantum Approximate Optimization Algorithm

The quantum approximate optimization algorithm (QAOA) generates an approximate solution to combinatorial optimization problems using a variational ansatz circuit defined by parameterized layers of quantum evolution. In theory, the approximation improves with increasing ansatz depth but gate noise and circuit complexity undermine performance in practice. Here, we introduce a multi-angle ansatz for QAOA that reduces circuit depth and improves the approximation ratio by increasing the number of classical parameters. Even though the number of parameters increases, our results indicate that good parameters can be found in polynomial time. This new ansatz gives a 33\% increase in the approximation ratio for an infinite family of MaxCut instances over QAOA. The optimal performance is lower bounded by the conventional ansatz, and we present empirical results for graphs on eight vertices that one layer of the multi-angle anstaz is comparable to three layers of the traditional ansatz on MaxCut problems. Similarly, multi-angle QAOA yields a higher approximation ratio than QAOA at the same depth on a collection of MaxCut instances on fifty and one-hundred vertex graphs. Many of the optimized parameters are found to be zero, so their associated gates can be removed from the circuit, further decreasing the circuit depth. These results indicate that multi-angle QAOA requires shallower circuits to solve problems than QAOA, making it more viable for near-term intermediate-scale quantum devices.
COMPUTERS
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning

Trust region methods rigorously enabled reinforcement learning (RL) agents to learn monotonically improving policies, leading to superior performance on a variety of tasks. Unfortunately, when it comes to multi-agent reinforcement learning (MARL), the property of monotonic improvement may not simply apply; this is because agents, even in cooperative games, could have conflicting directions of policy updates. As a result, achieving a guaranteed improvement on the joint policy where each agent acts individually remains an open challenge. In this paper, we extend the theory of trust region learning to MARL. Central to our findings are the multi-agent advantage decomposition lemma and the sequential policy update scheme. Based on these, we develop Heterogeneous-Agent Trust Region Policy Optimisation (HATPRO) and Heterogeneous-Agent Proximal Policy Optimisation (HAPPO) algorithms. Unlike many existing MARL algorithms, HATRPO/HAPPO do not need agents to share parameters, nor do they need any restrictive assumptions on decomposibility of the joint value function. Most importantly, we justify in theory the monotonic improvement property of HATRPO/HAPPO. We evaluate the proposed methods on a series of Multi-Agent MuJoCo and StarCraftII tasks. Results show that HATRPO and HAPPO significantly outperform strong baselines such as IPPO, MAPPO and MADDPG on all tested tasks, therefore establishing a new state of the art.
COMPUTERS
High order direct parametrisation of invariant manifolds for model order reduction of finite element structures: application to large amplitude vibrations and uncovering of a folding point

This paper investigates model-order reduction methods for geometrically nonlinear structures. The parametrisation method of invariant manifolds is used and adapted to the case of mechanical systems expressed in the physical basis, so that the technique is directly applicable to problems discretised by the finite element method. Two nonlinear mappings, respectively related to displacement and velocity, are introduced, and the link between the two is made explicit at arbitrary order of expansion. The same development is performed on the reduced-order dynamics which is computed at generic order following the different styles of parametrisation. More specifically, three different styles are introduced and commented: the graph style, the complex normal form style and the real normal form style. These developments allow making better connections with earlier works using these parametrisation methods. The technique is then applied to three different examples. A clamped-clamped arch with increasing curvature is first used to show an example of a system with a softening behaviour turning to hardening at larger amplitudes, which can be replicated with a single mode reduction. Secondly, the case of a cantilever beam is investigated. It is shown that the invariant manifold of the first mode shows a folding point at large amplitudes which is not connected to an internal resonance. This exemplifies the failure of the graph style due to the folding point, whereas the normal form style is able to pass over the folding. Finally, A MEMS micromirror undergoing large rotations is used to show the importance of using high-order expansions on an industrial example.
MATHEMATICS
Quark Number Fluctuations at Finite Temperature and Finite Chemical Potential via the Dyson-Schwinger Equation Approach

We investigate the quark number fluctuations up to the fourth order in the matter composed of two light flavor quarks with isospin symmetry and at finite temperature and finite chemical potential using the Dyson-Schwinger equation approach of QCD. In order to solve the quark gap equation, we approximate the dressed quark-gluon vertex with the bare one and adopt both the Marris-Tandy (MT) model and the infrared constant (Qin-Chang) model for the dressed gluon propagator. Our results indicate that the second, third, and forth order fluctuations of net quark number all diverge at the critical end point (CEP). Around the CEP, the second order fluctuation possesses obvious pump while the third and fourth order ones exhibit distinct wiggles between positive and negative. For the MT model and the Qin-Chang model, we give the pseudo-critical temperature at zero quark chemical potential as $T_{c}=146$ MeV and $150$ MeV, and locate the CEP at $({\mu_{E}^{q}}, {T_{E}^{}}) = (120, 124)$ MeV and $(124,129)$ MeV, respectively. In addition, our results manifest that the fluctuations are insensitive to the details of the model, but the location of the CEP shifts to low chemical potential and high temperature as the confinement length scale increases.
PHYSICS
Generalization of Safe Optimal Control Actions on Networked Multi-Agent Systems

We propose a unified framework to fast generate a safe optimal control action for a new task from existing controllers on Multi-Agent Systems (MASs). The control action composition is achieved by taking a weighted mixture of the existing controllers according to the contribution of each component task. Instead of sophisticatedly tuning the cost parameters and other hyper-parameters for safe and reliable behavior in the optimal control framework, the safety of each single task solution is guaranteed using the control barrier functions (CBFs) for high-degree stochastic systems, which constrains the system state within a known safe operation region where it originates from. Linearity of CBF constraints in control enables the control action composition. The discussed framework can immediately provide reliable solutions to new tasks by taking a weighted mixture of solved component-task actions and filtering on some CBF constraints, instead of performing an extensive sampling to achieve a new controller. Our results are verified and demonstrated on both a single UAV and two cooperative UAV teams in an environment with obstacles.
SOFTWARE

