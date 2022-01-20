ContributorsPublishersAdvertisers
Minimax Demographic Group Fairness in Federated Learning

By Afroditi Papadaki, Natalia Martinez, Martin Bertran, Guillermo Sapiro, Miguel Rodrigues
Federated learning is an increasingly popular paradigm that enables a large number of entities to collaboratively learn better models. In this work, we study minimax group fairness in federated learning scenarios where...

Communication-Efficient Federated Learning with Acceleration of Global Momentum

Federated learning often suffers from unstable and slow convergence due to heterogeneous characteristics of participating clients. Such tendency is aggravated when the client participation ratio is low since the information collected from the clients at each round is prone to be more inconsistent. To tackle the challenge, we propose a novel federated learning framework, which improves the stability of the server-side aggregation step, which is achieved by sending the clients an accelerated model estimated with the global gradient to guide the local gradient updates. Our algorithm naturally aggregates and conveys the global update information to participants with no additional communication cost and does not require to store the past models in the clients. We also regularize local update to further reduce the bias and improve the stability of local updates. We perform comprehensive empirical studies on real data under various settings and demonstrate the remarkable performance of the proposed method in terms of accuracy and communication-efficiency compared to the state-of-the-art methods, especially with low client participation rates. Our code is available at this https URL ninigapa0/FedAGM.
CODING & PROGRAMMING
RFLBAT: A Robust Federated Learning Algorithm against Backdoor Attack

Federated learning (FL) is a distributed machine learning paradigm where enormous scattered clients (e.g. mobile devices or IoT devices) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. Unfortunately, FL is susceptible to a variety of attacks, including backdoor attack, which is made substantially worse in the presence of malicious attackers. Most of algorithms usually assume that the malicious at tackers no more than benign clients or the data distribution is independent identically distribution (IID). However, no one knows the number of malicious attackers and the data distribution is usually non identically distribution (Non-IID). In this paper, we propose RFLBAT which utilizes principal component analysis (PCA) technique and Kmeans clustering algorithm to defend against backdoor attack. Our algorithm RFLBAT does not bound the number of backdoored attackers and the data distribution, and requires no auxiliary information outside of the learning process. We conduct extensive experiments including a variety of backdoor attack types. Experimental results demonstrate that RFLBAT outperforms the existing state-of-the-art algorithms and is able to resist various backdoor attack scenarios including different number of attackers (DNA), different Non-IID scenarios (DNS), different number of clients (DNC) and distributed backdoor attack (DBA).
CELL PHONES
Parametrization of sunspot groups based on machine learning approach

Sunspot groups observed in white-light appear as complex structures. Analysis of these structures is usually based on simple morphological descriptors which capture only generic properties and miss information about fine details. We present a machine learning approach to introduce a complete yet compact description of sunspot groups. The idea is to map sunspot group images into an appropriate lower-dimensional (latent) space. We apply a combination of Variational Autoencoder and Principal Component Analysis to obtain a set of 285 latent descriptors. We demonstrate that the standard descriptors are embedded into the latent ones. Thus, latent features can be considered as an extended description of sunspot groups and, in our opinion, can expand the possibilities for the research on sunspot groups. In particular, we demonstrate an application for estimation of the sunspot group complexity. The proposed parametrization model is generic and can be applied to investigation of other traces of solar activity observed in various spectrum lines. Key components of this work, which are the parametrization model, dataset of sunspot groups and latent vectors, are available in the public GitHub repository this https URL groups and can be used to reproduce the results and for further research.
ASTRONOMY
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation

We study the off-policy evaluation (OPE) problem in an infinite-horizon Markov decision process with continuous states and actions. We recast the $Q$-function estimation into a special form of the nonparametric instrumental variables (NPIV) estimation problem. We first show that under one mild condition the NPIV formulation of $Q$-function estimation is well-posed in the sense of $L^2$-measure of ill-posedness with respect to the data generating distribution, bypassing a strong assumption on the discount factor $\gamma$ imposed in the recent literature for obtaining the $L^2$ convergence rates of various $Q$-function estimators. Thanks to this new well-posed property, we derive the first minimax lower bounds for the convergence rates of nonparametric estimation of $Q$-function and its derivatives in both sup-norm and $L^2$-norm, which are shown to be the same as those for the classical nonparametric regression (Stone, 1982). We then propose a sieve two-stage least squares estimator and establish its rate-optimality in both norms under some mild conditions. Our general results on the well-posedness and the minimax lower bounds are of independent interest to study not only other nonparametric estimators for $Q$-function but also efficient estimation on the value of any target policy in off-policy settings.
MATHEMATICS
Demystifying Swarm Learning: A New Paradigm of Blockchain-based Decentralized Federated Learning

Federated learning (FL) is an emerging promising privacy-preserving machine learning paradigm and has raised more and more attention from researchers and developers. FL keeps users' private data on devices and exchanges the gradients of local models to cooperatively train a shared Deep Learning (DL) model on central custodians. However, the security and fault tolerance of FL have been increasingly discussed, because its central custodian mechanism or star-shaped architecture can be vulnerable to malicious attacks or software failures. To address these problems, Swarm Learning (SL) introduces a permissioned blockchain to securely onboard members and dynamically elect the leader, which allows performing DL in an extremely decentralized manner. Compared with tremendous attention to SL, there are few empirical studies on SL or blockchain-based decentralized FL, which provide comprehensive knowledge of best practices and precautions of deploying SL in real-world scenarios. Therefore, we conduct the first comprehensive study of SL to date, to fill the knowledge gap between SL deployment and developers, as far as we are concerned. In this paper, we conduct various experiments on 3 public datasets of 5 research questions, present interesting findings, quantitatively analyze the reasons behind these findings, and provide developers and researchers with practical suggestions. The findings have evidenced that SL is supposed to be suitable for most application scenarios, no matter whether the dataset is balanced, polluted, or biased over irrelevant features.
CODING & PROGRAMMING
Learning Fair Node Representations with Graph Counterfactual Fairness

Fair machine learning aims to mitigate the biases of model predictions against certain subpopulations regarding sensitive attributes such as race and gender. Among the many existing fairness notions, counterfactual fairness measures the model fairness from a causal perspective by comparing the predictions of each individual from the original data and the counterfactuals. In counterfactuals, the sensitive attribute values of this individual had been modified. Recently, a few works extend counterfactual fairness to graph data, but most of them neglect the following facts that can lead to biases: 1) the sensitive attributes of each node's neighbors may causally affect the prediction w.r.t. this node; 2) the sensitive attributes may causally affect other features and the graph structure. To tackle these issues, in this paper, we propose a novel fairness notion - graph counterfactual fairness, which considers the biases led by the above facts. To learn node representations towards graph counterfactual fairness, we propose a novel framework based on counterfactual data augmentation. In this framework, we generate counterfactuals corresponding to perturbations on each node's and their neighbors' sensitive attributes. Then we enforce fairness by minimizing the discrepancy between the representations learned from the original graph and the counterfactuals for each node. Experiments on both synthetic and real-world graphs show that our framework outperforms the state-of-the-art baselines in graph counterfactual fairness, and also achieves comparable prediction performance.
TECHNOLOGY
Dynamic Infection Spread Model Based Group Testing

We study a dynamic infection spread model, inspired by the discrete time SIR model, where infections are spread via non-isolated infected individuals. While infection keeps spreading over time, a limited capacity testing is performed at each time instance as well. In contrast to the classical, static, group testing problem, the objective in our setup is not to find the minimum number of required tests to identify the infection status of every individual in the population, but to control the infection spread by detecting and isolating the infections over time by using the given, limited number of tests. In order to analyze the performance of the proposed algorithms, we focus on the mean-sense analysis of the number of individuals that remain non-infected throughout the process of controlling the infection. We propose two dynamic algorithms that both use given limited number of tests to identify and isolate the infections over time, while the infection spreads. While the first algorithm is a dynamic randomized individual testing algorithm, in the second algorithm we employ the group testing approach similar to the original work of Dorfman. By considering weak versions of our algorithms, we obtain lower bounds for the performance of our algorithms. Finally, we implement our algorithms and run simulations to gather numerical results and compare our algorithms and theoretical approximation results under different sets of system parameters.
SCIENCE
Rawlsian Fairness in Online Bipartite Matching: Two-sided, Group, and Individual

Online bipartite-matching platforms are ubiquitous and find applications in important areas such as crowdsourcing and ridesharing. In the most general form, the platform consists of three entities: two sides to be matched and a platform operator that decides the matching. The design of algorithms for such platforms has traditionally focused on the operator's (expected) profit. Recent reports have shown that certain demographic groups may receive less favorable treatment under pure profit maximization. As a result, a collection of online matching algorithms have been developed that give a fair treatment guarantee for one side of the market at the expense of a drop in the operator's profit. In this paper, we generalize the existing work to offer fair treatment guarantees to both sides of the market simultaneously, at a calculated worst case drop to operator profit. We consider group and individual Rawlsian fairness criteria. Moreover, our algorithms have theoretical guarantees and have adjustable parameters that can be tuned as desired to balance the trade-off between the utilities of the three sides. We also derive hardness results that give clear upper bounds over the performance of any algorithm.
INTERNET
A Minimax Framework for Two-Agent Scheduling with Inertial Constraints

Autonomous agents are promising in applications such as intelligent transportation and smart manufacturing, and scheduling of agents has to take their inertial constraints into consideration. Most current researches require the obedience of all agents, which is hard to achieve in non-dedicated systems such as traffic intersections. In this article, we establish a minimax framework for the scheduling of two inertially constrained agents with no cooperation assumptions. Specifically, we first provide a unified and sufficient representation for various types of situation information, and define a state value function characterizing the agent's preference of states under a given situation. Then, the minimax control policy along with the calculation methods is proposed which optimizes the worst-case state value function at each step, and the safety guarantee of the policy is also presented. Furthermore, several generalizations are introduced on the applicable scenarios of the proposed framework. Numerical simulations show that the minimax control policy can reduce the largest scheduling cost by $13.4\%$ compared with queueing and following policies. Finally, the effects of decision period, observation period and inertial constraints are also numerically discussed.
TECHNOLOGY
Jamming Attacks on Federated Learning in Wireless Networks

Federated learning (FL) offers a decentralized learning environment so that a group of clients can collaborate to train a global model at the server, while keeping their training data confidential. This paper studies how to launch over-the-air jamming attacks to disrupt the FL process when it is executed over a wireless network. As a wireless example, FL is applied to learn how to classify wireless signals collected by clients (spectrum sensors) at different locations (such as in cooperative sensing). An adversary can jam the transmissions for the local model updates from clients to the server (uplink attack), or the transmissions for the global model updates the server to clients (downlink attack), or both. Given a budget imposed on the number of clients that can be attacked per FL round, clients for the (uplink/downlink) attack are selected according to their local model accuracies that would be expected without an attack or ranked via spectrum observations. This novel attack is extended to general settings by accounting different processing speeds and attack success probabilities for clients. Compared to benchmark attack schemes, this attack approach degrades the FL performance significantly, thereby revealing new vulnerabilities of FL to jamming attacks in wireless networks.
SOFTWARE
Minimax Optimality (Probably) Doesn't Imply Distribution Learning for GANs

Arguably the most fundamental question in the theory of generative adversarial networks (GANs) is to understand to what extent GANs can actually learn the underlying distribution. Theoretical and empirical evidence suggests local optimality of the empirical training objective is insufficient. Yet, it does not rule out the possibility that achieving a true population minimax optimal solution might imply distribution learning.
COMPUTERS
Caring Without Sharing: A Federated Learning Crowdsensing Framework for Diversifying Representation of Cities

Mobile Crowdsensing has become main stream paradigm for researchers to collect behavioral data from citizens in large scales. This valuable data can be leveraged to create centralized repositories that can be used to train advanced Artificial Intelligent (AI) models for various services that benefit society in all aspects. Although decades of research has explored the viability of Mobile Crowdsensing in terms of incentives and many attempts have been made to reduce the participation barriers, the overshadowing privacy concerns regarding sharing personal data still remain. Recently a new pathway has emerged to enable to shift MCS paradigm towards a more privacy-preserving collaborative learning, namely Federated Learning. In this paper, we posit a first of its kind framework for this emerging paradigm. We demonstrate the functionalities of our framework through a case study of diversifying two vision algorithms through to learn the representation of ordinary sidewalk obstacles as part of enhancing visually impaired navigation.
CELL PHONES
Communication-Efficient Device Scheduling for Federated Learning Using Stochastic Optimization

Federated learning (FL) is a useful tool in distributed machine learning that utilizes users' local datasets in a privacy-preserving manner. When deploying FL in a constrained wireless environment; however, training models in a time-efficient manner can be a challenging task due to intermittent connectivity of devices, heterogeneous connection quality, and non-i.i.d. data. In this paper, we provide a novel convergence analysis of non-convex loss functions using FL on both i.i.d. and non-i.i.d. datasets with arbitrary device selection probabilities for each round. Then, using the derived convergence bound, we use stochastic optimization to develop a new client selection and power allocation algorithm that minimizes a function of the convergence bound and the average communication time under a transmit power constraint. We find an analytical solution to the minimization problem. One key feature of the algorithm is that knowledge of the channel statistics is not required and only the instantaneous channel state information needs to be known. Using the FEMNIST and CIFAR-10 datasets, we show through simulations that the communication time can be significantly decreased using our algorithm, compared to uniformly random participation.
COMPUTERS
Smoothed Model-Assisted Small Area Estimation

In countries where population census and sample survey data are limited, generating accurate subnational estimates of health and demographic indicators is challenging. Existing model-based geostatistical methods leverage covariate information and spatial smoothing to reduce the variability of estimates but often assume the survey design is ignorable, which may be inappropriate given the complex design of household surveys typically used in this context. On the other hand, small area estimation approaches common in the survey statistics literature do not incorporate both unit-level covariate information and spatial smoothing in a design-consistent way. We propose a new smoothed model-assisted estimator that accounts for survey design and leverages both unit-level covariates and spatial smoothing, bridging the survey statistics and model-based geostatistics perspectives. Under certain assumptions, the new estimator can be viewed as both design-consistent and model-consistent, offering potential benefits from both perspectives. We demonstrate our estimator's performance using both real and simulated data, comparing it with existing design-based and model-based estimators.
SCIENCE
Deep reinforcement learning under signal temporal logic constraints using Lagrangian relaxation

Deep reinforcement learning (DRL) has attracted much attention as an approach to solve sequential decision making problems without mathematical models of systems or environments. In general, a constraint may be imposed on the decision making. In this study, we consider the optimal decision making problems with constraints to complete temporal high-level tasks in the continuous state-action domain. We describe the constraints using signal temporal logic (STL), which is useful for time sensitive control tasks since it can specify continuous signals within a bounded time interval. To deal with the STL constraints, we introduce an extended constrained Markov decision process (CMDP), which is called a $\tau$-CMDP. We formulate the STL constrained optimal decision making problem as the $\tau$-CMDP and propose a two-phase constrained DRL algorithm using the Lagrangian relaxation method. Through simulations, we also demonstrate the learning performance of the proposed algorithm.
CODING & PROGRAMMING
FedComm: Federated Learning as a Medium for Covert Communication

Proposed as a solution to mitigate the privacy implications related to the adoption of deep learning solutions, Federated Learning (FL) enables large numbers of participants to successfully train deep neural networks without having to reveal the actual private training data. To date, a substantial amount of research has investigated the security and privacy properties of FL, resulting in a plethora of innovative attack and defense strategies. This paper thoroughly investigates the communication capabilities of an FL scheme. In particular, we show that a party involved in the FL learning process can use FL as a covert communication medium to send an arbitrary message. We introduce FedComm, a novel covert-communication technique that enables robust sharing and transfer of targeted payloads within the FL framework. Our extensive theoretical and empirical evaluations show that FedComm provides a stealthy communication channel, with minimal disruptions to the training process. Our experiments show that FedComm, allowed us to successfully deliver 100% of a payload in the order of kilobits before the FL procedure converges. Our evaluation also shows that FedComm is independent of the application domain and the neural network architecture used by the underlying FL scheme.
SOFTWARE
Distance-Ratio-Based Formulation for Metric Learning

In metric learning, the goal is to learn an embedding so that data points with the same class are close to each other and data points with different classes are far apart. We propose a distance-ratio-based (DR) formulation for metric learning. Like softmax-based formulation for metric learning, it models $p(y=c|x')$, which is a probability that a query point $x'$ belongs to a class $c$. The DR formulation has two useful properties. First, the corresponding loss is not affected by scale changes of an embedding. Second, it outputs the optimal (maximum or minimum) classification confidence scores on representing points for classes. To demonstrate the effectiveness of our formulation, we conduct few-shot classification experiments using softmax-based and DR formulations on CUB and mini-ImageNet datasets. The results show that DR formulation generally enables faster and more stable metric learning than the softmax-based formulation. As a result, using DR formulation achieves improved or comparable generalization performances.
COMPUTERS
Vertical Federated Edge Learning with Distributed Integrated Sensing and Communication

This letter studies a vertical federated edge learning (FEEL) system for collaborative objects/human motion recognition by exploiting the distributed integrated sensing and communication (ISAC). In this system, distributed edge devices first send wireless signals to sense targeted objects/human, and then exchange intermediate computed vectors (instead of raw sensing data) for collaborative recognition while preserving data privacy. To boost the spectrum and hardware utilization efficiency for FEEL, we exploit ISAC for both target sensing and data exchange, by employing dedicated frequency-modulated continuous-wave (FMCW) signals at each edge device. Under this setup, we propose a vertical FEEL framework for realizing the recognition based on the collected multi-view wireless sensing data. In this framework, each edge device owns an individual local L-model to transform its sensing data into an intermediate vector with relatively low dimensions, which is then transmitted to a coordinating edge device for final output via a common downstream S-model. By considering a human motion recognition task, experimental results show that our vertical FEEL based approach achieves recognition accuracy up to 98\% with an improvement up to 8\% compared to the benchmarks, including on-device training and horizontal FEEL.
CODING & PROGRAMMING
Nilpotent dynamics on signed interaction graphs and weak converses of Thomas' rules

A finite dynamical system with $n$ components is a function $f:X\to X$ where $X=X_1\times\dots\times X_n$ is a product of $n$ finite intervals of integers. The structure of such a system $f$ is represented by a signed digraph $G$, called interaction graph: there are $n$ vertices, one per component, and the signed arcs describe the positive and negative influences between them. Finite dynamical systems are usual models for gene networks. In this context, it is often assumed that $f$ is {\em degree-bounded}, that is, the size of each $X_i$ is at most the out-degree of $i$ in $G$ plus one. Assuming that $G$ is connected and that $f$ is degree-bounded, we prove the following: if $G$ is not a cycle, then $f^{n+1}$ may be a constant. In that case, $f$ describes a very simple dynamics: a global convergence toward a unique fixed point in $n+1$ iterations. This shows that, in the degree-bounded case, the fact that $f$ describes a complex dynamics {\em cannot} be deduced from its interaction graph. We then widely generalize the above result, obtaining, as immediate consequences, other limits on what can be deduced from the interaction graph only, as the following weak converses of Thomas' rules: if $G$ is connected and has a positive (negative) cycle, then $f$ may have two (no) fixed points.
MATHEMATICS
Learning with latent group sparsity via heat flow dynamics on networks

Group or cluster structure on explanatory variables in machine learning problems is a very general phenomenon, which has attracted broad interest from practitioners and theoreticians alike. In this work we contribute an approach to learning under such group structure, that does not require prior information on the group identities. Our paradigm is motivated by the Laplacian geometry of an underlying network with a related community structure, and proceeds by directly incorporating this into a penalty that is effectively computed via a heat flow-based local network dynamics. In fact, we demonstrate a procedure to construct such a network based on the available data. Notably, we dispense with computationally intensive pre-processing involving clustering of variables, spectral or otherwise. Our technique is underpinned by rigorous theorems that guarantee its effective performance and provide bounds on its sample complexity. In particular, in a wide range of settings, it provably suffices to run the heat flow dynamics for time that is only logarithmic in the problem dimensions. We explore in detail the interfaces of our approach with key statistical physics models in network science, such as the Gaussian Free Field and the Stochastic Block Model. We validate our approach by successful applications to real-world data from a wide array of application domains, including computer science, genetics, climatology and economics. Our work raises the possibility of applying similar diffusion-based techniques to classical learning tasks, exploiting the interplay between geometric, dynamical and stochastic structures underlying the data.
COMPUTERS

