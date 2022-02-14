ContributorsPublishersAdvertisers
Science

Counterfactual inference for sequential experimental design

By Raaz Dwivedi, Susan Murphy, Devavrat Shah
arxiv.org
 2 days ago

We consider the problem of counterfactual inference in sequentially designed experiments wherein a collection of $\mathbf{N}$ units each undergo a sequence of interventions for $\mathbf{T}$ time periods, based on policies that sequentially adapt over time. Our goal is counterfactual inference, i.e., estimate what would have happened if alternate policies were used,...

arxiv.org

Comments / 0

Related
arxiv.org

An Experimental Proof of Concept for Integrated Sensing and Communications Waveform Design

The integration of sensing and communication (ISAC) functionalities have recently gained significant research interest as a hardware-, power-, spectrum- and cost- efficient solution. This experimental work focuses on a dual-functional radar sensing and communication framework where a single radiation waveform, either omnidirectional or directional, can realize both radar sensing and communication functions. We study a trade-off approach that can balance the performance of communications and radar sensing. We design an orthogonal frequency division multiplexing (OFDM) based multi-user multiple input multiple output (MIMO) software-defined radio (SDR) testbed to validate the dual-functional model. We carry out over-the-air experiments to investigate the optimal trade-off factor to balance the performance for both functions. On the radar performance, we measure the output beampatterns of our transmission to examine their similarity to simulation based beampatterns. On the communication side, we obtain bit error rate (BER) results from the testbed to show the communication performance using the dual-functional waveform. Our experiment reveals that the dual-functional approach can achieve comparable BER performance with pure communication-based solutions while maintaining fine radar beampatterns simultaneously.
COMPUTERS
arxiv.org

Variance-Optimal Augmentation Logging for Counterfactual Evaluation in Contextual Bandits

Methods for offline A/B testing and counterfactual learning are seeing rapid adoption in search and recommender systems, since they allow efficient reuse of existing log data. However, there are fundamental limits to using existing log data alone, since the counterfactual estimators that are commonly used in these methods can have large bias and large variance when the logging policy is very different from the target policy being evaluated. To overcome this limitation, we explore the question of how to design data-gathering policies that most effectively augment an existing dataset of bandit feedback with additional observations for both learning and evaluation. To this effect, this paper introduces Minimum Variance Augmentation Logging (MVAL), a method for constructing logging policies that minimize the variance of the downstream evaluation or learning problem. We explore multiple approaches to computing MVAL policies efficiently, and find that they can be substantially more effective in decreasing the variance of an estimator than naïve approaches.
COMPUTERS
arxiv.org

Experimental evidence of nonlinear focusing in standing water waves

Nonlinear wave focusing originating from the universal modulation instability (MI) is responsible for the formation of strong wave localizations on the water surface and in nonlinear wave guides, such as optical Kerr media and plasma. Such extreme wave dynamics can be described by breather solutions of the nonlinear Schrödinger equation (NLSE) like by way of example the famed doubly-localized Peregrine breathers (PB), which typify particular cases of MI. On the other hand, it has been suggested that the MI relevance weakens when the wave field becomes broadband or directional. Here, we provide experimental evidence of nonlinear and distinct PB-type focusing in standing water waves describing the scenario of two counter-propagating wave trains. The collected collinear wave measurements are in excellent agreement with the hydrodynamic coupled NLSE (CNLSE) and suggest that MI can undisturbedly prevail during the interplay of several wave systems and emphasize the potential role of exact NLSE solutions in extreme wave formation beyond the formal narrowband and uni-directional limits. Our work may inspire further experimental investigations in various nonlinear wave guides governed by CNLSE frameworks as well as theoretical progress to predict strong wave coherence in directional fields.
SCIENCE
arxiv.org

An Experimental Design Approach for Regret Minimization in Logistic Bandits

In this work we consider the problem of regret minimization for logistic bandits. The main challenge of logistic bandits is reducing the dependence on a potentially large problem dependent constant $\kappa$ that can at worst scale exponentially with the norm of the unknown parameter $\theta_{\ast}$. Abeille et al. (2021) have applied self-concordance of the logistic function to remove this worst-case dependence providing regret guarantees like $O(d\log^2(\kappa)\sqrt{\dot\mu T}\log(|\mathcal{X}|))$ where $d$ is the dimensionality, $T$ is the time horizon, and $\dot\mu$ is the variance of the best-arm. This work improves upon this bound in the fixed arm setting by employing an experimental design procedure that achieves a minimax regret of $O(\sqrt{d \dot\mu T\log(|\mathcal{X}|)})$. Our regret bound in fact takes a tighter instance (i.e., gap) dependent regret bound for the first time in logistic bandits. We also propose a new warmup sampling algorithm that can dramatically reduce the lower order term in the regret in general and prove that it can replace the lower order term dependency on $\kappa$ to $\log^2(\kappa)$ for some instances. Finally, we discuss the impact of the bias of the MLE on the logistic bandit problem, providing an example where $d^2$ lower order regret (cf., it is $d$ for linear bandits) may not be improved as long as the MLE is used and how bias-corrected estimators may be used to make it closer to $d$.
MATHEMATICS
IN THIS ARTICLE
#Design#Sequential#Machine Learning
arxiv.org

Causal Inference Using Tractable Circuits

The aim of this paper is to discuss a recent result which shows that probabilistic inference in the presence of (unknown) causal mechanisms can be tractable for models that have traditionally been viewed as intractable. This result was reported recently to facilitate model-based supervised learning but it can be interpreted in a causality context as follows. One can compile a non-parametric causal graph into an arithmetic circuit that supports inference in time linear in the circuit size. The circuit is also non-parametric so it can be used to estimate parameters from data and to further reason (in linear time) about the causal graph parametrized by these estimates. Moreover, the circuit size can sometimes be bounded even when the treewidth of the causal graph is not, leading to tractable inference on models that have been deemed intractable previously. This has been enabled by a new technique that can exploit causal mechanisms computationally but without needing to know their identities (the classical setup in causal inference). Our goal is to provide a causality-oriented exposure to these new results and to speculate on how they may potentially contribute to more scalable and versatile causal inference.
COMPUTERS
arxiv.org

Intent Contrastive Learning for Sequential Recommendation

Users' interactions with items are driven by various intents (e.g., preparing for holiday gifts, shopping for fishing equipment, etc.).However, users' underlying intents are often unobserved/latent, making it challenging to leverage such latent intents forSequentialrecommendation(SR). To investigate the benefits of latent intents and leverage them effectively for recommendation, we proposeIntentContrastiveLearning(ICL), a general learning paradigm that leverages a latent intent variable into SR. The core idea is to learn users' intent distribution functions from unlabeled user behavior sequences and optimize SR models with contrastive self-supervised learning (SSL) by considering the learned intents to improve recommendation. Specifically, we introduce a latent variable to represent users' intents and learn the distribution function of the latent variable via clustering. We propose to leverage the learned intents into SR models via contrastive SSL, which maximizes the agreement between a view of sequence and its corresponding intent. The training is alternated between intent representation learning and the SR model optimization steps within the generalized expectation-maximization (EM) framework. Fusing user intent information into SR also improves model robustness. Experiments conducted on four real-world datasets demonstrate the superiority of the proposed learning paradigm, which improves performance, and robustness against data sparsity and noisy interaction issues.
COMPUTERS
arxiv.org

Discovering Concepts in Learned Representations using Statistical Inference and Interactive Visualization

Concept discovery is one of the open problems in the interpretability literature that is important for bridging the gap between non-deep learning experts and model end-users. Among current formulations, concepts defines them by as a direction in a learned representation space. This definition makes it possible to evaluate whether a particular concept significantly influences classification decisions for classes of interest. However, finding relevant concepts is tedious, as representation spaces are high-dimensional and hard to navigate. Current approaches include hand-crafting concept datasets and then converting them to latent space directions; alternatively, the process can be automated by clustering the latent space. In this study, we offer another two approaches to guide user discovery of meaningful concepts, one based on multiple hypothesis testing, and another on interactive visualization. We explore the potential value and limitations of these approaches through simulation experiments and an demo visual interface to real data. Overall, we find that these techniques offer a promising strategy for discovering relevant concepts in settings where users do not have predefined descriptions of them, but without completely automating the process.
COMPUTERS
arxiv.org

UCLCHEMCMC: A MCMC Inference tool for Physical Parameters of Molecular Clouds

We present the publicly available, open source code UCLCHEMCMC, designed to estimate physical parameters of an observed cloud of gas by combining Monte Carlo Markov Chain (MCMC) sampling with chemical and radiative transfer modeling. When given the observed values of different emission lines, UCLCHEMCMC runs a Bayesian parameter inference, using a MCMC algorithm to sample the likelihood and produce an estimate of the posterior probability distribution of the parameters. UCLCHEMCMC takes a full forward modeling approach, generating model observables from the physical parameters via chemical and radiative transfer modeling. While running UCLCHEMCMC, the created chemical models and radiative transfer code results are stored in an SQL database, preventing redundant model calculations in future inferences. This means that the more UCLCHEMCMC is used, the more efficient it becomes. Using UCLCHEM and RADEX, the increase of efficiency is nearly two orders of magnitude, going from 5185.33 \pm 1041.96 s for ten walkers to take one thousand steps when the database is empty, to 68.89 \pm 45.39 s when nearly all models requested are in the database. In order to demonstrate its usefulness we provide an example inference of UCLCHEMCMC to estimate the physical parameters of mock data, and perform two inferences on the well studied prestellar core, L1544, one of which show that it is important to consider the substructures of an object when determining which emission lines to use.
SCIENCE
YOU MAY ALSO LIKE
NewsBreak
Science
NewsBreak
Computer Science
arxiv.org

Sensitivity driven experimental design to facilitate control of dynamical systems

Control of nonlinear dynamical systems is a complex and multifaceted process. Essential elements of many engineering systems include high fidelity physics-based modeling, offline trajectory planning, feedback control design, and data acquisition strategies to reduce uncertainties. This article proposes an optimization centric perspective which couples these elements in a cohesive framework. We introduce a novel use of hyper-differential sensitivity analysis to understand the sensitivity of feedback controllers to parametric uncertainty in physics-based models used for trajectory planning. These sensitivities provide a foundation to define an optimal experimental design which seeks to acquire data most relevant in reducing demand on the feedback controller. Our proposed framework is illustrated on the Zermelo navigation problem and a hypersonic trajectory control problem using data from NASA's X-43 hypersonic flight tests.
SCIENCE
arxiv.org

Inference for Projection-Based Wasserstein Distances on Finite Spaces

The Wasserstein distance is a distance between two probability distributions and has recently gained increasing popularity in statistics and machine learning, owing to its attractive properties. One important approach to extending this distance is using low-dimensional projections of distributions to avoid a high computational cost and the curse of dimensionality in empirical estimation, such as the sliced Wasserstein or max-sliced Wasserstein distances. Despite their practical success in machine learning tasks, the availability of statistical inferences for projection-based Wasserstein distances is limited owing to the lack of distributional limit results. In this paper, we consider distances defined by integrating or maximizing Wasserstein distances between low-dimensional projections of two probability distributions. Then we derive limit distributions regarding these distances when the two distributions are supported on finite points. We also propose a bootstrap procedure to estimate quantiles of limit distributions from data. This facilitates asymptotically exact interval estimation and hypothesis testing for these distances. Our theoretical results are based on the arguments of Sommerfeld and Munk (2018) for deriving distributional limits regarding the original Wasserstein distance on finite spaces and the theory of sensitivity analysis in nonlinear programming. Finally, we conduct numerical experiments to illustrate the theoretical results and demonstrate the applicability of our inferential methods to real data analysis.
SCIENCE
arxiv.org

Bayesian inference of three-dimensional gas maps: II. Galactic HI

The 21-cm emission from atomic hydrogen (HI) is one of the most important tracers of the structure and dynamics of the interstellar medium. Thanks to Galactic rotation, the line is Doppler shifted and, assuming a model for the velocity field, data from gas line surveys can be deprojected along the line of sight. However, given our vantage point in the Galaxy, such a reconstruction suffers from a number of ambiguities. Here, we argue that those can be cured by exploiting the spatial coherence of the gas density that is implied by the physical processes shaping it. We have adopted a Bayesian inference framework that allows reconstructing the three-dimensional map of HI and quantifying its uncertainty. We employ data from the HI4PI compilation to produce three-dimensional maps of Galactic HI. The reconstructed density shows structure on a variety of scales. In particular, some spurs and spiral arms can be identified with ease. We discuss the morphology of the surface mass density and the radial and vertical profiles. The reconstructed three-dimensional HI densities are available at this https URL.
ASTRONOMY
arxiv.org

Inference and FDR Control for Simulated Ising Models in High-dimension

This paper studies the consistency and statistical inference of simulated Ising models in the high dimensional background. Our estimators are based on the Markov chain Monte Carlo maximum likelihood estimation (MCMC-MLE) method penalized by the Elastic-net. Under mild conditions that ensure a specific convergence rate of MCMC method, the $\ell_{1}$ consistency of Elastic-net-penalized MCMC-MLE is proved. We further propose a decorrelated score test based on the decorrelated score function and prove the asymptotic normality of the score function without the influence of many nuisance parameters under the assumption that accelerates the convergence of the MCMC method. The one-step estimator for a single parameter of interest is purposed by linearizing the decorrelated score function to solve its root, as well as its normality and confidence interval for the true value, therefore, be established. Finally, we use different algorithms to control the false discovery rate (FDR) via traditional p-values and novel e-values.
COMPUTERS
arxiv.org

Deep End-to-end Causal Inference

Tomas Geffner, Javier Antoran, Adam Foster, Wenbo Gong, Chao Ma, Emre Kiciman, Amit Sharma, Angus Lamb, Martin Kukla, Nick Pawlowski, Miltiadis Allamanis, Cheng Zhang. Causal inference is essential for data-driven decision making across domains such as business engagement, medical treatment or policy making. However, research on causal discovery and inference has evolved separately, and the combination of the two domains is not trivial. In this work, we develop Deep End-to-end Causal Inference (DECI), a single flow-based method that takes in observational data and can perform both causal discovery and inference, including conditional average treatment effect (CATE) estimation. We provide a theoretical guarantee that DECI can recover the ground truth causal graph under mild assumptions. In addition, our method can handle heterogeneous, real-world, mixed-type data with missing values, allowing for both continuous and discrete treatment decisions. Moreover, the design principle of our method can generalize beyond DECI, providing a general End-to-end Causal Inference (ECI) recipe, which enables different ECI frameworks to be built using existing methods. Our results show the superior performance of DECI when compared to relevant baselines for both causal discovery and (C)ATE estimation in over a thousand experiments on both synthetic datasets and other causal machine learning benchmark datasets.
SCIENCE
arxiv.org

Adaptive Experimentation with Delayed Binary Feedback

Conducting experiments with objectives that take significant delays to materialize (e.g. conversions, add-to-cart events, etc.) is challenging. Although the classical "split sample testing" is still valid for the delayed feedback, the experiment will take longer to complete, which also means spending more resources on worse-performing strategies due to their fixed allocation schedules. Alternatively, adaptive approaches such as "multi-armed bandits" are able to effectively reduce the cost of experimentation. But these methods generally cannot handle delayed objectives directly out of the box. This paper presents an adaptive experimentation solution tailored for delayed binary feedback objectives by estimating the real underlying objectives before they materialize and dynamically allocating variants based on the estimates. Experiments show that the proposed method is more efficient for delayed feedback compared to various other approaches and is robust in different settings. In addition, we describe an experimentation product powered by this algorithm. This product is currently deployed in the online experimentation platform of this http URL, a large e-commerce company and a publisher of digital ads.
SCIENCE
arxiv.org

Accelerating the Inference of the Exa.TrkX Pipeline

Alina Lazar, Xiangyang Ju, Daniel Murnane, Paolo Calafiura, Steven Farrell, Yaoyuan Xu, Maria Spiropulu, Jean-Roch Vlimant, Giuseppe Cerati, Lindsey Gray, Thomas Klijnsma, Jim Kowalkowski, Markus Atkinson, Mark Neubauer, Gage DeZoort, Savannah Thais, Shih-Chieh Hsu, Adam Aurisano, Jeremy Hewes, Alexandra Ballow, Nirajan Acharya, Chun-yi Wang, Emma Liu, Alberto Lucas. Recently, graph...
COMPUTERS
arxiv.org

Controlling the CERN Experimental Area Beams

B. Rae, M. Hrabia, V. Baggiolini, D. Banerjee, J. Bernhard, M. Brugger, N. Charitonidis, L. Gatignon, A. Gerbershagen, R. Gorbonosov, M. Peryt, M. Gabriel, G. Romagnoli, C. Roderick. The CERN fixed target experimental areas are composed of more than 8 km of beam lines with around 800 devices used to...
SCIENCE
arxiv.org

The LPM Effect in sequential bremsstrahlung: $1/N_c^2$ corrections

An important question concerning in-medium high-energy parton showers in a quark-gluon plasma or other QCD medium is whether consecutive splittings of the partons in a given shower can be treated as quantum mechanically independent, or whether the formation times for two consecutive splittings instead have significant overlap. Various previous calculations of the effect of overlapping formation times have either (i) restricted attention to a soft bremsstrahlung limit, or else (ii) used the large-$N_c$ limit (where $N_c{=}3$ is the number of quark colors). In this paper, we make a first study of the accuracy of the large-$N_c$ limit used by those calculations of overlap effects that avoid a soft bremsstrahlung approximation. Specifically, we calculate the $1/N_c^2$ correction to previous $N_c{=}\infty$ results for overlap $g \to gg \to ggg$ of two consecutive gluon splittings $g \to gg$. At order $1/N_c^2$, there is interesting and non-trivial color dynamics that must be accounted for during the overlap of the formation times.
SCIENCE
arxiv.org

Experimental observation of thermalisation with noncommuting charges

Florian Kranzl, Aleksander Lasek, Manoj K. Joshi, Amir Kalev, Rainer Blatt, Christian F. Roos, Nicole Yunger Halpern. Quantum simulators have recently enabled experimental observations of quantum many-body systems' internal thermalisation. Often, the global energy and particle number are conserved, and the system is prepared with a well-defined particle number - in a microcanonical subspace. However, quantum evolution can also conserve quantities, or charges, that fail to commute with each other. Noncommuting charges have recently emerged as a subfield at the intersection of quantum thermodynamics and quantum information. Until now, this subfield has remained theoretical. We initiate the experimental testing of its predictions, with a trapped-ion simulator. We prepare 6-15 spins in an approximate microcanonical subspace, a generalisation of the microcanonical subspace for accommodating noncommuting charges, which cannot necessarily have well-defined nontrivial values simultaneously. We simulate a Heisenberg evolution using laser-induced entangling interactions and collective spin rotations. The noncommuting charges are the three spin components. We find that small subsystems equilibrate to near a recently predicted non-Abelian thermal state. This work bridges quantum many-body simulators to the quantum thermodynamics of noncommuting charges, whose predictions can now be tested.
PHYSICS
arxiv.org

Adjoint-aided inference of Gaussian process driven differential equations

Paterne Gahungu, Christopher W Lanyon, Mauricio A Alvarez, Engineer Bainomugisha, Michael Smith, Richard D. Wilkinson. Linear systems occur throughout engineering and the sciences, most notably as differential equations. In many cases the forcing function for the system is unknown, and interest lies in using noisy observations of the system to infer the forcing, as well as other unknown parameters. In differential equations, the forcing function is an unknown function of the independent variables (typically time and space), and can be modelled as a Gaussian process (GP). In this paper we show how the adjoint of a linear system can be used to efficiently infer forcing functions modelled as GPs, after using a truncated basis expansion of the GP kernel. We show how exact conjugate Bayesian inference for the truncated GP can be achieved, in many cases with substantially lower computation than would be required using MCMC methods. We demonstrate the approach on systems of both ordinary and partial differential equations, and by testing on synthetic data, show that the basis expansion approach approximates well the true forcing with a modest number of basis vectors. Finally, we show how to infer point estimates for the non-linear model parameters, such as the kernel length-scales, using Bayesian optimisation.
SCIENCE
arxiv.org

Sensitivity Analysis in the Generalization of Experimental Results

Randomized controlled trials (RCT's) allow researchers to estimate causal effects in an experimental sample with minimal identifying assumptions. However, to generalize or transport a causal effect from an RCT to a target population, researchers must adjust for a set of treatment effect moderators. In practice, it is impossible to know whether the set of moderators has been properly accounted for. In the following paper, I propose a three parameter sensitivity analysis for generalizing or transporting experimental results using weighted estimators, with several advantages over existing methods. First, the framework does not require assumptions on the underlying data generating process for either the experimental sample selection mechanism or treatment effect heterogeneity. Second, I show that the sensitivity parameters are guaranteed to be bounded and propose several tools researchers can use to perform sensitivity analysis: (1) graphical and numerical summaries for researchers to assess how robust a point estimate is to killer confounders; (2) an extreme scenario analysis; and (3) a formal benchmarking approach for researchers to estimate potential sensitivity parameter values using existing data. Finally, I demonstrate that the proposed framework can be easily extended to the class of doubly robust, augmented weighted estimators. The sensitivity analysis framework is applied to a set of Jobs Training Program experiments.
SCIENCE

Comments / 0

Community Policy