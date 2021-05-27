Cancel
Score test for missing at random or not

By Hairu Wang, Zhiping Lu, Yukun Liu
arxiv.org
 22 days ago

Missing data are frequently encountered in various disciplines and can be divided into three categories: missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR). Valid statistical approaches to missing data depend crucially on correct identification of the underlying missingness mechanism. Although the problem of testing whether this mechanism is MCAR or MAR has been extensively studied, there has been very little research on testing MAR versus MNAR.A critical challenge that is faced when dealing with this problem is the issue of model identification under MNAR. In this paper, under a logistic model for the missing probability, we develop two score tests for the problem of whether the missingness mechanism is MAR or MNAR under a parametric model and a semiparametric location model on the regression function. The score tests require only parameter estimation under the null MAR assumption, which completely circumvents the identification issue. Our simulations and analysis of human immunodeficiency virus data show that the score tests have well-controlled type I errors and desirable powers.

arxiv.org
#Missing Data#Null#Score Test#Mcar#Mar#Mnar A#Missingness
Science
Computersarxiv.org

DPER: Efficient Parameter Estimation for Randomly Missing Data

The missing data problem has been broadly studied in the last few decades and has various applications in different areas such as statistics or bioinformatics. Even though many methods have been developed to tackle this challenge, most of those are imputation techniques that require multiple iterations through the data before yielding convergence. In addition, such approaches may introduce extra biases and noises to the estimated parameters. In this work, we propose novel algorithms to find the maximum likelihood estimates (MLEs) for a one-class/multiple-class randomly missing data set under some mild assumptions. As the computation is direct without any imputation, our algorithms do not require multiple iterations through the data, thus promising to be less time-consuming than other methods while maintaining superior estimation performance. We validate these claims by empirical results on various data sets of different sizes and release all codes in a GitHub repository to contribute to the research community related to this problem.
Sciencearxiv.org

Proper Scoring Rules for Missing Value Imputation

Given the prevalence of missing data in modern statistical research, a broad range of methods is available for any given imputation task. How does one choose the `best' method in a given application? The standard approach is to select some observations, set their status to missing, and compare prediction accuracy of the methods under consideration for these observations. Besides having to somewhat artificially mask additional observations, a shortcoming of this approach is that the optimal imputation in this scheme chooses the conditional mean if predictive accuracy is measured with RMSE. In contrast, we would like to rank highest methods that can sample from the true conditional distribution. In this paper, we develop a principled and easy-to-use evaluation method for missing value imputation under the missing completely at random (MCAR) assumption. The approach is applicable for discrete and continuous data and works on incomplete data sets, without having to leave out additional observations for evaluation. Moreover, it favors imputation methods that reproduce the original data distribution. We show empirically on a range of data sets and imputation methods that our score consistently ranks true data high(est) and is able to avoid pitfalls usually associated with performance measures such as RMSE. Finally, we provide an R-package with an implementation of our method.
Softwarearxiv.org

How to Test the Randomness from the Wireless Channel for Security?

We revisit the traditional framework of wireless secret key generation, where two parties leverage the wireless channel randomness to establish a secret key. The essence in the framework is to quantify channel randomness into bit sequences for key generation. Conducting randomness tests on such bit sequences has been a common practice to provide the confidence to validate whether they are random. Interestingly, despite different settings in the tests, existing studies interpret the results the same: passing tests means that the bit sequences are indeed random.
Sciencearxiv.org

Towards a model-independent reconstruction approach for late-time Hubble data

Gaussian processes offers a convenient way to perform nonparametric reconstructions of observational data assuming only a kernel which describes the covariance between neighbouring points in a data set. We approach the ambiguity in the choice of kernel in Gaussian processes with two methods -- (a) approximate Bayesian computation with sequential Monte Carlo sampling and (b) genetic algorithm -- in order to address the often ad hoc choice of the kernel and use the overall resulting method to reconstruct the cosmic chronometers and supernovae type Ia data sets. The results have shown that the Matérn$\left( \nu = 5/2 \right)$ kernel emerges on top of the two-hyperparameter family of kernels for both cosmological data sets. On the other hand, we use the genetic algorithm in order to select a most naturally-fit kernel among a competitive pool made up of a ten-hyperparameters class of kernels. Imposing a Bayesian information criterion-inspired measure of the fitness, the results have shown that a hybrid of the radial basis function and the Matérn$\left( \nu = 5/2 \right)$ kernel best represented both data sets.
Physicsarxiv.org

An odd feature of the `most classical' states of $SU(2)$ invariant quantum mechanical systems

Complex and spinorial techniques of general relativity are used to determine all the states of the $SU(2)$ invariant quantum mechanical systems in which the equality holds in the uncertainty relations for the components of the angular momentum vector operator in two given directions. The expectation values depend on a discrete `quantum number' and two parameters, one of them is the angle between the two angular momentum components and the other is the quotient of the two standard deviations. It is shown that although the standard deviations change continuously, one of the expectation values changes \emph{discontinuously} on this parameter space. Since physically neither of the angular momentum components is distinguished over the other, this discontinuity suggests that the genuine parameter space must be a \emph{Riemann surface} known in connection with the complex function $\sqrt{z}$. Moreover, the angle between the angular momentum components plays the role of the parameter of an interpolation between the continuous range of the expectation values found in the special case of the orthogonal angular momentum components by Aragone \emph{et al} (J. Phys. A. {\bf 7} L149 (1974)) and the discrete point spectrum of one angular momentum component. The consequences in the \emph{simultaneous} measurements of these angular momentum components are also discussed briefly.
Mathematicsarxiv.org

Restrained double Roman domination of a graph

For a graph G=(V,E), a restrained double Roman dominating function is a function f:V\rightarrow\{0,1,2,3\} having the property that if f(v)=0, then the vertex v must have at least two neighbors assigned 2 under f or one neighbor w with f(w)=3, and if f(v)=1, then the vertex v must have at least one neighbor w with f(w)\geq2, and at the same time, the subgraph G[V_0] which includes vertices with zero labels has no isolated vertex. The weight of a restrained double Roman dominating function f is the sum f(V)=\sum_{v\in V}f(v), and the minimum weight of a restrained double Roman dominating function on G is the restrained double Roman domination number of G. We initiate the study of restrained double Roman domination with proving that the problem of computing this parameter is NP-hard. Then we present an upper bound on the restrained double Roman domination number of a connected graph G in terms of the order of G and characterize the graphs attaining this bound. We study the restrained double Roman domination versus the restrained Roman domination. Finally, we characterized all trees T attaining the exhibited bound.
Sciencearxiv.org

Robust a posteriori error analysis for rotation-based formulations of the elasticity/poroelasticity coupling

We develop the \textit{a posteriori} error analysis of three mixed finite element formulations for rotation-based equations in elasticity, poroelasticity, and interfacial elasticity-poroelasticity. The discretisations use $H^1$-conforming finite elements of degree $k+1$ for displacement and fluid pressure, and discontinuous piecewise polynomials of degree $k$ for rotation vector, total pressure, and elastic pressure. Residual-based estimators are constructed, and upper and lower bounds (up to data oscillations) for all global estimators are rigorously derived. The methods are all robust with respect to the model parameters (in particular, the Lamé constants), they are valid in 2D and 3D, and also for arbitrary polynomial degree $k\geq 0$. The error behaviour predicted by the theoretical analysis is then demonstrated numerically on a set of computational examples including different geometries on which we perform adaptive mesh refinement guided by the \textit{a posteriori} error estimators.
Computersarxiv.org

Identifiability-Guaranteed Simplex-Structured Post-Nonlinear Mixture Learning via Autoencoder

This work focuses on the problem of unraveling nonlinearly mixed latent components in an unsupervised manner. The latent components are assumed to reside in the probability simplex, and are transformed by an unknown post-nonlinear mixing system. This problem finds various applications in signal and data analytics, e.g., nonlinear hyperspectral unmixing, image embedding, and nonlinear clustering. Linear mixture learning problems are already ill-posed, as identifiability of the target latent components is hard to establish in general. With unknown nonlinearity involved, the problem is even more challenging. Prior work offered a function equation-based formulation for provable latent component identification. However, the identifiability conditions are somewhat stringent and unrealistic. In addition, the identifiability analysis is based on the infinite sample (i.e., population) case, while the understanding for practical finite sample cases has been elusive. Moreover, the algorithm in the prior work trades model expressiveness with computational convenience, which often hinders the learning performance. Our contribution is threefold. First, new identifiability conditions are derived under largely relaxed assumptions. Second, comprehensive sample complexity results are presented -- which are the first of the kind. Third, a constrained autoencoder-based algorithmic framework is proposed for implementation, which effectively circumvents the challenges in the existing algorithm. Synthetic and real experiments corroborate our theoretical analyses.
Computersarxiv.org

On the Fragile Rates of Linear Feedback Coding Schemes of Gaussian Channels with Memory

In \cite{butman1976} the linear coding scheme is applied, $X_t =g_t\Big(\Theta - {\bf E}\Big\{\Theta\Big|Y^{t-1}, V_0=v_0\Big\}\Big)$, $t=2,\ldots,n$, $X_1=g_1\Theta$, with $\Theta: \Omega \to {\mathbb R}$, a Gaussian random variable, to derive a lower bound on the feedback rate, for additive Gaussian noise (AGN) channels, $Y_t=X_t+V_t, t=1, \ldots, n$, where $V_t$ is a Gaussian autoregressive (AR) noise, and $\kappa \in [0,\infty)$ is the total transmitter power. For the unit memory AR noise, with parameters $(c, K_W)$, where $c\in [-1,1]$ is the pole and $K_W$ is the variance of the Gaussian noise, the lower bound is $C^{L,B} =\frac{1}{2} \log \chi^2$, where $\chi =\lim_{n\longrightarrow \infty} \chi_n$ is the positive root of $\chi^2=1+\Big(1+ \frac{|c|}{\chi}\Big)^2 \frac{\kappa}{K_W}$, and the sequence $\chi_n \triangleq \Big|\frac{g_n}{g_{n-1}}\Big|, n=2, 3, \ldots,$ satisfies a certain recursion, and conjectured that $C^{L,B}$ is the feedback capacity.
Sciencearxiv.org

Predicting cognitive scores with graph neural networks through sample selection learning

Analyzing the relation between intelligence and neural activity is of the utmost importance in understanding the working principles of the human brain in health and disease. In existing literature, functional brain connectomes have been used successfully to predict cognitive measures such as intelligence quotient (IQ) scores in both healthy and disordered cohorts using machine learning models. However, existing methods resort to flattening the brain connectome (i.e., graph) through vectorization which overlooks its topological properties. To address this limitation and inspired from the emerging graph neural networks (GNNs), we design a novel regression GNN model (namely RegGNN) for predicting IQ scores from brain connectivity. On top of that, we introduce a novel, fully modular sample selection method to select the best samples to learn from for our target prediction task. However, since such deep learning architectures are computationally expensive to train, we further propose a \emph{learning-based sample selection} method that learns how to choose the training samples with the highest expected predictive power on unseen samples. For this, we capitalize on the fact that connectomes (i.e., their adjacency matrices) lie in the symmetric positive definite (SPD) matrix cone. Our results on full-scale and verbal IQ prediction outperforms comparison methods in autism spectrum disorder cohorts and achieves a competitive performance for neurotypical subjects using 3-fold cross-validation. Furthermore, we show that our sample selection approach generalizes to other learning-based methods, which shows its usefulness beyond our GNN architecture.
Mathematicsarxiv.org

Title:Polynomial $χ$-binding functions for $t$-broom-free graphs

Authors:Xiaonan Liu, Joshua Schroeder, Zhiyu Wang, Xingxing Yu. Abstract: For any positive integer $t$, a \emph{$t$-broom} is a graph obtained from $K_{1,t+1}$ by subdividing an edge once. In this paper, we show that, for graphs $G$ without induced $t$-brooms, we have $\chi(G) = o(\omega(G)^{t+1})$, where $\chi(G)$ and $\omega(G)$ are the chromatic number and clique number of $G$, respectively. When $t=2$, this answers a question of Schiermeyer and Randerath. Moreover, for $t=2$, we strengthen the bound on $\chi(G)$ to $7.5\omega(G)^2$, confirming a conjecture of Sivaraman. For $t\geq 3$ and \{$t$-broom, $K_{t,t}$\}-free graphs, we improve the bound to $o(\omega^{t-1+\frac{2}{t+1}})$.
Mathematicsarxiv.org

Asymptotic normality for $m$-dependent and constrained $U$-statistics, with applications to pattern matching in random strings and permutations

We study (asymmetric) $U$-statistics based on a stationary sequence of $m$-dependent variables; moreover, we consider constrained $U$-statistics, where the defining multiple sum only includes terms satisfying some restrictions on the gaps between indices. Results include a law of large numbers and a central limit theorem. Special attention is paid to degenerate cases where, after the standard normalization, the asymptotic variance vanishes; in these cases non-normal limits occur after a different normalization.
Physicsarxiv.org

A fourth-order compact time-splitting method for the Dirac equation with time-dependent potentials

In this paper, we present an approach to deal with the dynamics of the Dirac equation with time-dependent electromagnetic potentials using the fourth-order compact time-splitting method ($S_\text{4c}$). To this purpose, the time-ordering technique for time-dependent Hamiltonians is introduced, so that the influence of the time-dependence could be limited to certain steps which are easy to treat. Actually, in the case of the Dirac equation, it turns out that only those steps involving potentials need to be amended, and the scheme remains efficient, accurate, as well as easy to implement. Numerical examples in 1D and 2D are given to validate the scheme.
Mathematicsarxiv.org

Mixing of the Averaging process and its discrete dual on finite-dimensional geometries

We analyze the $L^1$-mixing of a generalization of the Averaging process introduced by Aldous. The process takes place on a growing sequence of graphs which we assume to be finite-dimensional, in the sense that the random walk on those geometries satisfies a family of Nash inequalities. As a byproduct of our analysis, we provide a complete picture of the total variation mixing of a discrete dual of the Averaging process, which we call Binomial Splitting process. A single particle of this process is essentially the random walk on the underlying graph. When several particles evolve together, they interact by synchronizing their jumps when placed on neighboring sites. We show that, given $k$ the number of particles and $n$ the (growing) size of the underlying graph, the system exhibits cutoff in total variation if $k\to\infty$ and $k=O(n^2)$. Finally, we exploit the duality between the two processes to show that the Binomial Splitting satisfies a version of Aldous' spectral gap identity, namely, the relaxation time of the process is independent of the number of particles.
Computersarxiv.org

Large Scale Private Learning via Low-rank Reparametrization

We propose a reparametrization scheme to address the challenges of applying differentially private SGD on large neural networks, which are 1) the huge memory cost of storing individual gradients, 2) the added noise suffering notorious dimensional dependence. Specifically, we reparametrize each weight matrix with two \emph{gradient-carrier} matrices of small dimension and a \emph{residual weight} matrix. We argue that such reparametrization keeps the forward/backward process unchanged while enabling us to compute the projected gradient without computing the gradient itself. To learn with differential privacy, we design \emph{reparametrized gradient perturbation (RGP)} that perturbs the gradients on gradient-carrier matrices and reconstructs an update for the original weight from the noisy gradients. Importantly, we use historical updates to find the gradient-carrier matrices, whose optimality is rigorously justified under linear regression and empirically verified with deep learning tasks. RGP significantly reduces the memory cost and improves the utility. For example, we are the first able to apply differential privacy on the BERT model and achieve an average accuracy of $83.9\%$ on four downstream tasks with $\epsilon=8$, which is within $5\%$ loss compared to the non-private baseline but enjoys much lower privacy leakage risk.
Mathematicsarxiv.org

Title:$1$-independent percolation on $\mathbb{Z}^2 \times K_n$

Abstract: A random graph model on a host graph $H$ is said to be $1$-independent if for every pair of vertex-disjoint subsets $A,B$ of $E(H)$, the state of edges (absent or present) in $A$ is independent of the state of edges in $B$. For an infinite connected graph $H$, the $1$-independent critical percolation probability $p_{1,c}(H)$ is the infimum of the $p\in [0,1]$ such that every $1$-independent random graph model on $H$ in which each edge is present with probability at least $p$ almost surely contains an infinite connected component.
Mathematicsarxiv.org

Accurate and efficient hydrodynamic analysis of structures with sharp edges by the Extended Finite Element Method (XFEM): 2D studies

Achieving accurate numerical results of hydrodynamic loads based on the potential-flow theory is very challenging for structures with sharp edges, due to the singular behavior of the local-flow velocities. In this paper, we introduce the Extended Finite Element Method (XFEM) to solve fluid-structure interaction problems involving sharp edges on structures. Four different FEM solvers, including conventional linear and quadratic FEMs as well as their corresponding XFEM versions with local enrichment by singular basis functions at sharp edges, are implemented and compared. To demonstrate the accuracy and efficiency of the XFEMs, a thin flat plate in an infinite fluid domain and a forced heaving rectangle at the free surface, both in two dimensions, will be studied. For the flat plate, the mesh convergence studies are carried out for both the velocity potential in the fluid domain and the added mass, and the XFEMs show apparent advantages thanks to their local enhancement at the sharp edges. Three different enrichment strategies are also compared, and suggestions will be made for the practical implementation of the XFEM. For the forced heaving rectangle, the linear and 2nd order mean wave loads are studied. Our results confirm the previous conclusion in the literature that it is not difficult for a conventional numerical model to obtain convergent results for added mass and damping coefficients. However, when the 2nd order mean wave loads requiring the computation of velocity components are calculated via direct pressure integration, it takes a tremendously large number of elements for the conventional FEMs to get convergent results. On the contrary, the numerical results of XFEMs converge rapidly even with very coarse meshes, especially for the quadratic XFEM.
Physicsarxiv.org

Hamiltonian analysis of fermions coupled to the Holst action

We report three manifestly Lorentz-invariant Hamiltonian formulations of minimally and nonminimally coupled fermion fields to the Holst action. These formulations are achieved by making a suitable parametrization of both the tetrad and the Lorentz connection, which allows us to integrate out some auxiliary fields without spoiling the local Lorentz symmetry. They have the peculiarity that their noncanonical symplectic structures as well as the phase-space variables for the gravitational sector are real. Moreover, two of these Hamiltonian formulations involve half-densitized fermion fields. We also impose the time gauge on these formulations, which leads to real connections for the gravitational configuration variables. Finally, we perform a symplectomorphism in one of the manifestly Lorentz-invariant Hamiltonian formulations and analyze the resulting formulation, which becomes the Hamiltonian formulation of fermion fields minimally coupled to the Palatini action for particular values of the coupling parameters.
Sciencearxiv.org

Over-and-Under Complete Convolutional RNN for MRI Reconstruction

Reconstructing magnetic resonance (MR) images from undersampled data is a challenging problem due to various artifacts introduced by the under-sampling operation. Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture which captures low-level features at the initial layers and high-level features at the deeper layers. Such networks focus much on global features which may not be optimal to reconstruct the fully-sampled image. In this paper, we propose an Over-and-Under Complete Convolutional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network(CRNN). The overcomplete branch gives special attention in learning local structures by restraining the receptive field of the network. Combining it with the undercomplete branch leads to a network which focuses more on low-level features without losing out on the global structures. Extensive experiments on two datasets demonstrate that the proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters. Our code is available at this https URL.
Astronomyarxiv.org

Jointly setting upper limits on multiple components of an anisotropic stochastic gravitational-wave background

With the increasing sensitivities of the gravitational wave detectors and more detectors joining the international network, the chances of detection of a stochastic GW background (SGWB) is progressively increasing. Different astrophysical and cosmological processes are likely to give rise to backgrounds with distinct spectral signatures and distributions on the sky. The observed background will therefore be a superposition of these components. Hence, one of the first questions that will come up after the first detection of a SGWB will likely be about identifying the dominant components and their distributions on the sky. Both these questions were addressed separately in the literature, namely, how to separate components of isotropic backgrounds and how to probe the anisotropy of a single component. Here, we address the question of how to separate distinct anisotropic backgrounds with (sufficiently) different spectral shapes. We first obtain the combined Fisher information matrix from folded data using an efficient analysis pipeline PyStoch, which incorporates covariances between pixels and spectral indices. This is necessary for estimating the detection statistic and setting upper limits. However, based on a recent study, we ignore the pixel-to-pixel noise covariance that does not have a significant effect on the results at the present sensitivity levels of the detectors. We establish the validity of our formalism using injection studies. We show that the joint analysis accurately separates and estimates backgrounds with different spectral shapes and different sky distributions with no major bias. This does come at the cost of increased variance. Thus making the joint upper limits safer, though less strict than the individual analysis. We finally set joint upper limits on the multi-component anisotropic background using aLIGO data taken up to the first half of the third observing run.