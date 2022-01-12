ContributorsPublishersAdvertisers
SLISEMAP: Explainable Dimensionality Reduction

By Anton Björklund, Jarmo Mäkelä, Kai Puolamäki
 3 days ago

Existing explanation methods for black-box supervised learning models generally work by building local models that explain the models behaviour for a particular data item. It is possible to make global explanations, but the explanations may have low fidelity for complex...

Related
Interesting Engineering

The Internet Is Running Out of Water. Here's What That Means

These few decades heralded a new era in digital information processing, with microchips doubling in speed every two years and miniaturizing equipment that once took up entire rooms. Today, the smartphone you're reading this article on probably outperforms the best technologies of the past, as massive data centers filled with computers holding all kinds of information keep our world turning. These computers, known as servers, provide support for the software, apps, and websites that we use every day.
INTERNET
Gadget Flow

Most innovative gadgets of 2021

2021 brought us cutting-edge gadgets until the very end. From devices that make life easier at home to those that help us stay safe and healthy, these are the most innovative gadgets of 2021. Gadgets in 2021 were nothing short of awe-inspiring. Yes, this was the year LG introduced a...
ELECTRONICS
Fortune

5 useful tools for remote workers unveiled at CES 2022

Never miss a story: Follow your favorite topics and authors to get a personalized email with the journalism that matters most to you. While many workers—and even more managers—hoped to be back at the office by now, COVID-19 and the Omicron variant have wrecked those plans. Google, Uber, and Ford, for instance, have pushed back their return-to-office dates—and in the case of Google and Uber, the postponement is indefinite.
CELL PHONES
Interesting Engineering

Will Smart Glasses Soon Replace Smartphones?

Smart glasses nowadays can do everything that smartphones can but are also hands-free. They effectively blend our field of view with the virtual world through a combination of displays, sensors, software, and internet connectivity. They also boast a camera, speaker, and microphone. Unlike smartphones, they can even be controlled by...
ELECTRONICS
arxiv.org

DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents

Diffusion Probabilistic models have been shown to generate state-of-the-art results on several competitive image synthesis benchmarks but lack a low-dimensional, interpretable latent space, and are slow at generation. On the other hand, Variational Autoencoders (VAEs) typically have access to a low-dimensional latent space but exhibit poor sample quality. Despite recent advances, VAEs usually require high-dimensional hierarchies of the latent codes to generate high-quality samples. We present DiffuseVAE, a novel generative framework that integrates VAE within a diffusion model framework, and leverage this to design a novel conditional parameterization for diffusion models. We show that the resulting model can improve upon the unconditional diffusion model in terms of sampling efficiency while also equipping diffusion models with the low-dimensional VAE inferred latent code. Furthermore, we show that the proposed model can generate high-resolution samples and exhibits synthesis quality comparable to state-of-the-art models on standard benchmarks. Lastly, we show that the proposed method can be used for controllable image synthesis and also exhibits out-of-the-box capabilities for downstream tasks like image super-resolution and denoising. For reproducibility, our source code is publicly available at \url{this https URL}.
COMPUTERS
arxiv.org

A General Framework for Treatment Effect Estimation in Semi-Supervised and High Dimensional Settings

In this article, we aim to provide a general and complete understanding of semi-supervised (SS) causal inference for treatment effects. Specifically, we consider two such estimands: (a) the average treatment effect and (b) the quantile treatment effect, as prototype cases, in an SS setting, characterized by two available data sets: (i) a labeled data set of size $n$, providing observations for a response and a set of high dimensional covariates, as well as a binary treatment indicator; and (ii) an unlabeled data set of size $N$, much larger than $n$, but without the response observed. Using these two data sets, we develop a family of SS estimators which are ensured to be: (1) more robust and (2) more efficient than their supervised counterparts based on the labeled data set only. Beyond the 'standard' double robustness results (in terms of consistency) that can be achieved by supervised methods as well, we further establish root-n consistency and asymptotic normality of our SS estimators whenever the propensity score in the model is correctly specified, without requiring specific forms of the nuisance functions involved. Such an improvement of robustness arises from the use of the massive unlabeled data, so it is generally not attainable in a purely supervised setting. In addition, our estimators are shown to be semi-parametrically efficient as long as all the nuisance functions are correctly specified. Moreover, as an illustration of the nuisance estimators, we consider inverse-probability-weighting type kernel smoothing estimators involving unknown covariate transformation mechanisms, and establish in high dimensional scenarios novel results on their uniform convergence rates, which should be of independent interest. Numerical results on both simulated and real data validate the advantage of our methods over their supervised counterparts with respect to both robustness and efficiency.
SCIENCE
arxiv.org

Scalable semi-supervised dimensionality reduction with GPU-accelerated EmbedSOM

Adam Šmelko, Soňa Molnárová, Miroslav Kratochvíl, Abhishek Koladiya, Jan Musil, Martin Kruliš, Jiří Vondrášek. Dimensionality reduction methods have found vast application as visualization tools in diverse areas of science. Although many different methods exist, their performance is often insufficient for providing quick insight into many contemporary datasets, and the unsupervised mode of use prevents the users from utilizing the methods for dataset exploration and fine-tuning the details for improved visualization quality. We present BlosSOM, a high-performance semi-supervised dimensionality reduction software for interactive user-steerable visualization of high-dimensional datasets with millions of individual data points. BlosSOM builds on a GPU-accelerated implementation of the EmbedSOM algorithm, complemented by several landmark-based algorithms for interfacing the unsupervised model learning algorithms with the user supervision. We show the application of BlosSOM on realistic datasets, where it helps to produce high-quality visualizations that incorporate user-specified layout and focus on certain features. We believe the semi-supervised dimensionality reduction will improve the data visualization possibilities for science areas such as single-cell cytometry, and provide a fast and efficient base methodology for new directions in dataset exploration and annotation.
CODING & PROGRAMMING
arxiv.org

Similarity reductions of peakon equations: the $b$-family

The $b$-family is a one-parameter family of Hamiltonian partial differential equations of non-evolutionary type, which arises in shallow water wave theory. It admits a variety of solutions, including the celebrated peakons, which are weak solutions in the form of peaked solitons with a discontinuous first derivative at the peaks, as well as other interesting solutions that have been obtained in exact form and/or numerically. In each of the special cases $b=2,3$ (the Camassa-Holm and Degasperis-Procesi equations, respectively) the equation is completely integrable, in the sense that it admits a Lax pair and an infinite hierarchy of commuting local symmetries, but for other values of the parameter $b$ it is non-integrable. After a discussion of travelling waves via the use of a reciprocal transformation, which reduces to a hodograph transformation at the level of the ordinary differential equation satisfied by these solutions, we apply the same technique to the scaling similarity solutions of the $b$-family, and show that when $b=2$ or $3$ this similarity reduction is related by a hodograph transformation to particular cases of the Painlevé III equation, while for all other choices of $b$ the resulting ordinary differential equation is not of Painlevé type.
MATHEMATICS
arxiv.org

Singularity models in the three-dimensional Ricci flow

The Ricci flow is a natural evolution equation for Riemannian metrics on a given manifold. The main goal is to understand singularity formation. In his spectacular 2002 breakthrough, Perelman achieved a qualitative understanding of singularity formation in dimension $3$. More precisely, Perelman showed that every finite-time singularity to the Ricci flow in dimension $3$ is modeled on an ancient $\kappa$-solution. Moreover, Perelman proved a structure theorem for ancient $\kappa$-solutions in dimension $3$.
SCIENCE
arxiv.org

Dissipative structure of one-dimensional isothermal compressible fluids of Korteweg type

This paper studies the dissipative structure of the system of equations that describes the motion of a compressible, isothermal, viscous-capillar fluid of Korteweg type in one space dimension. It is shown that the system satisfies the genuine coupling condition of Humpherys (J. Hyperbolic Differ. Equ. 2, 2005, no. 4, 963-974), which is, in turn, an extension to higher order systems of the classical condition by Kawashima and Shizuta (Tohoku Math. J. 40, 1988, no. 3, 449-464; Hokkaido Math. J. 14, 1985, no. 2, 249-275) for second order systems. It is proved that genuine coupling implies the decay of solutions to the linearized system around a constant equilibrium state. For that purpose, the symmetrizability of the Fourier symbol is used in order to construct an appropriate compensating matrix. These linear decay estimates imply the global decay of perturbations to constant equilibrium states as solutions to the full nonlinear system, via a standard continuation argument.
SCIENCE
arxiv.org

On the spectrum of Schrödinger-type operators on two dimensional lattices

We consider a family $$ \widehat H_{a,b}(\mu)=\widehat H_0 +\mu \widehat V_{a,b}\quad \mu>0, $$ of Schrödinger-type operators on the two dimensional lattice $\mathbb{Z}^2,$ where $\widehat H_0$ is a Laurent-Toeplitz-type convolution operator with a given Hopping matrix $\hat{e}$ and $\widehat V_{a,b}$ is a potential taking into account only the zero-range and one-range interactions, i.e., a multiplication operator by a function $\hat v$ such that $\hat v(0)=a,$ $\hat v(x)=b$ for $|x|=1$ and $\hat v(x)=0$ for $|x|\ge2,$ where $a,b\in\mathbb{R}\setminus\{0\}.$ Under certain conditions on the regularity of $\hat{e}$ we completely describe the discrete spectrum of $\hat H_{a,b}(\mu)$ lying above the essential spectrum and study the dependence of eigenvalues on parameters $\mu,$ $a$ and $b.$ Moreover, we characterize the threshold eigenfunctions and resonances.
MATHEMATICS
Phys.org

Researchers detect two-dimensional kagome surface states

Kogome lattices have become a new focus in the study of condensed matter physics for their novel features. However, due to the in-plane and interlayer interactions in materials, the intrinsic features of the 2D kogome lattices are often affected or even destroyed, causing the bulk states of the material to be inconsistent with its characteristic structure in theoretical calculation.
PHYSICS
arxiv.org

$m^\ast$ of two-dimensional electron gas: a neural canonical transformation study

The quasiparticle effective mass $m^\ast$ of interacting electrons is a fundamental quantity in the Fermi liquid theory. However, the precise value of the effective mass of uniform electron gas is still elusive after decades of research. The newly developed neural canonical transformation approach arXiv:2105.08644 offers a principled way to extract the effective mass of electron gas by directly calculating the thermal entropy at low temperature. The approach models a variational many-electron density matrix using two generative neural networks: an autoregressive model for momentum occupation and a normalizing flow for electron coordinates. Our calculation reveals a suppression of effective mass in the two-dimensional spin-polarized electron gas, which is more pronounced than previous reports in the low-density strong-coupling region. This prediction calls for verification in two-dimensional electron gas experiments.
MATHEMATICS
arxiv.org

Compilation and scaling strategies for a silicon quantum processor with sparse two-dimensional connectivity

Inspired by the challenge of scaling up existing silicon quantum hardware, we investigate compilation strategies for sparsely-connected 2d qubit arrangements and propose a spin-qubit architecture with minimal compilation overhead. Our architecture is based on silicon nanowire split-gate transistors which can form finite 1d chains of spin-qubits and allow the execution of two-qubit operations such as Swap gates among neighbors. Adding to this, we describe a novel silicon junction which can couple up to four nanowires into 2d arrangements via spin shuttling and Swap operations. Given these hardware elements, we propose a modular sparse 2d spin-qubit architecture with unit cells consisting of diagonally-oriented squares with nanowires along the edges and junctions on the corners. We show that this architecture allows for compilation strategies which outperform the best-in-class compilation strategy for 1d chains, not only asymptotically, but also down to the minimal structure of a single square. The proposed architecture exhibits favorable scaling properties which allow for balancing the trade-off between compilation overhead and co-location of classical control electronics within each square by adjusting the length of the nanowires. An appealing feature of the proposed architecture is its manufacturability using complementary-metal-oxide-semiconductor (CMOS) fabrication processes. Finally, we note that our compilation strategies, while being inspired by spin-qubits, are equally valid for any other quantum processor with sparse 2d connectivity.
COMPUTERS
arxiv.org

Estimating Heterogeneous Causal Effects of High-Dimensional Treatments: Application to Conjoint Analysis

Estimation of heterogeneous treatment effects is an active area of research in causal inference. Most of the existing methods, however, focus on estimating the conditional average treatment effects of a single, binary treatment given a set of pre-treatment covariates. In this paper, we propose a method to estimate the heterogeneous causal effects of high-dimensional treatments, which poses unique challenges in terms of estimation and interpretation. The proposed approach is based on a Bayesian mixture of regularized regressions to identify groups of units who exhibit similar patterns of treatment effects. By directly modeling cluster membership with covariates, the proposed methodology allows one to explore the unit characteristics that are associated with different patterns of treatment effects. Our motivating application is conjoint analysis, which is a popular survey experiment in social science and marketing research and is based on a high-dimensional factorial design. We apply the proposed methodology to the conjoint data, where survey respondents are asked to select one of two immigrant profiles with randomly selected attributes. We find that a group of respondents with a relatively high degree of prejudice appears to discriminate against immigrants from non-European countries like Iraq. An open-source software package is available for implementing the proposed methodology.
SCIENCE
arxiv.org

High-dimensional variable selection with heterogeneous signals: A precise asymptotic perspective

We study the problem of exact support recovery for high-dimensional sparse linear regression when the signals are weak, rare and possibly heterogeneous. Specifically, we fix the minimum signal magnitude at the information-theoretic optimal rate and investigate the asymptotic selection accuracy of best subset selection (BSS) and marginal screening (MS) procedures under independent Gaussian design. Despite of the ideal setup, somewhat surprisingly, marginal screening can fail to achieve exact recovery with probability converging to one in the presence of heterogeneous signals, whereas BSS enjoys model consistency whenever the minimum signal strength is above the information-theoretic threshold. To mitigate the computational issue of BSS, we also propose a surrogate two-stage algorithm called ETS (Estimate Then Screen) based on iterative hard thresholding and gradient coordinate screening, and we show that ETS shares exactly the same asymptotic optimality in terms of exact recovery as BSS. Finally, we present a simulation study comparing ETS with LASSO and marginal screening. The numerical results echo with our asymptotic theory even for realistic values of the sample size, dimension and sparsity.
SCIENCE
arxiv.org

A novel framework for the three-dimensional NLTE inverse problem

Inversion of spectropolarimetric observations of the solar upper atmosphere is one of the most challenging goals in solar physics. If we account for all relevant ingredients of the spectral line formation process such as three-dimensional (3D) radiative transfer out of local thermodynamic equilibrium (NLTE), the task becomes extremely computationally expensive. Instead of generalizing 1D methods to 3D, we develop a new approach to the inverse problem. In our meshfree method we do not consider the requirement of 3D NLTE consistency as an obstacle, but as a natural regularization with respect to the traditional pixel-by-pixel methods. This leads to more robust and less ambiguous solutions. We solve the 3D NLTE inverse problem as an unconstrained global minimization problem avoiding repetitive evaluations of the $\Lambda$~operator. Apart from 3D NLTE consistency, the method allows to easily include additional conditions of physical consistency such as zero divergence of the magnetic field. Stochastic ingredients make the method less prone to ending up in local minima of the loss function. Our method is capable of solving the inverse problem by orders of magnitude faster than it would be possible using grid-based methods. The method can provide accurate and physically consistent results if sufficient computing time is available, but also approximate solutions in case of very complex plasma structures or limited computing time.
SCIENCE
arxiv.org

A reduced variational approach for searching cycles in high-dimensional systems

Searching recurrent patterns in complex systems with high-dimensional phase spaces is an important task in diverse fields. In the current work, an improved scheme is proposed to accelerate the recently designed variational approach for finding periodic orbits in systems with chaotic dynamics based on the existence of inertial manifold widely observed in various spatially extended systems, especially those with high dimensions. On the premise of keeping exponential convergence of the variational method, an effective loop evolution equation is derived to greatly reduce the storage and computing time. With repeated modification of local coordinates and evolution of the guess loop being carried out alternately, the rapid convergence and the stability of the reduction scheme are effectively achieved. The dimension of local coordinate subspaces is generally larger than the number of nonnegative Lyapunov exponents to ensure the exponential convergence. The proposed scheme is successfully demonstrated on several well-known examples and expected to supply a powerful tool in the exploration of high-dimensional nonlinear systems.
SCIENCE
arxiv.org

An approximate Bayes factor based high dimensional MANOVA using Random Projections

High-dimensional mean vector testing problem for two or more groups remain a very active research area. In these setting, traditional tests are not applicable because they involve the inversion of rank deficient group covariance matrix. In current approaches, this problem is addressed by simply looking at a test assuming a sparse or diagonal covariance matrix potentially ignoring complex dependency between features. In this paper, we develop a Bayes factor (BF) based testing procedure for comparing two or more population means in (very) high dimensional settings. Two versions of the Bayes factor based test statistics are considered which are based on a Random projection (RP) approach. RPs are appealing since they make not assumption about the form of the dependency across features in the data. The final test statistic is based on an ensemble of Bayes factors corresponding to multiple replications of randomly projected data. Both proposed test statistics are compared through a battery of simulation settings. Finally they are applied to the analysis of a publicly available genomic single cell RNA-seq (scRNA-seq) dataset.
SCIENCE

