Closed linear spaces consisting of strongly norm attaining Lipschitz mappings

By Vladimir Kadets, Óscar Roldán
arxiv.org
 2 days ago

Vladimir Kadets (1), Óscar Roldán (2) ((1) V. N. Karazin Kharkiv National University, (2) Universitat de València) Given a pointed metric space $M$, we study when we can form non-trivial linear subspaces of $\operatorname{Lip}_0(M)$ consisting...

arxiv.org

arxiv.org

On Linear Separability under Linear Compression with Applications to Hard Support Vector Machine

This paper investigates the theoretical problem of maintaining linear separability of the data-generating distribution under linear compression. While it has been long known that linear separability may be maintained by linear transformations that approximately preserve the inner products between the domain points, the limit to which the inner products are preserved in order to maintain linear separability was unknown. In this paper, we show that linear separability is maintained as long as the distortion of the inner products is smaller than the squared margin of the original data-generating distribution. The proof is mainly based on the geometry of hard support vector machines (SVM) extended from the finite set of training examples to the (possibly) infinite domain of the data-generating distribution. As applications, we derive bounds on the (i) compression length of random sub-Gaussian matrices; and (ii) generalization error for compressive learning with hard-SVM.
MATHEMATICS
arxiv.org

S-duality and the universal isometries of q-map spaces

The tree-level q-map assigns to a projective special real (PSR) manifold of dimension $n-1\geq 0$, a quaternionic Kähler (QK) manifold of dimension $4n+4$. It is known that the resulting QK manifold admits a $(3n+5)$-dimensional universal group of isometries (i.e. independently of the choice of PSR manifold). On the other hand, in the context of Calabi-Yau compactifications of type IIB string theory, the large volume limit of the hypermultiplet moduli space metric is an instance of a tree-level q-map space, and it is known from the physics literature that such a metric has an $\mathrm{SL}(2,\mathbb{R})$ group of isometries related to the $\mathrm{SL}(2,\mathbb{Z})$ S-duality symmetry of the full 10d theory. We present a purely mathematical proof that any tree-level q-map space admits such an $\mathrm{SL}(2,\mathbb{R})$ action by isometries, enlarging the previous universal group of isometries to a $(3n+6)$-dimensional group $G$. As part of this analysis, we describe how the $(3n+5)$-dimensional subgroup interacts with the $\mathrm{SL}(2,\mathbb{R})$-action, and find a codimension one normal subgroup of $G$ that is unimodular. By taking a quotient with respect to a lattice in the unimodular group, we obtain a quaternionic Kähler manifold fibering over a projective special real manifold with fibers of finite volume, and compute the volume as a function of the base. We furthermore provide a mathematical treatment of results from the physics literature concerning the twistor space of the tree-level q-map space and the holomorphic lift of the $(3n+6)$-dimensional group of universal isometries to the twistor space.
SCIENCE
arxiv.org

Lipschitz-constrained Unsupervised Skill Discovery

We study the problem of unsupervised skill discovery, whose goal is to learn a set of diverse and useful skills with no external reward. There have been a number of skill discovery methods based on maximizing the mutual information (MI) between skills and states. However, we point out that their MI objectives usually prefer static skills to dynamic ones, which may hinder the application for downstream tasks. To address this issue, we propose Lipschitz-constrained Skill Discovery (LSD), which encourages the agent to discover more diverse, dynamic, and far-reaching skills. Another benefit of LSD is that its learned representation function can be utilized for solving goal-following downstream tasks even in a zero-shot manner - i.e., without further training or complex planning. Through experiments on various MuJoCo robotic locomotion and manipulation environments, we demonstrate that LSD outperforms previous approaches in terms of skill diversity, state space coverage, and performance on seven downstream tasks including the challenging task of following multiple goals on Humanoid. Our code and videos are available at this https URL.
ARTIFICIAL INTELLIGENCE
arxiv.org

Euclidean algorithm for a class of linear orders

Borrowing inspiration from Marcone and Montálban's one-one correspondence between the class of signed trees and the equimorphism classes of indecomposable scattered linear orders, we find a subclass of signed trees which has an analogous correspondence with equimorphism classes of indecomposable finite rank discrete linear orders. We also introduce the...
MATHEMATICS
#Metric Space#National University#Universitat De Val Ncia#Linear Subspaces#Functional Analysis
arxiv.org

Lipschitz continuity and Bochner-Eells-Sampson inequality for harmonic maps from $\mathrm{RCD}(K,N)$ spaces to $\mathrm{CAT}(0)$ spaces

We establish Lipschitz regularity of harmonic maps from $\mathrm{RCD}(K,N)$ metric measure spaces with lower Ricci curvature bounds and dimension upper bounds in synthetic sense with values into $\mathrm{CAT}(0)$ metric spaces with non-positive sectional curvature. Under the same assumptions, we obtain a Bochner-Eells-Sampson inequality with a Hessian type-term. This gives a fairly complete generalization of the classical theory for smooth source and target spaces to their natural synthetic counterparts and an affirmative answer to a question raised several times in the recent literature.
MATHEMATICS
arxiv.org

Linear Model with Local Differential Privacy

Scientific collaborations benefit from collaborative learning of distributed sources, but remain difficult to achieve when data are sensitive. In recent years, privacy preserving techniques have been widely studied to analyze distributed data across different agencies while protecting sensitive information. Secure multiparty computation has been widely studied for privacy protection with high privacy level but intense computation cost. There are also other security techniques sacrificing partial data utility to reduce disclosure risk. A major challenge is to balance data utility and disclosure risk while maintaining high computation efficiency. In this paper, matrix masking technique is applied to encrypt data such that the secure schemes are against malicious adversaries while achieving local differential privacy. The proposed schemes are designed for linear models and can be implemented for both vertical and horizontal partitioning scenarios. Moreover, cross validation is studied to prevent overfitting and select optimal parameters without additional communication cost. Simulation results present the efficiency of proposed schemes to analyze dataset with millions of records and high-dimensional data (n << p).
COMPUTERS
arxiv.org

Reduced order modeling with Barlow Twins self-supervised learning: Navigating the space between linear and nonlinear solution manifolds

We propose a unified data-driven reduced order model (ROM) that bridges the performance gap between linear and nonlinear manifold approaches. Deep learning ROM (DL-ROM) using deep-convolutional autoencoders (DC-AE) has been shown to capture nonlinear solution manifolds but fails to perform adequately when linear subspace approaches such as proper orthogonal decomposition (POD) would be optimal. Besides, most DL-ROM models rely on convolutional layers, which might limit its application to only a structured mesh. The proposed framework in this study relies on the combination of an autoencoder (AE) and Barlow Twins (BT) self-supervised learning, where BT maximizes the information content of the embedding with the latent space through a joint embedding architecture. Through a series of benchmark problems of natural convection in porous media, BT-AE performs better than the previous DL-ROM framework by providing comparable results to POD-based approaches for problems where the solution lies within a linear subspace as well as DL-ROM autoencoder-based techniques where the solution lies on a nonlinear manifold; consequently, bridges the gap between linear and nonlinear reduced manifolds. Furthermore, this BT-AE framework can operate on unstructured meshes, which provides flexibility in its application to standard numerical solvers, on-site measurements, experimental data, or a combination of these sources.
COMPUTERS
arxiv.org

New Penalized Stochastic Gradient Methods for Linearly Constrained Strongly Convex Optimization

For minimizing a strongly convex objective function subject to linear inequality constraints, we consider a penalty approach that allows one to utilize stochastic methods for problems with a large number of constraints and/or objective function terms. We provide upper bounds on the distance between the solutions to the original constrained problem and the penalty reformulations, guaranteeing the convergence of the proposed approach. We give a nested accelerated stochastic gradient method and propose a novel way for updating the smoothness parameter of the penalty function and the step-size. The proposed algorithm requires at most $\tilde O(1/\sqrt{\epsilon})$ expected stochastic gradient iterations to produce a solution within an expected distance of $\epsilon$ to the optimal solution of the original problem, which is the best complexity for this problem class to the best of our knowledge. We also show how to query an approximate dual solution after stochastically solving the penalty reformulations, leading to results on the convergence of the duality gap. Moreover, the nested structure of the algorithm and upper bounds on the distance to the optimal solutions allows one to safely eliminate constraints that are inactive at an optimal solution throughout the algorithm, which leads to improved complexity results. Finally, we present computational results that demonstrate the effectiveness and robustness of our algorithm.
SCIENCE
NewsBreak
Science
NewsBreak
Computer Science
arxiv.org

The 3-dimensional complex projective space admits no special generic maps

The main theorem of the present paper is that the 3-dimensional complex projective space does not admit special generic maps. Special generic maps are generalized versions of Morse functions on spheres with exactly two singular points. The canonical projections of unit spheres are of the class. The paper mainly focuses on special generic maps on 6-dimensional closed and simply-connected manifolds.
MATHEMATICS
arxiv.org

Covertly Controlling a Linear System

Consider the problem of covertly controlling a linear system. In this problem, Alice desires to control (stabilize or change the parameters of) a linear system, while keeping an observer, Willie, unable to decide if the system is indeed being controlled or not. We formally define the problem, under two different...
COMPUTERS
arxiv.org

Exploiting deterministic algorithms to perform global sensitivity analysis for continuous-time Markov chain compartmental models with application to epidemiology

Henri Mermoz Kouye (INRAE, MaIAGE, AIRSEA), Gildas Mazo (INRAE, MaIAGE), Clémentine Prieur (UGA, CNRS, Grenoble INP, AIRSEA), Elisabeta Vergu (INRAE, MaIAGE) In this paper, we develop an approach of global sensitivity analysis for compartmental models based on continuous-time Markov chains. We propose to measure the sensitivity of quantities of interest by representing the Markov chain as a deterministic function of the uncertain parameters and a random variable with known distribution modeling intrinsic randomness. This representation is exact and does not rely on meta-modeling. An application to a SARS-CoV-2 epidemic model is included to illustrate the practical impact of our approach.
SCIENCE
arxiv.org

L2C2: Locally Lipschitz Continuous Constraint towards Stable and Smooth Reinforcement Learning

This paper proposes a new regularization technique for reinforcement learning (RL) towards making policy and value functions smooth and stable. RL is known for the instability of the learning process and the sensitivity of the acquired policy to noise. Several methods have been proposed to resolve these problems, and in summary, the smoothness of policy and value functions learned mainly in RL contributes to these problems. However, if these functions are extremely smooth, their expressiveness would be lost, resulting in not obtaining the global optimal solution. This paper therefore considers RL under local Lipschitz continuity constraint, so-called L2C2. By designing the spatio-temporal locally compact space for L2C2 from the state transition at each time step, the moderate smoothness can be achieved without loss of expressiveness. Numerical noisy simulations verified that the proposed L2C2 outperforms the task performance while smoothing out the robot action generated from the learned policy.
COMPUTERS
arxiv.org

Random Feature Amplification: Feature Learning and Generalization in Neural Networks

In this work, we provide a characterization of the feature-learning process in two-layer ReLU networks trained by gradient descent on the logistic loss following random initialization. We consider data with binary labels that are generated by an XOR-like function of the input features. We permit a constant fraction of the training labels to be corrupted by an adversary. We show that, although linear classifiers are no better than random guessing for the distribution we consider, two-layer ReLU networks trained by gradient descent achieve generalization error close to the label noise rate, refuting the conjecture of Malach and Shalev-Shwartz that 'deeper is better only when shallow is good'. We develop a novel proof technique that shows that at initialization, the vast majority of neurons function as random features that are only weakly correlated with useful features, and the gradient descent dynamics 'amplify' these weak, random features to strong, useful features.
COMPUTERS
arxiv.org

Contrasting pseudo-criticality in the classical two-dimensional Heisenberg and $\mathrm{RP}^2$ models: zero-temperature phase transition versus finite-temperature crossover

Tensor-network methods are used to perform a comparative study of the two-dimensional classical Heisenberg and $\mathrm{RP}^2$ models. We demonstrate that uniform matrix product states (MPS) with explicit $\mathrm{SO}(3)$ symmetry can probe correlation lengths up to $\mathcal{O}(10^3)$ sites accurately, and we study the scaling of entanglement entropy and universal features of MPS entanglement spectra. For the Heisenberg model, we find no signs of a finite-temperature phase transition, supporting the scenario of asymptotic freedom. For the $\mathrm{RP}^2$ model we observe an abrupt onset of scaling behaviour, consistent with hints of a finite-temperature phase transition reported in previous studies. A careful analysis of the softening of the correlation length divergence, the scaling of the entanglement entropy and the MPS entanglement spectra shows that our results are inconsistent with true criticality, but are rather in agreement with the scenario of a crossover to a pseudo-critical region which exhibits strong signatures of nematic quasi-long-range order at length scales below the true correlation length. Our results reveal a fundamental difference in scaling behaviour between the Heisenberg and $\mathrm{RP}^2$ models: Whereas the emergence of scaling in the former shifts to zero temperature if the bond dimension is increased, it occurs at a finite bond-dimension independent crossover temperature in the latter.
SCIENCE
arxiv.org

Optically-induced magnetization switching in NiCo2O4 thin films using ultrafast lasers

Recently, all-optical magnetization control has been garnering considerable attention in realizing next-generation ultrafast magnetic information devices. Here, employing a magneto-optical Kerr effect (MOKE) microscope, we observed the laser-induced magnetization switching of ferrimagnetic oxide NiCo2O4 (NCO) epitaxial thin films with perpendicular magnetic anisotropy, where the sample was pumped at 1030-nm laser pulses, and magnetic domain images were acquired via the MOKE microscope with a white light emitting diode. Laser pulses irradiated an NCO thin film at various temperatures from 300 K to 400 K while altering the parameters of pulse interval, fluence, and the number of pulses with the absence of the external magnetic field. We observed accumulative all-optical switching at 380 K and above. Our observation of oxide NCO thin films facilitates the realization of chemically stable magnetization switching using ultrafast lasers, and without applying a magnetic field.
PHYSICS
arxiv.org

Identifying strongly correlated groups of sections in a large motorway network

In a motorway network, correlations between the different links, i.e. between the parts of (different) motorways, are of considerable interest. Knowledge of fluxes and velocities on individual motorways is not sufficient, rather, their correlations determine or reflect, respectively, the functionality of and the dynamics on the network as a whole. These correlations are time dependent as the dynamics on the network is highly non-stationary, as it strongly varies during the day and over the week. Correlations are indispensable to detect risks of failure in a traffic network. Discovery of alternative routes less correlated with the vulnerable ones helps to make the traffic network robust and to avoid a collapse. Hence, the identification of, especially, groups of strongly correlated road sections is needed. To this end, we employ an optimized $k$-means clustering method. A major ingredient is the spectral information of certain correlation matrices in which the leading collective motion of the network has been removed. We identify strongly correlated groups of sections in the large motorway network of North Rhine-Westphalia (NRW), Germany. The groups classify the motorway sections in terms of spectral and geographic features as well as of traffic phases during different time periods. The representation and visualization of the groups on the real topology, i.e. on the road map, provides new results on the dynamics on the motorway network. Our approach is very general and can also be applied to other correlated complex systems.
TRAFFIC
arxiv.org

Laser-patterned submicron Bi2Se3-WS2 pixels with tunable circular polarization at room temperature

Zachariah Hennighausen, Darshana Wickramaratne, Kathleen M. McCreary, Bethany M. Hudak, Todd Brintlinger, Hsun-Jen Chuang, Mehmet A. Noyan, Berend T. Jonker, Rhonda M. Stroud, Olaf M. vant Erve. Characterizing and manipulating the circular polarization of light is central to numerous emerging technologies, including spintronics and quantum computing. Separately, monolayer tungsten disulfide...
SCIENCE
arxiv.org

TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm

Approximating distributions from their samples is a canonical statistical-learning problem. One of its most powerful and successful modalities approximates every distribution to an $\ell_1$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d$ polynomial, where $t\ge1$ and $d\ge0$. Letting $c_{t,d}$ denote the smallest such factor, clearly $c_{1,0}=1$, and it can be shown that $c_{t,d}\ge 2$ for all other $t$ and $d$. Yet current computationally efficient algorithms show only $c_{t,1}\le 2.25$ and the bound rises quickly to $c_{t,d}\le 3$ for $d\ge 9$. We derive a near-linear-time and essentially sample-optimal estimator that establishes $c_{t,d}=2$ for all $(t,d)\ne(1,0)$. Additionally, for many practical distributions, the lowest approximation distance is achieved by polynomials with vastly varying number of pieces. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation. Experiments combining the two techniques confirm improved performance over existing methodologies.
COMPUTERS
arxiv.org

Flowformer: Linearizing Transformers with Conservation Flows

Transformers based on the attention mechanism have achieved impressive success in various areas. However, the attention mechanism has a quadratic complexity, significantly impeding Transformers from dealing with numerous tokens and scaling up to bigger models. Previous methods mainly utilize the similarity decomposition and the associativity of matrix multiplication to devise linear-time attention mechanisms. They avoid degeneration of attention to a trivial distribution by reintroducing inductive biases such as the locality, thereby at the expense of model generality and expressiveness. In this paper, we linearize Transformers free from specific inductive biases based on the flow network theory. We cast attention as the information flow aggregated from the sources (values) to the sinks (results) through the learned flow capacities (attentions). Within this framework, we apply the property of flow conservation with attention and propose the Flow-Attention mechanism of linear complexity. By respectively conserving the incoming flow of sinks for source competition and the outgoing flow of sources for sink allocation, Flow-Attention inherently generates informative attentions without using specific inductive biases. Empowered by the Flow-Attention, Flowformer yields strong performance in linear time for wide areas, including long sequence, time series, vision, natural language, and reinforcement learning.
TECHNOLOGY
arxiv.org

Control and stabilization of geometrically exact beams

We study well-posedness, stabilization and control problems involving freely vibrating beams that may undergo motions of large magnitude -- i.e. large displacements of the reference line and large rotations of the cross sections. Such beams, shearable and very flexible, are often called geometrically exact beams and are especially needed in modern highly flexible light-weight structures, where one cannot neglect these large motions. We view these beams from two perspectives. The first perspective is one in which the beam is described in terms of the position of its reference line and the orientation of its cross sections (expressed in some fixed coordinate system). This is the generally encountered model, due to Eric Reissner and Juan C. Simo. Of second order in time and space, it is a quasilinear system of six equations. The second perspective is one in which the beam is rather described by intrinsic variables -- velocities and strains or internal forces and moments -- which are moreover expressed in a moving coordinate system attached to the beam. This system, proposed in its most general form by Dewey H. Hodges, consists of twice as many equations, but is of first order in time and space, hyperbolic and only semilinear (quadratic). From the definition of the state of the latter model, one can see that both perspectives are linked by a nonlinear transformation. The questions of well-posedness, stabilization and control are addressed for beams governed by the intrinsic model, while by using the transformation we also prove that the existence and uniqueness of a classical solution to the intrinsic model implies that of a classical solution to the model written in terms of positions and rotations. In particular, this enables us to deduce corresponding results for the latter model. We also also address these questions for networks of beams attached to each other by means of rigid joints.
SCIENCE

