ContributorsPublishersAdvertisers
Mathematics

Permutation invariant matrix statistics and computational language tasks

By Manuel Accettulli Huber, Adriana Correia, Sanjaye Ramgoolam, Mehrnoosh Sadrzadeh
arxiv.org
 2 days ago

The Linguistic Matrix Theory programme introduced by Kartsaklis, Ramgoolam and Sadrzadeh is an approach to the statistics of matrices that are generated in type-driven distributional semantics, based on permutation invariant polynomial functions...

arxiv.org

Comments / 0

Related
Searchengineland.com

Microsoft rolls out portfolio bid strategies and automated integration with Google Tag Manager

Portfolio bid strategies are now available globally, Microsoft Advertising announced on Tuesday. The platform’s automated integration with Google Tag Manager (GTM) is also now generally available. Portfolio bid strategies. This feature automatically adjusts bidding across multiple campaigns to balance under- and over-performing campaigns that share the same bidding strategy...
SOFTWARE
mit.edu

A new programming language for high-performance computers

High-performance computing is needed for an ever-growing number of tasks — such as image processing or various deep learning applications on neural nets — where one must plow through immense piles of data, and do so reasonably quickly, or else it could take ridiculous amounts of time. It’s widely believed that, in carrying out operations of this sort, there are unavoidable trade-offs between speed and reliability. If speed is the top priority, according to this view, then reliability will likely suffer, and vice versa.
CODING & PROGRAMMING
Wyoming News

Automate tasks

Automation is the way of the future—and the numbers show why. According to WorkMarket’s 2020 In(Sight) Report, more than 50% of employees believe automation could save them 240 hours a year. However, it can be hard to discern which tasks should be automated and which deserve a more personal approach. While things like managing employees, working with clients, and networking may require time, automating back-end tasks such as bookkeeping and metrics may actually improve business. For example, 40% of business owners reported spending more than 80 hours annually on tax and accounting tasks, with most of them spending more than 41 hours a year. Using software to help track invoices, sending out payment reminders, and automatically paying bills can save a lot of time and may even help avoid accounting errors.
TECHNOLOGY
arxiv.org

The separating variety for 2x2 matrix invariants

Let $G$ be a linear algebraic group acting linearly on a $G$-variety $\mathcal{V}$, and let $k[\mathcal{V}]^G$ be the corresponding algebra of invariant polynomial functions. A separating set $S \subseteq k[\mathcal{V}]^G$ is a set of polynomials with the property that for all $v,w \in \mathcal{V}$, if there exists $f \in k[\mathcal{V}]^G$ separating $v$ and $w$, then there exists $f \in S$ separating $v$ and $w$.
MATHEMATICS
IN THIS ARTICLE
#Language#Invariant#Computation#Permutation#Qmul
arxiv.org

Faster Exact Permutation Testing: Using a Representative Subgroup

Non-parametric tests based on permutation, rotation or sign-flipping are examples of so-called group-invariance tests. These tests rely on invariance of the null distribution under a set of transformations that has a group structure, in the algebraic sense. Such groups are often huge, which makes it computationally infeasible to use the entire group. Hence, it is standard practice to test using a randomly sampled set of transformations from the group. This random sample still needs to be substantial to obtain good power and replicability. We improve upon the standard practice by using a well-designed subgroup of transformations instead of a random sample. We show this can yield a more powerful and fully replicable test with the same number of transformations. For a normal location model and a particular design of the subgroup, we show that the power improvement is equivalent to the power difference between a Monte Carlo $Z$-test and Monte Carlo $t$-test. In our simulations, we find that we can obtain the same power as a test based on sampling with just half the number of transformations, or equivalently, more power for the same computation time. These benefits come entirely `for free', as our methodology relies on an assumption of invariance under the subgroup, which is implied by invariance under the entire group.
MATHEMATICS
arxiv.org

Context-Aware Discrimination Detection in Job Vacancies using Computational Language Models

Discriminatory job vacancies are disapproved worldwide, but remain persistent. Discrimination in job vacancies can be explicit by directly referring to demographic memberships of candidates. More implicit forms of discrimination are also present that may not always be illegal but still influence the diversity of applicants. Explicit written discrimination is still present in numerous job vacancies, as was recently observed in the Netherlands. Current efforts for the detection of explicit discrimination concern the identification of job vacancies containing potentially discriminating terms such as "young" or "male". However, automatic detection is inefficient due to low precision: e.g. "we are a young company" or "working with mostly male patients" are phrases that contain explicit terms, while the context shows that these do not reflect discriminatory content.
TECHNOLOGY
wolfram.com

Tsallis Statistics, a quick computational overview

A complex microscopic dynamic, e.g., with strong correlations, usually drag the system into out-of-equilibrium states. In this cases, the Boltzmann-Gibbs (BG) statistical mechanics fails and some sort of extension of the entropy measures are needed. In 1988, Tsallis proposed nonextensive statistical mechanics based on the concept of a nonadditive entropy. In this short computational essay, we shall explore it in the Wolfram Mathematica, using the resource function TsallisEntropy.
MATHEMATICS
arxiv.org

Sparse superposition codes under VAMP decoding with generic rotational invariant coding matrices

Sparse superposition codes were originally proposed as a capacity-achieving communication scheme over the gaussian channel, whose coding matrices were made of i.i.d. gaussian entries.We extend this coding scheme to more generic ensembles of rotational invariant coding matrices with arbitrary spectrum, which include the gaussian ensemble as a special case. We further introduce and analyse a decoder based on vector approximate message-passing (VAMP).Our main findings, based on both a standard replica symmetric potential theory and state evolution analysis, are the superiority of certain structured ensembles of coding matrices (such as partial row-orthogonal) when compared to i.i.d. matrices, as well as a spectrum-independent upper bound on VAMP's threshold. Most importantly, we derive a simple "spectral criterion " for the scheme to be at the same time capacity-achieving while having the best possible algorithmic threshold, in the "large section size" asymptotic limit. Our results therefore provide practical design principles for the coding matrices in this promising communication scheme.
MATHEMATICS
YOU MAY ALSO LIKE
NewsBreak
Technology
NewsBreak
Mathematics
NewsBreak
Computers
NewsBreak
Science
NewsBreak
Computer Science
arxiv.org

Universal Statistical Simulator

The Quantum Fourier Transform is a famous example in quantum computing for being the first demonstration of a useful algorithm in which a quantum computer is exponentially faster than a classical computer. However when giving an explanation of the speed up, understanding computational complexity of a classical calculation has to be taken on faith. Moreover, the explanation also comes with the caveat that the current classical calculations might be improved. In this paper we present a quantum computer code for a Galton Board Simulator that is exponentially faster than a classical calculation using an example that can be intuitively understood without requiring an understanding of computational complexity. We demonstrate a straight forward implementation on a quantum computer, using only three types of quantum gate, which calculates $2^n$ trajectories using $\mathcal{O} (n^2)$ resources. The circuit presented here also benefits from having a lower depth than previous Quantum Galton Boards, and in addition, we show that it can be extended to a universal statistical simulator which is achieved by removing pegs and altering the left-right ratio for each peg.
CODING & PROGRAMMING
arxiv.org

Rotationally-Invariant Circuits: Universality with the exchange interaction and two ancilla qubits

Universality of local unitary transformations is one of the cornerstones of quantum computing with many applications and implications that go beyond this field. However, it has been recently shown that this universality does not hold in the presence of continuous symmetries: generic symmetric unitaries on a composite system cannot be implemented, even approximately, using local symmetric unitaries on the subsystems [I. Marvian, Nature Physics (2022)]. In this work, we study qubit circuits formed from k-local rotationally-invariant unitaries and fully characterize the constraints imposed by locality on the realizable unitaries. We also present an interpretation of these constraints in terms of the average energy of states with a fixed angular momentum. Interestingly, despite these constraints, we show that, using a pair of ancilla qubits, any rotationally-invariant unitary can be realized with the Heisenberg exchange interaction, which is 2-local and rotationally-invariant. We also show that a single ancilla is not enough to achieve universality. Finally, we discuss applications of these results for quantum computing with semiconductor quantum dots, quantum reference frames, and resource theories.
SCIENCE
arxiv.org

Identifiability of Label Noise Transition Matrix

The noise transition matrix plays a central role in the problem of learning from noisy labels. Among many other reasons, a significant number of existing solutions rely on access to it. Estimating the transition matrix without using ground truth labels is a critical and challenging task. When label noise transition depends on each instance, the problem of identifying the instance-dependent noise transition matrix becomes substantially more challenging. Despite recent works proposing solutions for learning from instance-dependent noisy labels, we lack a unified understanding of when such a problem remains identifiable, and therefore learnable. This paper seeks to provide answers to a sequence of related questions: What are the primary factors that contribute to the identifiability of a noise transition matrix? Can we explain the observed empirical successes? When a problem is not identifiable, what can we do to make it so? We will relate our theoretical findings to the literature and hope to provide guidelines for developing effective solutions for battling instance-dependent label noise.
MATHEMATICS
arxiv.org

Computation of lattice isomorphisms and the integral matrix similarity problem

Let $K$ be a number field, let $A$ be a finite-dimensional $K$-algebra, let $\mathrm{J}(A)$ denote the Jacobson radical of $A$, and let $\Lambda$ be an $\mathcal{O}_{K}$-order in $A$. Suppose that each simple component of the semisimple $K$-algebra $A/{\mathrm{J}(A)}$ is isomorphic to a matrix ring over a field. Under this hypothesis on $A$, we give an efficient algorithm that given two $\Lambda$-lattices $X$ and $Y$, determines whether $X$ and $Y$ are isomorphic, and if so, computes an explicit isomorphism $X \rightarrow Y$. As an application, we give an efficient algorithm for the following long-standing problem: given a number field $K$, a positive integer $n$ and two matrices $A,B \in \mathrm{Mat}_{n}(\mathcal{O}_{K})$, determine whether $A$ and $B$ are similar over $\mathcal{O}_{K}$, and if so, return a matrix $C \in \mathrm{GL}_{n}(\mathcal{O}_{K})$ such that $B= CAC^{-1}$. We give explicit examples that show that their implementations for $\mathcal{O}_{K}=\mathbb{Z}$ vastly outperform implementations of all previous algorithms, as predicted by our complexity analysis.
MATHEMATICS
arxiv.org

Solving matrix nearness problems via Hamiltonian systems, matrix factorization, and optimization

In these lectures notes, we review our recent works addressing various problems of finding the nearest stable system to an unstable one. After the introduction, we provide some preliminary background, namely, defining Port-Hamiltonian systems and dissipative Hamiltonian systems and their properties, briefly discussing matrix factorizations, and describing the optimization methods that we will use in these notes. In the third chapter, we present our approach to tackle the distance to stability for standard continuous linear time invariant (LTI) systems. The main idea is to rely on the characterization of stable systems as dissipative Hamiltonian systems. We show how this idea can be generalized to compute the nearest $\Omega$-stable matrix, where the eigenvalues of the sought system matrix $A$ are required to belong a rather general set $\Omega$. We also show how these ideas can be used to compute minimal-norm static feedbacks, that is, stabilize a system by choosing a proper input $u(t)$ that linearly depends on $x(t)$ (static-state feedback), or on $y(t)$ (static-output feedback). In the fourth chapter, we present our approach to tackle the distance to passivity. The main idea is to rely on the characterization of stable systems as port-Hamiltonian systems. We also discuss in more details the special case of computing the nearest stable matrix pairs. In the last chapter, we focus on discrete-time LTI systems. Similarly as for the continuous case, we propose a parametrization that allows efficiently compute the nearest stable system (for matrices and matrix pairs), allowing to compute the distance to stability. We show how this idea can be used in data-driven system identification, that is, given a set of input-output pairs, identify the system $A$.
COMPUTERS
arxiv.org

Robust Training of Neural Networks using Scale Invariant Architectures

In contrast to SGD, adaptive gradient methods like Adam allow robust training of modern deep networks, especially large language models. However, the use of adaptivity not only comes at the cost of extra memory but also raises the fundamental question: can non-adaptive methods like SGD enjoy similar benefits? In this paper, we provide an affirmative answer to this question by proposing to achieve both robust and memory-efficient training via the following general recipe: (1) modify the architecture and make it scale invariant, i.e. the scale of parameter doesn't affect the output of the network, (2) train with SGD and weight decay, and optionally (3) clip the global gradient norm proportional to weight norm multiplied by $\sqrt{\tfrac{2\lambda}{\eta}}$, where $\eta$ is learning rate and $\lambda$ is weight decay. We show that this general approach is robust to rescaling of parameter and loss by proving that its convergence only depends logarithmically on the scale of initialization and loss, whereas the standard SGD might not even converge for many initializations. Following our recipe, we design a scale invariant version of BERT, called SIBERT, which when trained simply by vanilla SGD achieves performance comparable to BERT trained by adaptive methods like Adam on downstream tasks.
CODING & PROGRAMMING
arxiv.org

Longitudinal regression of covariance matrix outcomes

In this study, a longitudinal regression model for covariance matrix outcomes is introduced. The proposal considers a multilevel generalized linear model for regressing covariance matrices on (time-varying) predictors. This model simultaneously identifies covariate associated components from covariance matrices, estimates regression coefficients, and estimates the within-subject variation in the covariance matrices. Optimal estimators are proposed for both low-dimensional and high-dimensional cases by maximizing the (approximated) hierarchical likelihood function and are proved to be asymptotically consistent, where the proposed estimator is the most efficient under the low-dimensional case and achieves the uniformly minimum quadratic loss among all linear combinations of the identity matrix and the sample covariance matrix under the high-dimensional case. Through extensive simulation studies, the proposed approach achieves good performance in identifying the covariate related components and estimating the model parameters. Applying to a longitudinal resting-state fMRI dataset from the Alzheimer's Disease Neuroimaging Initiative (ADNI), the proposed approach identifies brain networks that demonstrate the difference between males and females at different disease stages. The findings are in line with existing knowledge of AD and the method improves the statistical power over the analysis of cross-sectional data.
SCIENCE
arxiv.org

Classical shadows with Pauli-invariant unitary ensembles

The classical shadow estimation protocol is a noise-resilient and sample-efficient quantum algorithm for learning the properties of quantum systems. Its performance depends on the choice of a unitary ensemble, which must be chosen by a user in advance. What is the weakest assumption that can be made on the chosen unitary ensemble that would still yield meaningful and interesting results? To address this question, we consider the class of Pauli-invariant unitary ensembles, i.e. unitary ensembles that are invariant under multiplication by a Pauli operator. This class includes many previously studied ensembles like the local and global Clifford ensembles as well as locally scrambled unitary ensembles. For this class of ensembles, we provide an explicit formula for the reconstruction map corresponding to the shadow channel and give explicit sample complexity bounds. In addition, we provide two applications of our results. Our first application is to locally scrambled unitary ensembles, where we give explicit formulas for the reconstruction map and sample complexity bounds that circumvent the need to solve an exponential-sized linear system. Our second application is to the classical shadow tomography of quantum channels with Pauli-invariant unitary ensembles. Our results pave the way for more efficient or robust protocols for predicting important properties of quantum states, such as their fidelity, entanglement entropy, and quantum Fisher information.
MATHEMATICS
arxiv.org

A Note on Holevo quantity of $SU(2)$-invariant states

The Holevo quantity and the $SU(2)$-invariant states have particular importance in quantum information processing. We calculate analytically the Holevo quantity for bipartite systems composed of spin-$j$ and spin-$\frac{1}{2}$ subsystems with $SU(2)$ symmetry, when the projective measurements are performed on the spin-$\frac{1}{2}$ subsystem. The relations among the Holevo quantity, the maximal values of the Holevo quantity and the states are analyzed in detail. In particular, we show that the Holevo quantity increases in the parameter region $F<F_d$ and decreases in region $F>F_d$ when $j$ increases, where $F$ is function of temperature in thermal equilibrium and $F_d=j/(2j+1)$, and the maximum value of the Holevo quantity is attained at $F=1$ for all $j$. Moreover, when the dimension of system increases, the maximal value of the Holevo quantity decreases.
PHYSICS
arxiv.org

Deep invariant networks with differentiable augmentation layers

Designing learning systems which are invariant to certain data transformations is critical in machine learning. Practitioners can typically enforce a desired invariance on the trained model through the choice of a network architecture, e.g. using convolutions for translations, or using data augmentation. Yet, enforcing true invariance in the network can be difficult, and data invariances are not always known a piori. State-of-the-art methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems, which are complex to solve and often computationally demanding. In this work we investigate new ways of learning invariances only from the training data. Using learnable augmentation layers built directly in the network, we demonstrate that our method is very versatile. It can incorporate any type of differentiable augmentation and be applied to a broad class of learning problems beyond computer vision. We provide empirical evidence showing that our approach is easier and faster to train than modern automatic data augmentation techniques based on bilevel optimization, while achieving comparable results. Experiments show that while the invariances transferred to a model through automatic data augmentation are limited by the model expressivity, the invariance yielded by our approach is insensitive to it by design.
CODING & PROGRAMMING
arxiv.org

Moduli of relative stable maps to $\mathbb{P}^1$: cut-and-paste invariants

We study constructible invariants of the moduli space $\overline{\mathcal{M}}(\boldsymbol{x})$ of stable maps from genus zero curves to $\mathbb{P}^1$, relative to $0$ and $\infty$, with ramification profiles specified by ${\boldsymbol{x}\in \mathbb{Z}^n}$. These spaces are central to the enumerative geometry of $\mathbb{P}^1$, and provide a large family of birational models of the Deligne--Mumford--Knudsen moduli space $\overline{\mathcal{M}}_{0,n}$. For the sequence of vectors $\boldsymbol{x}$ corresponding to maps which are maximally ramified over $0$ and unramified over $\infty$, we prove that a generating function for the topological Euler characteristics of these spaces satisfies a differential equation which allows for its recursive calculation. We also show that the class of the moduli space in the Grothendieck ring of varieties is constant as $\boldsymbol{x}$ varies within a fixed chamber in the resonance decomposition of $\mathbb{Z}^n$. We conclude by suggesting several further directions in the study of these spaces, giving conjectures on (1) the asymptotic behavior of the Euler characteristic and (2) a potential chamber structure for the Chern numbers.
MATHEMATICS
arxiv.org

Infinite-horizon risk-sensitive performance criteria for translation invariant networks of linear quantum stochastic systems

This paper is concerned with networks of identical linear quantum stochastic systems which interact with each other and external bosonic fields in a translation invariant fashion. The systems are associated with sites of a multidimensional lattice and are governed by coupled linear quantum stochastic differential equations (QSDEs). The block Toeplitz coefficients of these QSDEs are specified by the energy and coupling matrices which quantify the Hamiltonian and coupling operators for the component systems. We discuss the invariant Gaussian quantum state of the network when it satisfies a stability condition and is driven by statistically independent vacuum fields. A quadratic-exponential functional (QEF) is considered as a risk-sensitive performance criterion for a finite fragment of the network over a bounded time interval. This functional involves a quadratic function of dynamic variables of the component systems with a block Toeplitz weighting matrix. Assuming the invariant state, we study the spatio-temporal asymptotic rate of the QEF per unit time and per lattice site in the thermodynamic limit of unboundedly growing time horizons and fragments of the lattice. A spatio-temporal frequency-domain formula is obtained for the QEF rate in terms of two spectral functions associated with the real and imaginary parts of the invariant quantum covariance kernel of the network variables. A homotopy method and asymptotic expansions for evaluating the QEF rate are also discussed.
SCIENCE

Comments / 0

Community Policy