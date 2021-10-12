Tensor decompositions and algorithms, with applications to tensor learning
By Felipe Bottega Diniz
A new algorithm of the canonical polyadic decomposition (CPD) presented here. It features lower computational complexity and memory usage than the available state of the art implementations. We begin with some examples of CPD applications to real world problems. A short summary of the main contributions in this work
Raw data is impervious to cognitive bias — devoid of human emotion and predisposition. Guided by this precept, Billy Beane’s 2001 Oakland As designed a new statistical blueprint for championship-aspiring ballclubs, and precipitated a data analytics revolution that spread like wildfire throughout all facets of professional sports. As Beane discovered, baseball’s wealth of data makes it conducive to predictive analytics. The problem I have chosen to explore is employing machine learning to predict outcomes of individual games. My final logistic regression and random forest models achieved test accuracies among the higher levels found in existing scientific literature, and outperformed the Vegas betting odds on a two-year test period. My results confirm that the betting odds are highly efficient; however, my findings also suggest that machine learning methods may provide an incremental informational edge over the wisdom of the masses, which could translate to meaningful insights in the long run. My model also shed light on the importance of team pitching, namely that the strength of a team’s bullpen is much more indicative of its ability to win than the quality of its offense. Overall, the underlying methodologies and insights drawn from my algorithm may be useful to the strategic decision making of Major League Baseball front offices, or various other sports analytics entities.
In attempts to understand the very nature of our reality, physicists sure have some mind-bending theories. Like what if information is a tangible and fundamental aspect of physical reality itself – alongside matter and energy? Or, alternatively, what if information is the fifth state of matter?
Information is, after all, something all matter and energy measurably possess. The rules that govern their existence, like their mass, speed, or charge, are all bits of information they contain.
So to allow experimental probing of such ideas, physicist Melvin Vopson from the University of Portsmouth in the UK estimated how much information a single elementary...
This is one of multiple articles that will be covering algorithms in detail. Developers struggle with these and I want to simplify them as much as possible, from basic to complex. Whether you want to land a high-paying job as a software developer or want to land those exclusive gold...
Differentiable programming is a new programming paradigm which enables large scale optimization through automatic calculation of gradients also known as auto-differentiation. This concept emerges from deep learning, and has also been generalized to tensor network optimizations. Here, we extend the differentiable programming to tensor networks with isometric constraints with applications to multiscale entanglement renormalization ansatz (MERA) and tensor network renormalization (TNR). By introducing several gradient-based optimization methods for the isometric tensor network and comparing with Evenbly-Vidal method, we show that auto-differentiation has a better performance for both stability and accuracy. We numerically tested our methods on 1D critical quantum Ising spin chain and 2D classical Ising model. We calculate the ground state energy for the 1D quantum model and internal energy for the classical model, and scaling dimensions of scaling operators and find they all agree with the theory well.
A deep dive into the math details, with illustrations. When designing neural networks, we are often faced with the need of tensor reshaping. The spatial shape of a tensor has to be altered with a certain layer to be able to fit the downstream layers. Like the special wedge-shaped lego blocks with differently shaped top and bottom surfaces, we also need some adaptor blocks in neural network.
Hussam Al Daas, Grey Ballard, Paul Cazeaux, Eric Hallman, Agnieszka Miedlar, Mirjeta Pasha, Tim W. Reid, Arvind K. Saibaba. The Tensor-Train (TT) format is a highly compact low-rank representation for high-dimensional tensors. TT is particularly useful when representing approximations to the solutions of certain types of parametrized partial differential equations. For many of these problems, computing the solution explicitly would require an infeasible amount of memory and computational time. While the TT format makes these problems tractable, iterative techniques for solving the PDEs must be adapted to perform arithmetic while maintaining the implicit structure. The fundamental operation used to maintain feasible memory and computational time is called rounding, which truncates the internal ranks of a tensor already in TT format. We propose several randomized algorithms for this task that are generalizations of randomized low-rank matrix approximation algorithms and provide significant reduction in computation compared to deterministic TT-rounding algorithms. Randomization is particularly effective in the case of rounding a sum of TT-tensors (where we observe 20x speedup), which is the bottleneck computation in the adaptation of GMRES to vectors in TT format. We present the randomized algorithms and compare their empirical accuracy and computational time with deterministic alternatives.
The tensor renormalization group method is a promising approach to lattice field theories, which is free from the sign problem unlike standard Monte Carlo methods. One of the remaining issues is the application to gauge theories, which is so far limited to U(1) and SU(2) gauge groups. In the case of higher rank, it becomes highly nontrivial to restrict the number of representations in the character expansion to be used in constructing the fundamental tensor. We propose a practical strategy to accomplish this and demonstrate it in 2D U($N$) and SU($N$) gauge theories, which are exactly solvable. Using this strategy, we obtain the singular-value spectrum of the fundamental tensor, which turns out to have a definite profile in the large-$N$ limit. For the U($N$) case, in particular, we show that the large-$N$ behavior of the singular-value spectrum changes qualitatively at the critical coupling of the Gross-Witten-Wadia phase transition. As an interesting consequence, we find a new type of volume independence in the large-$N$ limit of the 2D U($N$) gauge theory with the $\theta$ term in the strong coupling phase, which goes beyond the Eguchi-Kawai reduction.
Proteins are the molecular machines of all living cells and have been exploited for use in many applications, including therapeutics and industrial catalysts. To overcome the limitations of naturally occurring proteins, protein engineering is used to improve protein characteristics such as stability and functionality. In a new study, researchers demonstrate a machine learning algorithm that accelerates the protein engineering process. The study is reported in the journal Nature Communications.
The early detection of terrorist threat objects, such as guns and knives, through improved metal detection, has the potential to reduce the number of attacks and improve public safety and security. To achieve this, there is considerable potential to use the fields applied and measured by a metal detector to discriminate between different shapes and different metals since, hidden within the field perturbation, is object characterisation information. The magnetic polarizability tensor (MPT) offers an economical characterisation of metallic objects and its spectral signature provides additional object characterisation information. The MPT spectral signature can be determined from measurements of the induced voltage over a range frequencies in a metal signature for a hidden object. With classification in mind, it can also be computed in advance for different threat and non-threat objects. In the article, we evaluate the performance of probabilistic and non-probabilistic machine learning algorithms, trained using a dictionary of computed MPT spectral signatures, to classify objects for metal detection. We discuss the importances of using appropriate features and selecting an appropriate algorithm depending on the classification problem being solved and we present numerical results for a range of practically motivated metal detection classification problems.
Tensor Network States (TNS) offer an efficient representation for the ground state of quantum many body systems and play an important role in the simulations of them. Numerous TNS are proposed in the past few decades. However, due to the high cost of TNS for two-dimensional systems, a balance between the encoded entanglement and computational complexity of TNS is yet to be reached. In this work we introduce a new Tree Tensor Network (TTN) based TNS dubbed as Fully- Augmented Tree Tensor Network (FATTN) by releasing the constraint in Augmented Tree Tensor Network (ATTN). When disentanglers are augmented in the physical layer of TTN, FATTN can provide more entanglement than TTN and ATTN. At the same time, FATTN maintains the scaling of computational cost with bond dimension in TTN and ATTN. Benchmark results on the ground state energy for the transverse Ising model are provided to demonstrate the improvement of accuracy of FATTN over TTN and ATTN. Moreover, FATTN is quite flexible which can be constructed as an interpolation between Tree Tensor Network and Multiscale Entanglement Renormalization Ansatz (MERA) to reach a balance between the encoded entanglement and the computational cost.
The matrix normal model, the family of Gaussian matrix-variate distributions whose covariance matrix is the Kronecker product of two lower dimensional factors, is frequently used to model matrix-variate data. The tensor normal model generalizes this family to Kronecker products of three or more factors. We study the estimation of the Kronecker factors of the covariance matrix in the matrix and tensor models. We show nonasymptotic bounds for the error achieved by the maximum likelihood estimator (MLE) in several natural metrics. In contrast to existing bounds, our results do not rely on the factors being well-conditioned or sparse. For the matrix normal model, all our bounds are minimax optimal up to logarithmic factors, and for the tensor normal model our bound for the largest factor and overall covariance matrix are minimax optimal up to constant factors provided there are enough samples for any estimator to obtain constant Frobenius error. In the same regimes as our sample complexity bounds, we show that an iterative procedure to compute the MLE known as the flip-flop algorithm converges linearly with high probability. Our main tool is geodesic strong convexity in the geometry on positive-definite matrices induced by the Fisher information metric. This strong convexity is determined by the expansion of certain random quantum channels. We also provide numerical evidence that combining the flip-flop algorithm with a simple shrinkage estimator can improve performance in the undersampled regime.
The integration of algorithmic components into neural architectures has gained increased attention recently, as it allows training neural networks with new forms of supervision such as ordering constraints or silhouettes instead of using ground truth labels. Many approaches in the field focus on the continuous relaxation of a specific task and show promising results in this context. But the focus on single tasks also limits the applicability of the proposed concepts to a narrow range of applications. In this work, we build on those ideas to propose an approach that allows to integrate algorithms into end-to-end trainable neural network architectures based on a general approximation of discrete conditions. To this end, we relax these conditions in control structures such as conditional statements, loops, and indexing, so that resulting algorithms are smoothly differentiable. To obtain meaningful gradients, each relevant variable is perturbed via logistic distributions and the expectation value under this perturbation is approximated. We evaluate the proposed continuous relaxation model on four challenging tasks and show that it can keep up with relaxations specifically designed for each individual task.
Hybrid quantum-classical algorithms based on variational circuits are a promising approach to quantum machine learning problems for near-term devices, but the selection of the variational ansatz is an open issue. Recently, tensor network-inspired circuits have been proposed as a natural choice for such ansatz. Their employment on binary classification tasks provided encouraging results. However, their effectiveness on more difficult tasks is still unknown. Here, we present numerical experiments on multi-class classifiers based on tree tensor network and multiscale entanglement renormalization ansatz circuits. We conducted experiments on image classification with the MNIST dataset and on quantum phase recognition with the XXZ model by Cirq and TensorFlow Quantum. In the former case, we reduced the number of classes to four to match the aimed output based on 2 qubits. The quantum data of the XXZ model consist of three classes of ground states prepared by a checkerboard circuit used for the ansatz of the variational quantum eigensolver, corresponding to three distinct quantum phases. Test accuracy turned out to be 59%-93% and 82%-96% respectively, depending on the model architecture and on the type of preprocessing.
We evaluate the Hadamard function and the vacuum expectation values (VEVs) of the field squared and energy-momentum tensor for a massless conformally coupled scalar field in $(D+1)$-dimensional de Sitter (dS) spacetime foliated by spatial sections of negative constant curvature. It is assumed that the field is prepared in the hyperbolic vacuum state. An integral representation for the difference of the Hadamard functions corresponding to the hyperbolic and Bunch-Davies vacua is provided that is well adapted for the evaluation of the expectation values in the coincidence limit. It is shown that the Bunch-Davies state is interpreted as thermal with respect to the hyperbolic vacuum. An expression for the corresponding density of states is provided. The relations obtained for the difference in the VEVs for the Bunch-Davies and hyperbolic vacua are compared with the corresponding relations for the Fulling-Rindler and Minkowski vacua in flat spacetime. The similarity between those relations is explained by the conformal connection of dS spacetime with hyperbolic foliation and Rindler spacetime. As a limiting case, the VEVs for the conformal vacuum in the Milne universe are discussed.
We prove a general result that relates certain pushouts of $E_k$-algebras to relative tensors over $E_{k+1}$-algebras. Specializations include a number of established results on classifying spaces, resolutions of modules, and (co)homology theories for ring spectra. The main results apply when the category in question has centralizers. Among our applications, we...
Out with the old, and in with the new. More specifically, CatBoost [2] may be replacing XGBoost for many data scientists and ML engineers moving forward. Not only is this a great algorithm for data science competitions, but it is also very beneficial for professional data scientists and ML engineers for a variety of reasons. Oftentimes, complex machine learning algorithms can take, what seems forever, to train, and then lack critical plotting tools that can help to explain features as well a the model training itself. Perhaps the biggest benefit of CatBoost is in its name, which we will expound upon below. With that being said, let’s take a deeper dive into three of the main benefits of CatBoost.
Remote sensing is the image acquisition of a target without having physical contact with it. Nowadays remote sensing data is widely preferred due to its reduced image acquisition period. The remote sensing of ground targets is more challenging because of the various factors that affect the propagation of light through different mediums from a satellite acquisition. Several Convolutional Neural Network-based algorithms are being implemented in the field of remote sensing. Supervised learning is a machine learning technique where the data is labelled according to their classes prior to the training. In order to detect and classify the targets more accurately, YOLOv3, an algorithm based on bounding and anchor boxes is adopted. In order to handle the various effects of light travelling through the atmosphere, Grayscale based YOLOv3 configuration is introduced. For better prediction and for solving the Rayleigh scattering effect, RGB based grayscale algorithms are proposed. The acquired images are analysed and trained with the grayscale based YOLO3 algorithm for target detection. The results show that the grayscale-based method can sense the target more accurately and effectively than the traditional YOLOv3 approach.
Asymptotic Safety provides an elegant mechanism for obtaining a consistent high-energy completion of gravity and gravity-matter systems. Following the initial idea by Steven Weinberg, the construction builds on an interacting fixed point of the theories renormalization group (RG) flow. In this work we use the Wetterich equation for the effective average action to investigate the RG flow of gravity supplemented by a real scalar field. We give a non-perturbative proof that the subspace of interactions respecting the global shift-symmetry of the scalar kinetic term is closed under RG transformations. Subsequently, we compute the beta functions in an approximation comprising the EinsteinHilbert action supplemented by the shift-symmetric quartic scalar self-interaction and the two lowest order shift-symmetric interactions coupling scalar-bilinears to the spacetime curvature. The computation utilizes the background field method with an arbitrary background, demonstrating that the results are manifestly background independent. Our beta functions exhibit an interacting fixed point suitable for Asymptotic Safety, where all matter interactions are non-vanishing. The presence of this fixed point is rooted in the interplay of the matter couplings which our work tracks for the first time. The relation of our findings with previous results in the literature is discussed in detail and we conclude with a brief outlook on potential phenomenological applications.
We generalize the Quantum Geometric Tensor by replacing a Hamiltonian with a modular Hamiltonian. The symmetric part of the Quantum Geometric Tensor provides a Fubini-Study metric, and its anti-symmetric sector gives a Berry curvature. Now the generalization or Quantum Modular Geometric Tensor gives a Kinematic Space and a modular Berry curvature. Here we demonstrate the emergence by focusing on a spherical entangling surface. We also use the result of the identity Virasoro block to relate the connected correlator of two Wilson lines to the two-point function of a modular Hamiltonian. This result realizes a novel holographic entanglement formula for two intervals of a general separation. This formula does not only hold for a classical gravity sector but also Quantum Gravity. The formula also provides a new Quantum Information interpretation to the connected correlators of Wilson lines as the mutual information. Our study provides an opportunity to explore Quantum Kinematic Space through Quantum Modular Geometric Tensor and hence go beyond symmetry.
Recovering color images and videos from highly undersampled data is a fundamental and challenging task in face recognition and computer vision. By the multi-dimensional nature of color images and videos, in this paper, we propose a novel tensor completion approach, which is able to efficiently explore the sparsity of tensor data under the discrete cosine transform (DCT). Specifically, we introduce two DCT-based tensor completion models as well as two implementable algorithms for their solutions. The first one is a DCT-based weighted nuclear norm minimization model. The second one is called DCT-based $p$-shrinking tensor completion model, which is a nonconvex model utilizing $p$-shrinkage mapping for promoting the low-rankness of data. Moreover, we accordingly propose two implementable augmented Lagrangian-based algorithms for solving the underlying optimization models. A series of numerical experiments including color and MRI image inpainting and video data recovery demonstrate that our proposed approach performs better than many existing state-of-the-art tensor completion methods, especially for the case when the ratio of missing data is high.
