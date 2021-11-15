ContributorsPublishersAdvertisers
Spectral Transform Forms Scalable Transformer

By Bingxin Zhou, Xinliang Liu, Yuehua Liu, Yunying Huang, Pietro Liò, YuGuang Wang
 5 days ago

Many real-world relational systems, such as social networks and biological systems, contain dynamic interactions. When learning dynamic graph representation, it is essential to employ sequential temporal information and geometric structure. Mainstream work achieves topological embedding via message passing networks (e.g., GCN, GAT). The temporal evolution, on the other hand, is conventionally...

The Fourier Transform,

In 1822, Joseph Fourier published his book The Analytic Theory of Heat in which he showed many signals could be decomposed into sums of sines and cosines. Two hundred years later, Fourier’s ideas are used in many places in modern society, from cryptography, to radios and x-ray machines. As mentioned in the surprisingly entertaining book The World According to Wavelets, “The Fourier Transform is a mathematical prism, breaking up functions into the frequencies that compose it”[1]. The Fourier Transform expresses a signal x(t) as the superposition of sines and cosines, or more compactly in terms of complex exponentials. To obtain the transform, expressed as X(f), the signal is correlated against exponentials of all frequencies. It is a process analogous to tuning a radio where certain frequencies may be more dominant. No information is lost or gained in the conversion, however.
Navigating a Digital Transformation Project

In digital transformation, the stakes are high and the pace is intense. A key characteristic of transformation is the magnitude of change in terms of both depth and breadth. It is a journey that is designed to achieve high impact at two levels – individual and organizational. As the head...
Riesz transform associated with the fractional Fourier transform and applications

Since Zayed \cite[Zayed, 1998]{z2} introduced the fractional Hilbert transform related to the fractional Fourier transform, this transform has been widely concerned and applied in the field of signal processing. Recently, Chen, the first, second and fourth authors \cite[Chen et al, 2021]{cfgw} attribute it to the operator corresponding to fractional multiplier, but it is only limited to 1-dimensional case. This paper naturally considers the high-dimensional situation. We introduce the fractional Riesz transform associated with fractional Fourier transform, in which the chirp function is the key factor and the technical barriers to be overcome. Furthermore, after equipping with chirp functions, we introduce and investigate the boundedness of singular integral operators, the dual properties of Hardy spaces and BMO spaces as well as the applications of theory of fractional multiplier in partial differential equation, which completely matched some classical results. Through numerical simulation, we give the physical and geometric interpretation of the high-dimensional fractional multiplier theorem. Finally, we present the application of the fractional Riesz transform in edge detection, which verifies the prediction proposed in \cite[Xu et al, 2016]{xxwqwy}. Moreover, the application presented in this paper can also be considered as the high-dimensional case of the application of the continuous fractional Hilbert transform in edge detection in \cite[Pei and Yeh, 2000]{py}.
Hybrid transforms of constructible functions

We introduce a general definition of hybrid transforms for constructible functions. These are integral transforms combining Lebesgue integration and Euler calculus. Lebesgue integration gives access to well-studied kernels and to regularity results, while Euler calculus conveys topological information and allows for compatibility with operations on constructible functions. We conduct a systematic study of such transforms and introduce two new ones: the Euler-Fourier and Euler-Laplace transforms. We show that the first has a left inverse and that the second provides a satisfactory generalization of Govc and Hepworth's persistent magnitude to constructible sheaves, in particular to multi-parameter persistent modules. Finally, we prove index-theoretic formulae expressing a wide class of hybrid transforms as generalized Euler integral transforms. This yields expectation formulae for transforms of constructible functions associated to (sub)level-sets persistence of random Gaussian filtrations.
Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer

Attention-based Transformer models have been increasingly employed for automatic music generation. To condition the generation process of such a model with a user-specified sequence, a popular approach is to take that conditioning sequence as a priming sequence and ask a Transformer decoder to generate a continuation. However, this prompt-based conditioning cannot guarantee that the conditioning sequence would develop or even simply repeat itself in the generated continuation. In this paper, we propose an alternative conditioning approach, called theme-based conditioning, that explicitly trains the Transformer to treat the conditioning sequence as a thematic material that has to manifest itself multiple times in its generation result. This is achieved with two main technical contributions. First, we propose a deep learning-based approach that uses contrastive representation learning and clustering to automatically retrieve thematic materials from music pieces in the training data. Second, we propose a novel gated parallel attention module to be used in a sequence-to-sequence (seq2seq) encoder/decoder architecture to more effectively account for a given conditioning thematic material in the generation process of the Transformer decoder. We report on objective and subjective evaluations of variants of the proposed Theme Transformer and the conventional prompt-based baseline, showing that our best model can generate, to some extent, polyphonic pop piano music with repetition and plausible variations of a given condition.
Sliced Recursive Transformer

We present a neat yet effective recursive operation on vision transformers that can improve parameter utilization without involving additional parameters. This is achieved by sharing weights across depth of transformer networks. The proposed method can obtain a substantial gain (~2%) simply using naïve recursive operation, requires no special or sophisticated knowledge for designing principles of networks, and introduces minimum computational overhead to the training procedure. To reduce the additional computation caused by recursive operation while maintaining the superior accuracy, we propose an approximating method through multiple sliced group self-attentions across recursive layers which can reduce the cost consumption by 10~30% with minimal performance loss. We call our model Sliced Recursive Transformer (SReT), which is compatible with a broad range of other designs for efficient vision transformers. Our best model establishes significant improvement on ImageNet over state-of-the-art methods while containing fewer parameters. The proposed sliced recursive operation allows us to build a transformer with more than 100 or even 1000 layers effortlessly under a still small size (13~15M), to avoid difficulties in optimization when the model size is too large. The flexible scalability has shown great potential for scaling up and constructing extremely deep and large dimensionality vision transformers. Our code and models are available at this https URL.
Graph Relation Transformer: Incorporating pairwise object features into the Transformer architecture

Previous studies such as VizWiz find that Visual Question Answering (VQA) systems that can read and reason about text in images are useful in application areas such as assisting visually-impaired people. TextVQA is a VQA dataset geared towards this problem, where the questions require answering systems to read and reason about visual objects and text objects in images. One key challenge in TextVQA is the design of a system that effectively reasons not only about visual and text objects individually, but also about the spatial relationships between these objects. This motivates the use of 'edge features', that is, information about the relationship between each pair of objects. Some current TextVQA models address this problem but either only use categories of relations (rather than edge feature vectors) or do not use edge features within the Transformer architectures. In order to overcome these shortcomings, we propose a Graph Relation Transformer (GRT), which uses edge information in addition to node information for graph attention computation in the Transformer. We find that, without using any other optimizations, the proposed GRT method outperforms the accuracy of the M4C baseline model by 0.65% on the val set and 0.57% on the test set. Qualitatively, we observe that the GRT has superior spatial reasoning ability to M4C.
A Survey of Visual Transformers

Yang Liu, Yao Zhang, Yixin Wang, Feng Hou, Jin Yuan, Jiang Tian, Yang Zhang, Zhongchao Shi, Jianping Fan, Zhiqiang He. Transformer, an attention-based encoder-decoder architecture, has revolutionized the field of natural language processing. Inspired by this significant achievement, some pioneering works have recently been done on adapting Transformerliked architectures to Computer Vision (CV) fields, which have demonstrated their effectiveness on various CV tasks. Relying on competitive modeling capability, visual Transformers have achieved impressive performance on multiple benchmarks such as ImageNet, COCO, and ADE20k as compared with modern Convolution Neural Networks (CNN). In this paper, we have provided a comprehensive review of over one hundred different visual Transformers for three fundamental CV tasks (classification, detection, and segmentation), where a taxonomy is proposed to organize these methods according to their motivations, structures, and usage scenarios. Because of the differences in training settings and oriented tasks, we have also evaluated these methods on different configurations for easy and intuitive comparison instead of only various benchmarks. Furthermore, we have revealed a series of essential but unexploited aspects that may empower Transformer to stand out from numerous architectures, e.g., slack high-level semantic embeddings to bridge the gap between visual and sequential Transformers. Finally, three promising future research directions are suggested for further investment.
Fusion research using Azure A100 HPC instances

Fusion simulations have in the past required the use of leadership scale HPC resources to produce advances in physics. One such package is CGYRO, a premier multi-scale plasma turbulence simulation code. CGYRO is a typical HPC application that would not fit into a single node, as it requires O(100 GB) of memory and O(100 TFLOPS) worth of compute for relevant simulations. When distributed across multiple nodes, CGYRO requires high-throughput and low-latency networking to effectively use the compute resources. While in the past such compute may have required hundreds, or even thousands of nodes, recent advances in hardware capabilities allow for just a couple of nodes to deliver the necessary compute power. This paper presents our experience running CGYRO on NVIDIA A100 GPUs on InfiniBand-connected HPC resources in the Microsoft Azure Cloud. A comparison to older generation CPU and GPU Azure resources as well as on-prem resources is also provided.
Transformer-based Image Compression

A Transformer-based Image Compression (TIC) approach is developed which reuses the canonical variational autoencoder (VAE) architecture with paired main and hyper encoder-decoders. Both main and hyper encoders are comprised of a sequence of neural transformation units (NTUs) to analyse and aggregate important information for more compact representation of input image, while the decoders mirror the encoder-side operations to generate pixel-domain image reconstruction from the compressed bitstream. Each NTU is consist of a Swin Transformer Block (STB) and a convolutional layer (Conv) to best embed both long-range and short-range information; In the meantime, a casual attention module (CAM) is devised for adaptive context modeling of latent features to utilize both hyper and autoregressive priors. The TIC rivals with state-of-the-art approaches including deep convolutional neural networks (CNNs) based learnt image coding (LIC) methods and handcrafted rules-based intra profile of recently-approved Versatile Video Coding (VVC) standard, and requires much less model parameters, e.g., up to 45% reduction to leading-performance LIC.
C-OPH: Improving the Accuracy of One Permutation Hashing (OPH) with Circulant Permutations

Minwise hashing (MinHash) is a classical method for efficiently estimating the Jaccrad similarity in massive binary (0/1) data. To generate $K$ hash values for each data vector, the standard theory of MinHash requires $K$ independent permutations. Interestingly, the recent work on "circulant MinHash" (C-MinHash) has shown that merely two permutations are needed. The first permutation breaks the structure of the data and the second permutation is re-used $K$ time in a circulant manner. Surprisingly, the estimation accuracy of C-MinHash is proved to be strictly smaller than that of the original MinHash. The more recent work further demonstrates that practically only one permutation is needed. Note that C-MinHash is different from the well-known work on "One Permutation Hashing (OPH)" published in NIPS'12. OPH and its variants using different "densification" schemes are popular alternatives to the standard MinHash. The densification step is necessary in order to deal with empty bins which exist in One Permutation Hashing.
Recurrent Variational Network: A Deep Learning Inverse Problem Solver applied to the task of Accelerated MRI Reconstruction

Magnetic Resonance Imaging can produce detailed images of the anatomy and physiology of the human body that can assist doctors in diagnosing and treating pathologies such as tumours. However, MRI suffers from very long acquisition times that make it susceptible to patient motion artifacts and limit its potential to deliver dynamic treatments. Conventional approaches such as Parallel Imaging and Compressed Sensing allow for an increase in MRI acquisition speed by reconstructing MR images by acquiring less MRI data using multiple receiver coils. Recent advancements in Deep Learning combined with Parallel Imaging and Compressed Sensing techniques have the potential to produce high-fidelity reconstructions from highly accelerated MRI data. In this work we present a novel Deep Learning-based Inverse Problem solver applied to the task of accelerated MRI reconstruction, called Recurrent Variational Network (RecurrentVarNet) by exploiting the properties of Convolution Recurrent Networks and unrolled algorithms for solving Inverse Problems. The RecurrentVarNet consists of multiple blocks, each responsible for one unrolled iteration of the gradient descent optimization algorithm for solving inverse problems. Contrary to traditional approaches, the optimization steps are performed in the observation domain ($k$-space) instead of the image domain. Each recurrent block of RecurrentVarNet refines the observed $k$-space and is comprised of a data consistency term and a recurrent unit which takes as input a learned hidden state and the prediction of the previous block. Our proposed method achieves new state of the art qualitative and quantitative reconstruction results on 5-fold and 10-fold accelerated data from a public multi-channel brain dataset, outperforming previous conventional and deep learning-based approaches. We will release all models code and baselines on our public repository.
Order recognition by Schubert polynomials generated by optical near-field statistics via nanometre-scale photochromism

Kazuharu Uchiyama, Sota Nakajima, Hirotsugu Suzui, Nicolas Chauvet, Hayato Saigo, Ryoichi Horisaki, Kingo Uchida, Makoto Naruse, Hirokazu Hori. We have previously observed an irregular spatial distribution of photon transmission through a photochromic crystal photoisomerized by a local optical near-field excitation, manifesting complex branching processes via the interplay of deformation of the material and near-field photon transfer therein. Furthermore, by combining such naturally constructed complex photon transmission with a simple photon detection protocol, Schubert polynomials, the foundation of versatile permutation operations in mathematics, have been generated. In this study, we demonstrate an order recognition algorithm inspired by Schubert calculus using optical near-field statistics via nanometre-scale photochromism. More specifically, by utilizing Schubert polynomials generated via optical near-field patterns, we show that the order of slot machines with initially unknown reward probability is successfully recognized. We emphasize that, unlike conventional algorithms in the literature, the proposed principle does not estimate the reward probabilities. Instead, it exploits the inversion relations contained in the Schubert polynomials. To quantitatively evaluate the impact of the Schubert polynomials generated from an optical near-field pattern, order recognition performances are compared with uniformly distributed and spatially strongly skewed probability distributions, where the optical near-field pattern outperforms the others. We found that the number of singularities contained in Schubert polynomials and that of the given problem or considered environment exhibits a clear correspondence, indicating that superior order recognition performances may be attained if the singularity of the given problem is presupposed. This study paves a new way toward nanophotonic intelligent devices and systems by the interplay of complex natural processes and mathematical insights gained by Schubert calculus.
Multidimensional imaging reveals mechanisms controlling label-free biosensing in vertical 2DM-heterostructures

Tetyana Ignatova, Sajedeh Pourianejad, Xinyi Li, Kirby Schmidt, Frederick Aryeetey, Shyam Aravamudhan, Slava V. Rotkin. Two-dimensional materials and their van der Waals heterostructures enable a large range of applications, including label-free biosensing. Lattice mismatch and work function difference in the heterostructure material result in strain and charge transfer, often varying at nanometer scale, that influence device performance. In this work, a multidimensional optical imaging technique is developed in order to map sub-diffractional distributions for doping and strain and understand the role of those for modulation of electronic properties of the material. As an example, vertical heterostructure comprised of monolayer graphene and single layer flakes of transition metal dichalcogenide MoS$_2$ is fabricated and used for biosensing. Herein, an optical label-free detection of doxorubicin, a common cancer drug, is reported via three independent optical detection channels (photoluminescence shift, Raman shift and Graphene Enhanced Raman Scattering). Non-uniform broadening of components of multimodal signal correlates with the statistical distribution of local optical properties of the heterostructure. Multidimensional nanoscale imaging allows one to reveal the physical origin for such a local response and propose the best strategy for mitigation of materials variability and future device fabrication.
Cavity Amplified Scattering Spectroscopy reveals the dynamics of proteins and nanoparticles in quasi-transparent and miniature samples

Dynamic light scattering techniques are routinely used for numerous industrial and research applications, because they can give access to the motion spectrum of micro- and nano-objects, and therefore to particle sizes or visco-elastic properties. However, measurements are impossible when samples do not scatterer light enough, i.e. when there are too few scattering events due to excessively small scattering cross-sections and/or low concentrations of scatterers. Here, we propose to amplify light scattering efficiency by placing weakly scattering samples inside a Lambertian cavity with high reflectance walls. It produces a 3D isotropic and homogeneous light field that effectively elongates the scattering pathlength by 2 to 3 orders of magnitude, and leads to a dramatic increase in sensitivity. We could indeed measure the diffusion coefficient and size of particles ranging from 5nm to 20 microns with volume fractions as low at 10^(-9) in volumes as low as 100 microliters, and in solvents with refractive index mismatches down to 0.01. With a 10^(4) fold increase in sensitivity compared to classical techniques, we considerably expand the applications of light scattering to highly diluted samples, miniaturized microfluidics samples, and samples practically deemed non-scattering. Beyond the realm of current applications of light scattering techniques, our Cavity Amplified Scattering Spectroscopy method (CASS) and its outstanding sensitivity represent a major methodological step towards the study of problems such as the ballistic limit of Brownian motion, the internal dynamics of proteins, or the low frequency dielectric dynamics of liquids.
Coherent feedback cooling of a nanomechanical membrane with atomic spins

Coherent feedback stabilises a system towards a target state without the need of a measurement, thus avoiding the quantum backaction inherent to measurements. Here, we employ optical coherent feedback to remotely cool a nanomechanical membrane using atomic spins as a controller. Direct manipulation of the atoms allows us to tune from strong-coupling to an overdamped regime. Making use of the full coherent control offered by our system, we perform spin-membrane state swaps combined with stroboscopic spin pumping to cool the membrane in a room-temperature environment to ${T}={216}\,\mathrm{mK}$ ($\bar{n}_{m} = 2.3\times 10^3$ phonons) in ${200}\,\mathrm{{\mu}s}$. We furthermore observe and study the effects of delayed feedback on the cooling performance. Starting from a cryogenically pre-cooled membrane, this method would enable cooling of the mechanical oscillator close to its quantum mechanical ground state and the preparation of nonclassical states.
Efficient single-photon pair generation by spontaneous parametric down-conversion in nonlinear plasmonic metasurfaces

Spontaneous parametric down-conversion (SPDC) is one of the most versatile nonlinear optical techniques for the generation of entangled and correlated single-photon pairs. However, it suffers from very poor efficiency leading to extremely weak photon generation rates. Here we propose a plasmonic metasurface design based on silver nanostripes combined with a bulk lithium niobate (LiNbO3) crystal to realize a new scalable, ultrathin, and efficient SPDC source. By coinciding fundamental and higher order resonances of the metasurface with the generated signal and idler frequencies, respectively, the electric field in the nonlinear media is significantly boosted. This leads to a substantially enhancement in the SPDC process which, subsequently, by using the quantum-classical correspondence principle, translates to very high photon-pair generation rates. The emitted radiation is highly directional and perpendicular to the metasurface on the contrary to relevant dielectric structures. The incorporation of circular polarized excitation further increases the photon-pair generation efficiency. The presented work will lead to the design of new efficient ultrathin SPDC single-photon nanophotonic sources working at room temperature that are expected to be critical components in free-space quantum optical communications. In a more general context, our findings can find various applications in the emerging field of quantum plasmonics.
Information-theoretic formulation of dynamical systems: causality, modeling, and control

The problems of causality, modeling, and control for chaotic, high-dimensional dynamical systems are formulated in the language of information theory. The central quantity of interest is the Shannon entropy, which measures the amount of information in the states of the system. Within this framework, causality in a dynamical system is quantified by the information flux among the variables of interest. Reduced-order modeling is posed as a problem on the conservation of information, in which models aim at preserving the maximum amount of relevant information from the original system. Similarly, control theory is cast in information-theoretic terms by envisioning the tandem sensor-actuator as a device reducing the unknown information of the state to be controlled. The new formulation is applied to address three problems in the causality, modeling, and control of turbulence, which stands as a primary example of a chaotic, high-dimensional dynamical system. The applications include the causality of the energy transfer in the turbulent cascade, subgrid-scale modeling for large-eddy simulation, and flow control for drag reduction in wall-bounded turbulence.
Quenching Factor consistency across several NaI(Tl) crystals

D. Cintas, P. An, C. Awe, P. S. Barbeau, E. Barbosa de Souza, S. Hedges, J. H. Jo, M. Martinez, R. H. Maruyama, L. Li, G. C. Rich, J. Runge, M. L. Sarsa, W. G. Thompson. Testing the DAMA/LIBRA annual modulation result independently of dark matter particle and halo models has been a challenge for twenty years. Using the same target material, NaI(Tl), is required and presently two experiments, ANAIS-112 and COSINE-100, are running for such a goal. A precise knowledge of the detector response to nuclear recoils is mandatory because this is the most likely channel to find the dark matter signal. The light produced by nuclear recoils is quenched with respect to that produced by electrons by a factor that has to be measured experimentally. However, current quenching factor measurements in NaI(Tl) crystals disagree within the energy region of interest for dark matter searches. To disentangle whether this discrepancy is due to intrinsic differences in the light response among different NaI(Tl) crystals, or has its origin in unaccounted for systematic effects will be key in the comparison among the different experiments. We present measurements of the quenching factors for five small NaI(Tl) crystals performed in the same experimental setup to control systematics. Quenching factor results are compatible between crystals and no clear dependence with energy is observed from 10 to 80 keVnr.
