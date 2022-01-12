ContributorsPublishersAdvertisers
Neural Residual Flow Fields for Efficient Video Representations

By Daniel Rho, Junwoo Cho, Jong Hwan Ko, Eunbyung Park
arxiv.org
 3 days ago

Implicit neural representation (INR) has emerged as a powerful paradigm for representing signals, such as images, videos, 3D shapes, etc. Although it has shown the ability to represent fine details, its efficiency as a data representation has not been...

arxiv.org

MedicalXpress

T cells fit to tackle Omicron, suggests new study

Research from the Hong Kong University of Science and Technology (HKUST) and the University of Melbourne has revealed that T cells, one of the body's key defenses against COVID-19, are expected to be effective in mounting an immune response against Omicron despite its significantly higher mutations compared to previous variants of concern.
SCIENCE
arxiv.org

Video Summarization Based on Video-text Representation

Modern video summarization methods are based on deep neural networks which require a large amount of annotated data for training. However, existing datasets for video summarization are small-scale, easily leading to over-fitting of the deep models. Considering that the annotation of large-scale datasets is time-consuming, we propose a multimodal self-supervised learning framework to obtain semantic representations of videos, which benefits the video summarization task. Specifically, we explore the semantic consistency between the visual information and text information of videos, for the self-supervised pretraining of a multimodal encoder on a newly-collected dataset of video-text pairs. Additionally, we introduce a progressive video summarization method, where the important content in a video is pinpointed progressively to generate better summaries. Finally, an objective evaluation framework is proposed to measure the quality of video summaries based on video classification. Extensive experiments have proved the effectiveness and superiority of our method in rank correlation coefficients, F-score, and the proposed objective evaluation compared to the state of the art.
COMPUTERS
arxiv.org

An efficient extended block Arnoldi algorithm for feedback stabilization of incompressible Navier-Stokes flow problems

Navier-Stokes equations are well known in modelling of an incompressible Newtonian fluid, such as air or water. This system of equations is very complex due to the non-linearity term that characterizes it. After the linearization and the discretization parts, we get a descriptor system of index-2 described by a set of differential algebraic equations (DAEs). The two main parts we develop through this paper are focused firstly on constructing an efficient algorithm based on a projection technique onto an extended block Krylov subspace, that appropriately allows us to construct a reduced system of the original DAE system. Secondly, we solve a Linear Quadratic Regulator (LQR) problem based on a Riccati feedback approach. This approach uses numerical solutions of large-scale algebraic Riccati equations. To this end, we use the extended Krylov subspace method that allows us to project the initial large matrix problem onto a low order one that is solved by some direct methods. These numerical solutions are used to obtain a feedback matrix that will be used to stabilize the original system. We conclude by providing some numerical results to confirm the performances of our proposed method compared to other known methods.
MATHEMATICS
#Data Redundancy#Neural Network#Inr
arxiv.org

Efficiently Disentangle Causal Representations

This paper proposes an efficient approach to learning disentangled representations with causal mechanisms based on the difference of conditional probabilities in original and new distributions. We approximate the difference with models' generalization abilities so that it fits in the standard machine learning framework and can be efficiently computed. In contrast to the state-of-the-art approach, which relies on the learner's adaptation speed to new distribution, the proposed approach only requires evaluating the model's generalization ability. We provide a theoretical explanation for the advantage of the proposed method, and our experiments show that the proposed technique is 1.9--11.0$\times$ more sample efficient and 9.4--32.4 times quicker than the previous method on various tasks. The source code is available at \url{this https URL}.
SCIENCE
arxiv.org

Harmonics Virtual Lights : fast projection of luminance field on spherical harmonics for efficient rendering

In this paper, we introduce Harmonics Virtual Lights (HVL), to model indirect light sources for interactive global illumination of dynamic 3D scenes. Virtual Point Lights (VPL) are an efficient approach to define indirect light sources and to evaluate the resulting indirect lighting. Nonetheless, VPL suffer from disturbing artifacts, especially with high frequency materials. Virtual Spherical Lights (VSL) avoid these artifacts by considering spheres instead of points but estimates the lighting integral using Monte Carlo which results to noise in the final image. We define HVL as an extension of VSL in a Spherical Harmonics (SH) framework, defining a closed form of the lighting integral evaluation. We propose an efficient SH projection of spherical lights contribution faster than existing methods. Computing the outgoing luminance requires $\mathcal{O}(n)$ operations when using materials with circular symmetric lobes, and $\mathcal{O}(n^2)$ operations for the general case, where $n$ is the number of SH bands. HVL can be used with either parametric or measured BRDF without extra cost and offers control over rendering time and image quality, by either decreasing or increasing the band limit used for SH projection. Our approach is particularly well designed to render medium-frequency one-bounce global illumination with arbitrary BRDF in interactive time.
COMPUTERS
arxiv.org

Graph Neural Networks: a bibliometrics overview

Recently, graph neural networks have become a hot topic in machine learning community. This paper presents a Scopus based bibliometric overview of the GNNs research since 2004, when GNN papers were first published. The study aims to evaluate GNN research trend, both quantitatively and qualitatively. We provide the trend of research, distribution of subjects, active and influential authors and institutions, sources of publications, most cited documents, and hot topics. Our investigations reveal that the most frequent subject categories in this field are computer science, engineering, telecommunications, linguistics, operations research and management science, information science and library science, business and economics, automation and control systems, robotics, and social sciences. In addition, the most active source of GNN publications is Lecture Notes in Computer Science. The most prolific or impactful institutions are found in the United States, China, and Canada. We also provide must read papers and future directions. Finally, the application of graph convolutional networks and attention mechanism are now among hot topics of GNN research.
COMPUTERS
arxiv.org

Topological Representations of Local Explanations

Local explainability methods -- those which seek to generate an explanation for each prediction -- are becoming increasingly prevalent due to the need for practitioners to rationalize their model outputs. However, comparing local explainability methods is difficult since they each generate outputs in various scales and dimensions. Furthermore, due to the stochastic nature of some explainability methods, it is possible for different runs of a method to produce contradictory explanations for a given observation. In this paper, we propose a topology-based framework to extract a simplified representation from a set of local explanations. We do so by first modeling the relationship between the explanation space and the model predictions as a scalar function. Then, we compute the topological skeleton of this function. This topological skeleton acts as a signature for such functions, which we use to compare different explanation methods. We demonstrate that our framework can not only reliably identify differences between explainability techniques but also provides stable representations. Then, we show how our framework can be used to identify appropriate parameters for local explainability methods. Our framework is simple, does not require complex optimizations, and can be broadly applied to most local explanation methods. We believe the practicality and versatility of our approach will help promote topology-based approaches as a tool for understanding and comparing explanation methods.
SCIENCE
arxiv.org

Efficient-Dyn: Dynamic Graph Representation Learning via Event-based Temporal Sparse Attention Network

Static graph neural networks have been widely used in modeling and representation learning of graph structure data. However, many real-world problems, such as social networks, financial transactions, recommendation systems, etc., are dynamic, that is, nodes and edges are added or deleted over time. Therefore, in recent years, dynamic graph neural networks have received more and more attention from researchers. In this work, we propose a novel dynamic graph neural network, Efficient-Dyn. It adaptively encodes temporal information into a sequence of patches with an equal amount of temporal-topological structure. Therefore, while avoiding the use of snapshots to cause information loss, it also achieves a finer time granularity, which is close to what continuous networks could provide. In addition, we also designed a lightweight module, Sparse Temporal Transformer, to compute node representations through both structural neighborhoods and temporal dynamics. Since the fully-connected attention conjunction is simplified, the computation cost is far lower than the current state-of-the-arts. Link prediction experiments are conducted on both continuous and discrete graph datasets. Through comparing with several state-of-the-art graph embedding baselines, the experimental results demonstrate that Efficient-Dyn has a faster inference speed while having competitive performance.
COMPUTERS
arxiv.org

Two-level Graph Neural Network

Graph Neural Networks (GNNs) are recently proposed neural network structures for the processing of graph-structured data. Due to their employed neighbor aggregation strategy, existing GNNs focus on capturing node-level information and neglect high-level information. Existing GNNs therefore suffer from representational limitations caused by the Local Permutation Invariance (LPI) problem. To overcome these limitations and enrich the features captured by GNNs, we propose a novel GNN framework, referred to as the Two-level GNN (TL-GNN). This merges subgraph-level information with node-level information. Moreover, we provide a mathematical analysis of the LPI problem which demonstrates that subgraph-level information is beneficial to overcoming the problems associated with LPI. A subgraph counting method based on the dynamic programming algorithm is also proposed, and this has time complexity is O(n^3), n is the number of nodes of a graph. Experiments show that TL-GNN outperforms existing GNNs and achieves state-of-the-art performance.
CODING & PROGRAMMING
adafruit.com

Sofia Crespo Explores Creation and Neural Networks in Neural Zoo #ArtTuesday

Sofia Crespo uses AI to create these wondrous slightly unsettling pieces, via Art in America:. These images resemble nature, but an imagined nature that has been rearranged. Our visual cortex recognizes the textures, but the brain is simultaneously aware that those elements don’t belong to any arrangement of reality that it has access to. Computer vision and machine learning could offer a bridge between us and a speculative “natures” that can only be accessed through high levels of parallel computation.
VISUAL ART
arxiv.org

Flow-Guided Sparse Transformer for Video Deblurring

Jing Lin, Yuanhao Cai, Xiaowan Hu, Haoqian Wang, Youliang Yan, Xueyi Zou, Henghui Ding, Yulun Zhang, Radu Timofte, Luc Van Gool. Exploiting similar and sharper scene patches in spatio-temporal neighborhoods is critical for video deblurring. However, CNN-based methods show limitations in capturing long-range dependencies and modeling non-local self-similarity. In this paper, we propose a novel framework, Flow-Guided Sparse Transformer (FGST), for video deblurring. In FGST, we customize a self-attention module, Flow-Guided Sparse Window-based Multi-head Self-Attention (FGSW-MSA). For each $query$ element on the blurry reference frame, FGSW-MSA enjoys the guidance of the estimated optical flow to globally sample spatially sparse yet highly related $key$ elements corresponding to the same scene patch in neighboring frames. Besides, we present a Recurrent Embedding (RE) mechanism to transfer information from past frames and strengthen long-range temporal dependencies. Comprehensive experiments demonstrate that our proposed FGST outperforms state-of-the-art (SOTA) methods on both DVD and GOPRO datasets and even yields more visually pleasing results in real video deblurring. Code and models will be released to the public.
COMPUTERS
arxiv.org

Auto-Weighted Layer Representation Based View Synthesis Distortion Estimation for 3-D Video Coding

Recently, various view synthesis distortion estimation models have been studied to better serve for 3-D video coding. However, they can hardly model the relationship quantitatively among different levels of depth changes, texture degeneration, and the view synthesis distortion (VSD), which is crucial for rate-distortion optimization and rate allocation. In this paper, an auto-weighted layer representation based view synthesis distortion estimation model is developed. Firstly, the sub-VSD (S-VSD) is defined according to the level of depth changes and their associated texture degeneration. After that, a set of theoretical derivations demonstrate that the VSD can be approximately decomposed into the S-VSDs multiplied by their associated weights. To obtain the S-VSDs, a layer-based representation of S-VSD is developed, where all the pixels with the same level of depth changes are represented with a layer to enable efficient S-VSD calculation at the layer level. Meanwhile, a nonlinear mapping function is learnt to accurately represent the relationship between the VSD and S-VSDs, automatically providing weights for S-VSDs during the VSD estimation. To learn such function, a dataset of VSD and its associated S-VSDs are built. Experimental results show that the VSD can be accurately estimated with the weights learnt by the nonlinear mapping function once its associated S-VSDs are available. The proposed method outperforms the relevant state-of-the-art methods in both accuracy and efficiency. The dataset and source code of the proposed method will be available at this https URL.
CODING & PROGRAMMING
arxiv.org

Bifurcations of a neural network model with symmetry

We analyze a family of clustered excitatory-inhibitory neural networks and the underlying bifurcation structures that arise because of permutation symmetries in the network as the global coupling strength $g$ is varied. We primarily consider two network topologies: an all-to-all connected network which excludes self-connections, and a network in which the excitatory cells are broken into clusters of equal size. Although in both cases the bifurcation structure is determined by symmetries in the system, the behavior of the two systems is qualitatively different. In the all-to-all connected network, the system undergoes Hopf bifurcations leading to periodic orbit solutions; notably, for large $g$, there is a single, stable periodic orbit solution and no stable fixed points. By contrast, in the clustered network, there are no Hopf bifurcations, and there is a family of stable fixed points for large $g$.
CODING & PROGRAMMING
arxiv.org

An application of the splitting-up method for the computation of a neural network representation for the solution for the filtering equations

The filtering equations govern the evolution of the conditional distribution of a signal process given partial, and possibly noisy, observations arriving sequentially in time. Their numerical approximation plays a central role in many real-life applications, including numerical weather prediction, finance and engineering. One of the classical approaches to approximate the solution of the filtering equations is to use a PDE inspired method, called the splitting-up method, initiated by Gyongy, Krylov, LeGland, among other contributors. This method, and other PDE based approaches, have particular applicability for solving low-dimensional problems. In this work we combine this method with a neural network representation. The new methodology is used to produce an approximation of the unnormalised conditional distribution of the signal process. We further develop a recursive normalisation procedure to recover the normalised conditional distribution of the signal process. The new scheme can be iterated over multiple time steps whilst keeping its asymptotic unbiasedness property intact.
COMPUTERS
arxiv.org

Neural Capacitance: A New Perspective of Neural Network Selection via Edge Dynamics

Efficient model selection for identifying a suitable pre-trained neural network to a downstream task is a fundamental yet challenging task in deep learning. Current practice requires expensive computational costs in model training for performance prediction. In this paper, we propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training. Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections. Therefore, a converged neural network is associated with an equilibrium state of a networked system composed of those edges. To this end, we construct a network mapping $\phi$, converting a neural network $G_A$ to a directed line graph $G_B$ that is defined on those edges in $G_A$. Next, we derive a neural capacitance metric $\beta_{\rm eff}$ as a predictive measure universally capturing the generalization capability of $G_A$ on the downstream task using only a handful of early training results. We carried out extensive experiments using 17 popular pre-trained ImageNet models and five benchmark datasets, including CIFAR10, CIFAR100, SVHN, Fashion MNIST and Birds, to evaluate the fine-tuning performance of our framework. Our neural capacitance metric is shown to be a powerful indicator for model selection based only on early training results and is more efficient than state-of-the-art methods.
SCIENCE
arxiv.org

Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks

Shoukang Hu, Xurong Xie, Mingyu Cui, Jiajun Deng, Shansong Liu, Jianwei Yu, Mengzhe Geng, Xunying Liu, Helen Meng. State-of-the-art automatic speech recognition (ASR) system development is data and computation intensive. The optimal design of deep neural networks (DNNs) for these systems often require expert knowledge and empirical evaluation. In this paper, a range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper-parameters of factored time delay neural networks (TDNN-Fs): i) the left and right splicing context offsets; and ii) the dimensionality of the bottleneck linear projection at each hidden layer. These techniques include the differentiable neural architecture search (DARTS) method integrating architecture learning with lattice-free MMI training; Gumbel-Softmax and pipelined DARTS methods reducing the confusion over candidate architectures and improving the generalization of architecture selection; and Penalized DARTS incorporating resource constraints to balance the trade-off between performance and system complexity. Parameter sharing among TDNN-F architectures allows an efficient search over up to 7^28 different systems. Statistically significant word error rate (WER) reductions of up to 1.2% absolute and relative model size reduction of 31% were obtained over a state-of-the-art 300-hour Switchboard corpus trained baseline LF-MMI TDNN-F system featuring speed perturbation, i-Vector and learning hidden unit contribution (LHUC) based speaker adaptation as well as RNNLM rescoring. Performance contrasts on the same task against recent end-to-end systems reported in the literature suggest the best NAS auto-configured system achieves state-of-the-art WERs of 9.9% and 11.1% on the NIST Hub5' 00 and Rt03s test sets respectively with up to 96% model size reduction. Further analysis using Bayesian learning shows that the proposed NAS approaches can effectively minimize the structural redundancy in the TDNN-F systems and reduce their model parameter uncertainty. Consistent performance improvements were also obtained on a UASpeech dysarthric speech recognition task.
COMPUTERS
arxiv.org

BottleFit: Learning Compressed Representations in Deep Neural Networks for Effective and Efficient Split Computing

Although mission-critical applications require the use of deep neural networks (DNNs), their continuous execution at mobile devices results in a significant increase in energy consumption. While edge offloading can decrease energy consumption, erratic patterns in channel quality, network and edge server load can lead to severe disruption of the system's key operations. An alternative approach, called split computing, generates compressed representations within the model (called "bottlenecks"), to reduce bandwidth usage and energy consumption. Prior work has proposed approaches that introduce additional layers, to the detriment of energy consumption and latency. For this reason, we propose a new framework called BottleFit, which, in addition to targeted DNN architecture modifications, includes a novel training strategy to achieve high accuracy even with strong compression rates. We apply BottleFit on cutting-edge DNN models in image classification, and show that BottleFit achieves 77.1% data compression with up to 0.6% accuracy loss on ImageNet dataset, while state of the art such as SPINN loses up to 6% in accuracy. We experimentally measure the power consumption and latency of an image classification application running on an NVIDIA Jetson Nano board (GPU-based) and a Raspberry PI board (GPU-less). We show that BottleFit decreases power consumption and latency respectively by up to 49% and 89% with respect to (w.r.t.) local computing and by 37% and 55% w.r.t. edge offloading. We also compare BottleFit with state-of-the-art autoencoders-based approaches, and show that (i) BottleFit reduces power consumption and execution time respectively by up to 54% and 44% on the Jetson and 40% and 62% on Raspberry PI; (ii) the size of the head model executed on the mobile device is 83 times smaller. The code repository will be published for full reproducibility of the results.
CELL PHONES
towardsdatascience.com

Neural Networks and Neural Autoencoders as Dimensional Reduction Tools: Knime and Python

Now I will explore quite a similar path but I will use a Neural Network and a Neural Autoencoder, instead of the UMAP algorithm, for dimensional reduction. I will do that both within Knime, with Keras integration, environment and with TensorFlow in Python. After dimensional reduction, I will use DBSCAN to verify whether the clusters created by the neural networks can be identified…, or not. All codes and workflows will be shared.
CODING & PROGRAMMING
TechCrunch

The convergence of deep neural networks and immunotherapy

While both are among the most transformational areas of modern science, 30 years ago, these fields were all but ridiculed by the scientific community. As a result, progress in each happened at the sidelines of academia for decades. Between the 1970s and 1990s, some of the most prominent computer scientists,...
CANCER

