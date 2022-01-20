ContributorsPublishersAdvertisers
Informative Pseudo-Labeling for Graph Neural Networks with Few Labels

By Yayong Li, Jie Yin, Ling Chen
 4 days ago

Graph Neural Networks (GNNs) have achieved state-of-the-art results for semi-supervised node classification on graphs. Nevertheless, the challenge of how to effectively learn GNNs with very few labels is still under-explored. As one of the prevalent semi-supervised methods, pseudo-labeling has been proposed to explicitly address the label scarcity problem. It aims to...

towardsdatascience.com

Exploring the LSTM Neural Network Model for Time Series

Practical, straightforward implementation with the scalecast library. One of the most advanced models out there to forecast time series is the Long Short-Term Memory (LSTM) Neural Network. According to Korstanje in his book, Advanced Forecasting with Python:. “The LSTM cell adds long-term memory in an even more performant way because...
CODING & PROGRAMMING
arxiv.org

Multi-Scale Adaptive Graph Neural Network for Multivariate Time Series Forecasting

Multivariate time series (MTS) forecasting plays an important role in the automation and optimization of intelligent applications. It is a challenging task, as we need to consider both complex intra-variable dependencies and inter-variable dependencies. Existing works only learn temporal patterns with the help of single inter-variable dependencies. However, there are multi-scale temporal patterns in many real-world MTS. Single inter-variable dependencies make the model prefer to learn one type of prominent and shared temporal patterns. In this paper, we propose a multi-scale adaptive graph neural network (MAGNN) to address the above issue. MAGNN exploits a multi-scale pyramid network to preserve the underlying temporal dependencies at different time scales. Since the inter-variable dependencies may be different under distinct time scales, an adaptive graph learning module is designed to infer the scale-specific inter-variable dependencies without pre-defined priors. Given the multi-scale feature representations and scale-specific inter-variable dependencies, a multi-scale temporal graph neural network is introduced to jointly model intra-variable dependencies and inter-variable dependencies. After that, we develop a scale-wise fusion module to effectively promote the collaboration across different time scales, and automatically capture the importance of contributed temporal patterns. Experiments on four real-world datasets demonstrate that MAGNN outperforms the state-of-the-art methods across various settings.
COMPUTERS
arxiv.org

State Estimation in Electric Power Systems Leveraging Graph Neural Networks

The goal of the state estimation (SE) algorithm is to estimate complex bus voltages as state variables based on the available set of measurements in the power system. Because phasor measurement units (PMUs) are increasingly being used in transmission power systems, there is a need for a fast SE solver that can take advantage of PMU high sampling rates. This paper proposes training a graph neural network (GNN) to learn the estimates given the PMU voltage and current measurements as inputs, with the intent of obtaining fast and accurate predictions during the evaluation phase. GNN is trained using synthetic datasets, created by randomly sampling sets of measurements in the power system and labelling them with a solution obtained using a linear SE with PMUs solver. The presented results display the accuracy of GNN predictions in various test scenarios and tackle the sensitivity of the predictions to the missing input data.
ENERGY INDUSTRY
arxiv.org

Neural Approaches to Conversational Information Retrieval

A conversational information retrieval (CIR) system is an information retrieval (IR) system with a conversational interface which allows users to interact with the system to seek information via multi-turn conversations of natural language, in spoken or written form. Recent progress in deep learning has brought tremendous improvements in natural language processing (NLP) and conversational AI, leading to a plethora of commercial conversational services that allow naturally spoken and typed interaction, increasing the need for more human-centric interactions in IR. As a result, we have witnessed a resurgent interest in developing modern CIR systems in both research communities and industry. This book surveys recent advances in CIR, focusing on neural approaches that have been developed in the last few years. This book is based on the authors' tutorial at SIGIR'2020 (Gao et al., 2020b), with IR and NLP communities as the primary target audience. However, audiences with other background, such as machine learning and human-computer interaction, will also find it an accessible introduction to CIR. We hope that this book will prove a valuable resource for students, researchers, and software developers. This manuscript is a working draft. Comments are welcome.
COMPUTERS
arxiv.org

GraphVAMPNet, using graph neural networks and variational approach to markov processes for dynamical modeling of biomolecules

Finding low dimensional representation of data from long-timescale trajectories of biomolecular processes such as protein-folding or ligand-receptor binding is of fundamental importance and kinetic models such as Markov modeling have proven useful in describing the kinetics of these systems. Recently, an unsupervised machine learning technique called VAMPNet was introduced to learn the low dimensional representation and linear dynamical model in an end-to-end manner. VAMPNet is based on variational approach to Markov processes (VAMP) and relies on neural networks to learn the coarse-grained dynamics. In this contribution, we combine VAMPNet and graph neural networks to generate an end-to-end framework to efficiently learn high-level dynamics and metastable states from the long-timescale molecular dynamics trajectories. This method bears the advantages of graph representation learning and uses graph message passing operations to generate an embedding for each datapoint which is used in the VAMPNet to generate a coarse-grained representation. This type of molecular representation results in a higher resolution and more interpretable Markov model than the standard VAMPNet enabling a more detailed kinetic study of the biomolecular processes. Our GraphVAMPNet approach is also enhanced with an attention mechanism to find the important residues for classification into different metastable states.
COMPUTERS
arxiv.org

Scientific Machine Learning through Physics-Informed Neural Networks: Where we are and What's next

Salvatore Cuomo, Vincenzo Schiano di Cola, Fabio Giampaolo, Gianluigi Rozza, Maizar Raissi, Francesco Piccialli. Physic-Informed Neural Networks (PINN) are neural networks (NNs) that encode model equations, like Partial Differential Equations (PDE), as a component of the neural network itself. PINNs are nowadays used to solve PDEs, fractional equations, and integral-differential equations. This novel methodology has arisen as a multi-task learning framework in which a NN must fit observed data while reducing a PDE residual. This article provides a comprehensive review of the literature on PINNs: while the primary goal of the study was to characterize these networks and their related advantages and disadvantages, the review also attempts to incorporate publications on a larger variety of issues, including physics-constrained neural networks (PCNN), where the initial or boundary conditions are directly embedded in the NN structure rather than in the loss functions. The study indicates that most research has focused on customizing the PINN through different activation functions, gradient optimization techniques, neural network structures, and loss function structures. Despite the wide range of applications for which PINNs have been used, by demonstrating their ability to be more feasible in some contexts than classical numerical techniques like Finite Element Method (FEM), advancements are still possible, most notably theoretical issues that remain unresolved.
COMPUTERS
arxiv.org

Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification

Hierarchical multi-granularity classification (HMC) assigns hierarchical multi-granularity labels to each object and focuses on encoding the label hierarchy, e.g., ["Albatross", "Laysan Albatross"] from coarse-to-fine levels. However, the definition of what is fine-grained is subjective, and the image quality may affect the identification. Thus, samples could be observed at any level of the hierarchy, e.g., ["Albatross"] or ["Albatross", "Laysan Albatross"], and examples discerned at coarse categories are often neglected in the conventional setting of HMC. In this paper, we study the HMC problem in which objects are labeled at any level of the hierarchy. The essential designs of the proposed method are derived from two motivations: (1) learning with objects labeled at various levels should transfer hierarchical knowledge between levels; (2) lower-level classes should inherit attributes related to upper-level superclasses. The proposed combinatorial loss maximizes the marginal probability of the observed ground truth label by aggregating information from related labels defined in the tree hierarchy. If the observed label is at the leaf level, the combinatorial loss further imposes the multi-class cross-entropy loss to increase the weight of fine-grained classification loss. Considering the hierarchical feature interaction, we propose a hierarchical residual network (HRN), in which granularity-specific features from parent levels acting as residual connections are added to features of children levels. Experiments on three commonly used datasets demonstrate the effectiveness of our approach compared to the state-of-the-art HMC approaches and fine-grained visual classification (FGVC) methods exploiting the label hierarchy.
SCIENCE
NewsBreak
Artificial Intelligence
NewsBreak
Technology
NewsBreak
Computers
NewsBreak
Science
NewsBreak
Computer Science
arxiv.org

Disentangled Graph Neural Networks for Session-based Recommendation

Session-based recommendation (SBR) has drawn increasingly research attention in recent years, due to its great practical value by only exploiting the limited user behavior history in the current session. Existing methods typically learn the session embedding at the item level, namely, aggregating the embeddings of items with or without the attention weights assigned to items. However, they ignore the fact that a user's intent on adopting an item is driven by certain factors of the item (e.g., the leading actors of an movie). In other words, they have not explored finer-granularity interests of users at the factor level to generate the session embedding, leading to sub-optimal performance. To address the problem, we propose a novel method called Disentangled Graph Neural Network (Disen-GNN) to capture the session purpose with the consideration of factor-level attention on each item. Specifically, we first employ the disentangled learning technique to cast item embeddings into the embedding of multiple factors, and then use the gated graph neural network (GGNN) to learn the embedding factor-wisely based on the item adjacent similarity matrix computed for each factor. Moreover, the distance correlation is adopted to enhance the independence between each pair of factors. After representing each item with independent factors, an attention mechanism is designed to learn user intent to different factors of each item in the session. The session embedding is then generated by aggregating the item embeddings with attention weights of each item's factors. To this end, our model takes user intents at the factor level into account to infer the user purpose in a session. Extensive experiments on three benchmark datasets demonstrate the superiority of our method over existing methods.
COMPUTERS
arxiv.org

A Kernel-Expanded Stochastic Neural Network

The deep neural network suffers from many fundamental issues in machine learning. For example, it often gets trapped into a local minimum in training, and its prediction uncertainty is hard to be assessed. To address these issues, we propose the so-called kernel-expanded stochastic neural network (K-StoNet) model, which incorporates support vector regression (SVR) as the first hidden layer and reformulates the neural network as a latent variable model. The former maps the input vector into an infinite dimensional feature space via a radial basis function (RBF) kernel, ensuring absence of local minima on its training loss surface. The latter breaks the high-dimensional nonconvex neural network training problem into a series of low-dimensional convex optimization problems, and enables its prediction uncertainty easily assessed. The K-StoNet can be easily trained using the imputation-regularized optimization (IRO) algorithm. Compared to traditional deep neural networks, K-StoNet possesses a theoretical guarantee to asymptotically converge to the global optimum and enables the prediction uncertainty easily assessed. The performances of the new model in training, prediction and uncertainty quantification are illustrated by simulated and real data examples.
CODING & PROGRAMMING
arxiv.org

Exploring Fusion Strategies for Accurate RGBT Visual Object Tracking

Zhangyong Tang (1), Tianyang Xu (1), Hui Li (1), Xiao-Jun Wu (1), Xuefeng Zhu (1), Josef Kittler (2) ((1) Jiangnan University, Wuxi, China, (2) University of Surrey, UK) We address the problem of multi-modal object tracking in video and explore various options of fusing the complementary information conveyed by the visible (RGB) and thermal infrared (TIR) modalities including pixel-level, feature-level and decision-level fusion. Specifically, different from the existing methods, paradigm of image fusion task is heeded for fusion at pixel level. Feature-level fusion is fulfilled by attention mechanism with channels excited optionally. Besides, at decision level, a novel fusion strategy is put forward since an effortless averaging configuration has shown the superiority. The effectiveness of the proposed decision-level fusion strategy owes to a number of innovative contributions, including a dynamic weighting of the RGB and TIR contributions and a linear template update operation. A variant of which produced the winning tracker at the Visual Object Tracking Challenge 2020 (VOT-RGBT2020). The concurrent exploration of innovative pixel- and feature-level fusion strategies highlights the advantages of the proposed decision-level fusion method. Extensive experimental results on three challenging datasets, \textit{i.e.}, GTOT, VOT-RGBT2019, and VOT-RGBT2020, demonstrate the effectiveness and robustness of the proposed method, compared to the state-of-the-art approaches. Code will be shared at \textcolor{blue}{\emph{this https URL}.
SOFTWARE
arxiv.org

How Can Graph Neural Networks Help Document Retrieval: A Case Study on CORD19 with Concept Map Generation

Graph neural networks (GNNs), as a group of powerful tools for representation learning on irregular data, have manifested superiority in various downstream tasks. With unstructured texts represented as concept maps, GNNs can be exploited for tasks like document retrieval. Intrigued by how can GNNs help document retrieval, we conduct an empirical study on a large-scale multi-discipline dataset CORD-19. Results show that instead of the complex structure-oriented GNNs such as GINs and GATs, our proposed semantics-oriented graph functions achieve better and more stable performance based on the BM25 retrieved candidates. Our insights in this case study can serve as a guideline for future work to develop effective GNNs with appropriate semantics-oriented inductive biases for textual reasoning tasks like document retrieval and classification. All code for this case study is available at this https URL.
COMPUTERS
towardsdatascience.com

One more approach to optimize neural networks

Talking about Neural Architecture Search and own algorithm for optimizing neural network hyperparameters. In the last decade, neural network-based solutions have become extremely popular. At the same time, deep learning is quite a complex field, requiring high theoretical knowledge from experts. The industry needs quite a lot of these specialists, but now there are not enough of them to satisfy the request. With this gap between supply and demand, special tools are emerging. Let’s call these tools “automated tools”.
COMPUTERS
Nature.com

Imaging without the labels

Advances in label-free microscopy are expanding experimental observations, from biophysics to cell biology and beyond. Interferometric scattering microscopy (iSCAT) (Phys. Rev. Lett. 93, 037401, 2004) and the related technique mass photometry (Science, 360, 423"“427, 2018) are approaches that image the light scattered by an object with high sensitivity. iSCAT has frequently been applied to image and track nanoparticles, but recent advances have shown its power for imaging unlabeled structures like viruses and even single proteins. Mass photometry has been used to 'weigh' individual proteins and monitor protein assembly and disassembly, even for protein complexes in lipid bilayers (Nat. Methods 18, 1247"“1252, 2021; Nat. Methods 18, 1239"“1246, 2021). Advances in iSCAT are improving the detection sensitivity both through optical design (Nat. Commun. 12, 1744, 2021) and through image processing (Opt. Express 29, 11070"“11083, 2021). iSCAT imaging is also being combined with features such as polarization (Small Methods 5, 2000985, 2021) to enable new discovery. We think these approaches are poised to transform biophysical observation, especially in vitro.
SCIENCE
arxiv.org

Human Activity Recognition models using Limited Consumer Device Sensors and Machine Learning

Human activity recognition has grown in popularity with its increase of applications within daily lifestyles and medical environments. The goal of having efficient and reliable human activity recognition brings benefits such as accessible use and better allocation of resources; especially in the medical industry. Activity recognition and classification can be obtained using many sophisticated data recording setups, but there is also a need in observing how performance varies among models that are strictly limited to using sensor data from easily accessible devices: smartphones and smartwatches. This paper presents the findings of different models that are limited to train using such sensors. The models are trained using either the k-Nearest Neighbor, Support Vector Machine, or Random Forest classifier algorithms. Performance and evaluations are done by comparing various model performances using different combinations of mobile sensors and how they affect recognitive performances of models. Results show promise for models trained strictly using limited sensor data collected from only smartphones and smartwatches coupled with traditional machine learning concepts and algorithms.
CELL PHONES
Columbia University

What went wrong in the labeling of those cool graphs of y(t) vs. y'(t)?

Last week we discussed the cool graphs in geographer Danny Dorling’s recent book, “Slow Down.” Here’s an example:. Dorling is plotting y(t) vs y'(t), tracing over time with a dot for each year, or every few years. I really like this. But commenter Carlos noticed a...
JAPAN
arxiv.org

Training Fair Deep Neural Networks by Balancing Influence

Most fair machine learning methods either highly rely on the sensitive information of the training samples or require a large modification on the target models, which hinders their practical application. To address this issue, we propose a two-stage training algorithm named FAIRIF. It minimizes the loss over the reweighted data set (second stage) where the sample weights are computed to balance the model performance across different demographic groups (first stage). FAIRIF can be applied on a wide range of models trained by stochastic gradient descent without changing the model, while only requiring group annotations on a small validation set to compute sample weights. Theoretically, we show that, in the classification setting, three notions of disparity among different groups can be mitigated by training with the weights. Experiments on synthetic data sets demonstrate that FAIRIF yields models with better fairness-utility trade-offs against various types of bias; and on real-world data sets, we show the effectiveness and scalability of FAIRIF. Moreover, as evidenced by the experiments with pretrained models, FAIRIF is able to alleviate the unfairness issue of pretrained models without hurting their performance.
CODING & PROGRAMMING
arxiv.org

Incompleteness of graph convolutional neural networks for points clouds in three dimensions

Graph convolutional neural networks (GCNN) are very popular methods in machine learning and have been applied very successfully to the prediction of the properties of molecules and materials. First-order GCNNs are well known to be incomplete, i.e., there exist graphs that are distinct but appear identical when seen through the lens of the GCNN. More complicated schemes have thus been designed to increase their resolving power. Applications to molecules (and more generally, point clouds), however, add a geometric dimension to the problem. The most straightforward and prevalent approach to construct graph representation for the molecules regards atoms as vertices in a graph and draws a bond between each pair of atoms within a certain preselected cutoff. Bonds can be decorated with the distance between atoms, and the resulting "distance graph convolution NNs" (dGCNN) have empirically demonstrated excellent resolving power and are widely used in chemical ML. Here we show that even for the restricted case of graphs induced by 3D atom clouds dGCNNs are not complete. We construct pairs of distinct point clouds that generate graphs that, for any cutoff radius, are equivalent based on a first-order Weisfeiler-Lehman test. This class of degenerate structures includes chemically-plausible configurations, setting an ultimate limit to the expressive power of some of the well-established GCNN architectures for atomistic machine learning. Models that explicitly use angular information in the description of atomic environments can resolve these degeneracies.
COMPUTERS
IBM - United States

Digit recognition neural networks in R

Interpreting images has been a popular use case in the field of artificial intelligence (AI), and identification of handwritten digits using neural networks is commonly used in mobile applications. In this tutorial, learn how to create a web application to recognize handwritten digits using neural networks on R in Watson...
CODING & PROGRAMMING
The Independent

Scientists develop four-legged robot that hikes difficult terrain faster than average human

A new control technology has been developed by scientists for a four-legged robot that allowed it to achieve the “effortless” superhuman feat of hiking 120 vertical metres in the Alps in 31 minutes without any falls or missteps.The advance may lead to the development of new robots and other kinds of robotic technology that can be used in terrain too dangerous for humans, said the researchers, including those from ETH Zurich in Switzerland.The ANYmal quadrupedal robot successfully finished the hike – which consisted of steep sections on slippery ground, high steps and forest trails full of roots – four minutes...
ENGINEERING

