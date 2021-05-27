Cancel
CreatorsPublishersAdvertisers
View more in
Computers

BSNN: Towards Faster and Better Conversion of Artificial Neural Networks to Spiking Neural Networks with Bistable Neurons

By Yang Li, Yi Zeng, Dongcheng Zhao
arxiv.org
 22 days ago

The spiking neural network (SNN) computes and communicates information through discrete binary events. It is considered more biologically plausible and more energy-efficient than artificial neural networks (ANN) in emerging neuromorphic hardware. However, due to the discontinuous and non-differentiable characteristics, training SNN is a relatively challenging task. Recent work has achieved essential progress on an excellent performance by converting ANN to SNN. Due to the difference in information processing, the converted deep SNN usually suffers serious performance loss and large time delay. In this paper, we analyze the reasons for the performance loss and propose a novel bistable spiking neural network (BSNN) that addresses the problem of spikes of inactivated neurons (SIN) caused by the phase lead and phase lag. Also, when ResNet structure-based ANNs are converted, the information of output neurons is incomplete due to the rapid transmission of the shortcut path. We design synchronous neurons (SN) to help efficiently improve performance. Experimental results show that the proposed method only needs 1/4-1/10 of the time steps compared to previous work to achieve nearly lossless conversion. We demonstrate state-of-the-art ANN-SNN conversion for VGG16, ResNet20, and ResNet34 on challenging datasets including CIFAR-10 (95.16% top-1), CIFAR-100 (78.12% top-1), and ImageNet (72.64% top-1).

arxiv.org
IN THIS ARTICLE
#Neural Networks#Spikes#Ann#Imagenet#Ne
YOU MAY ALSO LIKE
News Break
Artificial Intelligence
News Break
Technology
News Break
Computers
News Break
Science
News Break
Computer Science
Related
Technologyarxiv.org

Spectral Temporal Graph Neural Network for Trajectory Prediction

An effective understanding of the contextual environment and accurate motion forecasting of surrounding agents is crucial for the development of autonomous vehicles and social mobile robots. This task is challenging since the behavior of an autonomous agent is not only affected by its own intention, but also by the static environment and surrounding dynamically interacting agents. Previous works focused on utilizing the spatial and temporal information in time domain while not sufficiently taking advantage of the cues in frequency domain. To this end, we propose a Spectral Temporal Graph Neural Network (SpecTGNN), which can capture inter-agent correlations and temporal dependency simultaneously in frequency domain in addition to time domain. SpecTGNN operates on both an agent graph with dynamic state information and an environment graph with the features extracted from context images in two streams. The model integrates graph Fourier transform, spectral graph convolution and temporal gated convolution to encode history information and forecast future trajectories. Moreover, we incorporate a multi-head spatio-temporal attention mechanism to mitigate the effect of error propagation in a long time horizon. We demonstrate the performance of SpecTGNN on two public trajectory prediction benchmark datasets, which achieves state-of-the-art performance in terms of prediction accuracy.
Sciencearxiv.org

A Physics Informed Neural Network for Time-Dependent Nonlinear and Higher Order Partial Differential Equations

A physics informed neural network (PINN) incorporates the physics of a system by satisfying its boundary value problem through a neural network's loss function. The PINN approach has shown great success in approximating the map between the solution of a partial differential equation (PDE) and its spatio-temporal input. However, for strongly non-linear and higher order partial differential equations PINN's accuracy reduces significantly. To resolve this problem, we propose a novel PINN scheme that solves the PDE sequentially over successive time segments using a single neural network. The key idea is to re-train the same neural network for solving the PDE over successive time segments while satisfying the already obtained solution for all previous time segments. Thus it is named as backward compatible PINN (bc-PINN). To illustrate the advantages of bc-PINN, we have used the Cahn Hilliard and Allen Cahn equations, which are widely used to describe phase separation and reaction diffusion systems. Our results show significant improvement in accuracy over the PINN method while using a smaller number of collocation points. Additionally, we have shown that using the phase space technique for a higher order PDE could further improve the accuracy and efficiency of the bc-PINN scheme.
Softwarearxiv.org

Classification of Audio Segments in Call Center Recordings using Convolutional Recurrent Neural Networks

Detailed statistical analysis of call center recordings is critical in the customer relationship management point of view. With the recent advances in artificial intelligence, many tasks regarding the calculation of call statistics are now performed automatically. This work proposes a neural network framework where the aim is to correctly identify audio segments and classify them as either customer or agent sections. Accurately identifying these sections gives a fair metric for evaluating agents' performances. We inherited the convolutional recurrent neural network (CRNN) architecture commonly used for such problems as music genre classification. We also tested the same architecture's performance, where the previous class information and the gender information of speakers are also added to the training data labels. We saw that CRNN could generalize the training data and perform well on validation data for this problem with and without the gender information. Moreover, even the training was performed using Turkish speech samples; the trained network was proven to achieve high accuracy for call center recordings in other languages like German and English.
Sciencearxiv.org

SpikePropamine: Differentiable Plasticity in Spiking Neural Networks

The adaptive changes in synaptic efficacy that occur between spiking neurons have been demonstrated to play a critical role in learning for biological neural networks. Despite this source of inspiration, many learning focused applications using Spiking Neural Networks (SNNs) retain static synaptic connections, preventing additional learning after the initial training period. Here, we introduce a framework for simultaneously learning the underlying fixed-weights and the rules governing the dynamics of synaptic plasticity and neuromodulated synaptic plasticity in SNNs through gradient descent. We further demonstrate the capabilities of this framework on a series of challenging benchmarks, learning the parameters of several plasticity rules including BCM, Oja's, and their respective set of neuromodulatory variants. The experimental results display that SNNs augmented with differentiable plasticity are sufficient for solving a set of challenging temporal learning tasks that a traditional SNN fails to solve, even in the presence of significant noise. These networks are also shown to be capable of producing locomotion on a high-dimensional robotic learning task, where near-minimal degradation in performance is observed in the presence of novel conditions not seen during the initial training period.
Softwareai-summary.com

Summary: Fraud and Anomaly Detection with Artificial Neural Networks using Python3 and Tensorflow.

Over the last few years, there has been a increasing trend in demand for the application of anomaly detection models within the field of data science — especially when it comes to the detection of fraudulent vs non-fraudulent actions. Within the following dataset, we will explore the use of a number of different predictive models, each with varying complexity. As with every good data science project, we will first examine the dataset, preprocess our data, explore the contents, train a number of models, and finally review and evaluate the results.
Computersarxiv.org

GraphMI: Extracting Private Graph Data from Graph Neural Networks

As machine learning becomes more widely used for critical applications, the need to study its implications in privacy turns to be urgent. Given access to the target model and auxiliary information, the model inversion attack aims to infer sensitive features of the training dataset, which leads to great privacy concerns. Despite its success in grid-like domains, directly applying model inversion techniques on non-grid domains such as graph achieves poor attack performance due to the difficulty to fully exploit the intrinsic properties of graphs and attributes of nodes used in Graph Neural Networks (GNN). To bridge this gap, we present \textbf{Graph} \textbf{M}odel \textbf{I}nversion attack (GraphMI), which aims to extract private graph data of the training graph by inverting GNN, one of the state-of-the-art graph analysis tools. Specifically, we firstly propose a projected gradient module to tackle the discreteness of graph edges while preserving the sparsity and smoothness of graph features. Then we design a graph auto-encoder module to efficiently exploit graph topology, node attributes, and target model parameters for edge inference. With the proposed methods, we study the connection between model inversion risk and edge influence and show that edges with greater influence are more likely to be recovered. Extensive experiments over several public datasets demonstrate the effectiveness of our method. We also show that differential privacy in its canonical form can hardly defend our attack while preserving decent utility.
Sciencearxiv.org

Neural Network Surrogate Models for Absorptivity and Emissivity Spectra of Multiple Elements

Michael D. Vander Wal (1), Ryan G. McClarren (1), Kelli D. Humbird (2) ((1) University of Notre Dame, (2) Lawrence Livermore National Laboratory) Simulations of high energy density physics are expensive in terms of computational resources. In particular, the computation of opacities of plasmas, which are needed to accurately compute radiation transport in the non-local thermal equilibrium (NLTE) regime, are expensive to the point of easily requiring multiple times the sum-total compute time of all other components of the simulation. As such, there is great interest in finding ways to accelerate NLTE computations. Previous work has demonstrated that a combination of fully-connected autoencoders and a deep jointly-informed neural network (DJINN) can successfully replace the standard NLTE calculations for the opacity of krypton. This work expands this idea to multiple elements in demonstrating that individual surrogate models can be also be generated for other elements with the focus being on creating autoencoders that can accurately encode and decode the absorptivity and emissivity spectra. Furthermore, this work shows that multiple elements across a large range of atomic numbers can be combined into a single autoencoder when using a convolutional autoencoder while maintaining accuracy that is comparable to individual fully-connected autoencoders. Lastly, it is demonstrated that DJINN can effectively learn the latent space of a convolutional autoencoder that can encode multiple elements allowing the combination to effectively function as a surrogate model.
Coding & Programmingarxiv.org

Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering

Inferring representations of 3D scenes from 2D observations is a fundamental problem of computer graphics, computer vision, and artificial intelligence. Emerging 3D-structured neural scene representations are a promising approach to 3D scene understanding. In this work, we propose a novel neural scene representation, Light Field Networks or LFNs, which represent both geometry and appearance of the underlying 3D scene in a 360-degree, four-dimensional light field parameterized via a neural implicit representation. Rendering a ray from an LFN requires only a *single* network evaluation, as opposed to hundreds of evaluations per ray for ray-marching or volumetric based renderers in 3D-structured neural scene representations. In the setting of simple scenes, we leverage meta-learning to learn a prior over LFNs that enables multi-view consistent light field reconstruction from as little as a single image observation. This results in dramatic reductions in time and memory complexity, and enables real-time rendering. The cost of storing a 360-degree light field via an LFN is two orders of magnitude lower than conventional methods such as the Lumigraph. Utilizing the analytical differentiability of neural implicit representations and a novel parameterization of light space, we further demonstrate the extraction of sparse depth maps from LFNs.
Sciencearxiv.org

Hierarchical Temperature Imaging Using Pseudo-Inversed Convolutional Neural Network Aided TDLAS Tomography

As an in situ combustion diagnostic tool, Tunable Diode Laser Absorption Spectroscopy (TDLAS) tomography has been widely used for imaging of two-dimensional temperature distributions in reactive flows. Compared with the computational tomographic algorithms, Convolutional Neural Networks (CNNs) have been proofed to be more robust and accurate for image reconstruction, particularly in case of limited access of laser beams in the Region of Interest (RoI). In practice, flame in the RoI that requires to be reconstructed with good spatial resolution is commonly surrounded by low-temperature background. Although the background is not of high interest, spectroscopic absorption still exists due to heat dissipation and gas convection. Therefore, we propose a Pseudo-Inversed CNN (PI-CNN) for hierarchical temperature imaging that (a) uses efficiently the training and learning resources for temperature imaging in the RoI with good spatial resolution, and (b) reconstructs the less spatially resolved background temperature by adequately addressing the integrity of the spectroscopic absorption model. In comparison with the traditional CNN, the newly introduced pseudo inversion of the RoI sensitivity matrix is more penetrating for revealing the inherent correlation between the projection data and the RoI to be reconstructed, thus prioritising the temperature imaging in the RoI with high accuracy and high computational efficiency. In this paper, the proposed algorithm was validated by both numerical simulation and lab-scale experiment, indicating good agreement between the phantoms and the high-fidelity reconstructions.
Computersarxiv.org

Predify: Augmenting deep neural networks with brain-inspired predictive coding dynamics

Deep neural networks excel at image classification, but their performance is far less robust to input perturbations than human perception. In this work we explore whether this shortcoming may be partly addressed by incorporating brain-inspired recurrent dynamics in deep convolutional networks. We take inspiration from a popular framework in neuroscience: 'predictive coding'. At each layer of the hierarchical model, generative feedback 'predicts' (i.e., reconstructs) the pattern of activity in the previous layer. The reconstruction errors are used to iteratively update the network's representations across timesteps, and to optimize the network's feedback weights over the natural image dataset-a form of unsupervised training. We show that implementing this strategy into two popular networks, VGG16 and EfficientNetB0, improves their robustness against various corruptions. We hypothesize that other feedforward networks could similarly benefit from the proposed framework. To promote research in this direction, we provide an open-sourced PyTorch-based package called Predify, which can be used to implement and investigate the impacts of the predictive coding dynamics in any convolutional neural network.
Coding & Programmingtowardsdatascience.com

Building Neural Network in Swift using Metal shaders

Using Metal Performance Shaders framework in building Neural Network. In the previous article, I implemented neural network framework from scratch. It supports CPU multi-threading, but doesn’t support GPU computations. In this article I’ll implement similar framework, but with use of Metal Performance Shaders. WWDC19 session 614 inspired me to write this article.
Coding & Programmingarxiv.org

SpreadGNN: Serverless Multi-task Federated Learning for Graph Neural Networks

Graph Neural Networks (GNNs) are the first choice methods for graph machine learning problems thanks to their ability to learn state-of-the-art level representations from graph-structured data. However, centralizing a massive amount of real-world graph data for GNN training is prohibitive due to user-side privacy concerns, regulation restrictions, and commercial competition. Federated Learning is the de-facto standard for collaborative training of machine learning models over many distributed edge devices without the need for centralization. Nevertheless, training graph neural networks in a federated setting is vaguely defined and brings statistical and systems challenges. This work proposes SpreadGNN, a novel multi-task federated training framework capable of operating in the presence of partial labels and absence of a central server for the first time in the literature. SpreadGNN extends federated multi-task learning to realistic serverless settings for GNNs, and utilizes a novel optimization algorithm with a convergence guarantee, Decentralized Periodic Averaging SGD (DPA-SGD), to solve decentralized multi-task learning problems. We empirically demonstrate the efficacy of our framework on a variety of non-I.I.D. distributed graph-level molecular property prediction datasets with partial labels. Our results show that SpreadGNN outperforms GNN models trained over a central server-dependent federated learning system, even in constrained topologies. The source code is publicly available at this https URL.
Coding & Programmingarxiv.org

Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks

Neural network compression techniques have become increasingly popular as they can drastically reduce the storage and computation requirements for very large networks. Recent empirical studies have illustrated that even simple pruning strategies can be surprisingly effective, and several theoretical studies have shown that compressible networks (in specific senses) should achieve a low generalization error. Yet, a theoretical characterization of the underlying cause that makes the networks amenable to such simple compression schemes is still missing. In this study, we address this fundamental question and reveal that the dynamics of the training algorithm has a key role in obtaining such compressible networks. Focusing our attention on stochastic gradient descent (SGD), our main contribution is to link compressibility to two recently established properties of SGD: (i) as the network size goes to infinity, the system can converge to a mean-field limit, where the network weights behave independently, (ii) for a large step-size/batch-size ratio, the SGD iterates can converge to a heavy-tailed stationary distribution. In the case where these two phenomena occur simultaneously, we prove that the networks are guaranteed to be '$\ell_p$-compressible', and the compression errors of different pruning techniques (magnitude, singular value, or node pruning) become arbitrarily small as the network size increases. We further prove generalization bounds adapted to our theoretical framework, which indeed confirm that the generalization error will be lower for more compressible networks. Our theory and numerical study on various neural networks show that large step-size/batch-size ratios introduce heavy-tails, which, in combination with overparametrization, result in compressibility.
Computersarxiv.org

Convolutional Neural Networks with Gated Recurrent Connections

The convolutional neural network (CNN) has become a basic model for solving many computer vision problems. In recent years, a new class of CNNs, recurrent convolution neural network (RCNN), inspired by abundant recurrent connections in the visual systems of animals, was proposed. The critical element of RCNN is the recurrent convolutional layer (RCL), which incorporates recurrent connections between neurons in the standard convolutional layer. With increasing number of recurrent computations, the receptive fields (RFs) of neurons in RCL expand unboundedly, which is inconsistent with biological facts. We propose to modulate the RFs of neurons by introducing gates to the recurrent connections. The gates control the amount of context information inputting to the neurons and the neurons' RFs therefore become adaptive. The resulting layer is called gated recurrent convolution layer (GRCL). Multiple GRCLs constitute a deep model called gated RCNN (GRCNN). The GRCNN was evaluated on several computer vision tasks including object recognition, scene text recognition and object detection, and obtained much better results than the RCNN. In addition, when combined with other adaptive RF techniques, the GRCNN demonstrated competitive performance to the state-of-the-art models on benchmark datasets for these tasks. The codes are released at \href{this https URL}{this https URL}.
Computersarxiv.org

Feature Flow Regularization: Improving Structured Sparsity in Deep Neural Networks

Pruning is a model compression method that removes redundant parameters in deep neural networks (DNNs) while maintaining accuracy. Most available filter pruning methods require complex treatments such as iterative pruning, features statistics/ranking, or additional optimization designs in the training process. In this paper, we propose a simple and effective regularization strategy from a new perspective of evolution of features, which we call feature flow regularization (FFR), for improving structured sparsity and filter pruning in DNNs. Specifically, FFR imposes controls on the gradient and curvature of feature flow along the neural network, which implicitly increases the sparsity of the parameters. The principle behind FFR is that coherent and smooth evolution of features will lead to an efficient network that avoids redundant parameters. The high structured sparsity obtained from FFR enables us to prune filters effectively. Experiments with VGGNets, ResNets on CIFAR-10/100, and Tiny ImageNet datasets demonstrate that FFR can significantly improve both unstructured and structured sparsity. Our pruning results in terms of reduction of parameters and FLOPs are comparable to or even better than those of state-of-the-art pruning methods.
Coding & Programmingarxiv.org

Adam in Private: Secure and Fast Training of Deep Neural Networks with Adaptive Moment Estimation

Nuttapong Attrapadung, Koki Hamada, Dai Ikarashi, Ryo Kikuchi, Takahiro Matsuda, Ibuki Mishina, Hiraku Morita, Jacob C. N. Schuldt. Privacy-preserving machine learning (PPML) aims at enabling machine learning (ML) algorithms to be used on sensitive data. We contribute to this line of research by proposing a framework that allows efficient and secure evaluation of full-fledged state-of-the-art ML algorithms via secure multi-party computation (MPC). This is in contrast to most prior works, which substitute ML algorithms with approximated "MPC-friendly" variants. A drawback of the latter approach is that fine-tuning of the combined ML and MPC algorithms is required, which might lead to less efficient algorithms or inferior quality ML. This is an issue for secure deep neural networks (DNN) training in particular, as this involves arithmetic algorithms thought to be "MPC-unfriendly", namely, integer division, exponentiation, inversion, and square root. In this work, we propose secure and efficient protocols for the above seemingly MPC-unfriendly computations. Our protocols are three-party protocols in the honest-majority setting, and we propose both passively secure and actively secure with abort variants. A notable feature of our protocols is that they simultaneously provide high accuracy and efficiency. This framework enables us to efficiently and securely compute modern ML algorithms such as Adam and the softmax function "as is", without resorting to approximations. As a result, we obtain secure DNN training that outperforms state-of-the-art three-party systems; our full training is up to 6.7 times faster than just the online phase of the recently proposed FALCON@PETS'21 on a standard benchmark network. We further perform measurements on real-world DNNs, AlexNet and VGG16. The performance of our framework is up to a factor of about 12-14 faster for AlexNet and 46-48 faster for VGG16 to achieve an accuracy of 70% and 75%, respectively, when compared to FALCON.
Astronomyarxiv.org

SCONE: Supernova Classification with a Convolutional Neural Network

We present a novel method of classifying Type Ia supernovae using convolutional neural networks, a neural network framework typically used for image recognition. Our model is trained on photometric information only, eliminating the need for accurate redshift data. Photometric data is pre-processed via 2D Gaussian process regression into two-dimensional images created from flux values at each location in wavelength-time space. These "flux heatmaps" of each supernova detection, along with "uncertainty heatmaps" of the Gaussian process uncertainty, constitute the dataset for our model. This preprocessing step not only smooths over irregular sampling rates between filters but also allows SCONE to be independent of the filter set on which it was trained. Our model has achieved impressive performance without redshift on the in-distribution SNIa classification problem: $99.73 \pm 0.26$% test accuracy with no over/underfitting on a subset of supernovae from PLAsTiCC's unblinded test dataset. We have also achieved $98.18 \pm 0.3$% test accuracy performing 6-way classification of supernovae by type. The out-of-distribution performance does not fully match the in-distribution results, suggesting that the detailed characteristics of the training sample in comparison to the test sample have a big impact on the performance. We discuss the implication and directions for future work. All of the data processing and model code developed for this paper can be found in the SCONE software package located at this http URL.
Computerstechxplore.com

A bio-inspired technique to mitigate catastrophic forgetting in binarized neural networks

Deep neural networks have achieved highly promising results on several tasks, including image and text classification. Nonetheless, many of these computational methods are prone to what is known as catastrophic forgetting, which essentially means that when they are trained on a new task, they tend to rapidly forget how to complete tasks they were trained to complete in the past.
Computersarxiv.org

Title:Si microring resonator crossbar array for on-chip inference and training of optical neural network

Authors:Shuhei Ohno, Kasidit Toprasertpong, Shinichi Takagi, Mitsuru Takenaka. Abstract: Deep learning is one of the most advancing technologies in various fields. Facing the limits of the current electronics platform, optical neural networks (ONNs) based on Si programmable photonic integrated circuits (PICs) have attracted considerable attention as a novel deep learning scheme with optical-domain matrix-vector multiplication (MVM). However, most of the proposed Si programmable PICs for ONNs have several drawbacks such as low scalability, high power consumption, and lack of frameworks for training. To address these issues, we have proposed a microring resonator (MRR) crossbar array as a Si programmable PIC for an ONN. In this article, we present a prototype of a fully integrated 4 ${\rm \times}$ 4 MRR crossbar array and demonstrated a simple MVM and classification task. Moreover, we propose on-chip backpropagation using the transpose matrix operation of the MRR crossbar array, enabling the on-chip training of the ONN. The proposed ONN scheme can establish a scalable, power-efficient deep learning accelerator for applications in both inference and training tasks.
ComputersPosted by
HackerNoon

Neural Networks and Deep Learning

Neural networks are a hot topic in the technology industry today, partially because they make a cameo appearance in many everyday devices. From your phone’s camera to an Alexa to even a toothbrush - companies and organizations are jumping on the AI hype train. Whether some of these are appropriate...