ContributorsPublishersAdvertisers
Engineering

HEAM: High-Efficiency Approximate Multiplier Optimization for Deep Neural Networks

By Su Zheng, Zhen Li, Yao Lu, Jingbo Gao, Jide Zhang, Lingli Wang
arxiv.org
 4 days ago

Deep neural networks (DNNs) are widely applied to artificial intelligence applications, achieving promising performance at the cost of massive computation, large power consumption, and high latency. Diverse solutions have been proposed to cope with the challenge of latency and power consumption, including light-weight neural networks and efficient hardware accelerators....

arxiv.org

Comments / 0

Related
towardsdatascience.com

Facial Expression Recognition (FER) without Artificial Neural Networks

When it comes to talking about Machine Learning, it’s clear that it is the science (and art) of programming computers that learn from data [1]. However, this definition raises some questions, and the first one is: data? Excel spreadsheets?. The first thing people think (or at least that’s the...
COMPUTERS
IBM - United States

Digit recognition neural networks in R

Interpreting images has been a popular use case in the field of artificial intelligence (AI), and identification of handwritten digits using neural networks is commonly used in mobile applications. In this tutorial, learn how to create a web application to recognize handwritten digits using neural networks on R in Watson...
CODING & PROGRAMMING
arxiv.org

Disentangled Graph Neural Networks for Session-based Recommendation

Session-based recommendation (SBR) has drawn increasingly research attention in recent years, due to its great practical value by only exploiting the limited user behavior history in the current session. Existing methods typically learn the session embedding at the item level, namely, aggregating the embeddings of items with or without the attention weights assigned to items. However, they ignore the fact that a user's intent on adopting an item is driven by certain factors of the item (e.g., the leading actors of an movie). In other words, they have not explored finer-granularity interests of users at the factor level to generate the session embedding, leading to sub-optimal performance. To address the problem, we propose a novel method called Disentangled Graph Neural Network (Disen-GNN) to capture the session purpose with the consideration of factor-level attention on each item. Specifically, we first employ the disentangled learning technique to cast item embeddings into the embedding of multiple factors, and then use the gated graph neural network (GGNN) to learn the embedding factor-wisely based on the item adjacent similarity matrix computed for each factor. Moreover, the distance correlation is adopted to enhance the independence between each pair of factors. After representing each item with independent factors, an attention mechanism is designed to learn user intent to different factors of each item in the session. The session embedding is then generated by aggregating the item embeddings with attention weights of each item's factors. To this end, our model takes user intents at the factor level into account to infer the user purpose in a session. Extensive experiments on three benchmark datasets demonstrate the superiority of our method over existing methods.
COMPUTERS
arxiv.org

SnapFuzz: An Efficient Fuzzing Framework for Network Applications

In recent years, fuzz testing has benefited from increased computational power and important algorithmic advances, leading to systems that have discovered many critical bugs and vulnerabilities in production software. Despite these successes, not all applications can be fuzzed efficiently. In particular, stateful applications such as network protocol implementations are constrained by their low fuzzing throughput and the need to develop fuzzing harnesses that reset their state and isolate their side effects. In this paper, we present SnapFuzz, a novel fuzzing framework for network applications. SnapFuzz offers a robust architecture that transforms slow asynchronous network communication into fast synchronous communication based on UNIX domain sockets, speeds up all file operations by redirecting them to an in-memory filesystem, and removes the need for many fragile modifications, such as configuring time delays or writing cleanup scripts, together with several other improvements. Using SnapFuzz, we fuzzed five popular networking applications: LightFTP, Dnsmasq, LIVE555, TinyDTLS and Dcmqrscp. We report impressive performance speedups of 72.4x, 49.7x, 24.8x, 23.9x, and 8.5x, respectively, with significantly simpler fuzzing harnesses in all cases. Through its performance advantage, SnapFuzz has also found 12 previously-unknown crashes in these applications.
SOFTWARE
IN THIS ARTICLE
#Deep Neural Networks#Multiplier#Design#Heam#Mnist#Fpga#Asic#Dnn#Hardware Architecture#Ar
Nature.com

Connecting reservoir computing with statistical forecasting and deep neural networks

Among the existing machine learning frameworks, reservoir computing demonstrates fast and low-cost training, and its suitability for implementation in various physical systems. This Comment reports on how aspects of reservoir computing can be applied to classical forecasting methods to accelerate the learning process, and highlights a new approach that makes the hardware implementation of traditional machine learning algorithms practicable in electronic and photonic systems.
CODING & PROGRAMMING
arxiv.org

Energy Efficiency for Proactive Eavesdropping in Cooperative Cognitive Radio Networks

This paper investigates a distant proactive eavesdropping system in cooperative cognitive radio (CR) networks. Specifically, an amplify-and-forward (AF) full-duplex (FD) secondary transmitter assists to relay the received signal from suspicious users to legitimate monitor for wireless information surveillance. In return, the secondary transmitter is granted to share the spectrum belonging to the suspicious users for its own information transmission. To improve the eavesdropping, the transmitted secondary user's signal can also be used as a jamming signal to moderate the data rate of the suspicious link. We consider two cases, i.e., non-negligible processing delay (NNPD) and negligible processing delay (NPD) at secondary transmitter. Our target is to maximize network energy efficiency (NEE) via jointly optimizing the AF relay matrix and precoding vector at the secondary transmitter, as well as the receiver combining vector at monitor, subject to the maximum power constraint at the secondary transmitter and minimum data rate requirement of the secondary user. We also guarantee that the achievable data rate of the eavesdropping link should be no less than that of the suspicious link for efficient surveillance. Due to the non-convexity of the formulated NEE maximization problem, we develop an efficient path-following algorithm and a robust alternating optimization (AO) method as solutions under perfect and imperfect channel state information (CSI) conditions, respectively. We also analyze the convergence and computational complexity of the proposed schemes. Numerical results are provided to validate the effectiveness of our proposed schemes.
arxiv.org

Towards Lightweight Neural Animation : Exploration of Neural Network Pruning in Mixture of Experts-based Animation Models

In the past few years, neural character animation has emerged and offered an automatic method for animating virtual characters. Their motion is synthesized by a neural network. Controlling this movement in real time with a user-defined control signal is also an important task in video games for example. Solutions based on fully-connected layers (MLPs) and Mixture-of-Experts (MoE) have given impressive results in generating and controlling various movements with close-range interactions between the environment and the virtual character. However, a major shortcoming of fully-connected layers is their computational and memory cost which may lead to sub-optimized solution. In this work, we apply pruning algorithms to compress an MLP- MoE neural network in the context of interactive character animation, which reduces its number of parameters and accelerates its computation time with a trade-off between this acceleration and the synthesized motion quality. This work demonstrates that, with the same number of experts and parameters, the pruned model produces less motion artifacts than the dense model and the learned high-level motion features are similar for both.
CODING & PROGRAMMING
arxiv.org

Systematic biases when using deep neural networks for annotating large catalogs of astronomical images

Deep convolutional neural networks (DCNNs) have become the most common solution for automatic image annotation due to their non-parametric nature, good performance, and their accessibility through libraries such as TensorFlow. Among other fields, DCNNs are also a common approach to the annotation of large astronomical image databases acquired by digital sky surveys. One of the main downsides of DCNNs is the complex non-intuitive rules that make DCNNs act as a ``black box", providing annotations in a manner that is unclear to the user. Therefore, the user is often not able to know what information is used by the DCNNs for the classification. Here we demonstrate that the training of a DCNN is sensitive to the context of the training data such as the location of the objects in the sky. We show that for basic classification of elliptical and spiral galaxies, the sky location of the galaxies used for training affects the behavior of the algorithm, and leads to a small but consistent and statistically significant bias. That bias exhibits itself in the form of cosmological-scale anisotropy in the distribution of basic galaxy morphology. Therefore, while DCNNs are powerful tools for annotating images of extended sources, the construction of training sets for galaxy morphology should take into consideration more aspects than the visual appearance of the object. In any case, catalogs created with deep neural networks that exhibit signs of cosmological anisotropy should be interpreted with the possibility of consistent bias.
SCIENCE
YOU MAY ALSO LIKE
NewsBreak
Engineering
NewsBreak
Artificial Intelligence
NewsBreak
Technology
NewsBreak
Computers
arxiv.org

NSGZero: Efficiently Learning Non-Exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search

How resources are deployed to secure critical targets in networks can be modelled by Network Security Games (NSGs). While recent advances in deep learning (DL) provide a powerful approach to dealing with large-scale NSGs, DL methods such as NSG-NFSP suffer from the problem of data inefficiency. Furthermore, due to centralized control, they cannot scale to scenarios with a large number of resources. In this paper, we propose a novel DL-based method, NSGZero, to learn a non-exploitable policy in NSGs. NSGZero improves data efficiency by performing planning with neural Monte Carlo Tree Search (MCTS). Our main contributions are threefold. First, we design deep neural networks (DNNs) to perform neural MCTS in NSGs. Second, we enable neural MCTS with decentralized control, making NSGZero applicable to NSGs with many resources. Third, we provide an efficient learning paradigm, to achieve joint training of the DNNs in NSGZero. Compared to state-of-the-art algorithms, our method achieves significantly better data efficiency and scalability.
CODING & PROGRAMMING
arxiv.org

An Efficient Algorithm for Generating Directed Networks with Predetermined Assortativity Measures

Assortativity coefficients are important metrics to analyze both directed and undirected networks. In general, it is not guaranteed that the fitted model will always agree with the assortativity coefficients in the given network, and the structure of directed networks is more complicated than the undirected ones. Therefore, we provide a remedy by proposing a degree-preserving rewiring algorithm, called DiDPR, for generating directed networks with given directed assortativity coefficients. We construct the joint edge distribution of the target network by accounting for the four directed assortativity coefficients simultaneously, provided that they are attainable, and obtain the desired network by solving a convex optimization problem.Our algorithm also helps check the attainability of the given assortativity coefficients. We assess the performance of the proposed algorithm by simulation studies with focus on two different network models, namely Erdös--Rényi and preferential attachment random networks. We then apply the algorithm to a Facebook wall post network as a real data example. The codes for implementing our algorithm are publicly available in R package wdnet.
COMPUTERS
towardsdatascience.com

Visualizing Backpropagation in Neural Network Training at Any Scale

Using HiPlot to generate parallel coordinate plots to visualize deep learning model training. Understanding and debugging a Neural Network’s performance on a dataset is a critical chapter in the end-to-end lifecycle of a Machine Learning (ML) model. Having the ability to comprehend how a model is training can provide valuable insight into where improvements can be made. In this article, we will walk through creating a simple, yet effective, method of visualizing a process called backpropagation during Neural Network training. The visualization technique we will be using is called parallel coordinate plots. This is generally a technique used to visualize many different features with varying units or types from multiple data points. Below is an outline of the rest of this article:
CODING & PROGRAMMING
arxiv.org

Neural Capacitance: A New Perspective of Neural Network Selection via Edge Dynamics

Efficient model selection for identifying a suitable pre-trained neural network to a downstream task is a fundamental yet challenging task in deep learning. Current practice requires expensive computational costs in model training for performance prediction. In this paper, we propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training. Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections. Therefore, a converged neural network is associated with an equilibrium state of a networked system composed of those edges. To this end, we construct a network mapping $\phi$, converting a neural network $G_A$ to a directed line graph $G_B$ that is defined on those edges in $G_A$. Next, we derive a neural capacitance metric $\beta_{\rm eff}$ as a predictive measure universally capturing the generalization capability of $G_A$ on the downstream task using only a handful of early training results. We carried out extensive experiments using 17 popular pre-trained ImageNet models and five benchmark datasets, including CIFAR10, CIFAR100, SVHN, Fashion MNIST and Birds, to evaluate the fine-tuning performance of our framework. Our neural capacitance metric is shown to be a powerful indicator for model selection based only on early training results and is more efficient than state-of-the-art methods.
SCIENCE
arxiv.org

Automatic Sparse Connectivity Learning for Neural Networks

Since sparse neural networks usually contain many zero weights, these unnecessary network connections can potentially be eliminated without degrading network performance. Therefore, well-designed sparse neural networks have the potential to significantly reduce FLOPs and computational resources. In this work, we propose a new automatic pruning method - Sparse Connectivity Learning (SCL). Specifically, a weight is re-parameterized as an element-wise multiplication of a trainable weight variable and a binary mask. Thus, network connectivity is fully described by the binary mask, which is modulated by a unit step function. We theoretically prove the fundamental principle of using a straight-through estimator (STE) for network pruning. This principle is that the proxy gradients of STE should be positive, ensuring that mask variables converge at their minima. After finding Leaky ReLU, Softplus, and Identity STEs can satisfy this principle, we propose to adopt Identity STE in SCL for discrete mask relaxation. We find that mask gradients of different features are very unbalanced, hence, we propose to normalize mask gradients of each feature to optimize mask variable training. In order to automatically train sparse masks, we include the total number of network connections as a regularization term in our objective function. As SCL does not require pruning criteria or hyper-parameters defined by designers for network layers, the network is explored in a larger hypothesis space to achieve optimized sparse connectivity for the best performance. SCL overcomes the limitations of existing automatic pruning methods. Experimental results demonstrate that SCL can automatically learn and select important network connections for various baseline network structures. Deep learning models trained by SCL outperform the SOTA human-designed and automatic pruning methods in sparsity, accuracy, and FLOPs reduction.
CODING & PROGRAMMING
arxiv.org

Training Fair Deep Neural Networks by Balancing Influence

Most fair machine learning methods either highly rely on the sensitive information of the training samples or require a large modification on the target models, which hinders their practical application. To address this issue, we propose a two-stage training algorithm named FAIRIF. It minimizes the loss over the reweighted data set (second stage) where the sample weights are computed to balance the model performance across different demographic groups (first stage). FAIRIF can be applied on a wide range of models trained by stochastic gradient descent without changing the model, while only requiring group annotations on a small validation set to compute sample weights. Theoretically, we show that, in the classification setting, three notions of disparity among different groups can be mitigated by training with the weights. Experiments on synthetic data sets demonstrate that FAIRIF yields models with better fairness-utility trade-offs against various types of bias; and on real-world data sets, we show the effectiveness and scalability of FAIRIF. Moreover, as evidenced by the experiments with pretrained models, FAIRIF is able to alleviate the unfairness issue of pretrained models without hurting their performance.
CODING & PROGRAMMING
arxiv.org

Training Free Graph Neural Networks for Graph Matching

We present TFGM (Training Free Graph Matching), a framework to boost the performance of Graph Neural Networks (GNNs) based graph matching without training. TFGM sidesteps two crucial problems when training GNNs: 1) the limited supervision due to expensive annotation, and 2) training's computational cost. A basic framework, BasicTFGM, is first proposed by adopting the inference stage of graph matching methods. Our analysis shows that the BasicTFGM is a linear relaxation to the quadratic assignment formulation of graph matching. This guarantees the preservation of structure compatibility and an efficient polynomial complexity. Empirically, we further improve the BasicTFGM by handcrafting two types of matching priors into the architecture of GNNs: comparing node neighborhoods of different localities and utilizing annotation data if available. For evaluation, we conduct extensive experiments on a broad set of settings, including supervised keypoint matching between images, semi-supervised entity alignment between knowledge graphs, and unsupervised alignment between protein interaction networks. Applying TFGM on various GNNs shows promising improvements over baselines. Further ablation studies demonstrate the effective and efficient training-free property of TFGM. Our code is available at this https URL.
CODING & PROGRAMMING
arxiv.org

Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks

Several image processing tasks, such as image classification and object detection, have been significantly improved using Convolutional Neural Networks (CNN). Like ResNet and EfficientNet, many architectures have achieved outstanding results in at least one dataset by the time of their creation. A critical factor in training concerns the network's regularization, which prevents the structure from overfitting. This work analyzes several regularization methods developed in the last few years, showing significant improvements for different CNN models. The works are classified into three main areas: the first one is called "data augmentation", where all the techniques focus on performing changes in the input data. The second, named "internal changes", which aims to describe procedures to modify the feature maps generated by the neural network or the kernels. The last one, called "label", concerns transforming the labels of a given input. This work presents two main differences comparing to other available surveys about regularization: (i) the first concerns the papers gathered in the manuscript, which are not older than five years, and (ii) the second distinction is about reproducibility, i.e., all works refered here have their code available in public repositories or they have been directly implemented in some framework, such as TensorFlow or Torch.
COMPUTERS
arxiv.org

Assessing the persistence of chalcogen bonds in solution with neural network potentials

Non-covalent bonding patterns are commonly harvested as a design principle in the field of catalysis, supramolecular chemistry and functional materials to name a few. Yet, their computational description generally neglects finite temperature and environment effects, which promote competing interactions and alter their static gas-phase properties. Recently, neural network potentials (NNPs) trained on Density Functional Theory (DFT) data have become increasingly popular to simulate molecular phenomena in condensed phase with an accuracy comparable to ab initio methods. To date, most applications have centered on solid-state materials or fairly simple molecules made of a limited number of elements. Herein, we focus on the persistence and strength of chalcogen bonds involving a benzotelluradiazole in condensed phase. While the tellurium-containing heteroaromatic molecules are known to exhibit pronounced interactions with anions and lone pairs of different atoms, the relevance of competing intermolecular interactions, notably with the solvent, is complicated to monitor experimentally but also challenging to model at an accurate electronic structure level. Here, we train direct and baselined NNPs to reproduce hybrid DFT energies and forces in order to identify what are the most prevalent non-covalent interactions occurring in a solute-Cl$^-$-THF mixture. The simulations in explicit solvent highlight the clear competition with chalcogen bonds formed with the solvent and the short-range directionality of the interaction with direct consequences for the molecular properties in the solution. The comparison with other potentials (e.g., AMOEBA, direct NNP and continuum solvent model) also demonstrates that baselined NNPs offer a reliable picture of the non-covalent interaction interplay occurring in solution.
MATHEMATICS
arxiv.org

Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks

We study the dynamics of a neural network in function space when optimizing the mean squared error via gradient flow. We show that in the underparameterized regime the network learns eigenfunctions of an integral operator $T_{K^\infty}$ determined by the Neural Tangent Kernel (NTK) at rates corresponding to their eigenvalues. For example, for uniformly distributed data on the sphere $S^{d - 1}$ and rotation invariant weight distributions, the eigenfunctions of $T_{K^\infty}$ are the spherical harmonics. Our results can be understood as describing a spectral bias in the underparameterized regime. The proofs use the concept of "Damped Deviations", where deviations of the NTK matter less for eigendirections with large eigenvalues due to the occurence of a damping factor. Aside from the underparameterized regime, the damped deviations point-of-view can be used to track the dynamics of the empirical risk in the overparameterized setting, allowing us to extend certain results in the literature. We conclude that damped deviations offers a simple and unifying perspective of the dynamics when optimizing the squared error.
SCIENCE
arxiv.org

An Efficient Multi-Indicator and Many-Objective Optimization Algorithm based on Two-Archive

Indicator-based algorithms are gaining prominence as traditional multi-objective optimization algorithms based on domination and decomposition struggle to solve many-objective optimization problems. However, previous indicator-based multi-objective optimization algorithms suffer from the following flaws: 1) The environment selection process takes a long time; 2) Additional parameters are usually necessary. As a result, this paper proposed an multi-indicator and multi-objective optimization algorithm based on two-archive (SRA3) that can efficiently select good individuals in environment selection based on indicators performance and uses an adaptive parameter strategy for parental selection without setting additional parameters. Then we normalized the algorithm and compared its performance before and after normalization, finding that normalization improved the algorithm's performance significantly. We also analyzed how normalizing affected the indicator-based algorithm and observed that the normalized $I_{\epsilon+}$ indicator is better at finding extreme solutions and can reduce the influence of each objective's different extent of contribution to the indicator due to its different scope. However, it also has a preference for extreme solutions, which causes the solution set to converge to the extremes. As a result, we give some suggestions for normalization. Then, on the DTLZ and WFG problems, we conducted experiments on 39 problems with 5, 10, and 15 objectives, and the results show that SRA3 has good convergence and diversity while maintaining high efficiency. Finally, we conducted experiments on the DTLZ and WFG problems with 20 and 25 objectives and found that the algorithm proposed in this paper is more competitive than other algorithms as the number of objectives increases.
CODING & PROGRAMMING
arxiv.org

Quantum activation functions for quantum neural networks

The field of artificial neural networks is expected to strongly benefit from recent developments of quantum computers. In particular, quantum machine learning, a class of quantum algorithms which exploit qubits for creating trainable neural networks, will provide more power to solve problems such as pattern recognition, clustering and machine learning in general. The building block of feed-forward neural networks consists of one layer of neurons connected to an output neuron that is activated according to an arbitrary activation function. The corresponding learning algorithm goes under the name of Rosenblatt perceptron. Quantum perceptrons with specific activation functions are known, but a general method to realize arbitrary activation functions on a quantum computer is still lacking. Here we fill this gap with a quantum algorithm which is capable to approximate any analytic activation functions to any given order of its power series. Unlike previous proposals providing irreversible measurement--based and simplified activation functions, here we show how to approximate any analytic function to any required accuracy without the need to measure the states encoding the information. Thanks to the generality of this construction, any feed-forward neural network may acquire the universal approximation properties according to Hornik's theorem. Our results recast the science of artificial neural networks in the architecture of gate-model quantum computers.
COMPUTERS

Comments / 0

Community Policy