ContributorsPublishersAdvertisers
Computers

JMSNAS: Joint Model Split and Neural Architecture Search for Learning over Mobile Edge Networks

By Yuqing Tian, Zhaoyang Zhang, Zhaohui Yang, Qianqian Yang
arxiv.org
 8 days ago

The main challenge to deploy deep neural network (DNN) over a mobile edge network is how to split the DNN model so as to match the network architecture as well as all the nodes' computation and communication...

arxiv.org

Comments / 0

Related
arxiv.org

An Energy Consumption Model for Electrical Vehicle Networks via Extended Federated-learning

Electrical vehicle (EV) raises to promote an eco-sustainable society. Nevertheless, the ``range anxiety'' of EV hinders its wider acceptance among customers. This paper proposes a novel solution to range anxiety based on a federated-learning model, which is capable of estimating battery consumption and providing energy-efficient route planning for vehicle networks. Specifically, the new approach extends the federated-learning structure with two components: anomaly detection and sharing policy. The first component identifies preventing factors in model learning, while the second component offers guidelines for information sharing amongst vehicle networks when the sharing is necessary to preserve learning efficiency. The two components collaborate to enhance learning robustness against data heterogeneities in networks. Numerical experiments are conducted, and the results show that compared with considered solutions, the proposed approach could provide higher accuracy of battery-consumption estimation for vehicles under heterogeneous data distributions, without increasing the time complexity or transmitting raw data among vehicle networks.
CARS
arxiv.org

Transmission Power Control for Over-the-Air Federated Averaging at Network Edge

Over-the-air computation (AirComp) has emerged as a new analog power-domain non-orthogonal multiple access (NOMA) technique for low-latency model/gradient-updates aggregation in federated edge learning (FEEL). By integrating communication and computation into a joint design, AirComp can significantly enhance the communication efficiency, but at the cost of aggregation errors caused by channel fading and noise. This paper studies a particular type of FEEL with federated averaging (FedAvg) and AirComp-based model-update aggregation, namely over-the-air FedAvg (Air-FedAvg). We investigate the transmission power control to combat against the AirComp aggregation errors for enhancing the training accuracy and accelerating the training speed of Air-FedAvg. Towards this end, we first analyze the convergence behavior (in terms of the optimality gap) of Air-FedAvg with aggregation errors at different outer iterations. Then, to enhance the training accuracy, we minimize the optimality gap by jointly optimizing the transmission power control at edge devices and the denoising factors at edge server, subject to a series of power constraints at individual edge devices. Furthermore, to accelerate the training speed, we also minimize the training latency of Air-FedAvg with a given targeted optimality gap, in which learning hyper-parameters including the numbers of outer iterations and local training epochs are jointly optimized with the power control. Finally, numerical results show that the proposed transmission power control policy achieves significantly faster convergence for Air-FedAvg, as compared with benchmark policies with fixed power transmission or per-iteration mean squared error (MSE) minimization. It is also shown that the Air-FedAvg achieves an order-of-magnitude shorter training latency than the conventional FedAvg with digital orthogonal multiple access (OMA-FedAvg).
SOFTWARE
arxiv.org

Climate Modeling with Neural Diffusion Equations

Owing to the remarkable development of deep learning technology, there have been a series of efforts to build deep learning-based climate models. Whereas most of them utilize recurrent neural networks and/or graph neural networks, we design a novel climate model based on the two concepts, the neural ordinary differential equation (NODE) and the diffusion equation. Many physical processes involving a Brownian motion of particles can be described by the diffusion equation and as a result, it is widely used for modeling climate. On the other hand, neural ordinary differential equations (NODEs) are to learn a latent governing equation of ODE from data. In our presented method, we combine them into a single framework and propose a concept, called neural diffusion equation (NDE). Our NDE, equipped with the diffusion equation and one more additional neural network to model inherent uncertainty, can learn an appropriate latent governing equation that best describes a given climate dataset. In our experiments with two real-world and one synthetic datasets and eleven baselines, our method consistently outperforms existing baselines by non-trivial margins.
ENVIRONMENT
arxiv.org

Graph Neural Network Training with Data Tiering

Graph Neural Networks (GNNs) have shown success in learning from graph-structured data, with applications to fraud detection, recommendation, and knowledge graph reasoning. However, training GNN efficiently is challenging because: 1) GPU memory capacity is limited and can be insufficient for large datasets, and 2) the graph-based data structure causes irregular data access patterns. In this work, we provide a method to statistical analyze and identify more frequently accessed data ahead of GNN training. Our data tiering method not only utilizes the structure of input graph, but also an insight gained from actual GNN training process to achieve a higher prediction result. With our data tiering method, we additionally provide a new data placement and access strategy to further minimize the CPU-GPU communication overhead. We also take into account of multi-GPU GNN training as well and we demonstrate the effectiveness of our strategy in a multi-GPU system. The evaluation results show that our work reduces CPU-GPU traffic by 87-95% and improves the training speed of GNN over the existing solutions by 1.6-2.1x on graphs with hundreds of millions of nodes and billions of edges.
CODING & PROGRAMMING
IN THIS ARTICLE
#Neural Networks#Mobile#Network Architecture#Jmsnas#Dnn#Ieee Icc
arxiv.org

Pairwise interactions for Potential energy surfaces and Atomic forces with Deep Neural network

Molecular dynamics (MD) simulation, which is considered an important tool for studying physical and chemical processes at the atomic scale, requires accurate calculations of energies and forces. Although reliable energies and forces can be obtained by electronic structure calculations such as those based on density functional theory (DFT), this approach is computationally expensive. In this work, we propose a full-stack model using deep neural network (NN) to enhance the calculation of force and energy, in which the NN is designed to extract the embedding feature of pairwise interactions of an atom and its neighbors, which are aggregated to obtain its feature vector for predicting atomic force and potential energy. By designing the features of the pairwise interactions, we can control the performance of models and take into account the many-body effects and other physics of the atomic interactions. Moreover, we demonstrated that using the Coulomb matrix of the local structures in complement to the pairwise information, we can improve the prediction of force and energy for silicon systems and the transferability of our models is confirmed to larger systems, with high accuracy.
SCIENCE
towardsdatascience.com

TensorFlow for Computer Vision — How to Train Image Classifier with Artificial Neural Networks

Image classification without convolutions? Here’s why it’s a bad idea. Artificial neural networks aren’t designed for image classification. But how terrible can they be? That’s what we’ll find out today. We’ll train an image classification model on 20,000 images using only Dense layers. So no convolutions and other fancy stuff, we’ll save them for upcoming articles.
CODING & PROGRAMMING
arxiv.org

Machine Learning-Based Optimization of Chiral Photonic Nanostructures: Evolution- and Neural Network-Based Design

Chiral photonics opens new pathways to manipulate light-matter interactions and tailor the optical response of meta-surfaces and -materials by nanostructuring nontrivial patterns. Chirality of matter, such as that of molecules, and light, which in the simplest case is given by the handedness of circular polarization, have attracted much attention for applications in chemistry, nanophotonics and optical information processing. We report the design of chiral photonic structures using two machine learning methods, the evolutionary algorithm and neural network approach, for rapid and efficient optimization of optical properties for dielectric metasurfaces. The design recipes obtained for visible light in the range of transition-metal dichalcogenide exciton resonances show a frequency-dependent modification in the reflected light's degree of circular polarization, that is represented by the difference between left- and right-circularly polarized intensity. Our results suggest the facile fabrication and characterization of optical nanopatterned reflectors for chirality-sensitive light-matter coupling scenarios employing tungsten disulfide as possible active material with features such as valley Hall effect and optical valley coherence.
COMPUTERS
arxiv.org

An Underexplored Dilemma between Confidence and Calibration in Quantized Neural Networks

Modern convolutional neural networks (CNNs) are known to be overconfident in terms of their calibration on unseen input data. That is to say, they are more confident than they are accurate. This is undesirable if the probabilities predicted are to be used for downstream decision making. When considering accuracy, CNNs are also surprisingly robust to compression techniques, such as quantization, which aim to reduce computational and memory costs. We show that this robustness can be partially explained by the calibration behavior of modern CNNs, and may be improved with overconfidence. This is due to an intuitive result: low confidence predictions are more likely to change post-quantization, whilst being less accurate. High confidence predictions will be more accurate, but more difficult to change. Thus, a minimal drop in post-quantization accuracy is incurred. This presents a potential conflict in neural network design: worse calibration from overconfidence may lead to better robustness to quantization. We perform experiments applying post-training quantization to a variety of CNNs, on the CIFAR-100 and ImageNet datasets.
COMPUTERS
YOU MAY ALSO LIKE
NewsBreak
Architecture
NewsBreak
Artificial Intelligence
NewsBreak
Technology
NewsBreak
Computers
arxiv.org

Parallel Physics-Informed Neural Networks with Bidirectional Balance

As an emerging technology in deep learning, physics-informed neural networks (PINNs) have been widely used to solve various partial differential equations (PDEs) in engineering. However, PDEs based on practical considerations contain multiple physical quantities and complex initial boundary conditions, thus PINNs often returns incorrect results. Here we take heat transfer problem in multilayer fabrics as a typical example. It is coupled by multiple temperature fields with strong correlation, and the values of variables are extremely unbalanced among different dimensions. We clarify the potential difficulties of solving such problems by classic PINNs, and propose a parallel physics-informed neural networks with bidirectional balance. In detail, our parallel solving framework synchronously fits coupled equations through several multilayer perceptions. Moreover, we design two modules to balance forward process of data and back-propagation process of loss gradient. This bidirectional balance not only enables the whole network to converge stably, but also helps to fully learn various physical conditions in PDEs. We provide a series of ablation experiments to verify the effectiveness of the proposed methods. The results show that our approach makes the PINNs unsolvable problem solvable, and achieves excellent solving accuracy.
ENGINEERING
arxiv.org

Topic-aware latent models for representation learning on networks

Network representation learning (NRL) methods have received significant attention over the last years thanks to their success in several graph analysis problems, including node classification, link prediction, and clustering. Such methods aim to map each vertex of the network into a low-dimensional space in a way that the structural information of the network is preserved. Of particular interest are methods based on random walks; such methods transform the network into a collection of node sequences, aiming to learn node representations by predicting the context of each node within the sequence. In this paper, we introduce TNE, a generic framework to enhance the embeddings of nodes acquired by means of random walk-based approaches with topic-based information. Similar to the concept of topical word embeddings in Natural Language Processing, the proposed model first assigns each node to a latent community with the favor of various statistical graph models and community detection methods and then learns the enhanced topic-aware representations. We evaluate our methodology in two downstream tasks: node classification and link prediction. The experimental results demonstrate that by incorporating node and community embeddings, we are able to outperform widely-known baseline NRL models.
COMPUTERS
arxiv.org

FaaS Execution Models for Edge Applications

In this paper, we address the problem of supporting stateful workflows following a Function-as-a-Service (FaaS) model in edge networks. In particular we focus on the problem of data transfer, which can be a performance bottleneck due to the limited speed of communication links in some edge scenarios and we propose three different schemes: a pure FaaS implementation, StateProp, i.e., propagation of the application state throughout the entire chain of functions, and StateLocal, i.e., a solution where the state is kept local to the workers that run functions and retrieved only as needed. We then extend the proposed schemes to the more general case of applications modeled as Directed Acyclic Graphs (DAGs), which cover a broad range of practical applications, e.g., in the Internet of Things (IoT) area. Our contribution is validated via a prototype implementation. Experiments in emulated conditions show that applying the data locality principle reduces significantly the volume of network traffic required and improves the end-to-end delay performance, especially with local caching on edge nodes and low link speeds.
SOFTWARE
arxiv.org

Neural optimal feedback control with local learning rules

Johannes Friedrich, Siavash Golkar, Shiva Farashahi, Alexander Genkin, Anirvan M. Sengupta, Dmitri B. Chklovskii. A major problem in motor control is understanding how the brain plans and executes proper movements in the face of delayed and noisy stimuli. A prominent framework for addressing such control problems is Optimal Feedback Control (OFC). OFC generates control actions that optimize behaviorally relevant criteria by integrating noisy sensory stimuli and the predictions of an internal model using the Kalman filter or its extensions. However, a satisfactory neural model of Kalman filtering and control is lacking because existing proposals have the following limitations: not considering the delay of sensory feedback, training in alternating phases, and requiring knowledge of the noise covariance matrices, as well as that of systems dynamics. Moreover, the majority of these studies considered Kalman filtering in isolation, and not jointly with control. To address these shortcomings, we introduce a novel online algorithm which combines adaptive Kalman filtering with a model free control approach (i.e., policy gradient algorithm). We implement this algorithm in a biologically plausible neural network with local synaptic plasticity rules. This network performs system identification and Kalman filtering, without the need for multiple phases with distinct update rules or the knowledge of the noise covariances. It can perform state estimation with delayed sensory feedback, with the help of an internal model. It learns the control policy without requiring any knowledge of the dynamics, thus avoiding the need for weight transport. In this way, our implementation of OFC solves the credit assignment problem needed to produce the appropriate sensory-motor control in the presence of stimulus delay.
SCIENCE
bitcoinmagazine.com

A Neural Network Is Developing Between Bitcoin Lightning Network Nodes

The below is a direct excerpt of Marty's Bent Issue #1109: "A neural network is developing between Lightning Nodes." Sign up for the newsletter here. Above is a visualization of the current Lightning Network topography made up of ~16,000 Lightning Nodes with ~140,000 payment channels opened between them. I don't know if I'm simply being duped by some visualization magic, but I can't help but think that we are all witnessing the emergence of something massive. Something that will have a profound effect on humanity that we can't quite comprehend yet. The topography that is emerging on the Lightning Network seems to be mimicking many things we find in nature as long time Bitcoin Core maintainer Wladimir van der Laan points out below.
ECONOMY
arxiv.org

On Neural Network Identification for Low-Speed Ship Maneuvering Model

Several studies on ship maneuvering models have been conducted using captive model tests or computational fluid dynamics (CFD) and physical models, such as the maneuvering modeling group (MMG) model. A new system identification method for generating a low-speed maneuvering model using recurrent neural networks (RNNs) and free running model tests is proposed in this study. We especially focus on a low-speed maneuver such as the final phase in berthing to achieve automatic berthing control. Accurate dynamic modeling with minimum modeling error is highly desired to establish a model-based control system. We propose a new loss function that reduces the effect of the noise included in the training data. Besides, we revealed the following facts - an RNN that ignores the memory before a certain time improved the prediction accuracy compared with the "standard" RNN, and the random maneuver test was effective in obtaining an accurate berthing maneuver model. In addition, several low-speed free running model tests were performed for the scale model of the M.V. Esso Osaka. As a result, this paper showed that the proposed method using a neural network model could accurately represent low-speed maneuvering motions.
INDUSTRY
arxiv.org

Soft Sensing Model Visualization: Fine-tuning Neural Network from What Model Learned

The growing availability of the data collected from smart manufacturing is changing the paradigms of production monitoring and control. The increasing complexity and content of the wafer manufacturing process in addition to the time-varying unexpected disturbances and uncertainties, make it infeasible to do the control process with model-based approaches. As a result, data-driven soft-sensing modeling has become more prevalent in wafer process diagnostics. Recently, deep learning has been utilized in soft sensing system with promising performance on highly nonlinear and dynamic time-series data. Despite its successes in soft-sensing systems, however, the underlying logic of the deep learning framework is hard to understand. In this paper, we propose a deep learning-based model for defective wafer detection using a highly imbalanced dataset. To understand how the proposed model works, the deep visualization approach is applied. Additionally, the model is then fine-tuned guided by the deep visualization. Extensive experiments are performed to validate the effectiveness of the proposed system. The results provide an interpretation of how the model works and an instructive fine-tuning method based on the interpretation.
SOFTWARE
ElectronicsWeekly.com

NXP i.MX 9 gets neural processor for AI at the edge

NXP has announced the first processors of its i.MX 9, with some including its first Arm ‘Ethos-U65’ neural network processor (dubbed a ‘microNPU’), with added security for artificial intelligence at the edge. “These attributes enable developers to address areas from voice-assisted smart home and building systems, to low-power industrial gateways...
COMPUTERS
arxiv.org

Predicting Lattice Phonon Vibrational Frequencies Using Deep Graph Neural Networks

Lattice vibration frequencies are related to many important materials properties such as thermal and electrical conductivity as well as superconductivity. However, computational calculation of vibration frequencies using density functional theory (DFT) methods is too computationally demanding for a large number of samples in materials screening. Here we propose a deep graph neural network-based algorithm for predicting crystal vibration frequencies from crystal structures with high accuracy. Our algorithm addresses the variable dimension of vibration frequency spectrum using the zero padding scheme. Benchmark studies on two data sets with 15,000 and 35,552 samples show that the aggregated $R^2$ scores of the prediction reaches 0.554 and 0.724 respectively. Our work demonstrates the capability of deep graph neural networks to learn to predict phonon spectrum properties of crystal structures in addition to phonon density of states (DOS) and electronic DOS in which the output dimension is constant.
SCIENCE
dataversity.net

A Brief History of Neural Networks

In the last few decades, neural networks have evolved from an academic curiosity into a vast “deep learning” industry. Deep learning uses neural networks, a data structure design loosely inspired by the layout of biological neurons. These neural networks are constructed in layers, and the inputs from one layer are connected to the outputs of the next layer. Deep learning is a subdivision of machine learning, and allows computers to automatically recognize faces and transcribe spoken words into text, and allows self-driving cars to avoid objects on the street.
COMPUTERS
arxiv.org

DropGNN: Random Dropouts Increase the Expressiveness of Graph Neural Networks

This paper studies Dropout Graph Neural Networks (DropGNNs), a new approach that aims to overcome the limitations of standard GNN frameworks. In DropGNNs, we execute multiple runs of a GNN on the input graph, with some of the nodes randomly and independently dropped in each of these runs. Then, we combine the results of these runs to obtain the final result. We prove that DropGNNs can distinguish various graph neighborhoods that cannot be separated by message passing GNNs. We derive theoretical bounds for the number of runs required to ensure a reliable distribution of dropouts, and we prove several properties regarding the expressive capabilities and limits of DropGNNs. We experimentally validate our theoretical findings on expressiveness. Furthermore, we show that DropGNNs perform competitively on established GNN benchmarks.
CODING & PROGRAMMING
arxiv.org

Implicit vs Unfolded Graph Neural Networks

It has been observed that graph neural networks (GNN) sometimes struggle to maintain a healthy balance between modeling long-range dependencies across nodes while avoiding unintended consequences such as oversmoothed node representations. To address this issue (among other things), two separate strategies have recently been proposed, namely implicit and unfolded GNNs. The former treats node representations as the fixed points of a deep equilibrium model that can efficiently facilitate arbitrary implicit propagation across the graph with a fixed memory footprint. In contrast, the latter involves treating graph propagation as the unfolded descent iterations as applied to some graph-regularized energy function. While motivated differently, in this paper we carefully elucidate the similarity and differences of these methods, quantifying explicit situations where the solutions they produced may actually be equivalent and others where behavior diverges. This includes the analysis of convergence, representational capacity, and interpretability. We also provide empirical head-to-head comparisons across a variety of synthetic and public real-world benchmarks.
COMPUTERS

Comments / 0

Community Policy