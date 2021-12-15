ContributorsPublishersAdvertisers
Exponential Convergence of Deep Operator Networks for Elliptic Partial Differential Equations

By Carlo Marcati, Christoph Schwab
arxiv.org
 4 days ago

We construct deep operator networks (ONets) between infinite-dimensional spaces that emulate with an exponential rate of convergence the coefficient-to-solution map of elliptic second-order PDEs. In particular, we consider problems set in $d$-dimensional periodic domains, $d=1, 2, \dots$, and...

arxiv.org

DMRVisNet: Deep Multi-head Regression Network for Pixel-wise Visibility Estimation Under Foggy Weather

Scene perception is essential for driving decision-making and traffic safety. However, fog, as a kind of common weather, frequently appears in the real world, especially in the mountain areas, making it difficult to accurately observe the surrounding environments. Therefore, precisely estimating the visibility under foggy weather can significantly benefit traffic management and safety. To address this, most current methods use professional instruments outfitted at fixed locations on the roads to perform the visibility measurement; these methods are expensive and less flexible. In this paper, we propose an innovative end-to-end convolutional neural network framework to estimate the visibility leveraging Koschmieder's law exclusively using the image data. The proposed method estimates the visibility by integrating the physical model into the proposed framework, instead of directly predicting the visibility value via the convolutional neural work. Moreover, we estimate the visibility as a pixel-wise visibility map against those of previous visibility measurement methods which solely predict a single value for an entire image. Thus, the estimated result of our method is more informative, particularly in uneven fog scenarios, which can benefit to developing a more precise early warning system for foggy weather, thereby better protecting the intelligent transportation infrastructure systems and promoting its development. To validate the proposed framework, a virtual dataset, FACI, containing 3,000 foggy images in different concentrations, is collected using the AirSim platform. Detailed experiments show that the proposed method achieves performance competitive to those of state-of-the-art methods.
arxiv.org

Interpolating between BSDEs and PINNs -- deep learning for elliptic and parabolic boundary value problems

Solving high-dimensional partial differential equations is a recurrent challenge in economics, science and engineering. In recent years, a great number of computational approaches have been developed, most of them relying on a combination of Monte Carlo sampling and deep learning based approximation. For elliptic and parabolic problems, existing methods can broadly be classified into those resting on reformulations in terms of $\textit{backward stochastic differential equations}$ (BSDEs) and those aiming to minimize a regression-type $L^2$-error ($\textit{physics-informed neural networks}$, PINNs). In this paper, we review the literature and suggest a methodology based on the novel $\textit{diffusion loss}$ that interpolates between BSDEs and PINNs. Our contribution opens the door towards a unified understanding of numerical approaches for high-dimensional PDEs, as well as for implementations that combine the strengths of BSDEs and PINNs. We also provide generalizations to eigenvalue problems and perform extensive numerical studies, including calculations of the ground state for nonlinear Schrödinger operators and committor functions relevant in molecular dynamics.
SCIENCE
towardsdatascience.com

Deep Dive into Neural Network Explanations with Integrated Gradients

Deep neural networks are highly utilized models that have shown great success in particular domains such as image, natural language processing, and time-series. While the efficacy of these models on these specialized domains is unrivaled, neural networks have often been thought of as “black-box” models due their opacity. Given this,...
arxiv.org

Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring

The success of the state-of-the-art video deblurring methods stems mainly from implicit or explicit estimation of alignment among the adjacent frames for latent video restoration. However, due to the influence of the blur effect, estimating the alignment information from the blurry adjacent frames is not a trivial task. Inaccurate estimations will interfere the following frame restoration. Instead of estimating alignment information, we propose a simple and effective deep Recurrent Neural Network with Multi-scale Bi-directional Propagation (RNN-MBP) to effectively propagate and gather the information from unaligned neighboring frames for better video deblurring. Specifically, we build a Multi-scale Bi-directional Propagation~(MBP) module with two U-Net RNN cells which can directly exploit the inter-frame information from unaligned neighboring hidden states by integrating them in different scales. Moreover, to better evaluate the proposed algorithm and existing state-of-the-art methods on real-world blurry scenes, we also create a Real-World Blurry Video Dataset (RBVD) by a well-designed Digital Video Acquisition System (DVAS) and use it as the training and evaluation dataset. Extensive experimental results demonstrate that the proposed RBVD dataset effectively improves the performance of existing algorithms on real-world blurry videos, and the proposed algorithm performs favorably against the state-of-the-art methods on three typical benchmarks. The code is available at this https URL.
COMPUTERS
IN THIS ARTICLE
#Elliptic#Differential Equations#Exponential#Convergence#Neural Networks#Onets#Machine Learning#Lg#Ne#Msc
arxiv.org

Superpixel-Based Building Damage Detection from Post-earthquake Very High Resolution Imagery Using Deep Neural Networks

Building damage detection after natural disasters like earthquakes is crucial for initiating effective emergency response actions. Remotely sensed very high spatial resolution (VHR) imagery can provide vital information due to their ability to map the affected buildings with high geometric precision. Many approaches have been developed to detect damaged buildings due to earthquakes. However, little attention has been paid to exploiting rich features represented in VHR images using Deep Neural Networks (DNN). This paper presents a novel super-pixel based approach combining DNN and a modified segmentation method, to detect damaged buildings from VHR imagery. Firstly, a modified Fast Scanning and Adaptive Merging method is extended to create initial over-segmentation. Secondly, the segments are merged based on the Region Adjacent Graph (RAG), considered an improved semantic similarity criterion composed of Local Binary Patterns (LBP) texture, spectral, and shape features. Thirdly, a pre-trained DNN using Stacked Denoising Auto-Encoders called SDAE-DNN is presented, to exploit the rich semantic features for building damage detection. Deep-layer feature abstraction of SDAE-DNN could boost detection accuracy through learning more intrinsic and discriminative features, which outperformed other methods using state-of-the-art alternative classifiers. We demonstrate the feasibility and effectiveness of our method using a subset of WorldView-2 imagery, in the complex urban areas of Bhaktapur, Nepal, which was affected by the Nepal Earthquake of April 25, 2015.
ARTS
arxiv.org

Unified Field Theory for Deep and Recurrent Neural Networks

Understanding capabilities and limitations of different network architectures is of fundamental importance to machine learning. Bayesian inference on Gaussian processes has proven to be a viable approach for studying recurrent and deep networks in the limit of infinite layer width, $n\to\infty$. Here we present a unified and systematic derivation of the mean-field theory for both architectures that starts from first principles by employing established methods from statistical physics of disordered systems. The theory elucidates that while the mean-field equations are different with regard to their temporal structure, they yet yield identical Gaussian kernels when readouts are taken at a single time point or layer, respectively. Bayesian inference applied to classification then predicts identical performance and capabilities for the two architectures. Numerically, we find that convergence towards the mean-field theory is typically slower for recurrent networks than for deep networks and the convergence speed depends non-trivially on the parameters of the weight prior as well as the depth or number of time steps, respectively. Our method exposes that Gaussian processes are but the lowest order of a systematic expansion in $1/n$. The formalism thus paves the way to investigate the fundamental differences between recurrent and deep architectures at finite widths $n$.
COMPUTERS
arxiv.org

Symmetry Perception by Deep Networks: Inadequacy of Feed-Forward Architectures and Improvements with Recurrent Connections

Symmetry is omnipresent in nature and perceived by the visual system of many species, as it facilitates detecting ecologically important classes of objects in our environment. Symmetry perception requires abstraction of non-local spatial dependencies between image regions, and its underlying neural mechanisms remain elusive. In this paper, we evaluate Deep Neural Network (DNN) architectures on the task of learning symmetry perception from examples. We demonstrate that feed-forward DNNs that excel at modelling human performance on object recognition tasks, are unable to acquire a general notion of symmetry. This is the case even when the DNNs are architected to capture non-local spatial dependencies, such as through `dilated' convolutions and the recently introduced `transformers' design. By contrast, we find that recurrent architectures are capable of learning to perceive symmetry by decomposing the non-local spatial dependencies into a sequence of local operations, that are reusable for novel images. These results suggest that recurrent connections likely play an important role in symmetry perception in artificial systems, and possibly, biological ones too.
COMPUTERS
arxiv.org

Regularisation by fractional noise for one-dimensional differential equations with nonnegative distributional drift

We study existence and uniqueness of solutions to the equation $dX_t=b(X_t)dt + dB_t$, where $b$ is a distribution in some Besov space and $B$ is a fractional Brownian motion with Hurst parameter $H\leqslant 1/2$. First, the equation is understood as a nonlinear Young integral equation. The integral is constructed in a $p$-variation space, which is well suited when $b$ is a nonnegative (or nonpositive) distribution. Based on the Besov regularity of $b$, a condition on $H$ is given so that solutions to the equation exist. The construction is deterministic, and $B$ can be replaced by a deterministic path $w$ which has a sufficiently smooth local time.
MATHEMATICS
arxiv.org

Deep differentiable reinforcement learning and optimal trading

In this article we introduce the differentiable reinforcement learning framework. It is based on the fact that in many reinforcement learning applications, the environment reward and transition functions are not black boxes but known differentiable functions. Incorporating deep learning in this framework we find more accurate and stable solutions than more generic actor critic algorithms. We apply this deep differentiable reinforcement learning (DDRL) algorithm to the problem of optimal trading strategies in various environments where the market dynamics are known. Thanks to the stability of this method, we are able to efficiently find optimal strategies for complex multi-scale market models and for a wide range of environment parameters. This makes it applicable to real life financial signals and portfolio optimization where the expected return has multiple time scales. In the case of a slow and a fast alpha signal, we find that the optimal trading strategy consists in using the fast signal to time the trades associated to the slow signal.
MARKETS
arxiv.org

Kraken: An Efficient Engine with a Uniform Dataflow for Deep Neural Networks

Deep neural networks (DNNs) have been successfully employed in a multitude of applications with remarkable performance. As such performance is achieved at a significant computational cost, several embedded applications demand fast and efficient hardware accelerators for DNNs. Previously proposed application specific integrated circuit (ASIC) architectures strive to utilize arrays of hundreds of processing elements (PEs) and reduce power-hungry DRAM accesses using multiple dataflows requiring complex PE architectures. These consume significant area and reduce the maximum clock frequency. This paper introduces the Kraken architecture, which optimally processes the convolutional layers, fully-connected layers, and matrix products of any DNN through a hardware-friendly uniform dataflow. This enables maximal data reuse of weights, inputs, and outputs, with a bare-bones PE design and on-the-fly dynamic reconfiguration. Kraken, implemented in 65-nm CMOS technology at 400 MHz, packs 672 PEs in 7.3 mm2, with a peak performance of 537.6 Gops. Kraken processes the convolutional layers of AlexNet, VGG-16, and ResNet-50 at 336.6, 17.5, and 64.2 frames/s, respectively, hence outperforming the state-of-the-art ASIC architectures in terms of overall performance efficiency, DRAM accesses, arithmetic intensity, and throughput, with 5.8x more Gops/mm2 and 1.6x more Gops/W.
COMPUTERS
arxiv.org

A Training Framework for Stereo-Aware Speech Enhancement using Deep Neural Networks

Deep learning-based speech enhancement has shown unprecedented performance in recent years. The most popular mono speech enhancement frameworks are end-to-end networks mapping the noisy mixture into an estimate of the clean speech. With growing computational power and availability of multichannel microphone recordings, prior works have aimed to incorporate spatial statistics along with spectral information to boost up performance. Despite an improvement in enhancement performance of mono output, the spatial image preservation and subjective evaluations have not gained much attention in the literature. This paper proposes a novel stereo-aware framework for speech enhancement, i.e., a training loss for deep learning-based speech enhancement to preserve the spatial image while enhancing the stereo mixture. The proposed framework is model independent, hence it can be applied to any deep learning based architecture. We provide an extensive objective and subjective evaluation of the trained models through a listening test. We show that by regularizing for an image preservation loss, the overall performance is improved, and the stereo aspect of the speech is better preserved.
COMPUTERS
arxiv.org

Projection methods for Neural Field equations

Neural field models are nonlinear integro-differential equations for the evolution of neuronal activity, and they are a prototypical large-scale, coarse-grained neuronal model in continuum cortices. Neural fields are often simulated heuristically and, in spite of their popularity in mathematical neuroscience, their numerical analysis is not yet fully established. We introduce generic projection methods for neural fields, and derive a-priori error bounds for these schemes. We extend an existing framework for stationary integral equations to the time-dependent case, which is relevant for neuroscience applications. We find that the convergence rate of a projection scheme for a neural field is determined to a great extent by the convergence rate of the projection operator. This abstract analysis, which unifies the treatment of collocation and Galerkin schemes, is carried out in operator form, without resorting to quadrature rules for the integral term, which are introduced only at a later stage, and whose choice is enslaved by the choice of the projector. Using an elementary timestepper as an example, we demonstrate that the error in a time stepper has two separate contributions: one from the projector, and one from the time discretisation. We give examples of concrete projection methods: two collocation schemes (piecewise-linear and spectral collocation) and two Galerkin schemes (finite elements and spectral Galerkin); for each of them we derive error bounds from the general theory, introduce several discrete variants, provide implementation details, and present reproducible convergence tests.
COMPUTERS
arxiv.org

Merging Subject Matter Expertise and Deep Convolutional Neural Network for State-Based Online Machine-Part Interaction Classification

Machine-part interaction classification is a key capability required by Cyber-Physical Systems (CPS), a pivotal enabler of Smart Manufacturing (SM). While previous relevant studies on the subject have primarily focused on time series classification, change point detection is equally important because it provides temporal information on changes in behavior of the machine. In this work, we address point detection and time series classification for machine-part interactions with a deep Convolutional Neural Network (CNN) based framework. The CNN in this framework utilizes a two-stage encoder-classifier structure for efficient feature representation and convenient deployment customization for CPS. Though data-driven, the design and optimization of the framework are Subject Matter Expertise (SME) guided. An SME defined Finite State Machine (FSM) is incorporated into the framework to prohibit intermittent misclassifications. In the case study, we implement the framework to perform machine-part interaction classification on a milling machine, and the performance is evaluated using a testing dataset and deployment simulations. The implementation achieved an average F1-Score of 0.946 across classes on the testing dataset and an average delay of 0.24 seconds on the deployment simulations.
SOFTWARE
aithority.com

SES Government Solutions Releases New Unified Operational Network

Hydra provides mission assurance with modern, fully customizable situational awareness dashboard for the U.S. Government and military. SES Government Solutions (SES GS), a wholly-owned subsidiary of SES, announced its new Common Operational Picture (COP) platform, Hydra, built exclusively to serve the U.S. Government and military. Managed and operated in-house, Hydra...
TECHNOLOGY
arxiv.org

RamBoAttack: A Robust Query Efficient Deep Neural Network Decision Exploit

Machine learning models are critically susceptible to evasion attacks from adversarial examples. Generally, adversarial examples, modified inputs deceptively similar to the original input, are constructed under whitebox settings by adversaries with full access to the model. However, recent attacks have shown a remarkable reduction in query numbers to craft adversarial examples using blackbox attacks. Particularly, alarming is the ability to exploit the classification decision from the access interface of a trained model provided by a growing number of Machine Learning as a Service providers including Google, Microsoft, IBM and used by a plethora of applications incorporating these models. The ability of an adversary to exploit only the predicted label from a model to craft adversarial examples is distinguished as a decision-based attack. In our study, we first deep dive into recent state-of-the-art decision-based attacks in ICLR and SP to highlight the costly nature of discovering low distortion adversarial employing gradient estimation methods. We develop a robust query efficient attack capable of avoiding entrapment in a local minimum and misdirection from noisy gradients seen in gradient estimation methods. The attack method we propose, RamBoAttack, exploits the notion of Randomized Block Coordinate Descent to explore the hidden classifier manifold, targeting perturbations to manipulate only localized input features to address the issues of gradient estimation methods. Importantly, the RamBoAttack is more robust to the different sample inputs available to an adversary and the targeted class. Overall, for a given target class, RamBoAttack is demonstrated to be more robust at achieving a lower distortion within a given query budget. We curate our extensive results using the large-scale high-resolution ImageNet dataset and open-source our attack, test samples and artifacts on GitHub.
COMPUTERS
arxiv.org

Distributed neural network control with dependability guarantees: a compositional port-Hamiltonian approach

Large-scale cyber-physical systems require that control policies are distributed, that is, that they only rely on local real-time measurements and communication with neighboring agents. Optimal Distributed Control (ODC) problems are, however, highly intractable even in seemingly simple cases. Recent work has thus proposed training Neural Network (NN) distributed controllers. A main challenge of NN controllers is that they are not dependable during and after training, that is, the closed-loop system may be unstable, and the training may fail due to vanishing and exploding gradients. In this paper, we address these issues for networks of nonlinear port-Hamiltonian (pH) systems, whose modeling power ranges from energy systems to non-holonomic vehicles and chemical reactions. Specifically, we embrace the compositional properties of pH systems to characterize deep Hamiltonian control policies with built-in closed-loop stability guarantees, irrespective of the interconnection topology and the chosen NN parameters. Furthermore, our setup enables leveraging recent results on well-behaved neural ODEs to prevent the phenomenon of vanishing gradients by design. Numerical experiments corroborate the dependability of the proposed architecture, while matching the performance of general neural network policies.
COMPUTERS
arxiv.org

Subspace Decomposition based DNN algorithm for elliptic type multi-scale PDEs

While deep learning algorithms demonstrate a great potential in scientific computing, its application to multi-scale problems remains to be a big challenge. This is manifested by the "frequency principle" that neural networks tend to learn low frequency components first. Novel architectures such as multi-scale deep neural network (MscaleDNN) were proposed to alleviate this problem to some extent. In this paper, we construct a subspace decomposition based DNN (dubbed SD$^2$NN) architecture for a class of multi-scale problems by combining traditional numerical analysis ideas and MscaleDNN algorithms. The proposed architecture includes one low frequency normal DNN submodule, and one (or a few) high frequency MscaleDNN submodule(s), which are designed to capture the smooth part and the oscillatory part of the multi-scale solutions, respectively. In addition, a novel trigonometric activation function is incorporated in the SD$^2$NN model. We demonstrate the performance of the SD$^2$NN architecture through several benchmark multi-scale problems in regular or irregular geometric domains. Numerical results show that the SD$^2$NN model is superior to existing models such as MscaleDNN.
CODING & PROGRAMMING
spglobal.com

Global 5G Survey: Operators push past COVID-19 to accelerate 5G network upgrades

While the COVID-19 pandemic prompted delays to 5G infrastructure buildouts and associated service deployments, mobile network operators, or MNOs, remained firmly committed to 5G network upgrades, according to Kagan's 2021 global 5G survey of 83 wireless operator decision-makers. The annual survey, completed in September 2021, revealed key data points from respondents, including the 51% of the surveyed MNOs that claim to offer 5G services, up from 38% of respondents in our 2020 5G survey. An additional 28% of respondents plan to offer 5G service in 2022, with a further 15% expecting to deploy 5G in 2023.
PUBLIC HEALTH
arxiv.org

Advancing Residual Learning towards Powerful Deep Spiking Neural Networks

Despite the rapid progress of neuromorphic computing, inadequate capacity and insufficient representation power of spiking neural networks (SNNs) severely restrict their application scope in practice. Residual learning and shortcuts have been evidenced as an important approach for training deep neural networks, but rarely did previous work assess their applicability to the characteristics of spike-based communication and spatiotemporal dynamics. In this paper, we first identify that this negligence leads to impeded information flow and accompanying degradation problem in previous residual SNNs. Then we propose a novel SNN-oriented residual block, MS-ResNet, which is able to significantly extend the depth of directly trained SNNs, e.g. up to 482 layers on CIFAR-10 and 104 layers on ImageNet, without observing any slight degradation problem. We validate the effectiveness of MS-ResNet on both frame-based and neuromorphic datasets, and MS-ResNet104 achieves a superior result of 76.02% accuracy on ImageNet, the first time in the domain of directly trained SNNs. Great energy efficiency is also observed that on average only one spike per neuron is needed to classify an input sample. We believe our powerful and scalable models will provide a strong support for further exploration of SNNs.
CODING & PROGRAMMING
arxiv.org

KartalOl: Transfer learning using deep neural network for iris segmentation and localization: New dataset for iris segmentation

Jalil Nourmohammadi Khiarak, Samaneh Salehi Nasab, Farhang Jaryani, Seyed Naeim Moafinejad, Rana Pourmohamad, Yasin Amini, Morteza Noshad. Iris segmentation and localization in unconstrained environments is challenging due to long distances, illumination variations, limited user cooperation, and moving subjects. To address this problem, we present a U-Net with a pre-trained MobileNetV2 deep neural network method. We employ the pre-trained weights given with MobileNetV2 for the ImageNet dataset and fine-tune it on the iris recognition and localization domain. Further, we have introduced a new dataset, called KartalOl, to better evaluate detectors in iris recognition scenarios. To provide domain adaptation, we fine-tune the MobileNetV2 model on the provided data for NIR-ISL 2021 from the CASIA-Iris-Asia, CASIA-Iris-M1, and CASIA-Iris-Africa and our dataset. We also augment the data by performing left-right flips, rotation, zoom, and brightness. We chose the binarization threshold for the binary masks by iterating over the images in the provided dataset. The proposed method is tested and trained in CASIA-Iris-Asia, CASIA-Iris-M1, CASIA-Iris-Africa, along the KartalOl dataset. The experimental results highlight that our method surpasses state-of-the-art methods on mobile-based benchmarks. The codes and evaluation results are publicly available at this https URL.
TECHNOLOGY

