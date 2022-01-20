ContributorsPublishersAdvertisers
Lightweight Salient Object Detection in Optical Remote Sensing Images via Feature Correlation

By Gongyang Li, Zhi Liu, Zhen Bai, Weisi Lin, and Haibin Ling
arxiv.org
 4 days ago

Salient object detection in optical remote sensing images (ORSI-SOD) has been widely explored for understanding ORSIs. However, previous methods focus mainly on improving the detection accuracy while neglecting the cost in memory and computation, which may hinder their real-world applications. In this paper, we propose a novel lightweight ORSI-SOD solution, named...

arxiv.org

arxiv.org

Enhancing Low-Light Images in Real World via Cross-Image Disentanglement

Images captured in the low-light condition suffer from low visibility and various imaging artifacts, e.g., real noise. Existing supervised enlightening algorithms require a large set of pixel-aligned training image pairs, which are hard to prepare in practice. Though weakly-supervised or unsupervised methods can alleviate such challenges without using paired training images, some real-world artifacts inevitably get falsely amplified because of the lack of corresponded supervision. In this paper, instead of using perfectly aligned images for training, we creatively employ the misaligned real-world images as the guidance, which are considerably easier to collect. Specifically, we propose a Cross-Image Disentanglement Network (CIDN) to separately extract cross-image brightness and image-specific content features from low/normal-light images. Based on that, CIDN can simultaneously correct the brightness and suppress image artifacts in the feature domain, which largely increases the robustness to the pixel shifts. Furthermore, we collect a new low-light image enhancement dataset consisting of misaligned training images with real-world corruptions. Experimental results show that our model achieves state-of-the-art performances on both the newly proposed dataset and other popular low-light datasets.
COMPUTERS
arxiv.org

Sparsely Annotated Object Detection: A Region-based Semi-supervised Approach

Research shows a noticeable drop in performance of object detectors when the training data has missing annotations, i.e. sparsely annotated data. Contemporary methods focus on proxies for missing ground-truth annotations either in the form of pseudo-labels or by re-weighing gradients for unlabeled boxes during training. In this work, we revisit the formulation of sparsely annotated object detection. We observe that sparsely annotated object detection can be considered a semi-supervised object detection problem at a region level. Building on this insight, we propose a region-based semi-supervised algorithm, that automatically identifies regions containing unlabeled foreground objects. Our algorithm then processes the labeled and un-labeled foreground regions differently, a common practice in semi-supervised methods. To evaluate the effectiveness of the proposed approach, we conduct exhaustive experiments on five splits commonly used by sparsely annotated approaches on the PASCAL-VOC and COCO datasets and achieve state-of-the-art performance. In addition to this, we show that our approach achieves competitive performance on standard semi-supervised setups demonstrating the strength and broad applicability of our approach.
COMPUTERS
arxiv.org

Drone Object Detection Using RGB/IR Fusion

Object detection using aerial drone imagery has received a great deal of attention in recent years. While visible light images are adequate for detecting objects in most scenarios, thermal cameras can extend the capabilities of object detection to night-time or occluded objects. As such, RGB and Infrared (IR) fusion methods for object detection are useful and important. One of the biggest challenges in applying deep learning methods to RGB/IR object detection is the lack of available training data for drone IR imagery, especially at night. In this paper, we develop several strategies for creating synthetic IR images using the AIRSim simulation engine and CycleGAN. Furthermore, we utilize an illumination-aware fusion framework to fuse RGB and IR images for object detection on the ground. We characterize and test our methods for both simulated and actual data. Our solution is implemented on an NVIDIA Jetson Xavier running on an actual drone, requiring about 28 milliseconds of processing per RGB/IR image pair.
ELECTRONICS
HackerNoon

This AI Removes Unwanted Objects From Your Images!

This task of removing part of an image and replacing it with what should appear behind has been tackled by many AI researchers for a long time. It is called image inpainting, and it’s extremely challenging. Learn more in the video!. Learn how this algorithm can understand images and...
SOFTWARE
arxiv.org

Multilevel T-spline Approximation for Scattered Observations with Application to Land Remote Sensing

In this contribution, we introduce a multilevel approximation method with T-splines for fitting scattered point clouds iteratively, with an application to land remote sensing. This new procedure provides a local surface approximation by an explicit computation of the control points and is called a multilevel T-splines approximation (MTA). It is computationally efficient compared with the traditional global least-squares (LS) approach, which may fail when there is an unfavourable point density from a given refinement level. We validate our approach within a simulated framework and apply it to two real datasets: (i) a surface with holes scanned with a terrestrial laser scanner, and (ii) a patch on a sand-dune in the Netherlands. Both examples highlight the potential of the MTA for rapidly fitting large and noisy point clouds with variable point density and with similar results compared to the global LS approximation.
SCIENCE
arxiv.org

TransVOD: End-to-end Video Object Detection with Spatial-Temporal Transformers

Detection Transformer (DETR) and Deformable DETR have been proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance as previous complex hand-crafted detectors. However, their performance on Video Object Detection (VOD) has not been well explored. In this paper, we present TransVOD, the first end-to-end video object detection system based on spatial-temporal Transformer architectures. The first goal of this paper is to streamline the pipeline of VOD, effectively removing the need for many hand-crafted components for feature aggregation, e.g., optical flow model, relation networks. Besides, benefited from the object query design in DETR, our method does not need complicated post-processing methods such as Seq-NMS. In particular, we present a temporal Transformer to aggregate both the spatial object queries and the feature memories of each frame. Our temporal transformer consists of two components: Temporal Query Encoder (TQE) to fuse object queries, and Temporal Deformable Transformer Decoder (TDTD) to obtain current frame detection results. These designs boost the strong baseline deformable DETR by a significant margin (3%-4% mAP) on the ImageNet VID dataset. Then, we present two improved versions of TransVOD including TransVOD++ and TransVOD Lite. The former fuses object-level information into object query via dynamic convolution while the latter models the entire video clips as the output to speed up the inference time. We give detailed analysis of all three models in the experiment part. In particular, our proposed TransVOD++ sets a new state-of-the-art record in terms of accuracy on ImageNet VID with 90.0% mAP. Our proposed TransVOD Lite also achieves the best speed and accuracy trade-off with 83.7% mAP while running at around 30 FPS on a single V100 GPU device. Code and models will be available for further research.
SOFTWARE
TrendHunter.com

Motion-Sensing Remote Controls

The Wechip W1 Air Mouse remote control is an accessory for aftermarket use with your choice of device to help enhance ease of use when searching out content to stream, games to play and more. The remote control connects to Android TV Boxes, Windows PCs, Linux systems, mini PCs and more to help users enjoy an immersive experience without the need for conventional mouses. The four-axis gyration motion system enables users to move characters in games and more, while the full QWERTY keyboard on the rear will enable fast entering of information where applicable.
ELECTRONICS
arxiv.org

Small Object Detection using Deep Learning

Now a days, UAVs such as drones are greatly used for various purposes like that of capturing and target detection from ariel imagery etc. Easy access of these small ariel vehicles to public can cause serious security threats. For instance, critical places may be monitored by spies blended in public using drones. Study in hand proposes an improved and efficient Deep Learning based autonomous system which can detect and track very small drones with great precision. The proposed system consists of a custom deep learning model Tiny YOLOv3, one of the flavors of very fast object detection model You Look Only Once (YOLO) is built and used for detection. The object detection algorithm will efficiently the detect the drones. The proposed architecture has shown significantly better performance as compared to the previous YOLO version. The improvement is observed in the terms of resource usage and time complexity. The performance is measured using the metrics of recall and precision that are 93% and 91% respectively.
COMPUTERS
arxiv.org

Model-Based Image Signal Processors via Learnable Dictionaries

Digital cameras transform sensor RAW readings into RGB images by means of their Image Signal Processor (ISP). Computational photography tasks such as image denoising and colour constancy are commonly performed in the RAW domain, in part due to the inherent hardware design, but also due to the appealing simplicity of noise statistics that result from the direct sensor readings. Despite this, the availability of RAW images is limited in comparison with the abundance and diversity of available RGB data. Recent approaches have attempted to bridge this gap by estimating the RGB to RAW mapping: handcrafted model-based methods that are interpretable and controllable usually require manual parameter fine-tuning, while end-to-end learnable neural networks require large amounts of training data, at times with complex training procedures, and generally lack interpretability and parametric control. Towards addressing these existing limitations, we present a novel hybrid model-based and data-driven ISP that builds on canonical ISP operations and is both learnable and interpretable. Our proposed invertible model, capable of bidirectional mapping between RAW and RGB domains, employs end-to-end learning of rich parameter representations, i.e. dictionaries, that are free from direct parametric supervision and additionally enable simple and plausible data augmentation. We evidence the value of our data generation process by extensive experiments under both RAW image reconstruction and RAW image denoising tasks, obtaining state-of-the-art performance in both. Additionally, we show that our ISP can learn meaningful mappings from few data samples, and that denoising models trained with our dictionary-based data augmentation are competitive despite having only few or zero ground-truth labels.
COMPUTERS
arxiv.org

Gravitational glint: Detectable gravitational wave tails from stars and compact objects

Observations of a merging neutron star binary in both gravitational waves, by the Laser Interferometer Gravitational-wave Observatory (LIGO), and across the spectrum of electromagnetic radiation, by myriad telescopes, have been used to show that gravitational waves travel in vacuum at a speed that is indistinguishable from that of light to within one part in a quadrillion. However, it has long been expected mathematically that, when electromagnetic or gravitational waves travel through vacuum in a curved spacetime, the waves develop "tails" that travel more slowly. The associated signal has been thought to be undetectably weak. Here we demonstrate that gravitational waves are efficiently scattered by the curvature sourced by ordinary compact objects -- stars, white dwarfs, neutron stars, and planets -- and certain candidates for dark matter, populating the interior of the null cone. The resulting "gravitational glint" should imminently be detectable, and be recognizable (for all but planets) as briefly delayed echoes of the primary signal emanating from extremely near the direction of the primary source. This opens the prospect for using GRAvitational Detection And Ranging (GRADAR) to map the Universe and conduct a comprehensive census of massive compact objects, and ultimately to explore their interiors.
ASTRONOMY
arxiv.org

Sensing performance enhancement via asymmetric gain optimization in the atom-light hybrid interferometer

The SU (1,1)-type atom-light hybrid interferometer (SALHI) is a kind of interferometer that is sensitive to both the optical phase and atomic phase. However, the loss has been an unavoidable problem in practical applications and greatly limits the use of interferometers. Visibility is an important parameter to evaluate the sensing performance of interferometers. Here, we experimentally demonstrate the mitigating effect of the loss on visibility of the SALHI via asymmetric gain optimization, where the maximum threshold of loss to visibility close to $100\%$ is increased. Furthermore, we theoretically find that the optimal condition for the largest visibility is the same as that for the enhancement of signal-to-noise ratio (SNR) to the best value in the presence of losses using the intensity detection, indicating that the visibility can act as an experimental operational criterion for SNR improvement in practical applications. Improvement of the interference visibility means achievement of SNR enhancement. Our results provide a significant foundation for practical application of the SALHI in radar and ranging measurements.
SCIENCE
New Scientist

Portable laser scanner creates colour 3D images of surfaces or objects

A new portable scanner combines laser scanning technology with cameras to create precise 3D images in colour, and it could be used for everything from infrastructure inspection to construction to robot vision. Lidar measures the distance to surfaces using a laser. Each measurement records a point in space, building a...
ELECTRONICS
arxiv.org

SmartDet: Context-Aware Dynamic Control of Edge Task Offloading for Mobile Object Detection

Mobile devices increasingly rely on object detection (OD) through deep neural networks (DNNs) to perform critical tasks. Due to their high complexity, the execution of these DNNs requires excessive time and energy. Low-complexity object tracking (OT) can be used with OD, where the latter is periodically applied to generate "fresh" references for tracking. However, the frames processed with OD incur large delays, which may make the reference outdated and degrade tracking quality. Herein, we propose to use edge computing in this context, and establish parallel OT (at the mobile device) and OD (at the edge server) processes that are resilient to large OD latency. We propose Katch-Up, a novel tracking mechanism that improves the system resilience to excessive OD delay. However, while Katch-Up significantly improves performance, it also increases the computing load of the mobile device. Hence, we design SmartDet, a low-complexity controller based on deep reinforcement learning (DRL) that learns controlling the trade-off between resource utilization and OD performance. SmartDet takes as input context-related information related to the current video content and the current network conditions to optimize frequency and type of OD offloading, as well as Katch-Up utilization. We extensively evaluate SmartDet on a real-world testbed composed of a JetSon Nano as mobile device and a GTX 980 Ti as edge server, connected through a Wi-Fi link. Experimental results show that SmartDet achieves an optimal balance between tracking performance - mean Average Recall (mAR) and resource usage. With respect to a baseline with full Katch-Upusage and maximum channel usage, we still increase mAR by 4% while using 50% less of the channel and 30% power resources associated with Katch-Up. With respect to a fixed strategy using minimal resources, we increase mAR by 20% while using Katch-Up on 1/3 of the frames.
CELL PHONES
Nature.com

Remote photonic detection of human senses using secondary speckle patterns

Neural activity research has recently gained significant attention due to its association with sensory information and behavior control. However, the current methods of brain activity sensing require expensive equipment and physical contact with the tested subject. We propose a novel photonic-based method for remote detection of human senses. Physiological processes associated with hemodynamic activity due to activation of the cerebral cortex affected by different senses have been detected by remote monitoring of nano"vibrations generated by the transient blood flow to the specific regions of the human brain. We have found that a combination of defocused, self"interference random speckle patterns with a spatiotemporal analysis, using Deep Neural Network, allows associating between the activated sense and the seemingly random speckle patterns.
SCIENCE
arxiv.org

Semantic decoupled representation learning for remote sensing image change detection

Contemporary transfer learning-based methods to alleviate the data insufficiency in change detection (CD) are mainly based on ImageNet pre-training. Self-supervised learning (SSL) has recently been introduced to remote sensing (RS) for learning in-domain representations. Here, we propose a semantic decoupled representation learning for RS image CD. Typically, the object of interest (e.g., building) is relatively small compared to the vast background. Different from existing methods expressing an image into one representation vector that may be dominated by irrelevant land-covers, we disentangle representations of different semantic regions by leveraging the semantic mask. We additionally force the model to distinguish different semantic representations, which benefits the recognition of objects of interest in the downstream CD task. We construct a dataset of bitemporal images with semantic masks in an effortless manner for pre-training. Experiments on two CD datasets show our model outperforms ImageNet pre-training, in-domain supervised pre-training, and several recent SSL methods.
SOFTWARE
arxiv.org

End-To-End Optimization of LiDAR Beam Configuration for 3D Object Detection and Localization

Existing learning methods for LiDAR-based applications use 3D points scanned under a pre-determined beam configuration, e.g., the elevation angles of beams are often evenly distributed. Those fixed configurations are task-agnostic, so simply using them can lead to sub-optimal performance. In this work, we take a new route to learn to optimize the LiDAR beam configuration for a given application. Specifically, we propose a reinforcement learning-based learning-to-optimize (RL-L2O) framework to automatically optimize the beam configuration in an end-to-end manner for different LiDAR-based applications. The optimization is guided by the final performance of the target task and thus our method can be integrated easily with any LiDAR-based application as a simple drop-in module. The method is especially useful when a low-resolution (low-cost) LiDAR is needed, for instance, for system deployment at a massive scale. We use our method to search for the beam configuration of a low-resolution LiDAR for two important tasks: 3D object detection and localization. Experiments show that the proposed RL-L2O method improves the performance in both tasks significantly compared to the baseline methods. We believe that a combination of our method with the recent advances of programmable LiDARs can start a new research direction for LiDAR-based active perception. The code is publicly available at this https URL.
COMPUTERS
arxiv.org

Imaging dark charge emitters in diamond via carrier-to-photon conversion

The application of color centers in wide-bandgap semiconductors to nanoscale sensing and quantum information processing largely rests on our knowledge of the surrounding crystalline lattice, often obscured by the countless classes of point defects the material can host. Here we monitor the fluorescence from a negatively charged nitrogen-vacancy (NV-) center in diamond as we illuminate its vicinity. Cyclic charge state conversion of neighboring point defects sensitive to the excitation beam leads to a position-dependent stream of photo-generated carriers whose capture by the probe NV- leads to a fluorescence change. This "charge-to-photon" conversion scheme allows us to image other individual point defects surrounding the probe NV, including non-fluorescent "single-charge emitters" that would otherwise remain unnoticed. Given the ubiquity of color center photo-chromism, this strategy may likely find extensions to material systems other than diamond.
PHYSICS
arxiv.org

Correlative Raman Imaging and Scanning Electron Microscopy: the Role of Single Ga Islands in Surface-Enhanced Raman Spectroscopy of Graphene

Jakub Piastek, Jindřich Mach, Stanislav Bardy, Zoltán Édes, Miroslav Bartošík, Jaroslav Maniš, Vojtěch Čalkovský, Martin Konečný, Jiří Spousta, Tomáš Šikola. Surface enhanced Raman spectroscopy (SERS) is a perspective non-destructive analytic technique enabling detection of individual nanoobjects, even single-molecules. . In the paper, we have studied the morphology of Ga islands deposited on CVD graphene by ultrahigh vacuum (UHV) evaporation and local optical response of this system by the correlative Raman Imaging and Scanning Electron Microscopy (RISE). Contrary to the previous papers, where only an integral Raman response from the whole ununiformed Ga NPs ensembles on graphene was investigated, the RISE technique has enabled us to detect graphene Raman peaks enhanced by single Ga islands and particularly to correlate the Raman signal with the shape and size of these single particles. In this way and by a support of numerical simulations, we have proved a plasmonic nature of the Raman signal enhancement related to localized surface plasmon resonances (LSPR).
CHEMISTRY
arxiv.org

Learning Hierarchical Graph Representation for Image Manipulation Detection

The objective of image manipulation detection is to identify and locate the manipulated regions in the images. Recent approaches mostly adopt the sophisticated Convolutional Neural Networks (CNNs) to capture the tampering artifacts left in the images to locate the manipulated regions. However, these approaches ignore the feature correlations, i.e., feature inconsistencies, between manipulated regions and non-manipulated regions, leading to inferior detection performance. To address this issue, we propose a hierarchical Graph Convolutional Network (HGCN-Net), which consists of two parallel branches: the backbone network branch and the hierarchical graph representation learning (HGRL) branch for image manipulation detection. Specifically, the feature maps of a given image are extracted by the backbone network branch, and then the feature correlations within the feature maps are modeled as a set of fully-connected graphs for learning the hierarchical graph representation by the HGRL branch. The learned hierarchical graph representation can sufficiently capture the feature correlations across different scales, and thus it provides high discriminability for distinguishing manipulated and non-manipulated regions. Extensive experiments on four public datasets demonstrate that the proposed HGCN-Net not only provides promising detection accuracy, but also achieves strong robustness under a variety of common image attacks in the task of image manipulation detection, compared to the state-of-the-arts.
COMPUTERS
arxiv.org

$\ell_1$-norm constrained multi-block sparse canonical correlation analysis via proximal gradient descent

Multi-block CCA constructs linear relationships explaining coherent variations across multiple blocks of data. We view the multi-block CCA problem as finding leading generalized eigenvectors and propose to solve it via a proximal gradient descent algorithm with $\ell_1$ constraint for high dimensional data. In particular, we use a decaying sequence of constraints over proximal iterations, and show that the resulting estimate is rate-optimal under suitable assumptions. Although several previous works have demonstrated such optimality for the $\ell_0$ constrained problem using iterative approaches, the same level of theoretical understanding for the $\ell_1$ constrained formulation is still lacking. We also describe an easy-to-implement deflation procedure to estimate multiple eigenvectors sequentially. We compare our proposals to several existing methods whose implementations are available on R CRAN, and the proposed methods show competitive performances in both simulations and a real data example.
COMPUTERS

