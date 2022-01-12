ContributorsPublishersAdvertisers
MDS-Net: A Multi-scale Depth Stratification Based Monocular 3D Object Detection Algorithm

By Zhouzhen Xie, Yuying Song, Jingxuan Wu, Zecheng Li, Chunyi Song, Zhiwei Xu
arxiv.org
 3 days ago

Monocular 3D object detection is very challenging in autonomous driving due to the lack of depth information. This paper proposes a one-stage monocular 3D object detection algorithm based on multi-scale depth stratification, which...

arxiv.org

#Object Detection#Algorithm#Mds#Stratification#Monocular 3d#Mds Net#Kitti#Bev
arxiv.org

Advancing 3D Medical Image Analysis with Variable Dimension Transform based Supervised 3D Pre-training

The difficulties in both data acquisition and annotation substantially restrict the sample sizes of training datasets for 3D medical imaging applications. As a result, constructing high-performance 3D convolutional neural networks from scratch remains a difficult task in the absence of a sufficient pre-training parameter. Previous efforts on 3D pre-training have frequently relied on self-supervised approaches, which use either predictive or contrastive learning on unlabeled data to build invariant 3D representations. However, because of the unavailability of large-scale supervision information, obtaining semantically invariant and discriminative representations from these learning frameworks remains problematic. In this paper, we revisit an innovative yet simple fully-supervised 3D network pre-training framework to take advantage of semantic supervisions from large-scale 2D natural image datasets. With a redesigned 3D network architecture, reformulated natural images are used to address the problem of data scarcity and develop powerful 3D representations. Comprehensive experiments on four benchmark datasets demonstrate that the proposed pre-trained models can effectively accelerate convergence while also improving accuracy for a variety of 3D medical imaging tasks such as classification, segmentation and detection. In addition, as compared to training from scratch, it can save up to 60% of annotation efforts. On the NIH DeepLesion dataset, it likewise achieves state-of-the-art detection performance, outperforming earlier self-supervised and fully-supervised pre-training approaches, as well as methods that do training from scratch. To facilitate further development of 3D medical models, our code and pre-trained model weights are publicly available at this https URL.
HEALTH
arxiv.org

Feature Selection-based Intrusion Detection System Using Genetic Whale Optimization Algorithm and Sample-based Classification

Preventing and detecting intrusions and attacks on wireless networks has become an important and serious challenge. On the other hand, due to the limited resources of wireless nodes, the use of monitoring nodes for permanent monitoring in wireless sensor networks in order to prevent and detect intrusion and attacks in this type of network is practically non-existent. Therefore, the solution to overcome this problem today is the discussion of remote-control systems and has become one of the topics of interest in various fields. Remote monitoring of node performance and behavior in wireless sensor networks, in addition to detecting malicious nodes within the network, can also predict malicious node behavior in future. In present research, a network intrusion detection system using feature selection based on a combination of Whale optimization algorithm (WOA) and genetic algorithm (GA) and sample-based classification is proposed. In this research, the standard data set KDDCUP1999 has been used in which the characteristics related to healthy nodes and types of malicious nodes are stored based on the type of attacks in the network. The proposed method is based on the combination of feature selection based on Whale optimization algorithm and genetic algorithm with KNN classification in terms of accuracy criteria, has better results than other previous methods. Based on this, it can be said that the Whale optimization algorithm and the genetic algorithm have extracted the features related to the class label well, and the KNN method has been able to well detect the misconduct nodes in the intrusion detection data set in wireless networks.
COMPUTERS
arxiv.org

Novelty-based Generalization Evaluation for Traffic Light Detection

The advent of Convolutional Neural Networks (CNNs) has led to their application in several domains. One noteworthy application is the perception system for autonomous driving that relies on the predictions from CNNs. Practitioners evaluate the generalization ability of such CNNs by calculating various metrics on an independent test dataset. A test dataset is often chosen based on only one precondition, i.e., its elements are not a part of the training data. Such a dataset may contain objects that are both similar and novel w.r.t. the training dataset. Nevertheless, existing works do not reckon the novelty of the test samples and treat them all equally for evaluating generalization. Such novelty-based evaluations are of significance to validate the fitness of a CNN in autonomous driving applications. Hence, we propose a CNN generalization scoring framework that considers novelty of objects in the test dataset. We begin with the representation learning technique to reduce the image data into a low-dimensional space. It is on this space we estimate the novelty of the test samples. Finally, we calculate the generalization score as a combination of the test data prediction performance and novelty. We perform an experimental study of the same for our traffic light detection application. In addition, we systematically visualize the results for an interpretable notion of novelty.
TECHNOLOGY
arxiv.org

Equalized Focal Loss for Dense Long-Tailed Object Detection

Despite the recent success of long-tailed object detection, almost all long-tailed object detectors are developed based on the two-stage paradigm. In practice, one-stage detectors are more prevalent in the industry because they have a simple and fast pipeline that is easy to deploy. However, in the long-tailed scenario, this line of work has not been explored so far. In this paper, we investigate whether one-stage detectors can perform well in this case. We discover the primary obstacle that prevents one-stage detectors from achieving excellent performance is: categories suffer from different degrees of positive-negative imbalance problems under the long-tailed data distribution. The conventional focal loss balances the training process with the same modulating factor for all categories, thus failing to handle the long-tailed problem. To address this issue, we propose the Equalized Focal Loss (EFL) that rebalances the loss contribution of positive and negative samples of different categories independently according to their imbalance degrees. Specifically, EFL adopts a category-relevant modulating factor which can be adjusted dynamically by the training status of different categories. Extensive experiments conducted on the challenging LVIS v1 benchmark demonstrate the effectiveness of our proposed method. With an end-to-end training pipeline, EFL achieves 29.2% in terms of overall AP and obtains significant performance improvements on rare categories, surpassing all existing state-of-the-art methods. The code is available at this https URL.
SCIENCE
arxiv.org

Revisiting Open World Object Detection

Open World Object Detection (OWOD), simulating the real dynamic world where knowledge grows continuously, attempts to detect both known and unknown classes and incrementally learn the identified unknown ones. We find that although the only previous OWOD work constructively puts forward to the OWOD definition, the experimental settings are unreasonable with the illogical benchmark, confusing metric calculation, and inappropriate method. In this paper, we rethink the OWOD experimental setting and propose five fundamental benchmark principles to guide the OWOD benchmark construction. Moreover, we design two fair evaluation protocols specific to the OWOD problem, filling the void of evaluating from the perspective of unknown classes. Furthermore, we introduce a novel and effective OWOD framework containing an auxiliary Proposal ADvisor (PAD) and a Class-specific Expelling Classifier (CEC). The non-parametric PAD could assist the RPN in identifying accurate unknown proposals without supervision, while CEC calibrates the over-confident activation boundary and filters out confusing predictions through a class-specific expelling function. Comprehensive experiments conducted on our fair benchmark demonstrate that our method outperforms other state-of-the-art object detection approaches in terms of both existing and our new metrics.\footnote{Our benchmark and code are available at this https URL.
COMPUTERS
arxiv.org

Robust Region Feature Synthesizer for Zero-Shot Object Detection

Zero-shot object detection aims at incorporating class semantic vectors to realize the detection of (both seen and) unseen classes given an unconstrained test image. In this study, we reveal the core challenges in this research area: how to synthesize robust region features (for unseen objects) that are as intra-class diverse and inter-class separable as the real samples, so that strong unseen object detectors can be trained upon them. To address these challenges, we build a novel zero-shot object detection framework that contains an Intra-class Semantic Diverging component and an Inter-class Structure Preserving component. The former is used to realize the one-to-more mapping to obtain diverse visual features from each class semantic vector, preventing miss-classifying the real unseen objects as image backgrounds. While the latter is used to avoid the synthesized features too scattered to mix up the inter-class and foreground-background relationship. To demonstrate the effectiveness of the proposed approach, comprehensive experiments on PASCAL VOC, COCO, and DIOR datasets are conducted. Notably, our approach achieves the new state-of-the-art performance on PASCAL VOC and COCO and it is the first study to carry out zero-shot object detection in remote sensing imagery.
SCIENCE
arxiv.org

Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO

Detecting human interactions is crucial for human behavior analysis. Many methods have been proposed to deal with Human-to-Object Interaction (HOI) detection, i.e., detecting in an image which person and object interact together and classifying the type of interaction. However, Human-to-Human Interactions, such as social and violent interactions, are generally not considered in available HOI training datasets. As we think these types of interactions cannot be ignored and decorrelated from HOI when analyzing human behavior, we propose a new interaction dataset to deal with both types of human interactions: Human-to-Human-or-Object (H2O). In addition, we introduce a novel taxonomy of verbs, intended to be closer to a description of human body attitude in relation to the surrounding targets of interaction, and more independent of the environment. Unlike some existing datasets, we strive to avoid defining synonymous verbs when their use highly depends on the target type or requires a high level of semantic interpretation. As H2O dataset includes V-COCO images annotated with this new taxonomy, images obviously contain more interactions. This can be an issue for HOI detection methods whose complexity depends on the number of people, targets or interactions. Thus, we propose DIABOLO (Detecting InterActions By Only Looking Once), an efficient subject-centric single-shot method to detect all interactions in one forward pass, with constant inference time independent of image content. In addition, this multi-task network simultaneously detects all people and objects. We show how sharing a network for these tasks does not only save computation resource but also improves performance collaboratively. Finally, DIABOLO is a strong baseline for the new proposed challenge of H2O Interaction detection, as it outperforms all state-of-the-art methods when trained and evaluated on HOI dataset V-COCO.
COMPUTERS
arxiv.org

MDFEND: Multi-domain Fake News Detection

Fake news spread widely on social media in various domains, which lead to real-world threats in many aspects like politics, disasters, and finance. Most existing approaches focus on single-domain fake news detection (SFND), which leads to unsatisfying performance when these methods are applied to multi-domain fake news detection. As an emerging field, multi-domain fake news detection (MFND) is increasingly attracting attention. However, data distributions, such as word frequency and propagation patterns, vary from domain to domain, namely domain shift. Facing the challenge of serious domain shift, existing fake news detection techniques perform poorly for multi-domain scenarios. Therefore, it is demanding to design a specialized model for MFND. In this paper, we first design a benchmark of fake news dataset for MFND with domain label annotated, namely Weibo21, which consists of 4,488 fake news and 4,640 real news from 9 different domains. We further propose an effective Multi-domain Fake News Detection Model (MDFEND) by utilizing a domain gate to aggregate multiple representations extracted by a mixture of experts. The experiments show that MDFEND can significantly improve the performance of multi-domain fake news detection. Our dataset and code are available at this https URL.
TECHNOLOGY
arxiv.org

A Deeper Understanding of State-Based Critics in Multi-Agent Reinforcement Learning

Centralized Training for Decentralized Execution, where training is done in a centralized offline fashion, has become a popular solution paradigm in Multi-Agent Reinforcement Learning. Many such methods take the form of actor-critic with state-based critics, since centralized training allows access to the true system state, which can be useful during training despite not being available at execution time. State-based critics have become a common empirical choice, albeit one which has had limited theoretical justification or analysis. In this paper, we show that state-based critics can introduce bias in the policy gradient estimates, potentially undermining the asymptotic guarantees of the algorithm. We also show that, even if the state-based critics do not introduce any bias, they can still result in a larger gradient variance, contrary to the common intuition. Finally, we show the effects of the theories in practice by comparing different forms of centralized critics on a wide range of common benchmarks, and detail how various environmental properties are related to the effectiveness of different types of critics.
CODING & PROGRAMMING
arxiv.org

Gravitational glint: Detectable gravitational wave tails from stars and compact objects

Observations of a merging neutron star binary in both gravitational waves, by the Laser Interferometer Gravitational-wave Observatory (LIGO), and across the spectrum of electromagnetic radiation, by myriad telescopes, have been used to show that gravitational waves travel in vacuum at a speed that is indistinguishable from that of light to within one part in a quadrillion. However, it has long been expected mathematically that, when electromagnetic or gravitational waves travel through vacuum in a curved spacetime, the waves develop "tails" that travel more slowly. The associated signal has been thought to be undetectably weak. Here we demonstrate that gravitational waves are efficiently scattered by the curvature sourced by ordinary compact objects -- stars, white dwarfs, neutron stars, and planets -- and certain candidates for dark matter, populating the interior of the null cone. The resulting "gravitational glint" should imminently be detectable, and be recognizable (for all but planets) as briefly delayed echoes of the primary signal emanating from extremely near the direction of the primary source. This opens the prospect for using GRAvitational Detection And Ranging (GRADAR) to map the Universe and conduct a comprehensive census of massive compact objects, and ultimately to explore their interiors.
ASTRONOMY
arxiv.org

The BepiColombo solar conjunction experiments revisited

BepiColombo ESA/JAXA mission is currently in its 7 year cruise phase towards Mercury. The Mercury orbiter radioscience experiment (MORE), one of the 16 experiments of the mission, will start its scientific investigation during the superior solar conjunction (SSC) in March 2021 with a test of general relativity (GR). Other solar conjunctions will follow during the cruise phase, providing several opportunities to improve the results of the first experiment. MORE radio tracking system allows to establish precise ranging and Doppler measurements almost at all solar elongation angles (up to 7-8 solar radii), thus providing an accurate measurement of the relativistic time delay and frequency shift experienced by a radio signal during an SSC. The final objective of the experiment is to place new limits to the accuracy of the GR as a theory of gravity in the weak-field limit. As in all gravity experiments, non-gravitational accelerations acting on the spacecraft are a major concern. Because of the proximity to the Sun, the spacecraft will undergo severe solar radiation pressure acceleration, and the effect of the random fluctuations of the solar irradiance may become a significant source of spacecraft buffeting. In this paper we address the problem of a realistic estimate of the outcome of the SSC experiments of BepiColombo, by including in the dynamical model the effects of random variations in the solar irradiance. We propose a numerical method to mitigate the impact of the variable solar radiation pressure on the outcome of the experiment. Our simulations show that, with different assumptions on the solar activity and observation coverage, the accuracy attainable in the estimation of $\gamma$ lays in the range $[6, 13]\cdot10^{-6}$.
AEROSPACE & DEFENSE
esri.com

Defining scale-based symbology using ArcGIS Runtime SDK

Unique value and class breaks renderers allow you to specify a criterion to uniquely symbolize data so the viewer can understand trends and patterns within the data. There are situations where a dataset has a lot of unique values or classes. An example is a utility network dataset. For such datasets, it may not make sense to render data at all scales. To do so, one option is to set a minimum and a maximum scale on the layer. Although with this option, the entire layer will either be shown or hidden, depending on map’s scale. Another option is to use multiple layers that reference the same data. For the correct symbology to display at the correct scale, each layer must be configured to have its own visible scale range, symbology definition, and definition expression defining the subset of data to be made visible at different scales.
COMPUTERS
arxiv.org

Salient Object Detection by LTP Texture Characterization on Opposing Color Pairs under SLICO Superpixel Constraint

The effortless detection of salient objects by humans has been the subject of research in several fields, including computer vision as it has many applications. However, salient object detection remains a challenge for many computer models dealing with color and textured images. Herein, we propose a novel and efficient strategy, through a simple model, almost without internal parameters, which generates a robust saliency map for a natural image. This strategy consists of integrating color information into local textural patterns to characterize a color micro-texture. Most models in the literature that use the color and texture features treat them separately. In our case, it is the simple, yet powerful LTP (Local Ternary Patterns) texture descriptor applied to opposing color pairs of a color space that allows us to achieve this end. Each color micro-texture is represented by vector whose components are from a superpixel obtained by SLICO (Simple Linear Iterative Clustering with zero parameter) algorithm which is simple, fast and exhibits state-of-the-art boundary adherence. The degree of dissimilarity between each pair of color micro-texture is computed by the FastMap method, a fast version of MDS (Multi-dimensional Scaling), that considers the color micro-textures non-linearity while preserving their distances. These degrees of dissimilarity give us an intermediate saliency map for each RGB, HSL, LUV and CMY color spaces. The final saliency map is their combination to take advantage of the strength of each of them. The MAE (Mean Absolute Error) and F$_{\beta}$ measures of our saliency maps, on the complex ECSSD dataset show that our model is both simple and efficient, outperforming several state-of-the-art models.
COMPUTERS

