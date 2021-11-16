ContributorsPublishersAdvertisers
Wildlife

Pose Recognition in the Wild: Animal pose estimation using Agglomerative Clustering and Contrastive Learning

By Samayan Bhattacharya, Sk Shahnawaz
arxiv.org
 8 days ago

Animal pose estimation has recently come into the limelight due to its application in biology, zoology, and aquaculture. Deep learning methods have effectively been applied to human pose estimation. However, the major bottleneck to the application of these methods to animal pose estimation is the unavailability of sufficient quantities of labeled...

arxiv.org

Comments / 0

Related
arxiv.org

6D Pose Estimation with Combined Deep Learning and 3D Vision Techniques for a Fast and Accurate Object Grasping

Real-time robotic grasping, supporting a subsequent precise object-in-hand operation task, is a priority target towards highly advanced autonomous systems. However, such an algorithm which can perform sufficiently-accurate grasping with time efficiency is yet to be found. This paper proposes a novel method with a 2-stage approach that combines a fast 2D object recognition using a deep neural network and a subsequent accurate and fast 6D pose estimation based on Point Pair Feature framework to form a real-time 3D object recognition and grasping solution capable of multi-object class scenes. The proposed solution has a potential to perform robustly on real-time applications, requiring both efficiency and accuracy. In order to validate our method, we conducted extensive and thorough experiments involving laborious preparation of our own dataset. The experiment results show that the proposed method scores 97.37% accuracy in 5cm5deg metric and 99.37% in Average Distance metric. Experiment results have shown an overall 62% relative improvement (5cm5deg metric) and 52.48% (Average Distance metric) by using the proposed method. Moreover, the pose estimation execution also showed an average improvement of 47.6% in running time. Finally, to illustrate the overall efficiency of the system in real-time operations, a pick-and-place robotic experiment is conducted and has shown a convincing success rate with 90% of accuracy. This experiment video is available at this https URL.
TECHNOLOGY
arxiv.org

Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space

As a basic component of SE(3)-equivariant deep feature learning, steerable convolution has recently demonstrated its advantages for 3D semantic analysis. The advantages are, however, brought by expensive computations on dense, volumetric data, which prevent its practical use for efficient processing of 3D data that are inherently sparse. In this paper, we propose a novel design of Sparse Steerable Convolution (SS-Conv) to address the shortcoming; SS-Conv greatly accelerates steerable convolution with sparse tensors, while strictly preserving the property of SE(3)-equivariance. Based on SS-Conv, we propose a general pipeline for precise estimation of object poses, wherein a key design is a Feature-Steering module that takes the full advantage of SE(3)-equivariance and is able to conduct an efficient pose refinement. To verify our designs, we conduct thorough experiments on three tasks of 3D object semantic analysis, including instance-level 6D pose estimation, category-level 6D pose and size estimation, and category-level 6D pose tracking. Our proposed pipeline based on SS-Conv outperforms existing methods on almost all the metrics evaluated by the three tasks. Ablation studies also show the superiority of our SS-Conv over alternative convolutions in terms of both accuracy and efficiency. Our code is released publicly at this https URL.
SCIENCE
arxiv.org

Coarse-to-fine Animal Pose and Shape Estimation

Most existing animal pose and shape estimation approaches reconstruct animal meshes with a parametric SMAL model. This is because the low-dimensional pose and shape parameters of the SMAL model makes it easier for deep networks to learn the high-dimensional animal meshes. However, the SMAL model is learned from scans of toy animals with limited pose and shape variations, and thus may not be able to represent highly varying real animals well. This may result in poor fittings of the estimated meshes to the 2D evidences, e.g. 2D keypoints or silhouettes. To mitigate this problem, we propose a coarse-to-fine approach to reconstruct 3D animal mesh from a single image. The coarse estimation stage first estimates the pose, shape and translation parameters of the SMAL model. The estimated meshes are then used as a starting point by a graph convolutional network (GCN) to predict a per-vertex deformation in the refinement stage. This combination of SMAL-based and vertex-based representations benefits from both parametric and non-parametric representations. We design our mesh refinement GCN (MRGCN) as an encoder-decoder structure with hierarchical feature representations to overcome the limited receptive field of traditional GCNs. Moreover, we observe that the global image feature used by existing animal mesh reconstruction works is unable to capture detailed shape information for mesh refinement. We thus introduce a local feature extractor to retrieve a vertex-level feature and use it together with the global feature as the input of the MRGCN. We test our approach on the StanfordExtra dataset and achieve state-of-the-art results. Furthermore, we test the generalization capacity of our approach on the Animal Pose and BADJA datasets. Our code is available at the project website.
WILDLIFE
arxiv.org

Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation

In keypoint estimation tasks such as human pose estimation, heatmap-based regression is the dominant approach despite possessing notable drawbacks: heatmaps intrinsically suffer from quantization error and require excessive computation to generate and post-process. Motivated to find a more efficient solution, we propose a new heatmap-free keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchor-based detection framework. Hence, we call our method KAPAO (pronounced "Ka-Pow!") for Keypoints And Poses As Objects. We apply KAPAO to the problem of single-stage multi-person human pose estimation by simultaneously detecting human pose objects and keypoint objects and fusing the detections to exploit the strengths of both object representations. In experiments, we observe that KAPAO is significantly faster and more accurate than previous methods, which suffer greatly from heatmap post-processing. Moreover, the accuracy-speed trade-off is especially favourable in the practical setting when not using test-time augmentation. Our large model, KAPAO-L, achieves an AP of 70.6 on the Microsoft COCO Keypoints validation set without test-time augmentation, which is 2.5x faster and 4.0 AP more accurate than the next best single-stage model. Furthermore, KAPAO excels in the presence of heavy occlusion. On the CrowdPose test set, KAPAO-L achieves new state-of-the-art accuracy for a single-stage method with an AP of 68.9.
SCIENCE
IN THIS ARTICLE
#Unsupervised Learning#Animals#Estimation#Deep Learning#Agglomerative Clustering
arxiv.org

ACR-Pose: Adversarial Canonical Representation Reconstruction Network for Category Level 6D Object Pose Estimation

Recently, category-level 6D object pose estimation has achieved significant improvements with the development of reconstructing canonical 3D representations. However, the reconstruction quality of existing methods is still far from excellent. In this paper, we propose a novel Adversarial Canonical Representation Reconstruction Network named ACR-Pose. ACR-Pose consists of a Reconstructor and a Discriminator. The Reconstructor is primarily composed of two novel sub-modules: Pose-Irrelevant Module (PIM) and Relational Reconstruction Module (RRM). PIM tends to learn canonical-related features to make the Reconstructor insensitive to rotation and translation, while RRM explores essential relational information between different input modalities to generate high-quality features. Subsequently, a Discriminator is employed to guide the Reconstructor to generate realistic canonical representations. The Reconstructor and the Discriminator learn to optimize through adversarial training. Experimental results on the prevalent NOCS-CAMERA and NOCS-REAL datasets demonstrate that our method achieves state-of-the-art performance.
COMPUTERS
arxiv.org

Contrast-reconstruction Representation Learning for Self-supervised Skeleton-based Action Recognition

Skeleton-based action recognition is widely used in varied areas, e.g., surveillance and human-machine interaction. Existing models are mainly learned in a supervised manner, thus heavily depending on large-scale labeled data which could be infeasible when labels are prohibitively expensive. In this paper, we propose a novel Contrast-Reconstruction Representation Learning network (CRRL) that simultaneously captures postures and motion dynamics for unsupervised skeleton-based action recognition. It mainly consists of three parts: Sequence Reconstructor, Contrastive Motion Learner, and Information Fuser. The Sequence Reconstructor learns representation from skeleton coordinate sequence via reconstruction, thus the learned representation tends to focus on trivial postural coordinates and be hesitant in motion learning. To enhance the learning of motions, the Contrastive Motion Learner performs contrastive learning between the representations learned from coordinate sequence and additional velocity sequence, respectively. Finally, in the Information Fuser, we explore varied strategies to combine the Sequence Reconstructor and Contrastive Motion Learner, and propose to capture postures and motions simultaneously via a knowledge-distillation based fusion strategy that transfers the motion learning from the Contrastive Motion Learner to the Sequence Reconstructor. Experimental results on several benchmarks, i.e., NTU RGB+D 60, NTU RGB+D 120, CMU mocap, and NW-UCLA, demonstrate the promise of the proposed CRRL method by far outperforming state-of-the-art approaches.
TECHNOLOGY
arxiv.org

PAM: Pose Attention Module for Pose-Invariant Face Recognition

Pose variation is one of the key challenges in face recognition. Conventional techniques mainly focus on face frontalization or face augmentation in image space. However, transforming face images in image space is not guaranteed to preserve the lossless identity features of the original image. Moreover, these methods suffer from more computational costs and memory requirements due to the additional models. We argue that it is more desirable to perform feature transformation in hierarchical feature space rather than image space, which can take advantage of different feature levels and benefit from joint learning with representation learning. To this end, we propose a lightweight and easy-to-implement attention block, named Pose Attention Module (PAM), for pose-invariant face recognition. Specifically, PAM performs frontal-profile feature transformation in hierarchical feature space by learning residuals between pose variations with a soft gate mechanism. We validated the effectiveness of PAM block design through extensive ablation studies and verified the performance on several popular benchmarks, including LFW, CFP-FP, AgeDB-30, CPLFW, and CALFW. Experimental results show that our method not only outperforms state-of-the-art methods but also effectively reduces memory requirements by more than 75 times. It is noteworthy that our method is not limited to face recognition with large pose variations. By adjusting the soft gate mechanism of PAM to a specific coefficient, such semantic attention block can easily extend to address other intra-class imbalance problems in face recognition, including large variations in age, illumination, expression, etc.
TECHNOLOGY
CBS Tampa

Deep Sea Mystery; Researchers Recover Ancient Mammoth Tusk Off Central Coast

MONTEREY (CBS SF) — It was discovery that raised a few eyebrows and even quickened the pulses of the deep sea researchers at the Monterey Bay Aquarium Research Institute. During a deep sea exploration dive 185 miles off the Central California coast in 2019, the camera on their remotely controlled probe flash on the image of what looked like an elephant’s tusk. Only able to collect a small piece at the time, the researchers returned in July to retrieve the complete specimen from it 10,000-feet deep resting place and now have discovered the just over 3-foot tusk is from a Columbian mammoth. The...
WILDLIFE
YOU MAY ALSO LIKE
NewsBreak
Wildlife
NewsBreak
Artificial Intelligence
NewsBreak
Science
arxiv.org

Hierarchical Graph Networks for 3D Human Pose Estimation

Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton. However, we argue that this skeletal topology is too sparse to reflect the body structure and suffer from serious 2D-to-3D ambiguity problem. To overcome these weaknesses, we propose a novel graph convolution network architecture, Hierarchical Graph Networks (HGN). It is based on denser graph topology generated by our multi-scale graph structure building strategy, thus providing more delicate geometric information. The proposed architecture contains three sparse-to-fine representation subnetworks organized in parallel, in which multi-scale graph-structured features are processed and exchange information through a novel feature fusion strategy, leading to rich hierarchical representations. We also introduce a 3D coarse mesh constraint to further boost detail-related feature learning. Extensive experiments demonstrate that our HGN achieves the state-of-the art performance with reduced network parameters.
SCIENCE
arxiv.org

Exploring Non-Contrastive Representation Learning for Deep Clustering

Existing deep clustering methods rely on contrastive learning for representation learning, which requires negative examples to form an embedding space where all instances are well-separated. However, the negative examples inevitably give rise to the class collision issue, compromising the representation learning for clustering. In this paper, we explore non-contrastive representation learning for deep clustering, termed NCC, which is based on BYOL, a representative method without negative examples. First, we propose to align one augmented view of instance with the neighbors of another view in the embedding space, called positive sampling strategy, which avoids the class collision issue caused by the negative examples and hence improves the within-cluster compactness. Second, we propose to encourage alignment between two augmented views of one prototype and uniformity among all prototypes, named prototypical contrastive loss or ProtoCL, which can maximize the inter-cluster distance. Moreover, we formulate NCC in an Expectation-Maximization (EM) framework, in which E-step utilizes spherical k-means to estimate the pseudo-labels of instances and distribution of prototypes from a target network and M-step leverages the proposed losses to optimize an online network. As a result, NCC forms an embedding space where all clusters are well-separated and within-cluster examples are compact. Experimental results on several clustering benchmark datasets including ImageNet-1K demonstrate that NCC outperforms the state-of-the-art methods by a significant margin.
COMPUTERS
arxiv.org

The Spatial Evolution of Young Massive Clusters III. Effect of the Gaia Filter on 2D Spatial Distribution Studies

[Context.] Gaia is limited in the optical down to G~21 mag so it is essential to understand the biases introduced by a magnitude limited sample on spatial distribution studies. [Aims.] To ascertain how sample incompleteness in Gaia observations of young clusters affects the local spatial analysis tool INDICATE and subsequently the perceived spatial properties of these clusters. [Methods.] We created a mock Gaia cluster catalogue from a synthetic dataset using the observation generating tool MYOSOTIS. The effect of cluster distance, uniform and variable extinction, binary fraction, population masking by the point spread function wings of high mass members, and contrast sensitivity limits on the trends identified by INDICATE are explored. A comparison of the typical index values derived by INDICATE for members of the synthetic dataset and their corresponding mock Gaia catalogue observations is made to identify any significant changes. [Results.] We typically find only small variations in the pre- and post- observation index values of cluster populations, which can increase as a function of incompleteness percentage and binarity. No significant strengthening, or false signatures, of stellar concentrations are found but real signatures may be diluted. Conclusions drawn about the spatial behaviour of Gaia observed cluster populations which are, and are not, associated with their natal nebulosity are reliable for most clusters but the perceived behaviours of individual members can change so INDICATE should be used as a measure of spatial behaviours between members as a function of their intrinsic properties (mass, age, object type etc.), rather than to draw conclusions about any specific observed member. [Conclusions.] INDICATE is a robust spatial analysis tool to reliably study Gaia observed young cluster populations within 1 kpc, up to a sample incompleteness of 83.3% and binarity of 50%.
ASTRONOMY
arxiv.org

Empirically estimating the distribution of the loudest candidate from a gravitational-wave search

Searches for gravitational-wave signals are often based on maximizing a detection statistic over a bank of waveform templates, covering a given parameter space with a variable level of correlation. Results are often evaluated using a noise-hypothesis test, where the background is characterized by the sampling distribution of the loudest template. In the context of continuous gravitational-wave searches, properly describing said distribution is an open problem: current approaches focus on a particular detection statistic and neglect template-bank correlations. We introduce a new approach using extreme value theory to describe the distribution of the loudest template's detection statistic in an arbitrary template bank. Our new proposal automatically generalizes to a wider class of detection statistics, including (but not limited to) line-robust statistics and transient continuous-wave signal hypotheses, and improves the estimation of the expected maximum detection statistic at a negligible computing cost. The performance of our proposal is demonstrated on simulated data as well as by applying it to different kinds of (transient) continuous-wave searches using O2 Advanced LIGO data. We release an accompanying Python software package, distromax, implementing our new developments.
PHYSICS
Discover Mag

Evidence Shows Humans May Have Introduced Now-Extinct Wolf to the Falkland Islands

A fossil warrah skull found at Spring Point Farm on West Falkland. The skull is housed at the Falkland Islands Museum and National Trust. (Credit: Kit Hamley/Inside Science) (Inside Science) — An unknown population of humans that left few traces on the landscape of the Falkland Islands may have brought large fox-like dogs still present when Europeans first visited the archipelago in the late 17th century.
WILDLIFE
ScienceAlert

We May Finally Understand Why These Gargantuan Mollusks Got So Huge

During the late Cretaceous, around 80 million years ago, monsters roamed the Earth. Not just the tyrannosaurs and titanosaurs. Even smaller animals could be super-sized. It was during this time that the size of a type of marine mollusk peaked, with the largest species of ammonite reaching sizes up to 2.5 meters (8.2 feet) across. No other ammonite ever reached such a prodigious size – and, as with all outliers, scientists have been keen to understand exactly why. Now, after studying the fossilized remains of 154 ammonites across a range of sizes, an international team led by paleontologist Christina Ifrim of Heidelberg University...
WILDLIFE
natureworldnews.com

Coronavirus Outbreak in White-Tailed Deer May Alter the Trajectory of Pandemic

SARS-CoV-2 spreads rapidly in white-tailed deer, according to scientists, and the virus is ubiquitous in this deer population across the United States. Researchers believe the findings are alarming and might have far-reaching repercussions for the coronavirus pandemic's long-term trajectory. COVID Outbreak. Since the initial appearance of SARS-CoV-2, the coronavirus that...
WILDLIFE
huntingdondailynews.com

Pesky Asian Lady Beetles pose no harm

As the days become colder, people are finding what appear to be ladybugs in their homes. They are actually Asian lady beetles, which are not really bugs at all. “Beetles and bugs are different species, like cats and dogs are,” said Dr. Norris Muth, a biology professor at Juniata College. “Beetles have a line down their packs that provide two different types of wings,” he said. “Bugs, like stink bugs, have an X symbol on their back.”
ANIMALS

Comments / 0

Community Policy