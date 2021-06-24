Cancel
CreatorsPublishersAdvertisers
View more in
Computers

Understanding Zero-Shot Learning — Making ML More Human

By Editors' Picks
towardsdatascience.com
 19 days ago

Cover picture for the articleAn intuitive overview of how a model can recognize what it hasn’t seen. Zero-shot learning allows a model to recognize what it hasn’t seen before. Imagine you’re tasked with designing the latest and greatest machine learning model that can classify all animals. Yes, all animals. Using your machine learning knowledge,...

towardsdatascience.com

Comments / 0

IN THIS ARTICLE
#Zero Shot Learning#Wikipedia#Design
YOU MAY ALSO LIKE
News Break
Technology
News Break
Computers
News Break
Science
News Break
Computer Science
Related
Coding & Programmingtowardsdatascience.com

Parameter counts in Machine Learning

Public dataset and analysis of the evolution of parameter counts in Machine Learning. In short: we have compiled information about the date of development and trainable parameter counts of n=139 machine learning systems between 1952 and 2021. This is, as far as we know, the biggest public dataset of its kind. You can access our dataset here, and the code to produce an interactive visualization is available here.
Computersarxiv.org

Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning

Recent works demonstrate that deep reinforcement learning (DRL) models are vulnerable to adversarial attacks which can decrease the victim's total reward by manipulating the observations. Compared with adversarial attacks in supervised learning, it is much more challenging to deceive a DRL model since the adversary has to infer the environmental dynamics. To address this issue, we reformulate the problem of adversarial attacks in function space and separate the previous gradient based attacks into several subspace. Following the analysis of the function space, we design a generic two-stage framework in the subspace where the adversary lures the agent to a target trajectory or a deceptive policy. In the first stage, we train a deceptive policy by hacking the environment, and discover a set of trajectories routing to the lowest reward. The adversary then misleads the victim to imitate the deceptive policy by perturbing the observations. Our method provides a tighter theoretical upper bound for the attacked agent's performance than the existing approaches. Extensive experiments demonstrate the superiority of our method and we achieve the state-of-the-art performance on both Atari and MuJoCo environments.
Computersarxiv.org

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Despite pre-trained language models have proven useful for learning high-quality semantic representations, these models are still vulnerable to simple perturbations. Recent works aimed to improve the robustness of pre-trained models mainly focus on adversarial training from perturbed examples with similar semantics, neglecting the utilization of different or even opposite semantics. Different from the image processing field, the text is discrete and few word substitutions can cause significant semantic changes. To study the impact of semantics caused by small perturbations, we conduct a series of pilot experiments and surprisingly find that adversarial training is useless or even harmful for the model to detect these semantic changes. To address this problem, we propose Contrastive Learning with semantIc Negative Examples (CLINE), which constructs semantic negative examples unsupervised to improve the robustness under semantically adversarial attacking. By comparing with similar and opposite semantic examples, the model can effectively perceive the semantic changes caused by small perturbations. Empirical results show that our approach yields substantial improvements on a range of sentiment analysis, reasoning, and reading comprehension tasks. And CLINE also ensures the compactness within the same semantics and separability across different semantics in sentence-level.
Computersarxiv.org

Multiagent Deep Reinforcement Learning: Challenges and Directions Towards Human-Like Approaches

This paper surveys the field of multiagent deep reinforcement learning. The combination of deep neural networks with reinforcement learning has gained increased traction in recent years and is slowly shifting the focus from single-agent to multiagent environments. Dealing with multiple agents is inherently more complex as (a) the future rewards depend on the joint actions of multiple players and (b) the computational complexity of functions increases. We present the most common multiagent problem representations and their main challenges, and identify five research areas that address one or more of these challenges: centralised training and decentralised execution, opponent modelling, communication, efficient coordination, and reward shaping. We find that many computational studies rely on unrealistic assumptions or are not generalisable to other settings; they struggle to overcome the curse of dimensionality or nonstationarity. Approaches from psychology and sociology capture promising relevant behaviours such as communication and coordination. We suggest that, for multiagent reinforcement learning to be successful, future research addresses these challenges with an interdisciplinary approach to open up new possibilities for more human-oriented solutions in multiagent reinforcement learning.
Educationarxiv.org

Few-Shot Learning with a Strong Teacher

Few-shot learning (FSL) aims to train a strong classifier using limited labeled examples. Many existing works take the meta-learning approach, sampling few-shot tasks in turn and optimizing the few-shot learner's performance on classifying the query examples. In this paper, we point out two potential weaknesses of this approach. First, the sampled query examples may not provide sufficient supervision for the few-shot learner. Second, the effectiveness of meta-learning diminishes sharply with increasing shots (i.e., the number of training examples per class). To resolve these issues, we propose a novel objective to directly train the few-shot learner to perform like a strong classifier. Concretely, we associate each sampled few-shot task with a strong classifier, which is learned with ample labeled examples. The strong classifier has a better generalization ability and we use it to supervise the few-shot learner. We present an efficient way to construct the strong classifier, making our proposed objective an easily plug-and-play term to existing meta-learning based FSL methods. We validate our approach in combinations with many representative meta-learning methods. On several benchmark datasets including miniImageNet and tiredImageNet, our approach leads to a notable improvement across a variety of tasks. More importantly, with our approach, meta-learning based FSL methods can consistently outperform non-meta-learning based ones, even in a many-shot setting, greatly strengthening their applicability.
Softwarearxiv.org

Segmenting 3D Hybrid Scenes via Zero-Shot Learning

This work is to tackle the problem of point cloud semantic segmentation for 3D hybrid scenes under the framework of zero-shot learning. Here by hybrid, we mean the scene consists of both seen-class and unseen-class 3D objects, a more general and realistic setting in application. To our knowledge, this problem has not been explored in the literature. To this end, we propose a network to synthesize point features for various classes of objects by leveraging the semantic features of both seen and unseen object classes, called PFNet. The proposed PFNet employs a GAN architecture to synthesize point features, where the semantic relationship between seen-class and unseen-class features is consolidated by adapting a new semantic regularizer, and the synthesized features are used to train a classifier for predicting the labels of the testing 3D scene points. Besides we also introduce two benchmarks for algorithmic evaluation by re-organizing the public S3DIS and ScanNet datasets under six different data splits. Experimental results on the two benchmarks validate our proposed method, and we hope our introduced two benchmarks and methodology could be of help for more research on this new direction.
Computersarxiv.org

Improving Human Motion Prediction Through Continual Learning

Human motion prediction is an essential component for enabling closer human-robot collaboration. The task of accurately predicting human motion is non-trivial. It is compounded by the variability of human motion, both at a skeletal level due to the varying size of humans and at a motion level due to individual movement's idiosyncrasies. These variables make it challenging for learning algorithms to obtain a general representation that is robust to the diverse spatio-temporal patterns of human motion. In this work, we propose a modular sequence learning approach that allows end-to-end training while also having the flexibility of being fine-tuned. Our approach relies on the diversity of training samples to first learn a robust representation, which can then be fine-tuned in a continual learning setup to predict the motion of new subjects. We evaluated the proposed approach by comparing its performance against state-of-the-art baselines. The results suggest that our approach outperforms other methods over all the evaluated temporal horizons, using a small amount of data for fine-tuning. The improved performance of our approach opens up the possibility of using continual learning for personalized and reliable motion prediction.
Coding & Programmingarxiv.org

Few-shot Learning for Unsupervised Feature Selection

We propose a few-shot learning method for unsupervised feature selection, which is a task to select a subset of relevant features in unlabeled data. Existing methods usually require many instances for feature selection. However, sufficient instances are often unavailable in practice. The proposed method can select a subset of relevant features in a target task given a few unlabeled target instances by training with unlabeled instances in multiple source tasks. Our model consists of a feature selector and decoder. The feature selector outputs a subset of relevant features taking a few unlabeled instances as input such that the decoder can reconstruct the original features of unseen instances from the selected ones. The feature selector uses the Concrete random variables to select features via gradient descent. To encode task-specific properties from a few unlabeled instances to the model, the Concrete random variables and decoder are modeled using permutation-invariant neural networks that take a few unlabeled instances as input. Our model is trained by minimizing the expected test reconstruction error given a few unlabeled instances that is calculated with datasets in source tasks. We experimentally demonstrate that the proposed method outperforms existing feature selection methods.
SoftwareScienceBlog.com

AI Learns to Predict Human Behavior from Videos

Predicting what someone is about to do next based on their body language comes naturally to humans but not so for computers. When we meet another person, they might greet us with a hello, handshake, or even a fist bump. We may not know which gesture will be used, but we can read the situation and respond appropriately.
Coding & Programmingarxiv.org

The MineRL BASALT Competition on Learning from Human Feedback

Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan. The last decade has seen a significant increase of interest in deep learning research, with many public successes that have demonstrated its potential....
Coding & Programmingarxiv.org

Mitigating Generation Shifts for Generalized Zero-Shot Learning

Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training. It is natural to derive generative models and hallucinate training samples for unseen classes based on the knowledge learned from the seen samples. However, most of these models suffer from the `generation shifts', where the synthesized samples may drift from the real distribution of unseen data. In this paper, we conduct an in-depth analysis on this issue and propose a novel Generation Shifts Mitigating Flow (GSMFlow) framework, which is comprised of multiple conditional affine coupling layers for learning unseen data synthesis efficiently and effectively. In particular, we identify three potential problems that trigger the generation shifts, i.e., semantic inconsistency, variance decay, and structural permutation and address them respectively. First, to reinforce the correlations between the generated samples and the respective attributes, we explicitly embed the semantic information into the transformations in each of the coupling layers. Second, to recover the intrinsic variance of the synthesized unseen features, we introduce a visual perturbation strategy to diversify the intra-class variance of generated data and hereby help adjust the decision boundary of the classifier. Third, to avoid structural permutation in the semantic space, we propose a relative positioning strategy to manipulate the attribute embeddings, guiding which to fully preserve the inter-class geometric structure. Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings. Our code is available at: this https URL.
HealthNature.com

The powers and perils of using digital data to understand human behaviour

Computational social science is a powerful research tool. But it needs its different disciplines to find a common language. You have full access to this article via your institution. What are the causes of vaccine hesitancy? How can people be encouraged to exercise more? What can governments do to improve...
Computersarxiv.org

Learn to Learn Metric Space for Few-Shot Segmentation of 3D Shapes

Recent research has seen numerous supervised learning-based methods for 3D shape segmentation and remarkable performance has been achieved on various benchmark datasets. These supervised methods require a large amount of annotated data to train deep neural networks to ensure the generalization ability on the unseen test set. In this paper, we introduce a meta-learning-based method for few-shot 3D shape segmentation where only a few labeled samples are provided for the unseen classes. To achieve this, we treat the shape segmentation as a point labeling problem in the metric space. Specifically, we first design a meta-metric learner to transform input shapes into embedding space and our model learns to learn a proper metric space for each object class based on point embeddings. Then, for each class, we design a metric learner to extract part-specific prototype representations from a few support shapes and our model performs per-point segmentation over the query shapes by matching each point to its nearest prototype in the learned metric space. A metric-based loss function is used to dynamically modify distances between point embeddings thus maximizes in-part similarity while minimizing inter-part similarity. A dual segmentation branch is adopted to make full use of the support information and implicitly encourages consistency between the support and query prototypes. We demonstrate the superior performance of our proposed on the ShapeNet part dataset under the few-shot scenario, compared with well-established baseline and state-of-the-art semi-supervised methods.
Softwarearxiv.org

The Simons Observatory: HoloSim-ML: machine learning applied to the efficient analysis of radio holography measurements of complex optical systems

Near-field radio holography is a common method for measuring and aligning mirror surfaces for millimeter and sub-millimeter telescopes. In instruments with more than a single mirror, degeneracies arise in the holography measurement, requiring multiple measurements and new fitting methods. We present HoloSim-ML, a Python code for beam simulation and analysis of radio holography data from complex optical systems. This code uses machine learning to efficiently determine the position of hundreds of mirror adjusters on multiple mirrors with few micron accuracy. We apply this approach to the example of the Simons Observatory 6m telescope.
Softwarearxiv.org

Exploiting the relationship between visual and textual features in social networks for image classification with zero-shot deep learning

One of the main issues related to unsupervised machine learning is the cost of processing and extracting useful information from large datasets. In this work, we propose a classifier ensemble based on the transferable learning capabilities of the CLIP neural network architecture in multimodal environments (image and text) from social media. For this purpose, we used the InstaNY100K dataset and proposed a validation approach based on sampling techniques. Our experiments, based on image classification tasks according to the labels of the Places dataset, are performed by first considering only the visual part, and then adding the associated texts as support. The results obtained demonstrated that trained neural networks such as CLIP can be successfully applied to image classification with little fine-tuning, and considering the associated texts to the images can help to improve the accuracy depending on the goal. The results demonstrated what seems to be a promising research direction.
Computersadafruit.com

Raspberry Pi Zero Makes a Xylophone Play Itself

While there are thousands of MIDI files freely available online, very few of them could actually be played by the xylophone. With only 32 notes, the instrument is limited in what it can play without losing any notes. Also, even when a MIDI file uses just 32 consecutive notes, they might not be the same range of 32 notes as the xylophone has, so you need to transpose. Stéphane developed a tool in Python to filter out 32-note tunes from thousands of MIDI files and automatically transpose them so the xylophone can play them. And, yes, everything you need to copy this filtering and transposing function is on GitHub.
Computerstamu.edu

Applying quantum mechanics to human and machine decision-making

Can we develop models of the cognitive behavior of human-machine collaboration? While this might seem like the stuff of science fiction, researchers at Texas A&M University are currently developing algorithms that interpret situations close to how humans navigate through their daily lives. For example, say you see something that resembles...

Comments / 0

Community Policy