Zero-Shot In-Context Learning Task

From GM-RKB
(Redirected from zero-shot learning)
Jump to navigation Jump to search

A Zero-Shot In-Context Learning Task is an in-context learning task where a pretrained model is expected to make accurate predictions about data from classes that it has not encountered during training.



References

2022

  • (Wikipedia, 2022) ⇒ https://en.wikipedia.org/wiki/zero-shot_learning Retrieved:2022-12-8.
    • Zero-shot learning (ZSL) is a problem setup in machine learning, where at test time, a learner observes samples from classes which were not observed during training, and needs to predict the class that they belong to. Zero-shot methods generally work by associating observed and non-observed classes through some form of auxiliary information, which encodes observable distinguishing properties of objects. For example, given a set of images of animals to be classified, along with auxiliary textual descriptions of what animals look like, an artificial intelligence model which has been trained to recognize horses, but has never been given a zebra, can still recognize a zebra when it also knows that zebras look like striped horses. This problem is widely studied in computer vision, natural language processing, and machine perception.

2020

2019

  • (Wang, Zheng et al., 2019) ⇒ Wei Wang, Vincent W. Zheng, Han Yu, and Chunyan Miao. (2019). “A Survey of Zero-shot Learning: Settings, Methods, and Applications.” In: ACM Transactions on Intelligent Systems and Technology (TIST), 10(2).
    • ABSTRACT: Most machine-learning methods focus on classifying instances whose classes have already been seen in training. In practice, many applications require classifying instances whose classes have not been seen previously. Zero-shot learning is a powerful and promising learning paradigm, in which the classes covered by training instances and the classes we aim to classify are disjoint. In this paper, we provide a comprehensive survey of zero-shot learning. First of all, we provide an overview of zero-shot learning. According to the data utilized in model optimization, we classify zero-shot learning into three learning settings. Second, we describe different semantic spaces adopted in existing zero-shot learning works. Third, we categorize existing zero-shot learning methods and introduce representative methods under each category. Fourth, we discuss different applications of zero-shot learning. Finally, we highlight promising future research directions of zero-shot learning.
    • ... In zero-shot learning, the goal is to learn the zero-shot classifier fu(⋅). During model learning, if information about the testing instances is involved, the learned model is transductive for these specific testing instances. In zero-shot learning, this transduction can be embodied in two progressive degrees: transductive for specific unseen classes and transductive for specific testing instances. This is different from the well-known transductive setting in semisupervised learning, which is just for the testing instances. In the setting that is transductive for specific unseen classes, information about the unseen classes is involved in model learning, and the model is optimized for these specific unseen classes. In the setting that is transductive for specific testing instances, the transductive degree goes further. The testing instances are also involved in model learning, and the model is optimized for these specific testing instances. Based on the degree of transduction, we categorize zero-shot learning into three learning settings.

and seen class prototypes Ts are used in model learning.

    • ... We organize the existing works on zero-shot learning from three perspectives: (1) semantic spaces, which contain the semantic information that is important for zero-shot learning; (2) methods, which are different methods for solving zero-shot learning problems under different learning settings; and (3) applications, the application areas in which zero-shot learning is used. ...

2017

  • (Xian et al., 2017) ⇒ Yongqin Xian, Bernt Schiele, and Zeynep Akata. (2017). “Zero-shot Learning-the Good, the Bad and the Ugly.” In: Proceedings of the IEEE conference on computer vision and pattern recognition.
    • ABSTRACT: Due to the importance of zero-shot learning, the number of proposed approaches has increased steadily recently. We argue that it is time to take a step back and to analyze the status quo of the area. The purpose of this paper is three-fold. First, given the fact that there is no agreed upon zero-shot learning benchmark, we first define a new benchmark by unifying both the evaluation protocols and data splits. This is an important contribution as published results are often not comparable and sometimes even flawed due to, e.g. pre-training on zero-shot test classes. Second, we compare and analyze a significant number of the state-of-the-art methods in depth, both in the classic zero-shot setting but also in the more realistic generalized zero-shot setting. Finally, we discuss limitations of the current status of the area which can be taken as a basis for advancing it.

2015

  • (Paredes & Torr, 2015) ⇒ Bernardino Romera-Paredes, and Philip Torr. (2015). “An Embarrassingly Simple Approach to Zero-shot Learning.” In: International Conference on Machine Learning.
    • ABSTRACT: Zero-shot learning consists in learning how to recognize new concepts by just having a description of them. Many sophisticated approaches have been proposed to address the challenges this problem comprises. In this paper we describe a zero-shot learning approach that can be implemented in just one line of code, yet it is able to outperform state of the art approaches on standard datasets. The approach is based on a more general framework which models the relationships between features, attributes, and classes as a two linear layers network, where the weights of the top layer are not learned but are given by the environment. We further provide a learning bound on the generalization error of this kind of approaches, by casting them as domain adaptation methods. In experiments carried out on three standard real datasets, we found that our approach is able to perform significantly better than the state of art on all of them, obtaining a ratio of improvement up to 17%.

2008