Zero-shot Learning via Discriminative Dual Semantic Auto-encoder

Nan Xing, Yang Liu, Hong Zhu, Jing Wang, Jungong Han

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)
134 Downloads (Pure)

Abstract

Zero-shot learning (ZSL) is an effective method to perform the recognition task without any training samples of specific classes. Most existing ZSL models put emphasis on learning an embedding between visual space and semantic space directly. However, few ZSL models research whether the human-designed semantic features are discriminative enough to recognize different classes. Moreover, one-way mapping suffers from the project domain shift problem. In this article, we propose to learn a Discriminative Dual Semantic Auto-encoder (DDSA) based on the encoder-decoder paradigm to solve this problem. DDSA attempts to construct two bidirectional embeddings to connect the visual space and the semantic space with the help of the learned aligned space which includes discriminative information of the visual features and semantic features. Based on the DDSA, we additionally propose a Deep DDSA to capture deep aligned features that are more conducive to zero-shot classification. The key to the proposed framework is that it implicitly exact the principal information from visual space and semantic space to construct aligned features, which is not only semantic-preserving but also discriminative. Extensive experiments on five benchmarks (SUN, CUB, AWA1, AWA2 and aPY) demonstrate the effectiveness of the proposed framework with state-of-the-art performance obtained on both conventional ZSL and generalized ZSL settings.

Original languageEnglish
Article number9303397
Pages (from-to)733-742
Number of pages10
JournalIEEE Access
Volume9
Early online date22 Dec 2020
DOIs
Publication statusPublished - 04 Jan 2021

Keywords

  • Zero-shot learning
  • aligned
  • discriminative
  • encoder-decoder

Fingerprint

Dive into the research topics of 'Zero-shot Learning via Discriminative Dual Semantic Auto-encoder'. Together they form a unique fingerprint.

Cite this