Zhe Gan

From GM-RKB
Jump to navigation Jump to search

Zhe Gan is a person.



References

2024

2021

  • (Lei et al., 2021) ⇒ J Lei, L Li, L Zhou, Zhe Gan, TL Berg, M Bansal, and J Liu. (2021). “Less is More: Clipbert for Video-and-Language Learning via Sparse Sampling.” In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
    • NOTE: It proposes a novel approach for efficient video-and-language learning, utilizing sparse sampling to reduce computational requirements while maintaining high performance levels.

2020

  • (Chen et al., 2020) ⇒ YC Chen, L Li, L Yu, A El Kholy, F Ahmed, Zhe Gan, Y Cheng, and J Liu. (2020). “Uniter: Universal Image-Text Representation Learning.” In: European Conference on Computer Vision, Pages 104-120.
    • NOTE: It introduces a method for learning universal image-text representations, aimed at improving the interoperability between visual and textual data in various computer vision tasks.

2019

  • (Sun et al., 2019) ⇒ S Sun, Y Cheng, Zhe Gan, and J Liu. (2019). “Patient Knowledge Distillation for BERT Model Compression.” In: arXiv preprint arXiv:1908.09355.
    • NOTE: It introduces a method for compressing BERT models through knowledge distillation, focusing on preserving model performance while reducing model size.

2018

  • (Xu et al., 2018) ⇒ T Xu, P Zhang, Q Huang, H Zhang, Zhe Gan, X Huang, and X He. (2018). “Attngan: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks.” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
    • NOTE: It presents an attentional generative adversarial network for generating detailed images from textual descriptions, focusing on the fine-grained aspects of the generated images.

2017

2016

  • (Pu et al., 2016) ⇒ Y Pu, Zhe Gan, R Henao, X Yuan, C Li, A Stevens, and L Carin. (2016). “Variational Autoencoder for Deep Learning of Images, Labels, and Captions.” In: NIPS.
    • NOTE: It discusses the application of variational autoencoders in learning joint representations of images, labels, and captions, contributing to the field of multi-modal learning.