FAIR Fairseq Toolkit

From GM-RKB
Jump to navigation Jump to search

A FAIR Fairseq Toolkit is a neural sequence modeling toolkit.



References

2018

Convolutional Neural Networks (CNN)
Dauphin et al. (2017): Language Modeling with Gated Convolutional Networks
Gehring et al. (2017): Convolutional Sequence to Sequence Learning
Edunov et al. (2018): Classical Structured Prediction Losses for Sequence to Sequence Learning
Fan et al. (2018): Hierarchical Neural Story Generation
Long Short-Term Memory (LSTM) networks
Luong et al. (2015): Effective Approaches to Attention-based Neural Machine Translation
Wiseman and Rush (2016): Sequence-to-Sequence Learning as Beam-Search Optimization
Transformer (self-attention) networks
Vaswani et al. (2017): Attention Is All You Need
Ott et al. (2018): Scaling Neural Machine Translation
Edunov et al. (2018): Understanding Back-Translation at Scale

Fairseq features:

   multi-GPU (distributed) training on one machine or across multiple machines
   fast beam search generation on both CPU and GPU
   large mini-batch training even on a single GPU via delayed updates
   fast half-precision floating point (FP16) training
   extensible: easily register new models, criterions, and tasks

We also provide pre-trained models for several benchmark translation and language modeling datasets.