Encoder-Decoder Sequence-to-Sequence Learning Task: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
 
m (Text replacement - "ions]] " to "ion]]s ")
 
(29 intermediate revisions by 3 users not shown)
Line 1: Line 1:
An [[Encoder-Decoder Sequence-to-Sequence Learning Task]] is a [[Sequence-to-Sequence Learning Task]] that is based on a [[Encoder-Decoder Neural Network]].  
An [[Encoder-Decoder Sequence-to-Sequence Learning Task]] is a [[Sequence-to-Sequence Learning Task]] that is based on a [[Encoder-Decoder Neural Network]].
* <B>Context:</B>
* <B>Context:</B>
** '''[[Task Input Requirement]]''' : a [[Sequence Dataset]]
** <B>[[Task Input Requirement]]</B> : a [[Sequence Dataset]].
** '''[[Task Output Requirement]]''' : a [[Sequence Dataset]]
** <B>[[Task Output Requirement]]</B> : a [[Sequence Dataset]].
** '''[[Task Requirement]]s''': [[Encoder-Decoder Neural Network Model]].
** <B>[[Task Requirement]]s</B>: [[Encoder-Decoder Neural Network Model]].
** …
* <B>Example(s):</B>
* <B>Example(s):</B>
* a [[Neural Machine Translation System]] that is based on a [[RNN Encoder-Decoder Network]],
** a [[Neural Machine Translation System]] that is based on a [[RNN Encoder-Decoder Network]],
* a [[Sequence-to-Sequence Learning System]] that is based on a [[CNN Encoder-Decoder Network]].  
** a [[Sequence-to-Sequence Learning System]] that is based on a [[CNN Encoder-Decoder Network]].
** …
* <B>Counter-Example(s):</B>
* <B>Counter-Example(s):</B>
** a [[Convolutional Sequence-to-Sequence Learning Task]],
** a [[Convolutional Sequence-to-Sequence Training Task]],
** a [[Connectionist Sequence Classification Task]],
** a [[Connectionist Sequence Classification Task]];
** a [[Multi-modal Sequence to Sequence Learning]],
** a [[Multi-modal Sequence to Sequence Training]];
** a [[Sequence-to-Sequence Learning with Variational Auto-Encoder]],
** a [[Neural seq2seq Model Training]];
** a [[Sequence-to-Sequence Learning via Shared Latent Representation]],
** a [[Sequence-to-Sequence Learning with Variational Auto-Encoder]];
** a [[Sequence-to-Sequence Translation with Attention Mechanism]].
** a [[Sequence-to-Sequence Learning via Shared Latent Representation]];
* <B>See:</B> [[Natural Language Processing Task]], [[Sequence Learning Task]], [[Word Sense Disambiguation]], [[LSTM]], [[Deep Neural Network]], [[Memory Augmented Neural Network Training System]], [[Deep Sequence Learning Task]], [[Bidirectional LSTM]].
** a [[Sequence-to-Sequence Translation with Attention Mechanism]].
* <B>See:</B> [[Natural Language Processing Task]], [[Sequence Learning Task]], [[Word Sense Disambiguation]], [[LSTM]], [[Deep Neural Network]], [[Memory Augmented Neural Network Training System]], [[Deep Sequence Learning Task]], [[Bidirectional LSTM]].
 
----
----
----
----
Line 22: Line 26:
=== 2018 ===
=== 2018 ===
* ([[2018_DeepSequenceLearningwithAuxilia|Liao et al., 2018]]) ⇒ [[Binbing Liao]], [[Jingqing Zhang]], [[Chao Wu]], [[Douglas McIlwraith]], [[Tong Chen]], [[Shengwen Yang]], [[Yike Guo]], and [[Fei Wu]]. ([[2018]]). &ldquo;[https://arxiv.org/pdf/1806.07380.pdf Deep Sequence Learning with Auxiliary Information for Traffic Prediction].&rdquo; In: [[Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining]]. ISBN:978-1-4503-5552-0 [http://dx.doi.org/10.1145/3219819.3219895 doi:10.1145/3219819.3219895]
* ([[2018_DeepSequenceLearningwithAuxilia|Liao et al., 2018]]) ⇒ [[Binbing Liao]], [[Jingqing Zhang]], [[Chao Wu]], [[Douglas McIlwraith]], [[Tong Chen]], [[Shengwen Yang]], [[Yike Guo]], and [[Fei Wu]]. ([[2018]]). &ldquo;[https://arxiv.org/pdf/1806.07380.pdf Deep Sequence Learning with Auxiliary Information for Traffic Prediction].&rdquo; In: [[Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining]]. ISBN:978-1-4503-5552-0 [http://dx.doi.org/10.1145/3219819.3219895 doi:10.1145/3219819.3219895]
** QUOTE: In this paper, we effectively utilise three kinds of [[auxiliary information]] in an [[encoder-decoder sequence to sequence (Seq2Seq)]] &#91;[[2014_LearningPhraseRepresentationsUs|7]], [[2014_SequencetoSequenceLearningwithN|32]]&#93; learning manner as follows: a [[wide linear model]] is used to [[encode]] the interactions among [[geographical and social attribute]]s, a [[graph convolution neural network]] is used to [[learn]] the [[spatial correlation]] of [[road segment]]s, and the [[query impact]] is quantified and encoded to learn the potential influence of [[online crowd queries]](...)<P>Figure 4 shows the [[architecture]] of the [[Seq2Seq model]] for [[traffic prediction]]. The [[encoder]] embeds the [[input]] [[traffic speed]] [[sequence]] <math>\{v_1,v_2, \cdots ,v_t \}</math> and the final [[hidden state]] of the [[encoder]] is fed into the [[decoder]], which [[learn]]s to [[predict]] the future traffic speed <math>\{\tilde{v}_{t+1},\tilde{v}_{t+2}, \cdots,\tilde{v}_{t+t'} \}</math>. [[Hybrid model]] that integrates the [[auxiliary information]] will be proposed based on the [[Seq2Seq model]].<P>[[File: 2018_DeepSequenceLearningwithAuxilia_Fig4.png|500px|nothumb|center|]]<P> <B>Figure 4:</B> [[Seq2Seq]]: The [[Sequence to Sequence model]] predicts future [[traffic speed]] <math>\{\tilde{v}_{t+1},\tilde{v}_{t+2}, \cdots ,\tilde{v}_{t+t'} \}</math>, given the previous [[traffic speed]] <math>{v_1,v_2, ...v_t }</math>.
** QUOTE: In this paper, we effectively utilise three kinds of [[auxiliary information]] in an [[encoder-decoder sequence to sequence (Seq2Seq)]] &#91;[[2014_LearningPhraseRepresentationsUs|7]], [[2014_SequencetoSequenceLearningwithN|32]]&#93; learning manner as follows: a [[wide linear model]] is used to [[encode]] the interactions among [[geographical and social attribute]]s, a [[graph convolution neural network]] is used to [[learn]] the [[spatial correlation]] of [[road segment]]s, and the [[query impact]] is quantified and encoded to learn the potential influence of [[online crowd queries]](...)<P>         Figure 4 shows the [[architecture]] of the [[Seq2Seq model]] for [[traffic prediction]]. The [[encoder]] embeds the [[input]] [[traffic speed]] [[sequence]] <math>\{v_1,v_2, \cdots ,v_t \}</math> and the final [[hidden state]] of the [[encoder]] is fed into the [[decoder]], which [[learn]]s to [[predict]] the future traffic speed <math>\{\tilde{v}_{t+1},\tilde{v}_{t+2}, \cdots,\tilde{v}_{t+t'} \}</math>. [[Hybrid model]] that integrates the [[auxiliary information]] will be proposed based on the [[Seq2Seq model]].         <P>         [[File: 2018_DeepSequenceLearningwithAuxilia_Fig4.png|500px|nothumb|center|]]<P>       <B>Figure 4:</B> [[Seq2Seq]]: The [[Sequence to Sequence model]] predicts future [[traffic speed]] <math>\{\tilde{v}_{t+1},\tilde{v}_{t+2}, \cdots ,\tilde{v}_{t+t'} \}</math>, given the previous [[traffic speed]] <math>{v_1,v_2, ...v_t }</math>.


=== 2017 ===
=== 2017 ===
* ([[2017_UnsupervisedPretrainingforSeque|Ramachandran et al., 2017]]) ⇒ [[author::Prajit Ramachandran]], [[author::Peter J. Liu]], and [[author::Quoc V. Le]]. ([[year::2017]]). &ldquo;[http://www.aclweb.org/anthology/D17-1039 Unsupervised Pretraining for Sequence to Sequence Learning].&rdquo; In: [[Proceedings of  the 2017 Conference on Empirical Methods in Natural Language Processing]] ([[EMNLP 2017]]). [https://arxiv.org/abs/1611.02683 arViv:1611.02683]  
* ([[2017_UnsupervisedPretrainingforSeque|Ramachandran et al., 2017]]) ⇒ [[Prajit Ramachandran]], [[Peter J. Liu]], and [[Quoc V. Le]]. ([[2017]]). &ldquo;[http://www.aclweb.org/anthology/D17-1039 Unsupervised Pretraining for Sequence to Sequence Learning].&rdquo; In: [[Proceedings of  the 2017 Conference on Empirical Methods in Natural Language Processing]] ([[EMNLP 2017]]). [https://arxiv.org/abs/1611.02683 arViv:1611.02683]  
** QUOTE: Therefore, the basic [[procedure]] of our approach is to [[pretrain]] both the [[seq2seq]] [[Encoder-Decoder Neural Network|encoder and decoder network]]s with [[Natural Language Processing Model|language model]]s, which can be [[Neural Network Training Task|trained]] on large amounts of [[unlabeled text data]]. This can be seen in Figure 1, where the [[parameter]]s in the shaded boxes are [[pretrained]]. In the following we will describe the method in detail using [[machine translation]] as an example application.<P>[[File:2017_UnsupervisedPretrainingforSeque_Fig1.png|650px|nothumb|center|]]<P>Figure 1: [[Pretrained]] [[sequence to sequence model]]. The red [[parameter]]s are the [[encoder]] and the blue [[parameter]]s are the [[decoder]]. All parameters in a shaded box are [[pretrained]], either from the [[Input Dataset|source]] side (light red) or [[Output Dataset|target]] side (light blue) [[Natural Language Processing Model|language model]]. Otherwise, they are [[randomly initialized]].
** QUOTE: Therefore, the basic [[procedure]] of our approach is to [[pretrain]] both the [[seq2seq]] [[Encoder-Decoder Neural Network|encoder and decoder network]]s with [[Natural Language Processing Model|language model]]s, which can be [[Neural Network Training Task|trained]] on large amounts of [[unlabeled text data]]. This can be seen in Figure 1, where the [[parameter]]s in the shaded boxes are [[pretrained]]. In the following we will describe the method in detail using [[machine translation]] as an example application.         <P>         [[File:2017_UnsupervisedPretrainingforSeque_Fig1.png|650px|nothumb|center|]]<P>         Figure 1: [[Pretrained]] [[sequence to sequence model]]. The red [[parameter]]s are the [[encoder]] and the blue [[parameter]]s are the [[decoder]]. All parameters in a shaded box are [[pretrained]], either from the [[Input Dataset|source]] side (light red) or [[Output Dataset|target]] side (light blue) [[Natural Language Processing Model|language model]]. Otherwise, they are [[randomly initialized]].


=== 2016 ===
=== 2016 ===
* ([[2016_MultiTaskSequencetoSequenceLear|Luong et al., 2016]]) ⇒ [[Minh-Thang Luong]], [[Quoc V. Le]], [[Ilya Sutskever]], [[Oriol Vinyals]], and [[Lukasz Kaiser]]. ([[2016]]). “[https://arxiv.org/pdf/1511.06114 Multi-task Sequence to Sequence Learning].&rdquo; In: Proceedings of 4th [[International Conference on Learning Representations]] ([[ICLR-2016]]).
* ([[2016_MultiTaskSequencetoSequenceLear|Luong et al., 2016]]) ⇒ [[Minh-Thang Luong]], [[Quoc V. Le]], [[Ilya Sutskever]], [[Oriol Vinyals]], and [[Lukasz Kaiser]]. ([[2016]]). “[https://arxiv.org/pdf/1511.06114 Multi-task Sequence to Sequence Learning].&rdquo; In: Proceedings of 4th [[International Conference on Learning Representation]]s ([[ICLR-2016]]).
   
   


=== 2014a ===
=== 2014a ===
* ([[2014_SequencetoSequenceLearningwithN|Sutskever et al., 2014]]) ⇒ [[Ilya Sutskever]], [[Oriol Vinyals]], and [[Quoc V. Le]]. ([[2014]]). “[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Sequence to Sequence Learning with Neural Networks].” In: Advances in Neural Information Processing Systems. [https://arxiv.org/abs/1409.3215 arXiv:1409.321]  
* ([[2014_SequencetoSequenceLearningwithN|Sutskever et al., 2014]]) ⇒ [[Ilya Sutskever]], [[Oriol Vinyals]], and [[Quoc V. Le]]. ([[2014]]). “[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Sequence to Sequence Learning with Neural Networks].” In: Advances in Neural Information Processing Systems. [https://arxiv.org/abs/1409.3215 arXiv:1409.321]  
=== 2014b ===
=== 2014b ===
* ([[2014_LearningPhraseRepresentationsUs|Cho et al., 2014]]) ⇒ [[Kyunghyun Cho]], [[Bart van Merrienboer]], [[Caglar Gulcehre]], [[Dzmitry Bahdanau]], [[Fethi Bougares]], [[Holger Schwenk]], and [[Yoshua Bengio]]. ([[2014]]). [http://emnlp2014.org/papers/pdf/EMNLP2014179.pdf “Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation”]. In: [[Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing]], ([[EMNLP-2014]]). [https://arxiv.org/abs/1406.1078 arXiv:1406.1078]
* ([[2014_LearningPhraseRepresentationsUs|Cho et al., 2014a]]) ⇒ [[Kyunghyun Cho]], [[Bart van Merrienboer]], [[Caglar Gulcehre]], [[Dzmitry Bahdanau]], [[Fethi Bougares]], [[Holger Schwenk]], and [[Yoshua Bengio]]. ([[2014]]). [http://emnlp2014.org/papers/pdf/EMNLP2014179.pdf “Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation”]. In: [[Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing]], ([[EMNLP-2014]]). [https://arxiv.org/abs/1406.1078 arXiv:1406.1078]
** QUOTE: In this paper, we propose a novel [[neural network architecture]] that learns to encode a [[variable-length]] [[sequence]] into a [[fixed-length]] [[vector representation]] and to decode a given [[fixed-length]] [[vector representation]] back into a [[variable-length]] [[sequence]].
** QUOTE: In this paper, we propose a novel [[neural network architecture]] that learns to encode a [[variable-length]] [[sequence]] into a [[fixed-length]] [[vector representation]] and to decode a given [[fixed-length]] [[vector representation]] back into a [[variable-length]] [[sequence]].
----
----
__NOTOC__
__NOTOC__
[[Category:Concept]]
[[Category:Concept]]
[[Category:Machine Learning]]
[[Category:Machine Learning]]
[[Category: Computational Linguistics]]
[[Category:Computational Linguistics]]

Latest revision as of 07:29, 22 August 2024

An Encoder-Decoder Sequence-to-Sequence Learning Task is a Sequence-to-Sequence Learning Task that is based on a Encoder-Decoder Neural Network.



References

2018

2017

2016


2014a

2014b