Theano-Recurrence Training System
Jump to navigation
Jump to search
A Theano-Recurrence Training System is a Bidirectional LSTM-RNN Training System developed by (Yaseen, 2016).
- Example(s):
- Counter-Example(s):
- See: Bidirectional LSTM (biLSTM) Network, LSTM Training System, RNN Training System, Artificial Neural Network, PyTorch.
References
2018
- (Yaseen, 2018) ⇒ Usama Yaseen (2016) Theano-Recurrence Training System: https://github.com/uyaseen/theano-recurrence#training Retrieved: 2018-07-01
train.py
provides a convenient methodtrain(..)
to train each model, you can select the recurrent model with therec_model
parameter, it is set togru
by default (possible options includernn
,gru
,lstm
,birnn
,bigru
&bilstm
), number of hidden neurons in each layer (at the moment only single layer models are supported to keep the things simple, although adding more layers is very trivial) can be adjusted byn_h
parameter intrain(..)
, which by default is set to100
. As the model is trained it stores the current best state of the model i.e set of weights (best = least training error), the stored model is in thedata\models\MODEL-NAME-best_model.pkl
, also this stored model can later be used for resuming training from the last point or just for prediction/sampling. If you don't want to start training from scratch and instead use the already trained model then setuse_existing_model=True
in argument totrain(..)
. Also optimization strategies can be specified totrain(..)
via optimizer parameter, currently supported optimizations arermsprop
,adam
andvanilla stochastic gradient descent
and can be found inutilities\optimizers.py
.b_path
,learning_rate
,n_epochs
in thetrain(..)
specifies the'base path to store model' (default = data\models\)
,'initial learning rate
of the optimizer', and 'number of epochs respectively'. During the training some logs (current epoch, sample, cross-entropy error etc) are shown on console to get an idea of how well learning is proceeding, logging frequencycan be specified vialogging_freq
in thetrain(..)
. At the end of training, a plot ofcross-entropy error vs # of iterations
gives an overview of overall training process and is also stored in theb_path
.