D2AG-LSTM Neural Network
A Distance-aware DAG-LSTM Neural Network is a DAG-LSTM Neural Network that applies a recursive distance penalty to each DAG-LSTM unit.
- AKA: D2AG-LSTM.
- Example(s):
- …
- Counter-Example(s):
- See: DAG Recurrent Neural Network, Bidirectional RNN Recurrent Neural Network, Recursive Neural Network, Support Vector Machine, Directed Acyclic Graph.
References
2018
- (Liu et al., 2018) ⇒ Zemin Liu, Vincent W. Zheng, Zhou Zhao, Fanwei Zhu, Kevin Chen-Chuan Chang, Minghui Wu, and Jing Ying (2018). "Distance-aware dag embedding for proximity search on heterogeneous graphs". AAAI.
- QUOTE: Particularly, to avoid the gradient vanishing problem, we exploit an LSTM (long-short term memory) architecture. We propose a novel D2AGLSTM (Distance-aware DAG-LSTM) model to recursively apply distance discount when embedding each DAG.
D2AG-LSTM. We use an example DAG to illustrate how we model D2AG-LSTM. As shown in Fig. 3, D2AG-LSTM models two parts of information: 1) the topological structure of each DAG with node distances to the start node a; 2) the possible feature inputs [math]\displaystyle{ x_j \in \mathbb{R}^n }[/math] for each node [math]\displaystyle{ j }[/math] in the DAG. For the topological structure and the node distances, they are provided in Fig. 2(b-8). For the feature inputs [math]\displaystyle{ x_j }[/math] of node [math]\displaystyle{ j }[/math], we concatenate the following information: 1) node type is a |S|-dimensional vector, where only the dimension corresponding to [math]\displaystyle{ j }[/math]’s type is one and the others are zero; 2) node degree is a scalar, indicating [math]\displaystyle{ j }[/math]’s degree in [math]\displaystyle{ G }[/math]; 3) neighbor type distribution is a |S|-dimensional vector, where each dimension is the number of [math]\displaystyle{ j }[/math]’s neighbors with the corresponding type; 4) neighbor type entropy is a scalar, indicating the entropy of the type distribution for [math]\displaystyle{ j }[/math]’s neighbors.
Similar to LSTM, each of our D2AG-LSTM unit also has an input gate [math]\displaystyle{ i }[/math], a forget gate [math]\displaystyle{ f }[/math], an output gate [math]\displaystyle{ o }[/math], a memory cell [math]\displaystyle{ C }[/math], and a hidden state [math]\displaystyle{ h }[/math]. Different from standard LSTM, D2AG-LSTM unit may have to update its gate vectors and memory cell states from multiple predecessor units, and also distributes its memory cell states and hidden states to multiple successor units.
Figure 3: D2AG-LSTM
- QUOTE: Particularly, to avoid the gradient vanishing problem, we exploit an LSTM (long-short term memory) architecture. We propose a novel D2AGLSTM (Distance-aware DAG-LSTM) model to recursively apply distance discount when embedding each DAG.