Vector-based Referencer
Jump to navigation
Jump to search
A Vector-based Referencer is a concept referencer that is a vector representation.
- AKA: Embedding Representation.
- Example(s):
- Counter-Example(s):
- See: Space Point.
References
2015
- (Vilnis & McCallum, 2015) ⇒ Luke Vilnis, and Andrew McCallum. (2015). “Word Representations via Gaussian Embedding.” In: arXiv preprint arXiv:1412.6623 submitted to ICRL 2015.
- QUOTE: ... Related work in probabilistic matrix factorization (Mnih & Salakhutdinov, 2007) embeds rows and columns as Gaussians, and some forms of this do provide each row and column with its own variance (Salakhutdinov & Mnih, 2008). Given the parallels between embedding models and matrix factorization (Deerwester et al., 1990; Riedel et al., 2013; Levy & Goldberg, 2014), this is relevant to our approach.
2014
- (Levy & Goldberg, 2014) ⇒ Omer Levy, and Yoav Goldberg. (2014). “Neural Word Embedding As Implicit Matrix Factorization.” In: Advances in Neural Information Processing Systems.
- QUOTE: Each word [math]\displaystyle{ w \in V_W }[/math] is associated with a vector [math]\displaystyle{ \vec{\mathcal{w}} }[/math] ... SGNS embeds both words and their contexts into a low-dimensional space [math]\displaystyle{ \R^d }[/math], resulting in the word and context matrices Kmath>W and [math]\displaystyle{ C }[/math].
- (Goldberg & Levy, 2014) ⇒ Yoav Goldberg, and Omer Levy. (2014). “word2vec Explained: Deriving Mikolov Et Al.'s Negative-sampling Word-embedding Method.” In: arXiv preprint arXiv:1402.3722.
- QUOTE: One approach for parameterizing the skip-gram model follows the neural-network language models literature, and models the conditional probability [math]\displaystyle{ p(c \mid w; \theta) }[/math] using soft-max: [math]\displaystyle{ p(c \mid w; \theta) = \frac{e^{v_c \cdot v_w}}{\Sigma_{c' \in C^{e{v_{c'} \cdot v_w}}}} }[/math] where [math]\displaystyle{ v_c }[/math] and [math]\displaystyle{ v_w \in \mathbb{R}^d }[/math] are vector representations for [math]\displaystyle{ c }[/math] and [math]\displaystyle{ w }[/math] respectively, and [math]\displaystyle{ C }[/math] is the set of all available contexts