0. Referencehttps://aclanthology.org/D15-1036/ Evaluation methods for unsupervised word embeddingsTobias Schnabel, Igor Labutov, David Mimno, Thorsten Joachims. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015.aclanthology.org1. Introduction- 본 논문은 Embedding Vector의 품질을 측정하는 방식에 대해 다룬 논문이다.- 기존의 word embedding vector의 평가 방법은 크게 두 가지 범주로 나뉜다. i) Extrins..
0. Referencehttps://nlp.stanford.edu/pubs/glove.pdf1. Introduction- 대부분의 word vector를 embedding할 땐, norm이나 cosine similarity 등의 intrinsic quality를 평가하게 된다.- 단순히 거리 대신, 벡터 간 structure of difference를 평가하면 더 품질이 좋지않을까?" king - queen ~ man - woman"- 즉, word간의 다양한 dimensions of meaning을 잘 표현하고 있는지를 평가하면 더 품질이 좋지않을까? - 우선 word vector를 학습하는 방법은 크게 두 가지로 나뉜다.i) Global Matrix Factorization(ex, Latent Sem..
0. Referencehttps://arxiv.org/abs/1310.4546 Distributed Representations of Words and Phrases and their CompositionalityThe recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensarxiv.org1. Introductio..
0. Referencehttps://arxiv.org/abs/1301.3781 Efficient Estimation of Word Representations in Vector SpaceWe propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best perarxiv.org1. Introduction- 해당 논문의 목표는, ..