Here are resources for more information about Word2Vec and GloVe:
Using Pretrained Word Embeddings
00:00 Some popular pretrained embeddings include the Word2Vec from Google and the GloVe from the NLP team at Stanford University. In this course, you’ll use GloVe for its size and speed. Word2Vec is larger, but it is also more accurate, so you can try it once you’ve seen GloVe in your code.
00:20 The next snippet will download the GloVe data set and extract it using utility methods from earlier in the course. It’s trained on 6 billion words, so the file size is over 800 megabytes and will take a while to process.
Before using this matrix in a model, take a look at the number of non-zero elements it contains. There’s a little more than 95% of the vocabulary that is contained by the matrix. To use the matrix in the model, assign the embedding matrix to the
weights keyword argument of the layer. Also, set the
trainable keyword argument to
False. The matrix has already been trained, so you don’t need to do anything else for it to work. Train the model, test it, and look at the results. And this is better.
Become a Member to join the conversation.