Join us and get access to thousands of tutorials and a community of expert Pythonistas.

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Using Pretrained Word Embeddings

Learn Text Classification With Python and Keras Douglas Starnes 02:02

Here are resources for more information about Word2Vec and GloVe:

00:00 Some popular pretrained embeddings include the Word2Vec from Google and the GloVe from the NLP team at Stanford University. In this course, you’ll use GloVe for its size and speed. Word2Vec is larger, but it is also more accurate, so you can try it once you’ve seen GloVe in your code.

00:20 The next snippet will download the GloVe data set and extract it using utility methods from earlier in the course. It’s trained on 6 billion words, so the file size is over 800 megabytes and will take a while to process.

00:33 I’ll speed it up through the magic of video. The file contains a list of words with the embedding vector for that word. Use the next code to get a reduced version of the embedding matrix.

00:46 The embedding matrix is stored in an array with 1,747 rows, which is the length of the vocabulary, and 50 columns, which is the size of the embedding.

01:33 Can you get more by training the embedding layer more? Let’s see. Set the trainable keyword argument to True, then train and test the model once more and plot the results. You’re getting there!

01:49 Again, this model needs to be trained no more than 20 epochs.

01:54 In the next video, you’ll see an advanced type of neural network which can yield even better inferences.

Become a Member to join the conversation.