Convert Text Into a Word List
00:00 Convert Text Into a Word List. You probably already have a word list on your system, and you can download word lists online. But for flexibility and control, you may want to create your own.
00:13 This allows you to create special themed word list containing programming-related terms, city names, or non-English words, amongst many other options. You’ll create a script that converts any text file into a nicely formatted word list that you can use in your Wordle clone.
00:30
Add a new file named create_wordlist.py
to your project with the content seen on-screen.
00:49
The script uses sys.argv
to read information from the command line. In particular, you are expected to provide the path to an existing text file and the location of the new word list file.
01:01
The first two command-line arguments are converted to paths named in_path
and out_path
. This
01:25 line acts as a filter in your set comprehension and won’t pass through words that contain any character that isn’t ASCII. In practice, it only allows the letters A to Z.
01:41 Note that key has two arguments passed to it. The first is a lambda expression sorting the words by length in ascending order. The second is the word itself, allowing words of equal length to be sorted in alphabetical order.
01:55 This isn’t strictly necessary, but makes reading the list easier. Finally, the word list is written to the output file at the path specified in the second argument. Once saved, you can use the script to convert your current version of Wordle to a word list as seen on-screen.
02:20
This reads wyrdl.py
, looks for words, and stores them in wordlist.txt
in the current directory. Note that this overwrites the word list which you created manually.
02:31
Take a look at your new word list. You’ll recognize some of the words from your code. However, note that only some words went into the word list. Also, take note to the fact you don’t filter out words that are more or less than five letters long. You could do that when you are building the word list, but by leaving that job for your wyrdl.py
code, you gain some flexibility.
02:56 This makes it possible to use a general word list and to change the word length in the game. Maybe you want to create a Wordle variant that quizzes the user about seven-letter words. You can now generate your personal word list.
03:10
Find any plain text file and run create_wordlist.py
on it. You could, for example, download the complete works of Shakespeare to create an old style word list or a version of Alice in Wonderland retold in words of one syllable to create a list of simpler words.
03:28
Since your word list now contains words that aren’t five letters long, you should add a filter to your list comprehension that parses the word list. Start by importing ascii_letters
.
03:43
Remove .strip()
since empty words will be filtered in the next line,
03:50 which filters on length and ensures the words only contain ASCII letters.
04:05 Remember that you can check your code against the code in the course materials if you have any problems running the game. There are versions of the files included for each numbered section of the course, so you should be able to get back on track if you need to.
04:17 But also remember that it’s worth spending some time looking at code that doesn’t work and seeing if you can figure out what’s happening that isn’t intended. This is a key skill that you’ll use a lot.
04:29 Your game is more interesting now that the secret word is chosen at random from a word list, and later on in the course, you’ll work on making the user experience nicer with more intuitive feedback.
04:42 But first, you’ll reorganize your code so it’ll be easier to work with in the long run.
Become a Member to join the conversation.
Dee82 on Sept. 28, 2023
I found the description of line 16 re: the key variable and lambda function confusing. It looks like key will be assigned a tuple containing the length of the word and the word itself, rather than having 2 arguments. The argument to the lambda function is just the word from what I’m seeing. Might be helpful to edit this for clarity.