Using the Recognizer Class

Speech Recognition With Python Darren Jones 03:47

00:00 The Recognizer class. All of the magic in SpeechRecognition happens within the Recognizer class. The primary purpose of each Recognizer instance is, of course, to recognize speech.

00:14 Each instance comes with a variety of settings and functionality for recognizing speech from an audio source. Creating an instance is easy. In your Python REPL, just type the following.

01:14 Of the seven, only .recognize_sphinx() works offline with the CMU Sphinx engine. The other six all require an internet connection, so keep this in mind as you work.

01:26 Due to the complexity of speech recognition, a full discussion of the features and benefits of each API is beyond the scope of this course. Since SpeechRecognition ships with a default API key for the Google Web Speech API, you can get started with it straight away.

01:43 For this reason, you’ll be using the Web Speech API in this course. The other six APIs all require authentication with either an API key or a username/password combination.

01:55 For more information, consult the SpeechRecognition documentation. An important note is that the default key provided by SpeechRecognition is for testing purposes only, and Google may revoke it at any time.

02:10 It is not a good idea to use the Google Web Speech API in production. Even with a valid API key, you’ll be limited to only 50 requests per day, and there is no way to raise this quota. Fortunately, SpeechRecognition’s interface is nearly identical for each API, so what you learn in this course will be easy to translate to a real-world project.

02:34 Each .recognize_*() method will throw a speech_recognition.RequestError exception if the API is unreachable. For .recognize_sphinx(), this could happen as a result of a missing, corrupt, or incompatible Sphinx installation. For the other six methods, RequestError may be thrown if quota limits are met, the server is unavailable, or there’s no internet connection.

02:58 With those prerequisites out of the way, let’s get our hands dirty. Go ahead and try to call .recognize_google() in your interpreter session.

03:09 You probably got an error similar to the one onscreen, and you may well have guessed this would happen. After all, how could something be recognized from nothing?

03:19 All seven .recognize_*() methods of the Recognizer class require an audio_data argument. In each case, audio_data must be an instance of SpeechRecognition’s AudioData class.

03:32 There are two ways to create an AudioData instance: either from an audio file or audio recorded by a microphone. Audio files are a little easier to get started with, so let’s take a look at that in the next section.

Become a Member to join the conversation.