Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Using the Recognizer Class

00:00 The Recognizer class. All of the magic in SpeechRecognition happens within the Recognizer class. The primary purpose of each Recognizer instance is, of course, to recognize speech.

00:14 Each instance comes with a variety of settings and functionality for recognizing speech from an audio source. Creating an instance is easy. In your Python REPL, just type the following.

00:32 Each Recognizer instance has seven methods for recognizing speech from an audio source using various APIs. These are: .recognize_bing(), Microsoft Bing Speech; .recognize_google(), Google Web Speech API; .recognize_google_cloud(), Google Cloud Speech, which requires installation of the google-cloud-speech package; .recognize_houndify(), Houndify by SoundHand; .recognize_ibm(), IBM Speech to Text; .recognize_sphinx(), CMU Sphinx, which requires installation of PocketSphinx; and finally, .recognize_wit(), which uses Wit.ai.

01:14 Of the seven, only .recognize_sphinx() works offline with the CMU Sphinx engine. The other six all require an internet connection, so keep this in mind as you work.

01:26 Due to the complexity of speech recognition, a full discussion of the features and benefits of each API is beyond the scope of this course. Since SpeechRecognition ships with a default API key for the Google Web Speech API, you can get started with it straight away.

01:43 For this reason, you’ll be using the Web Speech API in this course. The other six APIs all require authentication with either an API key or a username/password combination.

01:55 For more information, consult the SpeechRecognition documentation. An important note is that the default key provided by SpeechRecognition is for testing purposes only, and Google may revoke it at any time.

02:10 It is not a good idea to use the Google Web Speech API in production. Even with a valid API key, you’ll be limited to only 50 requests per day, and there is no way to raise this quota. Fortunately, SpeechRecognition’s interface is nearly identical for each API, so what you learn in this course will be easy to translate to a real-world project.

02:34 Each .recognize_*() method will throw a speech_recognition.RequestError exception if the API is unreachable. For .recognize_sphinx(), this could happen as a result of a missing, corrupt, or incompatible Sphinx installation. For the other six methods, RequestError may be thrown if quota limits are met, the server is unavailable, or there’s no internet connection.

02:58 With those prerequisites out of the way, let’s get our hands dirty. Go ahead and try to call .recognize_google() in your interpreter session.

03:09 You probably got an error similar to the one onscreen, and you may well have guessed this would happen. After all, how could something be recognized from nothing?

03:19 All seven .recognize_*() methods of the Recognizer class require an audio_data argument. In each case, audio_data must be an instance of SpeechRecognition’s AudioData class.

03:32 There are two ways to create an AudioData instance: either from an audio file or audio recorded by a microphone. Audio files are a little easier to get started with, so let’s take a look at that in the next section.

Become a Member to join the conversation.