Using the Recognizer Class
Recognizer instance has seven methods for recognizing speech from an audio source using various APIs. These are:
.recognize_bing(), Microsoft Bing Speech;
.recognize_google(), Google Web Speech API;
.recognize_google_cloud(), Google Cloud Speech, which requires installation of the google-cloud-speech package;
.recognize_houndify(), Houndify by SoundHand;
.recognize_ibm(), IBM Speech to Text;
.recognize_sphinx(), CMU Sphinx, which requires installation of PocketSphinx; and finally,
.recognize_wit(), which uses Wit.ai.
01:26 Due to the complexity of speech recognition, a full discussion of the features and benefits of each API is beyond the scope of this course. Since SpeechRecognition ships with a default API key for the Google Web Speech API, you can get started with it straight away.
01:55 For more information, consult the SpeechRecognition documentation. An important note is that the default key provided by SpeechRecognition is for testing purposes only, and Google may revoke it at any time.
02:10 It is not a good idea to use the Google Web Speech API in production. Even with a valid API key, you’ll be limited to only 50 requests per day, and there is no way to raise this quota. Fortunately, SpeechRecognition’s interface is nearly identical for each API, so what you learn in this course will be easy to translate to a real-world project.
.recognize_*() method will throw a
speech_recognition.RequestError exception if the API is unreachable. For
.recognize_sphinx(), this could happen as a result of a missing, corrupt, or incompatible Sphinx installation. For the other six methods,
RequestError may be thrown if quota limits are met, the server is unavailable, or there’s no internet connection.
There are two ways to create an
AudioData instance: either from an audio file or audio recorded by a microphone. Audio files are a little easier to get started with, so let’s take a look at that in the next section.
Become a Member to join the conversation.