Here are some resources for more information about topics covered in this lesson:
Understanding the Effect of Noise
To get a feel for how noise can affect speech recognition, download the
jackhammer.wav file. As always, make sure you save this to your interpreter session’s working directory. This file has the phrase “the stale smell of old beer lingers” spoken with a loud jackhammer in the background.
Hence, that portion of the stream is consumed before you call
.record() to capture the data. You can adjust the time-frame that
.adjust_for_ambient_noise() uses for analysis with the
duration keyword argument.
02:19 It looks like that has allowed SpeechRecognition to pass the entire spoken phrase for recognition, but now you’re back to where you were before with a transcription that’s the same as if noise adjustment hadn’t been used.
02:42 If you find yourself running up against these issues frequently, you may have to resort to some pre-processing of the audio. This can be done with audio editing software or a Python package such as SciPy that can apply filters to the files.
02:56 A detailed discussion of this is beyond the scope of this course, but check out Allen Downey’s Think DSP book if you’re interested in it. For now, just be aware the ambient noise in an audio file can cause problems and must be addressed in order to maximize the accuracy of speech recognition.
.recognize_google() method will always return the most likely transcription unless you force it to give you the full response. You can do this by setting the
show_all keyword argument of the
.recognize_google() method to
You’ve seen how to create an
AudioFile instance from an audio file and how to use the
.record() method to capture data from the file. You learned how to record segments of a file using the
duration keyword arguments of
.record(), and you experienced the detrimental effect noise can have on transcription accuracy.
Become a Member to join the conversation.