How to get text from mp4 and wav with Python

Install speech recognition module

pip install SpeechRecognition

This code will grab text from wav audio file.

It has also a function get_wav to get the wav out of an mp4. This one uses ffmpeg, so you have to install ffmpeg first. Look into this blog to know how to do it. Ffmpeg is a great free tool to manipulate audio and video file. I record my video with it and do a lot of other stuff like joining files etc.

To get text, if you got too long files you can have trouble, so I used duration of 100 that is good. Repeating the r.record

import os
import speech_recognition as sr
# import ffmpeg

def get_wav(videoname: str):
	com1 = f"ffmpeg -i {videoname} speech.mp3"
	com2 = "ffmpeg -i speech.mp3 speech.wav"
	os.system(com1)
	os.system(com2)

def wav2ytext(language="en"):
	r = sr.Recognizer()
	try:
		with sr.WavFile("speech.wav") as source:
			audio0 = r.record(source, duration=100)
			audio1 = r.record(source, duration=100)
			audio2 = r.record(source, duration=100)
			audio3 = r.record(source, duration=100)
			audio4 = r.record(source, duration=100)
			audio5 = r.record(source, duration=100)
			# audio = r.listen(source)
		print(r.recognize_google(audio0, language=language))
		print(r.recognize_google(audio1, language=language))
		print(r.recognize_google(audio2, language=language))
		print(r.recognize_google(audio3, language=language))
		print(r.recognize_google(audio4, language=language))
		print(r.recognize_google(audio5, language=language))
	except:
		print("Done")
		
get_wav("marketing_mix.mp4") # uncomment to get the wav
wav2ytext("en")

Subscribe to the newsletter for updates
Tkinter templates
Avatar My youtube channel

Twitter: @pythonprogrammi - python_pygame

Videos

Speech recognition game

Pygame's Platform Game

Other Pygame's posts

Published by pythonprogramming

Started with basic on the spectrum, loved javascript in the 90ies and python in the 2000, now I am back with python, still making some javascript stuff when needed.