Actually working
Let’s try to get all the text in many powerpoint files, inside a folder with python-pptx
The documentation of python-pptx
If you want to extract text:
- import Presentation from pptx (pip install python-pptx)
- for each file in the directory (using glob module)
- look in every slides and in every shape in each slide
- if there is a shape with text attribute, print the shape.text
from pptx import Presentation import glob for eachfile in glob.glob("*.pptx"): prs = Presentation(eachfile) print(eachfile) print("----------------------") for slide in prs.slides: for shape in slide.shapes: if hasattr(shape, "text"): print(shape.text)