Find all the files in a directory
How we can find a file on out computer with Python? There are a lot of way to do it, let’s start with the first one.
Using os.listdir()
You just have to import the os module and you can use the os.listdir function to be able to see all the file in a directory. In Python 3, if you do not put anything as argument in the round parenthesis, it will return you a list of all the files and folder of the current directory. In Pyton 2 you must use os.listdir(“.”) to see the files in the current directory.
# First try import os f_list = os.listdir()
If you want a particular directory you must put the path as an argument.
os.listdir("C:\\documents")
In this example, you can look at the files in the folder documents of the hard drive C:
Get a list with the files (lista con tutti i file)
os.listdir(): get files in current dir (Python 3)
The simplest way to have the file in the current dir in Python 3 is this. It’s really simple, use the os module and the listdir() function and you’ll have the file in that dir (and eventual folders that are in the dir, but you will not have the file in the subdirectory, for that you can use walk – I will talk about it later).
import os arr = os.listdir() arr ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']
Getting the full path name
As you noticed, you don’t have the full path of the file in the code above. If you need to have the absolute path, you can use another function of the os.path module called _getfullpathname, putting the file that you get from os.listdir() as an argument. There are other ways to have the full path, as we will check later (I replaced, as suggested by mexmex, _getfullpathname with abspath).
import os files_path = [os.path.abspath(x) for x in os.listdir())] files_path # outpu: ['F:\\documenti\applications.txt', 'F:\\documenti\collections.txt']
Get the full path name of a type of file into all subdirectories with walk
I find this very useful to find stuff in many directories and it helped me finding a file about which I didn’t remember the name:
import os thisdir = os.getcwd() for r, d, f in os.walk(thisdir): for file in f: if ".docx" in file: print(os.path.join(r, file))
os.listdir(): get files in current dir (Python 2)
import os arr = os.listdir('.') arr # output: ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']
To go up in the directory tree
# method 1 x = os.listdir('..') # method 2 x = os.listdir('/')
get files: os.listdir() in a particular directory (Python 2 and 3)
import os arr = os.listdir('F:\\python') arr # output: ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']
Get files of a particular subdirectory with os.listdir()
import os x = os.listdir("./content")
os.walk(‘.’) – current directory
import os arr = next(os.walk('.'))[2] arr # output: ['5bs_Turismo1.pdf', '5bs_Turismo1.pptx', 'esperienza.txt']
glob module – all files
import glob print(glob.glob("*")) # output: ['content', 'start.py']
next(os.walk(‘.’)) and os.path.join(‘dir’,’file’)
import os arr = [] for d,r,f in next(os.walk("F:\_python)): for file in f: arr.append(os.path.join(r,file)) ... for f in arr: print(files) # output: # F:\\_python\\dict_class.py # F:\\_python\\programmi.txt
next(os.walk(‘F:\’) – get the full path – list comprehension
[os.path.join(r,file) for r,d,f in next(os.walk("F:\\_python")) for file in f] # output # ['F:\\_python\\dict_class.py', 'F:\\_python\\programmi.txt']
os.walk – get full path – all files in sub dirs
x = [os.path.join(r,file) for r,d,f in os.walk("F:\\_python") for file in f] x # output # ['F:\\_python\\dict.py', 'F:\\_python\\progr.txt', 'F:\\_python\\readl.py']
os.listdir() – get only txt files
arr_txt = [x for x in os.listdir() if x.endswith(".txt")] print(arr_txt) # output: ['work.txt', '3ebooks.txt']
glob – get only txt files
import glob x = glob.glob("*.txt") x # output: ['ale.txt', 'alunni2015.txt', 'assenze.text.txt', 'text2.txt', 'untitled.txt']
Using glob to get the full path of the files
If I should need the absolute path of the files:
from path import path from glob import glob x = [path(f).abspath() for f in glob("F:\*.txt")] for f in x: ... print(f) ... # output: F:\acquistionline.txt # output: F:\acquisti_2018.txt # output: F:\bootstrap_jquery_ecc.txt
Other use of glob
If I want all the files in the directory:
x = glob.glob("*")
Using os.path.isfile to avoid directories in the list*
import os.path listOfFiles = [f for f in os.listdir() if os.path.isfile(f)] print(listOfFiles) # output: ['a simple game.py', 'data.txt', 'decorator.py']
Using pathlib from (Python 3.4)
import pathlib flist = [] for p in pathlib.Path('.').iterdir(): if p.is_file(): print(p) flist.append(p) # output: # error.PNG # exemaker.bat # guiprova.mp3 # setup.py # speak_gui2.py # thumb.PNG
If you want to use list comprehension
flist = [p for p in pathlib.Path('.').iterdir() if p.is_file()]
Get all and only files with os.walk
import os x = [i[2] for i in os.walk('.')] y=[] for t in x: for f in t: y.append(f) print(y) # output: ['append_to_list.py', 'data.txt', 'data1.txt', 'data2.txt', 'data_180617', .py', 'substitute_words.py', 'sum_data.py', 'data.txt', 'data1.txt', 'data_180617']
Get only files with next and walk in a directory
import os x = next(os.walk('F://python'))[2] print(x) ################################## # output: ['calculator.bat','calculator.py']
Get only directories with next and walk in a directory
import os next(os.walk('F://python'))[1] # for the current dir use ('.') # output: ['python3','others']
**Get all the subdir names with walk
for r,d,f in os.walk("F:\_python"): for dirs in d: print(dirs) # output: # .vscode # pyexcel # pyschool.py # subtitles # _metaprogramming # .ipynb_checkpoints
os.scandir() from python 3.5 on
import os x = [f.name for f in os.scandir() if f.is_file()] x # outpu: ['calculator.bat','calculator.py'] # Another example with scandir (a little variation from docs.python.org) # This one is more efficient than os.listdir. # In this case, it shows the files only in the current directory # where the script is executed. mport os with os.scandir() as i: for entry in i: if entry.is_file(): print(entry.name) # output """ ebookmaker.py error.PNG exemaker.bat guiprova.mp3 setup.py speakgui4.py speak_gui2.py speak_gui3.py thumb.PNG """
Ex. 1: How many files are there in the subdirectories?
In this example, we look for the number of files that are included in all the directory and its subdirectories.
import os def count(dir, counter=0): "returns number of files in dir and subdirs" for pack in os.walk(dir): for f in pack[2]: counter += 1 return dir + " : " + str(counter) + "files" print(count("F:\\python")) """ output 'F:\\\python' : 12057 files' """
Ex.2: How to copy all files from a dir to another?
A script to make order in your computer finding all files of a type (default: pptx) and copying them in a new folder.
import os import shutil from path import path destination = "F:\\file_copied" # os.makedirs(destination) def copyfile(dir, filetype='pptx', counter=0): "Searches for pptx (or other - pptx is the default) files and copies them" for pack in os.walk(dir): for f in pack[2]: if f.endswith(filetype): fullpath = pack[0] + "\\" + f print(fullpath) shutil.copy(fullpath, destination) counter += 1 if counter > 0: print("------------------------") print("\t==> Found in: `" + dir + "` : " + str(counter) + " files\n") for dir in os.listdir(): "searches for folders that starts with `_`" if dir[0] == '_': # copyfile(dir, filetype='pdf') copyfile(dir, filetype='txt') """ > Output _compiti18\Compito Contabilità 1\conti.txt _compiti18\Compito Contabilità 1\modula4.txt _compiti18\Compito Contabilità 1\moduloa4.txt ------------------------ ==> Found in: `_compiti18` : 3 files """
Ex.3: How to get all the files in a txt file
In case you want to create a txt file with all the file names:
import os mylist = "" with open("filelist.txt", "w", encoding="utf-8") as file: for eachfile in os.listdir(): mylist += eachfile + "\n" file.write(mylist)
Example: create a txt with all the files names in your hard drive
"""We are going to save a txt file with all the files in your directory. We will use the function walk() """ import os # see all the methos of os # print(*dir(os), sep=", ") listafile = [] percorso = [] with open("lista_file.txt", "w", encoding='utf-8') as testo: for root, dirs, files in os.walk("D:\\"): for file in files: listafile.append(file) percorso.append(root + "\\" + file) testo.write(file + "\n") listafile.sort() print("N. of files", len(listafile)) with open("lista_file_ordinata.txt", "w", encoding="utf-8") as testo_ordinato: for file in listafile: testo_ordinato.write(file + "\n") with open("percorso.txt", "w", encoding="utf-8") as file_percorso: for file in percorso: file_percorso.write(file + "\n") os.system("lista_file.txt") os.system("lista_file_ordinata.txt") os.system("percorso.txt")
All the file of C:\\ in one text file
This is a shorter version of the previous code. Change the folder where to start finding the files if you need to start from another position. This code generate a 50 mb on text file on my computer with something less then 500.000 lines with files with the complete path.
import os with open("file.txt", "w", encoding="utf-8") as filewrite: for r, d, f in os.walk("C:\\"): for file in f: filewrite.write(f"{r + file}\n")
Looking for a type of file in all Hard Drive with os.walk
import os def searchfiles(extension='.ttf'): "Create a txt file with all the file of a type" with open("file.txt", "w", encoding="utf-8") as filewrite: for r, d, f in os.walk("C:\\"): for file in f: if file.endswith(extension): filewrite.write(f"{r + file}\n") # looking for ttf file (fonts) searchfiles('ttf')
Utilities
Subscribe to the newsletter for updates
Tkinter templates
My youtube channel
Twitter: @pythonprogrammi - python_pygame