Copy files Python using pattern | Complete Guide

Copy files Python using pattern | Complete Guide

Do you want to copy files automatically? Do you know a pattern but not the full file name? our the path/folder? Python can discover for you and copy your files almost instantly!

Meanwhile at work I started to get some requests to copy some files from an HUGE Archive… In reality a list was given and then after I open that TXT File I could find a “list” of some “keywords” that I should search for in the archive and then by each “keyword” present in the file name I should copy all the files, this is very time consuming however with Python we can do this in a few minutes, or even seconds!

Preparing the Solution in Python

Tired of doing this Manually every day I decided to prepare a little python script that would accept some inputs and would copy files that I was searching for, in a matter of seconds so you can save time!

import os
import fnmatch
import shutil

We are going to use only 3 libraries, if you don’t have them install you can always run the following on the command prompt:

pip install "package_name";

Replacing “package_name” by the packages on the command prompt as a result will install the respective package, in our script we will use the OS, FNMATCH and Shuttil.

Defining paths to copy files

Above all, it’s important to define where the project is running and of course where to put the copied files.

ini_path = os.getcwd()
in_f= ini_path + '\\Input\\list.txt' 
out_f= ini_path + '\\Output'
  • ini_path – will save the current working folder
  • in_f – is a folder with the pattern file containing the patterns to find (the user must create folder and file)
  • out_f is the directory of the files found and copied (need to create folder)

Patterns to copy

Firstly we need to decide what is the pattern, I mean, what should we search for? A type of file? A sub string of the file name?

Secondly we create the file Lits.txt (and assign it to the in_f  variable) inside of the folder “Input”

The TXT contains the pattern or the patterns of the files that the user wants to copy.

PatternMeaning
*Every possible character
?Any single character
[seq]Some character in seq
[!seq]Any character not in seq
Source: https://docs.python.org/3/library/fnmatch.html#module-fnmatch

Most importantly the user can change this file by adding another patterns, without the need to change the python code!

For example to find all pdf files and copy them trough Python to the Output folder you can write:

*.pdf

For instance, an PDF with a specific name, or all kind of files with a given name:

*Nunes.pdf

BigDataCracker.*

Creating the files List

The following function will retrieve our patterns to a list:

def create_list(input_file_path):
    with open(input_file_path, 'r') as file:
        pattern_list = file.read().splitlines()
        file.close()
    return pattern_list

Furthermore we are receiving one parameter that we defined previously as in_f that represents the file with the patterns.

Meanwhile we open the file in “read mode” and for each line we append it to a list, at the end of the file the list is returned.

Python function for search directory

To make the script more “dynamic” because in the future I want to have the possibility to search in a different directory/path,

Subsequently adding a direct prompt to the user so the working directory can be used on the following functions.

search_path = input("Please write or paste the Path where you want to start the Search.")
os.chdir(search_path)
cur_path= os.getcwd()

We question the user “Please write or paste the Path where you want to start the Search.

In other words the output that user writes will be the next Working directory using the chdir and getcwd from the OS module so the variable cur_path will contain our Working directory.

To illustrate if I want to search in my Downloads Folder for Files:

C:\Users\jorge\Downloads

Creating a list of the files to copy

During execution of the script the user is going to be prompt to insert the Base Path where the script will start to search for the patterns in the file names, the search is not only on the Base Path but also on the Subdirectories.

def return_file(pattern):
    file_ret=[]
    for p in pattern:
        for dirpath, _, files in os.walk(cur_path):
            for file in files:
                if fnmatch.fnmatch(file.upper(), p.upper()):
                    stra = os.path.join(dirpath, file)
                    file_ret.append(stra)
    return file_ret

Seems complicated but in fact is simple.

  • So previously a pattern list was saved, it passed to this function.
  • On the other hand we Initialize an empty list that will have all file names.
  • By each element in the list we do…
    • By each folder or file in the Working Directory.
      • For each file we found we are going to use the fnmatch function so we put the pattern and file name in upper case.
        • If the names match then we join the file name founded with the respective path appending it the list.
  • In short the returned list that contains the paths and file names that match with the pattern.

Copy files to the directory

And for our last task, we only need to copy the files founded previously to the output directory we want using a small python function:

def copy_file(file,out_path):
    for f in file:
        shutil.copy(f, out_path)

In other words for each file name founded we are going to use a method called Shutil.copy that is used to copy the content of source file to its destination file or directory in our case.

Conclusion

In conclusion, this Script will help you to take less time retrieving or copy files just like me!

Therefore the final touchdown you can combine all 3 functions together:

copy_file(return_file(create_list(in_f)), out_f)
  • create_list(Function that retrieves a List of all the patterns present in List.txt inside of Input Folder)
  • return_file(Return the path and file name based that contains on it’s name the patterns found)
  • copy_file(Copy the file found on previous function to the folder Output)

Most importantly you can always download the script from our Github repository, also if you have any question or just need help let me know, you can always reach me trough my Contact page.

Thank you very much or your attention!

Leave a Reply