DEV Community

Free Python Code
Free Python Code

Posted on

Detect File Changes in Real-Time with Python File Monitoring and Hash Comparison

Hi πŸ™‚πŸ–

In this post, I will show you how to Detect File Changes in Real-Time with Python File Monitoring and Hash Comparison

from time import sleep from hashlib import md5 from os import listdir src_dir = 'myFiles' my_files = listdir(src_dir) # create hash for files def get_file_hash(filename): return md5(open(filename, 'rb').read()).hexdigest() file_data = {} def load_files(): for filename in my_files: file_hash = get_file_hash(f'{src_dir}/{filename}') file_data[filename] = file_hash load_files() def check_changes(): for filename in my_files: file_hash = get_file_hash(f'{src_dir}/{filename}') if file_data[filename] != file_hash: print(f'change detected in file {filename}') sleep(1) check_changes() sleep(1) check_changes() 
Enter fullscreen mode Exit fullscreen mode

This code is a Python script designed to monitor changes in files within a specified directory. Let's break it down step by step:

1. from time import sleep: This imports the sleep function from the time module. It is used to introduce a pause in the execution of the script.

2. from hashlib import md5: This imports the md5 hashing algorithm from the hashlib module. MD5 is used here to generate a hash of the file content.

3. from os import listdir: This imports the listdir function from the os module. It is used to list all files in a directory.

**4. src_dir = 'myFiles': **This specifies the directory (myFiles) where the files to be monitored are located.

5. my_files = listdir(src_dir): This lists all the files in the directory specified by src_dir and stores them in the my_files list.

6. def get_file_hash(filename): This is a function that takes a filename as input, reads the file in binary mode, calculates its MD5 hash, and returns the hexadecimal representation of the hash.

7. file_data = {}: This initializes an empty dictionary file_data where the filename and its corresponding hash will be stored.

8. def load_files(): This function iterates through all files in my_files, calculates the hash of each file using get_file_hash, and stores the filename and its hash in the file_data dictionary.

**9. load_files(): **This call to load_files initializes the file_data dictionary with the initial hashes of all files in the directory.

10. def check_changes(): This function is responsible for monitoring changes in the files. It iterates through all files in my_files, recalculates the hash of each file, and compares it with the hash stored in file_data. If a change is detected, it prints a message indicating the change.

11. Inside check_changes, there's a recursive call to itself wrapped within a sleep(1) statement, which causes the script to pause for 1 second before recursively calling check_changes again. This results in the script continuously checking for changes in the files.

12. The script ends with an initial sleep of 1 second followed by a call to check_changes() to start the monitoring process.

In summary, this script continuously monitors the specified directory for changes in files by calculating their MD5 hashes and comparing them with previously stored hashes. If a change is detected, it prints a message indicating the change.

Top comments (0)