π AI-powered face detection and recognition in videos
Face Finder is a powerful tool that uses state-of-the-art AI models to find and track specific people in video files. It supports multiple reference images, consensus-based matching, frame export with bounding boxes, and comprehensive result analysis.
- π― High Accuracy: Uses DeepFace with ArcFace, Facenet512, and VGG-Face models
- π Performance Optimized: Smart pre-filtering and embedding caching for fast processing
- π Consensus Matching: Multiple reference images with configurable agreement thresholds
- πΌοΈ Frame Export: Export detected frames with red bounding boxes around matched faces
- π Progress Tracking: Real-time face count display and processing statistics
- πΎ Resume Support: Automatic checkpointing for processing large videos
- π Multiple Output Formats: Console output, CSV export, and frame segments
- π§ Highly Configurable: Extensive command-line options for fine-tuning
# Basic usage - find a person in a video face-finder person.jpg video.mp4 # Use multiple reference images for better accuracy face-finder ./reference_photos/ video.mp4 --consensus 0.7 # Export frames with bounding boxes face-finder person.jpg video.mp4 --export-frames ./detected_faces/Build and install Face Finder as a package for easy distribution:
# Install build tools pip install build # Build the package ./build.sh # Install the package pip install dist/face_finder-1.0.0-py3-none-any.whl # Run from anywhere face-finder person.jpg video.mp4For development or when you want to modify the code:
# Clone/download the project cd face-finder # Create virtual environment (recommended) python -m venv face-finder-env source face-finder-env/bin/activate # On Windows: face-finder-env\Scripts\activate # Install in development mode pip install -e . # Run the tool face-finder person.jpg video.mp4Run directly without installation:
# Install dependencies pip install -r requirements.txt # Run the script directly python face_finder/cli.py person.jpg video.mp4- Build the package (see Option 1)
- Copy the wheel file to your target machine:
scp dist/face_finder-1.0.0-py3-none-any.whl user@remote-machine:~/ - Install on the target machine:
# Create virtual environment (recommended) python -m venv face-finder-env source face-finder-env/bin/activate # Install the package pip install face_finder-1.0.0-py3-none-any.whl # Run the tool face-finder person.jpg video.mp4
- Python 3.8+
- 4GB+ RAM (more for large videos or many reference images)
- Storage: ~500MB for dependencies, plus space for video files and exported frames
- GPU: Optional (CPU-only mode is used by default for stability)
Find timestamps when a specific person appears in a video.
face-finder <reference_images> <video_file> [options]python face_timestamp_finder.py <reference_images> <video_file> [options]reference_images: One of the following:- Path to a single reference image
- Path to a directory containing multiple reference images
- Comma-separated list of image file paths
video_file: Path to the video file to search
--interval N: Check every Nth frame (default: 24). Lower values are more accurate but slower--tolerance X: Face matching sensitivity from 0.0-1.0 (default: 0.6). Lower values are stricter--fast: Use fast OpenCV pre-filtering (default: enabled)--slow: Disable fast pre-filtering for maximum accuracy--csv FILE: Export results to specified CSV file (if not provided, auto-generates filename)--segments: Group consecutive detections into continuous appearance segments--no-resume: Disable resume functionality and start from beginning (resume is enabled by default)--min-matches N: Minimum reference images that must match (default: 1)--consensus X: Fraction of reference images that must agree (0.0-1.0, default: 0.5)--strict: Enable strict mode: requires ALL reference images to match--max-width N: Maximum width for frame processing (default: 900). Lower = faster processing--clear-cache: Clear cached reference embeddings and recreate them--export-frames [DIR]: Export detected frames with red bounding boxes to directory (default: face_output)
Package installation examples:
# Single reference image face-finder person.jpg video.mp4 # Multiple reference images from directory face-finder ./reference_photos/ video.mp4 # Group detections into continuous segments (better for long videos) face-finder ./reference_photos/ video.mp4 --segments # Reduce false positives with strict matching face-finder ./reference_photos/ video.mp4 --strict # Require consensus from multiple reference images face-finder ./reference_photos/ video.mp4 --consensus 0.8 --min-matches 3 # Export results to custom CSV file face-finder person.jpg video.mp4 --segments --csv segments.csv # Faster processing with smaller frame size face-finder ./reference_photos/ video.mp4 --max-width 640 # Maximum speed (lower quality but much faster) face-finder ./reference_photos/ video.mp4 --max-width 480 # Clear embedding cache and recreate face-finder ./reference_photos/ video.mp4 --clear-cache # Export detection frames with bounding boxes (uses default "face_output" directory) face-finder person.jpg video.mp4 --export-frames # Export to custom directory face-finder person.jpg video.mp4 --export-frames ./detected_faces/ # Combine frame export with segments and CSV face-finder ./reference_photos/ video.mp4 --segments --csv results.csv --export-framesDirect script examples:
# Single reference image python face_timestamp_finder.py person.jpg video.mp4 # Multiple reference images from directory python face_timestamp_finder.py ./reference_photos/ video.mp4 # Multiple specific reference images python face_timestamp_finder.py person1.jpg,person2.jpg,person3.jpg video.mp4 # Check every frame for maximum accuracy python face_timestamp_finder.py ./reference_photos/ video.mp4 --interval 1 # Use stricter matching python face_timestamp_finder.py person.jpg video.mp4 --tolerance 0.4 # Force start from beginning (ignore any existing checkpoints) python face_timestamp_finder.py person.jpg video.mp4 --no-resumeThe tool provides two output modes: individual detections and grouped segments.
Console Output:
01:23:45 (83.25s) - Score: 87.5% 01:28:12 (88.12s) - Score: 92.1% Results exported to: face_detections_video.csv CSV Columns:
- timestamp_hms: Time in HH:MM:SS format (e.g., "01:23:45")
- timestamp_seconds: Precise time in seconds (e.g., 83.25)
- match_score_percent: Confidence score percentage (e.g., 87.5)
- consensus_percent: Percentage of reference images that agreed (e.g., 75.0)
Perfect for long videos where a person appears continuously across multiple frames.
Console Output:
01:23:45 - 01:25:30 (Duration: 00:01:45) - Avg Score: 89.2% - Detections: 8 01:28:12 - 01:28:45 (Duration: 00:00:33) - Avg Score: 91.5% - Detections: 4 Segments exported to: face_segments_video.csv CSV Columns:
- start_time_hms/start_time_seconds: When the person first appears
- end_time_hms/end_time_seconds: When the person was last detected
- duration_hms/duration_seconds: How long they appeared continuously
- avg_score_percent: Average confidence across all detections in segment
- max_score_percent: Best confidence score in the segment
- avg_consensus_percent: Average consensus across all detections in segment
- max_consensus_percent: Best consensus score in the segment
- detection_count: Number of individual frames detected in this segment
Segment Logic:
- Consecutive detections (based on
--interval) are grouped together - A segment ends when the person is not detected in the next expected frame
- Even single-frame appearances count as segments
- Useful for understanding continuous presence rather than individual frame hits
The --export-frames option allows you to save actual video frames where faces are detected, complete with red bounding boxes around the detected faces.
- Automatic detection: Only frames with positive face matches are exported
- Bounding boxes: Red rectangles are drawn around all detected faces in the frame
- Original quality: Exported frames maintain the original video resolution (not downscaled)
- Smart naming: Files include frame number, timestamp, confidence score, and consensus percentage
detection_frame_00012345_01h23m45s_score87_consensus75.jpg Where:
00012345: Frame number (8 digits, zero-padded)01h23m45s: Timestamp in hours, minutes, secondsscore87: Confidence score (rounded to nearest integer)consensus75: Consensus percentage - how many reference images agreed (rounded to nearest integer)
# Basic frame export (uses default "face_output" directory) face-finder person.jpg video.mp4 --export-frames # Export with custom output directory face-finder ./refs/ video.mp4 --export-frames /home/user/detected_faces/ # Combine with other options face-finder person.jpg video.mp4 --export-frames --segments --csv results.csv # High-quality frame export (slower but better quality) face-finder ./refs/ video.mp4 --export-frames ./frames/ --max-width 1920 --slow- Manual verification: Visually confirm detection accuracy
- Dataset creation: Build training datasets from video footage
- Evidence collection: Extract specific moments for documentation
- Content analysis: Study facial expressions or contexts
- Quality assessment: Evaluate detection performance across different conditions
The tool automatically resumes from where it left off if interrupted, making it perfect for processing very large video files.
- Automatic checkpoints: Progress is saved every 10 processed frames
- Smart validation: Only resumes if video file, interval, and tolerance settings match
- Crash recovery: Can resume after unexpected shutdowns, power failures, or interruptions
- No lost work: Preserves all detections found so far
- Auto-cleanup: Removes checkpoint files after successful completion
# Start processing a large video python face_timestamp_finder.py ../refs/ large_video.mp4 # If interrupted, simply run the same command again to resume python face_timestamp_finder.py ../refs/ large_video.mp4 # Force restart from beginning (ignoring checkpoints) python face_timestamp_finder.py ../refs/ large_video.mp4 --no-resume- Stored as hidden files:
.face_finder_checkpoint_{hash}.json - Contain video hash, progress, detections, and settings
- Automatically deleted on successful completion
- Safe to manually delete if you want to start fresh
The tool uses multiple optimization strategies to dramatically speed up face detection:
- Fast mode (default): Uses OpenCV's Haar cascades to quickly detect if any face exists in a frame
- Only processes frames with faces: Skips expensive DeepFace analysis on frames without faces
- Typical speedup: 10-50x faster than processing every frame with DeepFace
- Pre-computed embeddings: Reference images are processed once at startup into facial embeddings
- Automatic caching: Embeddings are cached to disk and reused across runs with the same reference images
- Single frame processing: Each video frame only needs one embedding computation
- Vector comparison: Fast cosine distance calculation between embeddings
- Massive speedup: 5-10x faster than traditional image-to-image comparison
- Cache invalidation: Automatically detects when reference images change and rebuilds cache
- No temporary files: Frames are processed directly in memory without disk I/O
- Eliminated bottleneck: Removes file write/read overhead that was slowing down processing
- Additional speedup: 2-3x improvement from removing disk operations
- Automatic downscaling: Scales frames to a maximum width before processing (default: 900px)
- Maintains aspect ratio: Preserves image proportions while reducing processing load
- Configurable: Use
--max-widthto control the speed vs. accuracy trade-off
- 4K video (3840px width) β 900px = ~4x faster processing
- 1080p video (1920px width) β 900px = ~2x faster processing
- 720p video (1280px width) β 900px = ~1.4x faster processing
- For maximum speed: Use
--max-width 480with--interval 30 - Balanced speed/quality: Default settings (900px width, interval 24)
- High quality: Use
--max-width 1280with--slowmode and--interval 10 - Frame interval: Increase
--intervalfor faster processing if you don't need to catch every appearance - Multiple reference images: The embedding approach scales efficiently with more reference photos
The tool automatically caches reference image embeddings to speed up subsequent runs:
- Cache files: Stored as hidden
.face_finder_embeddings_*.jsonfiles - Smart invalidation: Cache is rebuilt when reference images are added, removed, or modified
- Instant startup: Cached embeddings load in seconds instead of minutes for large reference sets
- Manual clearing: Use
--clear-cacheto force recreation of embeddings - No maintenance needed: Cache files are automatically managed and cleaned up
The consensus system helps ensure reliable detections when using multiple reference images:
# Require 70% of reference images to agree face-finder ./refs/ video.mp4 --consensus 0.7 # Require at least 3 reference images to match face-finder ./refs/ video.mp4 --min-matches 3 # Strict mode: ALL reference images must agree face-finder ./refs/ video.mp4 --strict# Maximum speed (lower quality) face-finder person.jpg video.mp4 --max-width 480 --interval 30 # Maximum accuracy (slower) face-finder person.jpg video.mp4 --max-width 1920 --interval 1 --slow # Balanced (recommended) face-finder person.jpg video.mp4 --max-width 900 --interval 24Slow processing:
- Reduce
--max-width(e.g., 640 instead of 900) - Increase
--interval(process fewer frames) - Use
--fastmode (enabled by default)
Too many false positives:
- Increase
--tolerance(e.g., 0.7 or 0.8) - Use more reference images with
--consensus - Try
--strictmode for highest accuracy
Missing detections:
- Decrease
--tolerance(e.g., 0.4 or 0.5) - Use
--slowmode for maximum accuracy - Add more diverse reference images
Memory issues:
- Reduce
--max-widthsignificantly - Process videos in smaller chunks
- Clear embedding cache with
--clear-cache
Watch the progress bar for insights:
Processing frames: 45% |βββββ | 450/1000 [02:15<02:30, 3.2frame/s] Faces: 2/4 match - High face counts: Expect slower processing
- Low match ratios: May need better reference images
- "No faces detected": Fast processing, good performance