Posted on Jun 21

File System Walking with WalkDir: Recursive Tree Traversal 4/9

#go #programming #systems #softwaredevelopment

WalkDir Function Comprehensive Guide

The filepath.WalkDir function represents Go's modern approach to recursive directory traversal, replacing the older filepath.Walk function with improved performance and cleaner interface design. Understanding its mechanics is essential for any developer working with file system operations at scale.

Function Signature and Parameters

func WalkDir(root string, fn WalkDirFunc) error

The function signature deliberately keeps things simple. The root parameter accepts any valid file system path - whether it points to a file or directory. When you pass a file path, WalkDir processes only that single file. Directory paths trigger recursive traversal of the entire subtree.

The fn parameter expects a function matching the WalkDirFunc signature:

type WalkDirFunc func(path string, d DirEntry, err error) error

This callback executes for every file and directory encountered during traversal. The function receives the full path, a DirEntry interface providing file metadata, and any error that occurred while accessing the entry.

Lexical Ordering Guarantees

WalkDir provides deterministic traversal through lexical ordering. Within each directory, entries are processed in sorted order by name. This predictability proves crucial for testing and debugging file system operations.

// Directory structure: // /project/ // ├── README.md // ├── main.go // └── utils/ // ├── helper.go // └── parser.go // Traversal order: // 1. /project/README.md // 2. /project/main.go  // 3. /project/utils/ // 4. /project/utils/helper.go // 5. /project/utils/parser.go

The lexical ordering applies only within individual directories. Parent directories are always processed before their children, but sibling directories follow alphabetical order.

Memory Usage Considerations

WalkDir optimizes memory usage through several design decisions. Unlike filepath.Walk, it uses DirEntry instead of FileInfo, avoiding expensive stat system calls until you explicitly request detailed file information.

// Memory-efficient approach err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error { // Only call Info() when you need detailed metadata if strings.HasSuffix(path, ".log") { info, err := d.Info() if err != nil { return err } // Now you have full FileInfo processLogFile(path, info) } return nil })

The function processes one directory at a time, reading directory entries incrementally rather than loading entire directory trees into memory. This approach scales well even with deeply nested directory structures containing thousands of files.

For large traversals, be mindful of callback function allocations. Avoid creating unnecessary string concatenations or slice allocations within the callback, as these multiply across thousands of file system entries.

WalkDirFunc Callback Patterns

The WalkDirFunc callback serves as your primary interface for processing file system entries during traversal. Mastering its parameter handling and return value semantics gives you precise control over the walking behavior.

Function Parameters: path, DirEntry, error

Each callback invocation receives three parameters that work together to provide complete context about the current file system entry.

The path parameter contains the full file path from the root. This path uses the operating system's native separator and includes the original root prefix:

filepath.WalkDir("/home/user/projects", func(path string, d fs.DirEntry, err error) error { // path examples: // "/home/user/projects" // "/home/user/projects/main.go" // "/home/user/projects/src/utils.go" // Extract relative path if needed relPath, _ := filepath.Rel("/home/user/projects", path) return nil })

The DirEntry parameter provides efficient access to basic file metadata without requiring expensive system calls. Use its methods to check file types and names:

func processEntry(path string, d fs.DirEntry, err error) error { if d.IsDir() { fmt.Printf("Directory: %s\n", d.Name()) return nil } // Check file type without calling Info() if d.Type()&fs.ModeSymlink != 0 { fmt.Printf("Symlink: %s\n", path) return nil } // Only call Info() when you need full metadata if strings.HasSuffix(d.Name(), ".large") { info, err := d.Info() if err != nil { return err } if info.Size() > 1024*1024 { fmt.Printf("Large file: %s (%d bytes)\n", path, info.Size()) } } return nil }

The error parameter indicates problems accessing the current entry. This error handling happens before your callback logic executes, allowing you to decide whether to continue or abort traversal.

Return Value Meanings and Control Flow

Your callback's return value directly controls traversal behavior. Understanding these return patterns enables sophisticated directory walking logic.

Returning nil continues normal traversal:

func normalProcessing(path string, d fs.DirEntry, err error) error { // Process the entry processFile(path) // Continue traversal return nil }

Returning filepath.SkipDir skips the current directory's contents but continues traversing siblings:

func skipHiddenDirs(path string, d fs.DirEntry, err error) error { if d.IsDir() && strings.HasPrefix(d.Name(), ".") { return filepath.SkipDir // Skip .git, .vscode, etc. } return nil }

Returning filepath.SkipAll terminates the entire traversal immediately:

func findFirstMatch(target string) filepath.WalkDirFunc { return func(path string, d fs.DirEntry, err error) error { if d.Name() == target { fmt.Printf("Found: %s\n", path) return filepath.SkipAll // Stop searching } return nil } }

Any other error value stops traversal and propagates up to the WalkDir caller:

func strictProcessing(path string, d fs.DirEntry, err error) error { if err != nil { return fmt.Errorf("access error for %s: %w", path, err) } if err := processFile(path); err != nil { return fmt.Errorf("processing failed for %s: %w", path, err) } return nil }

Error Propagation Strategies

Different applications require different error handling approaches. Consider these common patterns based on your fault tolerance requirements.

The fail-fast approach stops on any error:

func failFast(path string, d fs.DirEntry, err error) error { if err != nil { return err // Propagate immediately } return processEntry(path, d) }

Advanced Traversal Control

Fine-grained control over directory traversal enables efficient file system operations by avoiding unnecessary work. The key lies in understanding when and how to skip portions of the directory tree based on your specific requirements.

SkipDir vs SkipAll Usage

The distinction between filepath.SkipDir and filepath.SkipAll determines the scope of traversal interruption. Understanding their behavior prevents common mistakes in traversal logic.

SkipDir affects only the current directory when returned for a directory entry:

func skipLargeDirs(path string, d fs.DirEntry, err error) error { if d.IsDir() { // Check if directory should be skipped if d.Name() == "node_modules" || d.Name() == ".git" { fmt.Printf("Skipping directory: %s\n", path) return filepath.SkipDir } } // Process files normally fmt.Printf("Processing: %s\n", path) return nil } // Directory structure: // /project/ // ├── src/file1.go // ├── node_modules/ <- Skipped entirely // │ └── package/ // ├── docs/ // │ └── readme.md <- Still processed // └── main.go // Output: // Processing: /project // Processing: /project/src // Processing: /project/src/file1.go // Skipping directory: /project/node_modules // Processing: /project/docs // Processing: /project/docs/readme.md // Processing: /project/main.go

SkipAll terminates the entire traversal regardless of where it's returned:

func findAndStop(target string) filepath.WalkDirFunc { return func(path string, d fs.DirEntry, err error) error { fmt.Printf("Visiting: %s\n", path) if d.Name() == target { fmt.Printf("Found target: %s\n", path) return filepath.SkipAll // Stop everything } return nil } } // Usage: find first occurrence of "config.json" err := filepath.WalkDir("/app", findAndStop("config.json")) // Traversal stops immediately when config.json is found

Important: SkipDir has no effect when returned for file entries. Only directory entries can be skipped.

Conditional Directory Skipping

Complex applications often require dynamic skipping logic based on directory contents, depth, or external conditions. Implement these patterns using closure-captured state.

Depth-based skipping prevents traversal beyond a certain level:

func maxDepthWalker(maxDepth int) filepath.WalkDirFunc { rootDepth := -1 return func(path string, d fs.DirEntry, err error) error { if err != nil { return err } // Calculate current depth if rootDepth == -1 { rootDepth = strings.Count(path, string(filepath.Separator)) } currentDepth := strings.Count(path, string(filepath.Separator)) - rootDepth if d.IsDir() && currentDepth >= maxDepth { return filepath.SkipDir } fmt.Printf("Depth %d: %s\n", currentDepth, path) return nil } } // Usage: traverse only 2 levels deep err := filepath.WalkDir("/project", maxDepthWalker(2))

Content-based skipping examines directory properties before entering:

func skipEmptyOrHidden(path string, d fs.DirEntry, err error) error { if err != nil { return err } if d.IsDir() { // Skip hidden directories if strings.HasPrefix(d.Name(), ".") { return filepath.SkipDir } // Skip empty directories entries, err := os.ReadDir(path) if err != nil { return err // Can't read, don't skip } if len(entries) == 0 { fmt.Printf("Skipping empty directory: %s\n", path) return filepath.SkipDir } } return nil }

Pattern-based skipping uses matching rules for directory names:

type SkipPattern struct { patterns []string } func (sp *SkipPattern) shouldSkip(name string) bool { for _, pattern := range sp.patterns { if matched, _ := filepath.Match(pattern, name); matched { return true } } return false } func (sp *SkipPattern) walkFunc(path string, d fs.DirEntry, err error) error { if err != nil { return err } if d.IsDir() && sp.shouldSkip(d.Name()) { return filepath.SkipDir } // Process entry return processEntry(path, d) } // Usage skipper := &SkipPattern{ patterns: []string{"temp*", "*_backup", "node_modules"}, } err := filepath.WalkDir("/project", skipper.walkFunc)

Early Termination Patterns

Early termination patterns optimize performance by stopping traversal once specific conditions are met. These patterns are essential for search operations and resource-constrained environments.

The first-match pattern stops after finding the first occurrence:

func findFirst(predicate func(string, fs.DirEntry) bool) filepath.WalkDirFunc { found := false return func(path string, d fs.DirEntry, err error) error { if err != nil || found { return err } if predicate(path, d) { found = true fmt.Printf("First match: %s\n", path) return filepath.SkipAll } return nil } } // Find first .go file predicate := func(path string, d fs.DirEntry) bool { return !d.IsDir() && strings.HasSuffix(path, ".go") } err := filepath.WalkDir("/src", findFirst(predicate))

The quota-based pattern stops after processing a certain number of entries:

func processWithQuota(quota int) filepath.WalkDirFunc { processed := 0 return func(path string, d fs.DirEntry, err error) error { if err != nil { return err } if processed >= quota { return filepath.SkipAll } if !d.IsDir() { processed++ fmt.Printf("Processing %d/%d: %s\n", processed, quota, path) } return nil } }

The timeout-based pattern stops after a time limit:

func processWithTimeout(timeout time.Duration) filepath.WalkDirFunc { start := time.Now() return func(path string, d fs.DirEntry, err error) error { if err != nil { return err } if time.Since(start) > timeout { fmt.Println("Timeout reached, stopping traversal") return filepath.SkipAll } return processEntry(path, d) } }

Error Handling in Tree Walking

File system traversal encounters various error conditions that require different handling strategies. The WalkDir function provides a two-phase error reporting mechanism that gives you fine-grained control over error recovery and propagation.

Two-Phase Error Reporting

WalkDir implements a sophisticated error handling model where errors can occur both during directory reading and individual entry access. Understanding this distinction is crucial for building robust file system tools.

The first phase occurs when WalkDir attempts to read a directory's contents. If this fails, your callback receives the directory path with a non-nil error parameter:

func handleDirectoryErrors(path string, d fs.DirEntry, err error) error { if err != nil { // This error occurred while reading the directory fmt.Printf("Directory read error for %s: %v\n", path, err) // Decide whether to continue or abort if os.IsPermission(err) { fmt.Printf("Skipping due to permissions: %s\n", path) return nil // Continue with siblings } return err // Abort on other errors } // Normal processing for successfully read entries return processEntry(path, d) }

The second phase happens when individual entries within a readable directory have access problems. In this case, the callback receives the entry with its specific error:

func handleEntryErrors(path string, d fs.DirEntry, err error) error { if err != nil { // This error is specific to this entry fmt.Printf("Entry access error for %s: %v\n", path, err) // Log and continue with other entries logError(path, err) return nil } // Entry is accessible, process normally return processEntry(path, d) }

Here's a comprehensive handler that distinguishes between error types:

type ErrorHandler struct { dirErrors []error entryErrors []error } func (eh *ErrorHandler) handleBothPhases(path string, d fs.DirEntry, err error) error { if err != nil { // Determine error phase based on context if d == nil { // Directory reading failed eh.dirErrors = append(eh.dirErrors, fmt.Errorf("directory %s: %w", path, err)) if os.IsPermission(err) { return nil // Skip but continue } return err // Fail on critical errors } else { // Individual entry error eh.entryErrors = append(eh.entryErrors, fmt.Errorf("entry %s: %w", path, err)) return nil // Continue with other entries } } return processEntry(path, d) }

Pre-read vs Post-read Error Handling

The timing of error detection affects your handling strategy. Pre-read errors prevent access to directory contents entirely, while post-read errors affect individual entries after successful directory enumeration.

Pre-read errors typically indicate system-level issues:

func handlePreReadErrors(path string, d fs.DirEntry, err error) error { if err != nil && d == nil { // Pre-read error: couldn't enumerate directory switch { case os.IsPermission(err): fmt.Printf("Permission denied: %s\n", path) auditLog.LogDeniedAccess(path) return nil // Continue traversal case os.IsNotExist(err): fmt.Printf("Path disappeared: %s\n", path) return nil // Handle race conditions gracefully case errors.Is(err, syscall.ELOOP): fmt.Printf("Symlink loop detected: %s\n", path) return nil // Skip problematic symlinks default: return fmt.Errorf("critical directory error: %w", err) } } return processNormally(path, d, err) }

Post-read errors occur after successful directory reading but indicate problems with specific entries:

func handlePostReadErrors(path string, d fs.DirEntry, err error) error { if err != nil && d != nil { // Post-read error: entry exists but has problems fmt.Printf("Entry problem %s: %v\n", path, err) // Try to extract what information we can if d.IsDir() { fmt.Printf(" Problematic directory, skipping contents\n") } else { fmt.Printf(" Problematic file, logging for later review\n") problemFiles.Add(path, err) } return nil // Continue with other entries } return processNormally(path, d, err) }

Recovery Strategies

Different applications require different recovery approaches when file system errors occur. Implement recovery strategies based on your application's fault tolerance requirements.

The retry strategy attempts to recover from transient errors:

func withRetry(maxRetries int, backoff time.Duration) filepath.WalkDirFunc { return func(path string, d fs.DirEntry, err error) error { if err != nil { for attempt := 0; attempt < maxRetries; attempt++ { fmt.Printf("Retry %d for %s\n", attempt+1, path) time.Sleep(backoff * time.Duration(attempt+1)) // Attempt to re-read the problematic entry if d == nil { // Directory read failure, try again entries, retryErr := os.ReadDir(path) if retryErr == nil { fmt.Printf("Retry successful for directory %s\n", path) // Process recovered entries for _, entry := range entries { entryPath := filepath.Join(path, entry.Name()) if procErr := processEntry(entryPath, entry); procErr != nil { return procErr } } return filepath.SkipDir // Already processed } } else { // Entry access failure, try to get info again if _, retryErr := d.Info(); retryErr == nil { fmt.Printf("Retry successful for entry %s\n", path) return processEntry(path, d) } } } // All retries failed return fmt.Errorf("failed after %d retries: %w", maxRetries, err) } return processEntry(path, d) } }

The graceful degradation strategy continues operation with reduced functionality:

type GracefulWalker struct { processedCount int errorCount int maxErrors int } func (gw *GracefulWalker) walkWithDegradation(path string, d fs.DirEntry, err error) error { if err != nil { gw.errorCount++ // Log error but continue fmt.Printf("Error %d: %s - %v\n", gw.errorCount, path, err) // Abort if too many errors if gw.errorCount > gw.maxErrors { return fmt.Errorf("too many errors (%d), aborting", gw.errorCount) } return nil // Continue despite error } gw.processedCount++ return processEntry(path, d) }

The error isolation strategy quarantines problematic areas while continuing elsewhere:

type IsolatingWalker struct { quarantinedPaths map[string]error mutex sync.RWMutex } func (iw *IsolatingWalker) walkWithIsolation(path string, d fs.DirEntry, err error) error { if err != nil { iw.mutex.Lock() iw.quarantinedPaths[path] = err iw.mutex.Unlock() fmt.Printf("Quarantined: %s (%v)\n", path, err) // Skip this subtree but continue elsewhere if d != nil && d.IsDir() { return filepath.SkipDir } return nil } // Check if we're in a quarantined subtree iw.mutex.RLock() for quarantined := range iw.quarantinedPaths { if strings.HasPrefix(path, quarantined) { iw.mutex.RUnlock() return filepath.SkipDir } } iw.mutex.RUnlock() return processEntry(path, d) }

The error collection approach gathers all errors for batch reporting:

type ErrorCollector struct { errors []error } func (ec *ErrorCollector) collectErrors(path string, d fs.DirEntry, err error) error { if err != nil { ec.errors = append(ec.errors, fmt.Errorf("%s: %w", path, err)) return nil // Continue despite error } if procErr := processEntry(path, d); procErr != nil { ec.errors = append(ec.errors, fmt.Errorf("%s: %w", path, procErr)) } return nil }

The selective error handling approach treats different error types differently:

 func selectiveHandling(path string, d fs.DirEntry, err error) error { if err != nil { // Log permission errors but continue if os.IsPermission(err) { log.Printf("Permission denied: %s", path) return nil } // Fail on other errors return err } return processEntry(path, d) }

Symbolic Link Behavior

WalkDir implements specific symbolic link handling policies that differ significantly from traditional file system traversal tools. Understanding these behaviors prevents security vulnerabilities and infinite loops while maintaining predictable traversal characteristics.

Root Symlink Resolution

When the root path passed to WalkDir is itself a symbolic link, the function resolves it before beginning traversal. This resolution applies only to the root path and establishes the actual starting point for the walk operation.

// File system setup: // /real/project/ // ├── src/main.go // └── docs/readme.md // /links/current -> /real/project func demonstrateRootResolution() { fmt.Println("Walking symlinked root:") err := filepath.WalkDir("/links/current", func(path string, d fs.DirEntry, err error) error { if err != nil { return err } fmt.Printf("Path: %s\n", path) return nil }) if err != nil { log.Fatal(err) } } // Output: // Path: /links/current <- Root symlink kept in paths // Path: /links/current/src // Path: /links/current/src/main.go // Path: /links/current/docs  // Path: /links/current/docs/readme.md

Notice that while WalkDir resolves the root symlink to determine what to traverse, it preserves the original symlink path in the callback parameters. This behavior maintains path consistency for your application logic while ensuring the traversal reaches the intended content.

The resolution only affects traversal scope, not path reporting:

func analyzeRootSymlink() { symlinkRoot := "/links/project-v2" // -> /real/project-v2 err := filepath.WalkDir(symlinkRoot, func(path string, d fs.DirEntry, err error) error { if err != nil { return err } // All paths maintain the symlink prefix if strings.HasPrefix(path, symlinkRoot) { fmt.Printf("Consistent path: %s\n", path) } // But the actual traversal follows the resolved target if path == symlinkRoot { realPath, _ := filepath.EvalSymlinks(path) fmt.Printf("Root %s resolves to %s\n", path, realPath) } return nil }) }

Root symlink resolution has security implications for applications that perform path-based access control:

func secureWalk(allowedRoot string) filepath.WalkDirFunc { // Resolve the allowed root to handle symlinks resolvedRoot, err := filepath.EvalSymlinks(allowedRoot) if err != nil { resolvedRoot = allowedRoot // Fallback to original } return func(path string, d fs.DirEntry, err error) error { if err != nil { return err } // Verify we're still within allowed boundaries resolvedPath, _ := filepath.EvalSymlinks(path) if !strings.HasPrefix(resolvedPath, resolvedRoot) { return fmt.Errorf("path escaped allowed root: %s", path) } return processEntry(path, d) } }

Directory Symlink Non-Following

Unlike root symlinks, WalkDir does not follow symbolic links to directories encountered during traversal. This policy prevents infinite loops and maintains bounded traversal behavior.

// File system setup: // /project/ // ├── src/ // │ ├── main.go // │ └── vendor -> ../../shared/vendor // ├── docs/ // │ └── api -> ../external/api-docs // └── backup -> /archive/project-backup func demonstrateSymlinkNonFollowing() { err := filepath.WalkDir("/project", func(path string, d fs.DirEntry, err error) error { if err != nil { return err } if d.Type()&fs.ModeSymlink != 0 { if d.IsDir() { fmt.Printf("Directory symlink (not followed): %s\n", path) } else { fmt.Printf("File symlink: %s\n", path) } } else { fmt.Printf("Regular entry: %s\n", path) } return nil }) } // Output: // Regular entry: /project // Regular entry: /project/src // Regular entry: /project/src/main.go // Directory symlink (not followed): /project/src/vendor // Regular entry: /project/docs // Directory symlink (not followed): /project/docs/api // Directory symlink (not followed): /project/backup

The non-following behavior applies only to directory symlinks. File symlinks are reported but not dereferenced:

func handleSymlinks(path string, d fs.DirEntry, err error) error { if err != nil { return err } if d.Type()&fs.ModeSymlink != 0 { // Get symlink target for informational purposes target, linkErr := os.Readlink(path) if linkErr != nil { fmt.Printf("Broken symlink: %s\n", path) return nil } if d.IsDir() { fmt.Printf("Directory symlink %s -> %s (not traversed)\n", path, target) // WalkDir will not descend into this directory } else { fmt.Printf("File symlink %s -> %s\n", path, target) // Process as needed, but content comes from target } } else { fmt.Printf("Regular %s: %s\n", map[bool]string{true: "directory", false: "file"}[d.IsDir()], path) } return nil }

This behavior protects against common symlink attack patterns:

// Dangerous symlink structure that WalkDir handles safely: // /tmp/safe/ // ├── data/ // │ └── important.txt // ├── loop -> loop <- Self-referencing symlink // ├── escape -> /etc <- Directory escape attempt // └── cycle -> ../safe <- Parent directory cycle func safeTraversal() { count := 0 err := filepath.WalkDir("/tmp/safe", func(path string, d fs.DirEntry, err error) error { if err != nil { return err } count++ if count > 1000 { // Safety check return fmt.Errorf("traversal too deep, possible loop") } fmt.Printf("Safe traversal: %s\n", path) return nil }) fmt.Printf("Completed safely with %d entries\n", count) }

When you need to follow directory symlinks, implement custom logic with cycle detection:

type SymlinkFollower struct { visited map[string]bool mutex sync.RWMutex } func (sf *SymlinkFollower) followingWalk(root string) error { sf.visited = make(map[string]bool) return sf.walkWithFollowing(root) } func (sf *SymlinkFollower) walkWithFollowing(root string) error { return filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error { if err != nil { return err } // Check for directory symlinks if d.IsDir() && d.Type()&fs.ModeSymlink != 0 { target, err := filepath.EvalSymlinks(path) if err != nil { return nil // Skip broken symlinks } sf.mutex.Lock() if sf.visited[target] { sf.mutex.Unlock() fmt.Printf("Cycle detected, skipping: %s -> %s\n", path, target) return nil } sf.visited[target] = true sf.mutex.Unlock() // Recursively walk the symlink target fmt.Printf("Following directory symlink: %s -> %s\n", path, target) return sf.walkWithFollowing(target) } return processEntry(path, d) }) }

Real-World Applications

The practical value of WalkDir emerges through real-world implementations that solve common file system problems. These examples demonstrate how to combine the traversal patterns into production-ready tools.

File Search Implementation

Building efficient file search tools requires combining multiple WalkDir features: pattern matching, early termination, and smart filtering. Here's a comprehensive search implementation:

type FileSearcher struct { patterns []string maxResults int maxDepth int skipHidden bool caseSensitive bool results []SearchResult } type SearchResult struct { Path string Size int64 ModTime time.Time IsDir bool } func NewFileSearcher(patterns []string, options ...SearchOption) *FileSearcher { fs := &FileSearcher{ patterns: patterns, maxResults: 100, maxDepth: -1, caseSensitive: true, results: make([]SearchResult, 0), } for _, opt := range options { opt(fs) } return fs } func (fs *FileSearcher) Search(root string) ([]SearchResult, error) { rootDepth := strings.Count(root, string(filepath.Separator)) err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error { if err != nil { // Log but continue on permission errors if os.IsPermission(err) { return nil } return err } // Check depth limit if fs.maxDepth >= 0 { currentDepth := strings.Count(path, string(filepath.Separator)) - rootDepth if d.IsDir() && currentDepth > fs.maxDepth { return filepath.SkipDir } } // Skip hidden files/directories if requested if fs.skipHidden && strings.HasPrefix(d.Name(), ".") { if d.IsDir() { return filepath.SkipDir } return nil } // Check if we've hit the result limit if len(fs.results) >= fs.maxResults { return filepath.SkipAll } // Apply pattern matching if fs.matchesPattern(d.Name()) { info, err := d.Info() if err != nil { return nil // Skip entries we can't stat } fs.results = append(fs.results, SearchResult{ Path: path, Size: info.Size(), ModTime: info.ModTime(), IsDir: d.IsDir(), }) } return nil }) return fs.results, err } func (fs *FileSearcher) matchesPattern(name string) bool { if !fs.caseSensitive { name = strings.ToLower(name) } for _, pattern := range fs.patterns { searchPattern := pattern if !fs.caseSensitive { searchPattern = strings.ToLower(pattern) } // Support both glob patterns and substring matching if matched, _ := filepath.Match(searchPattern, name); matched { return true } if strings.Contains(name, searchPattern) { return true } } return false } // Usage example func demonstrateFileSearch() { searcher := NewFileSearcher( []string{"*.go", "*test*"}, WithMaxResults(50), WithMaxDepth(3), WithSkipHidden(true), ) results, err := searcher.Search("/project") if err != nil { log.Fatal(err) } for _, result := range results { fmt.Printf("Found: %s (%d bytes, %s)\n", result.Path, result.Size, result.ModTime.Format("2006-01-02")) } }

Directory Cleanup Tools

Cleanup tools require careful error handling and confirmation mechanisms to avoid data loss. This implementation provides safe cleanup with rollback capabilities:

type CleanupTool struct { dryRun bool rules []CleanupRule deletedSize int64 deletedCount int errors []error } type CleanupRule interface { ShouldDelete(path string, d fs.DirEntry, info os.FileInfo) bool Description() string } type AgeBasedRule struct { maxAge time.Duration pattern string } func (r *AgeBasedRule) ShouldDelete(path string, d fs.DirEntry, info os.FileInfo) bool { if d.IsDir() { return false // Don't delete directories by age } if r.pattern != "" { if matched, _ := filepath.Match(r.pattern, d.Name()); !matched { return false } } return time.Since(info.ModTime()) > r.maxAge } func (r *AgeBasedRule) Description() string { return fmt.Sprintf("Files older than %v matching %s", r.maxAge, r.pattern) } type SizeBasedRule struct { minSize int64 pattern string } func (r *SizeBasedRule) ShouldDelete(path string, d fs.DirEntry, info os.FileInfo) bool { if d.IsDir() { return false } if r.pattern != "" { if matched, _ := filepath.Match(r.pattern, d.Name()); !matched { return false } } return info.Size() > r.minSize } func (r *SizeBasedRule) Description() string { return fmt.Sprintf("Files larger than %d bytes matching %s", r.minSize, r.pattern) } func (ct *CleanupTool) AddRule(rule CleanupRule) { ct.rules = append(ct.rules, rule) } func (ct *CleanupTool) Clean(root string) error { fmt.Printf("Starting cleanup of %s (dry run: %v)\n", root, ct.dryRun) for _, rule := range ct.rules { fmt.Printf("Rule: %s\n", rule.Description()) } err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error { if err != nil { ct.errors = append(ct.errors, fmt.Errorf("access error %s: %w", path, err)) return nil // Continue despite errors } // Skip the root directory if path == root { return nil } // Get file info for rule evaluation info, err := d.Info() if err != nil { ct.errors = append(ct.errors, fmt.Errorf("stat error %s: %w", path, err)) return nil } // Check if any rule applies shouldDelete := false var matchedRule CleanupRule for _, rule := range ct.rules { if rule.ShouldDelete(path, d, info) { shouldDelete = true matchedRule = rule break } } if shouldDelete { ct.deletedSize += info.Size() ct.deletedCount++ if ct.dryRun { fmt.Printf("Would delete: %s (%d bytes) - %s\n", path, info.Size(), matchedRule.Description()) } else { if err := os.Remove(path); err != nil { ct.errors = append(ct.errors, fmt.Errorf("delete error %s: %w", path, err)) } else { fmt.Printf("Deleted: %s (%d bytes)\n", path, info.Size()) } } // Skip directory contents if we deleted the directory if d.IsDir() { return filepath.SkipDir } } return nil }) ct.printSummary() return err } func (ct *CleanupTool) printSummary() { action := "Would delete" if !ct.dryRun { action = "Deleted" } fmt.Printf("\nSummary:\n") fmt.Printf("%s %d files totaling %d bytes\n", action, ct.deletedCount, ct.deletedSize) if len(ct.errors) > 0 { fmt.Printf("Encountered %d errors:\n", len(ct.errors)) for _, err := range ct.errors { fmt.Printf(" %v\n", err) } } } // Usage example func demonstrateCleanup() { cleaner := &CleanupTool{dryRun: true} // Clean up old temporary files cleaner.AddRule(&AgeBasedRule{ maxAge: 7 * 24 * time.Hour, // 7 days pattern: "*.tmp", }) // Clean up large log files cleaner.AddRule(&SizeBasedRule{ minSize: 100 * 1024 * 1024, // 100MB pattern: "*.log", }) err := cleaner.Clean("/tmp/project") if err != nil { log.Fatal(err) } }

File System Auditing

Auditing tools analyze file system structure and permissions to identify security issues and compliance violations:

type FileSystemAuditor struct { checks []SecurityCheck findings []SecurityFinding totalFiles int totalDirs int totalSize int64 } type SecurityCheck interface { Check(path string, d fs.DirEntry, info os.FileInfo) *SecurityFinding Name() string } type SecurityFinding struct { Check string Path string Severity string Message string Details map[string]interface{} } type PermissionCheck struct { maxPermissions os.FileMode } func (pc *PermissionCheck) Check(path string, d fs.DirEntry, info os.FileInfo) *SecurityFinding { if info.Mode().Perm() > pc.maxPermissions { return &SecurityFinding{ Check: pc.Name(), Path: path, Severity: "Medium", Message: "File has overly permissive permissions", Details: map[string]interface{}{ "current": info.Mode().Perm().String(), "maximum": pc.maxPermissions.String(), "is_dir": d.IsDir(), }, } } return nil } func (pc *PermissionCheck) Name() string { return "Permission Check" } type SuidCheck struct{} func (sc *SuidCheck) Check(path string, d fs.DirEntry, info os.FileInfo) *SecurityFinding { if !d.IsDir() && (info.Mode()&os.ModeSetuid != 0 || info.Mode()&os.ModeSetgid != 0) { severity := "High" if info.Mode()&os.ModeSetuid != 0 { severity = "Critical" } return &SecurityFinding{ Check: sc.Name(), Path: path, Severity: severity, Message: "File has setuid/setgid permissions", Details: map[string]interface{}{ "setuid": info.Mode()&os.ModeSetuid != 0, "setgid": info.Mode()&os.ModeSetgid != 0, "mode": info.Mode().String(), }, } } return nil } func (sc *SuidCheck) Name() string { return "SUID/SGID Check" } type HiddenFileCheck struct { allowedHidden []string } func (hfc *HiddenFileCheck) Check(path string, d fs.DirEntry, info os.FileInfo) *SecurityFinding { if strings.HasPrefix(d.Name(), ".") { // Check if this hidden file is in the allowed list for _, allowed := range hfc.allowedHidden { if matched, _ := filepath.Match(allowed, d.Name()); matched { return nil } } return &SecurityFinding{ Check: hfc.Name(), Path: path, Severity: "Low", Message: "Unexpected hidden file found", Details: map[string]interface{}{ "name": d.Name(), "is_dir": d.IsDir(), }, } } return nil } func (hfc *HiddenFileCheck) Name() string { return "Hidden File Check" } func (fsa *FileSystemAuditor) AddCheck(check SecurityCheck) { fsa.checks = append(fsa.checks, check) } func (fsa *FileSystemAuditor) Audit(root string) error { fmt.Printf("Starting security audit of %s\n", root) err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error { if err != nil { finding := &SecurityFinding{ Check: "Access Check", Path: path, Severity: "Medium", Message: "Unable to access file system entry", Details: map[string]interface{}{"error": err.Error()}, } fsa.findings = append(fsa.findings, *finding) return nil // Continue audit despite access errors } // Update statistics if d.IsDir() { fsa.totalDirs++ } else { fsa.totalFiles++ if info, err := d.Info(); err == nil { fsa.totalSize += info.Size() } } // Run security checks info, err := d.Info() if err != nil { return nil // Skip checks if we can't get file info } for _, check := range fsa.checks { if finding := check.Check(path, d, info); finding != nil { fsa.findings = append(fsa.findings, *finding) } } return nil }) fsa.generateReport() return err } func (fsa *FileSystemAuditor) generateReport() { fmt.Printf("\n=== Security Audit Report ===\n") fmt.Printf("Files scanned: %d\n", fsa.totalFiles) fmt.Printf("Directories scanned: %d\n", fsa.totalDirs) fmt.Printf("Total size: %d bytes\n", fsa.totalSize) fmt.Printf("Security findings: %d\n\n", len(fsa.findings)) // Group findings by severity severityCount := make(map[string]int) for _, finding := range fsa.findings { severityCount[finding.Severity]++ } for severity, count := range severityCount { fmt.Printf("%s: %d findings\n", severity, count) } fmt.Printf("\nDetailed findings:\n") for _, finding := range fsa.findings { fmt.Printf("[%s] %s: %s\n", finding.Severity, finding.Check, finding.Path) fmt.Printf(" %s\n", finding.Message) for key, value := range finding.Details { fmt.Printf(" %s: %v\n", key, value) } fmt.Println() } } // Usage example func demonstrateAudit() { auditor := &FileSystemAuditor{} // Add security checks auditor.AddCheck(&PermissionCheck{maxPermissions: 0644}) auditor.AddCheck(&SuidCheck{}) auditor.AddCheck(&HiddenFileCheck{ allowedHidden: []string{".git", ".gitignore", ".env"}, }) err := auditor.Audit("/var/www") if err != nil { log.Fatal(err) } }

DEV Community