feat(scanner): implement selective folder scanning and file system watcher improvements (#4674)

* feat: Add selective folder scanning capability

Implement targeted scanning of specific library/folder pairs without
full recursion. This enables efficient rescanning of individual folders
when changes are detected, significantly reducing scan time for large
libraries.

Key changes:
- Add ScanTarget struct and ScanFolders API to Scanner interface
- Implement CLI flag --targets for specifying libraryID:folderPath pairs
- Add FolderRepository.GetByPaths() for batch folder info retrieval
- Create loadSpecificFolders() for non-recursive directory loading
- Scope GC operations to affected libraries only (with TODO for full impl)
- Add comprehensive tests for selective scanning behavior

The selective scan:
- Only processes specified folders (no subdirectory recursion)
- Maintains library isolation
- Runs full maintenance pipeline scoped to affected libraries
- Supports both full and quick scan modes

Examples:
  navidrome scan --targets "1:Music/Rock,1:Music/Jazz"
  navidrome scan --full --targets "2:Classical"

* feat(folder): replace GetByPaths with GetFolderUpdateInfo for improved folder updates retrieval

Signed-off-by: Deluan <deluan@navidrome.org>

* test: update parseTargets test to handle folder names with spaces

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(folder): remove unused LibraryPath struct and update GC logging message

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(folder): enhance external scanner to support target-specific scanning

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): simplify scanner methods

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(watcher): implement folder scanning notifications with deduplication

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(watcher): add resolveFolderPath function for testability

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(watcher): implement path ignoring based on .ndignore patterns

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): implement IgnoreChecker for managing .ndignore patterns

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(ignore_checker): rename scanner to lineScanner for clarity

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): enhance ScanTarget struct with String method for better target representation

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(scanner): validate library ID to prevent negative values

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): simplify GC method by removing library ID parameter

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(scanner): update folder scanning to include all descendants of specified folders

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(subsonic): allow selective scan in the /startScan endpoint

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): update CallScan to handle specific library/folder pairs

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): streamline scanning logic by removing scanAll method

Signed-off-by: Deluan <deluan@navidrome.org>

* test: enhance mockScanner for thread safety and improve test reliability

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): move scanner.ScanTarget to model.ScanTarget

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor: move scanner types to model,implement MockScanner

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): update scanner interface and implementations to use model.Scanner

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(folder_repository): normalize target path handling by using filepath.Clean

Signed-off-by: Deluan <deluan@navidrome.org>

* test(folder_repository): add comprehensive tests for folder retrieval and child exclusion

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): simplify selective scan logic using slice.Filter

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): streamline phase folder and album creation by removing unnecessary library parameter

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): move initialization logic from phase_1 to the scanner itself

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(tests): rename selective scan test file to scanner_selective_test.go

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(configuration): add DevSelectiveWatcher configuration option

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(watcher): enhance .ndignore handling for folder deletions and file changes

Signed-off-by: Deluan <deluan@navidrome.org>

* docs(scanner): comments

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): enhance walkDirTree to support target folder scanning

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(scanner, watcher): handle errors when pushing ignore patterns for folders

Signed-off-by: Deluan <deluan@navidrome.org>

* Update scanner/phase_1_folders.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor(scanner): replace parseTargets function with direct call to scanner.ParseTargets

Signed-off-by: Deluan <deluan@navidrome.org>

* test(scanner): add tests for ScanBegin and ScanEnd functionality

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(library): update PRAGMA optimize to check table sizes without ANALYZE

Signed-off-by: Deluan <deluan@navidrome.org>

* test(scanner): refactor tests

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(ui): add selective scan options and update translations

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(ui): add quick and full scan options for individual libraries

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(ui): add Scan buttonsto the LibraryList

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(scan): update scanning parameters from 'path' to 'target' for selective scans.

* refactor(scan): move ParseTargets function to model package

* test(scan): suppress unused return value from SetUserLibraries in tests

* feat(gc): enhance garbage collection to support selective library purging

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(scanner): prevent race condition when scanning deleted folders

When the watcher detects changes in a folder that gets deleted before
the scanner runs (due to the 10-second delay), the scanner was
prematurely removing these folders from the tracking map, preventing
them from being marked as missing.

The issue occurred because `newFolderEntry` was calling `popLastUpdate`
before verifying the folder actually exists on the filesystem.

Changes:
- Move fs.Stat check before newFolderEntry creation in loadDir to
  ensure deleted folders remain in lastUpdates for finalize() to handle
- Add early existence check in walkDirTree to skip non-existent target
  folders with a warning log
- Add unit test verifying non-existent folders aren't removed from
  lastUpdates prematurely
- Add integration test for deleted folder scenario with ScanFolders

Fixes the issue where deleting entire folders (e.g., /music/AC_DC)
wouldn't mark tracks as missing when using selective folder scanning.

* refactor(scan): streamline folder entry creation and update handling

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(scan): add '@Recycle' (QNAP) to ignored directories list

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(log): improve thread safety in logging level management

* test(scan): move unit tests for ParseTargets function

Signed-off-by: Deluan <deluan@navidrome.org>

* review

Signed-off-by: Deluan <deluan@navidrome.org>

---------

Signed-off-by: Deluan <deluan@navidrome.org>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: deluan <deluan.quintao@mechanical-orchard.com>
This commit is contained in:
Deluan Quintão
2025-11-14 22:15:43 -05:00
committed by GitHub
parent bca76069c3
commit 28d5299ffc
52 changed files with 3221 additions and 374 deletions
+52 -62
View File
@@ -1,7 +1,6 @@
package scanner
import (
"bufio"
"context"
"io/fs"
"maps"
@@ -11,37 +10,69 @@ import (
"strings"
"github.com/navidrome/navidrome/conf"
"github.com/navidrome/navidrome/consts"
"github.com/navidrome/navidrome/log"
"github.com/navidrome/navidrome/model"
"github.com/navidrome/navidrome/utils"
ignore "github.com/sabhiram/go-gitignore"
)
func walkDirTree(ctx context.Context, job *scanJob) (<-chan *folderEntry, error) {
// walkDirTree recursively walks the directory tree starting from the given targetFolders.
// If no targetFolders are provided, it starts from the root folder (".").
// It returns a channel of folderEntry pointers representing each folder found.
func walkDirTree(ctx context.Context, job *scanJob, targetFolders ...string) (<-chan *folderEntry, error) {
results := make(chan *folderEntry)
folders := targetFolders
if len(targetFolders) == 0 {
// No specific folders provided, scan the root folder
folders = []string{"."}
}
go func() {
defer close(results)
err := walkFolder(ctx, job, ".", nil, results)
if err != nil {
log.Error(ctx, "Scanner: There were errors reading directories from filesystem", "path", job.lib.Path, err)
return
for _, folderPath := range folders {
if utils.IsCtxDone(ctx) {
return
}
// Check if target folder exists before walking it
// If it doesn't exist (e.g., deleted between watcher detection and scan execution),
// skip it so it remains in job.lastUpdates and gets handled in following steps
_, err := fs.Stat(job.fs, folderPath)
if err != nil {
log.Warn(ctx, "Scanner: Target folder does not exist.", "path", folderPath, err)
continue
}
// Create checker and push patterns from root to this folder
checker := newIgnoreChecker(job.fs)
err = checker.PushAllParents(ctx, folderPath)
if err != nil {
log.Error(ctx, "Scanner: Error pushing ignore patterns for target folder", "path", folderPath, err)
continue
}
// Recursively walk this folder and all its children
err = walkFolder(ctx, job, folderPath, checker, results)
if err != nil {
log.Error(ctx, "Scanner: Error walking target folder", "path", folderPath, err)
continue
}
}
log.Debug(ctx, "Scanner: Finished reading folders", "lib", job.lib.Name, "path", job.lib.Path, "numFolders", job.numFolders.Load())
log.Debug(ctx, "Scanner: Finished reading target folders", "lib", job.lib.Name, "path", job.lib.Path, "numFolders", job.numFolders.Load())
}()
return results, nil
}
func walkFolder(ctx context.Context, job *scanJob, currentFolder string, ignorePatterns []string, results chan<- *folderEntry) error {
ignorePatterns = loadIgnoredPatterns(ctx, job.fs, currentFolder, ignorePatterns)
func walkFolder(ctx context.Context, job *scanJob, currentFolder string, checker *IgnoreChecker, results chan<- *folderEntry) error {
// Push patterns for this folder onto the stack
_ = checker.Push(ctx, currentFolder)
defer checker.Pop() // Pop patterns when leaving this folder
folder, children, err := loadDir(ctx, job, currentFolder, ignorePatterns)
folder, children, err := loadDir(ctx, job, currentFolder, checker)
if err != nil {
log.Warn(ctx, "Scanner: Error loading dir. Skipping", "path", currentFolder, err)
return nil
}
for _, c := range children {
err := walkFolder(ctx, job, c, ignorePatterns, results)
err := walkFolder(ctx, job, c, checker, results)
if err != nil {
return err
}
@@ -59,50 +90,17 @@ func walkFolder(ctx context.Context, job *scanJob, currentFolder string, ignoreP
return nil
}
func loadIgnoredPatterns(ctx context.Context, fsys fs.FS, currentFolder string, currentPatterns []string) []string {
ignoreFilePath := path.Join(currentFolder, consts.ScanIgnoreFile)
var newPatterns []string
if _, err := fs.Stat(fsys, ignoreFilePath); err == nil {
// Read and parse the .ndignore file
ignoreFile, err := fsys.Open(ignoreFilePath)
if err != nil {
log.Warn(ctx, "Scanner: Error opening .ndignore file", "path", ignoreFilePath, err)
// Continue with previous patterns
} else {
defer ignoreFile.Close()
scanner := bufio.NewScanner(ignoreFile)
for scanner.Scan() {
line := scanner.Text()
if line == "" || strings.HasPrefix(line, "#") {
continue // Skip empty lines and comments
}
newPatterns = append(newPatterns, line)
}
if err := scanner.Err(); err != nil {
log.Warn(ctx, "Scanner: Error reading .ignore file", "path", ignoreFilePath, err)
}
}
// If the .ndignore file is empty, mimic the current behavior and ignore everything
if len(newPatterns) == 0 {
log.Trace(ctx, "Scanner: .ndignore file is empty, ignoring everything", "path", currentFolder)
newPatterns = []string{"**/*"}
} else {
log.Trace(ctx, "Scanner: .ndignore file found ", "path", ignoreFilePath, "patterns", newPatterns)
}
}
// Combine the patterns from the .ndignore file with the ones passed as argument
combinedPatterns := append([]string{}, currentPatterns...)
return append(combinedPatterns, newPatterns...)
}
func loadDir(ctx context.Context, job *scanJob, dirPath string, ignorePatterns []string) (folder *folderEntry, children []string, err error) {
folder = newFolderEntry(job, dirPath)
func loadDir(ctx context.Context, job *scanJob, dirPath string, checker *IgnoreChecker) (folder *folderEntry, children []string, err error) {
// Check if directory exists before creating the folder entry
// This is important to avoid removing the folder from lastUpdates if it doesn't exist
dirInfo, err := fs.Stat(job.fs, dirPath)
if err != nil {
log.Warn(ctx, "Scanner: Error stating dir", "path", dirPath, err)
return nil, nil, err
}
// Now that we know the folder exists, create the entry (which removes it from lastUpdates)
folder = job.createFolderEntry(dirPath)
folder.modTime = dirInfo.ModTime()
dir, err := job.fs.Open(dirPath)
@@ -117,12 +115,11 @@ func loadDir(ctx context.Context, job *scanJob, dirPath string, ignorePatterns [
return folder, children, err
}
ignoreMatcher := ignore.CompileIgnoreLines(ignorePatterns...)
entries := fullReadDir(ctx, dirFile)
children = make([]string, 0, len(entries))
for _, entry := range entries {
entryPath := path.Join(dirPath, entry.Name())
if len(ignorePatterns) > 0 && isScanIgnored(ctx, ignoreMatcher, entryPath) {
if checker.ShouldIgnore(ctx, entryPath) {
log.Trace(ctx, "Scanner: Ignoring entry", "path", entryPath)
continue
}
@@ -234,6 +231,7 @@ func isDirReadable(ctx context.Context, fsys fs.FS, dirPath string) bool {
var ignoredDirs = []string{
"$RECYCLE.BIN",
"#snapshot",
"@Recycle",
"@Recently-Snapshot",
".streams",
"lost+found",
@@ -254,11 +252,3 @@ func isDirIgnored(name string) bool {
func isEntryIgnored(name string) bool {
return strings.HasPrefix(name, ".") && !strings.HasPrefix(name, "..")
}
func isScanIgnored(ctx context.Context, matcher *ignore.GitIgnore, entryPath string) bool {
matches := matcher.MatchesPath(entryPath)
if matches {
log.Trace(ctx, "Scanner: Ignoring entry matching .ndignore: ", "path", entryPath)
}
return matches
}