feat(scanner): implement selective folder scanning and file system watcher improvements (#4674)

* feat: Add selective folder scanning capability

Implement targeted scanning of specific library/folder pairs without
full recursion. This enables efficient rescanning of individual folders
when changes are detected, significantly reducing scan time for large
libraries.

Key changes:
- Add ScanTarget struct and ScanFolders API to Scanner interface
- Implement CLI flag --targets for specifying libraryID:folderPath pairs
- Add FolderRepository.GetByPaths() for batch folder info retrieval
- Create loadSpecificFolders() for non-recursive directory loading
- Scope GC operations to affected libraries only (with TODO for full impl)
- Add comprehensive tests for selective scanning behavior

The selective scan:
- Only processes specified folders (no subdirectory recursion)
- Maintains library isolation
- Runs full maintenance pipeline scoped to affected libraries
- Supports both full and quick scan modes

Examples:
  navidrome scan --targets "1:Music/Rock,1:Music/Jazz"
  navidrome scan --full --targets "2:Classical"

* feat(folder): replace GetByPaths with GetFolderUpdateInfo for improved folder updates retrieval

Signed-off-by: Deluan <deluan@navidrome.org>

* test: update parseTargets test to handle folder names with spaces

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(folder): remove unused LibraryPath struct and update GC logging message

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(folder): enhance external scanner to support target-specific scanning

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): simplify scanner methods

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(watcher): implement folder scanning notifications with deduplication

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(watcher): add resolveFolderPath function for testability

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(watcher): implement path ignoring based on .ndignore patterns

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): implement IgnoreChecker for managing .ndignore patterns

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(ignore_checker): rename scanner to lineScanner for clarity

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): enhance ScanTarget struct with String method for better target representation

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(scanner): validate library ID to prevent negative values

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): simplify GC method by removing library ID parameter

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(scanner): update folder scanning to include all descendants of specified folders

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(subsonic): allow selective scan in the /startScan endpoint

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): update CallScan to handle specific library/folder pairs

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): streamline scanning logic by removing scanAll method

Signed-off-by: Deluan <deluan@navidrome.org>

* test: enhance mockScanner for thread safety and improve test reliability

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): move scanner.ScanTarget to model.ScanTarget

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor: move scanner types to model,implement MockScanner

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): update scanner interface and implementations to use model.Scanner

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(folder_repository): normalize target path handling by using filepath.Clean

Signed-off-by: Deluan <deluan@navidrome.org>

* test(folder_repository): add comprehensive tests for folder retrieval and child exclusion

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): simplify selective scan logic using slice.Filter

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): streamline phase folder and album creation by removing unnecessary library parameter

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): move initialization logic from phase_1 to the scanner itself

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(tests): rename selective scan test file to scanner_selective_test.go

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(configuration): add DevSelectiveWatcher configuration option

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(watcher): enhance .ndignore handling for folder deletions and file changes

Signed-off-by: Deluan <deluan@navidrome.org>

* docs(scanner): comments

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): enhance walkDirTree to support target folder scanning

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(scanner, watcher): handle errors when pushing ignore patterns for folders

Signed-off-by: Deluan <deluan@navidrome.org>

* Update scanner/phase_1_folders.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor(scanner): replace parseTargets function with direct call to scanner.ParseTargets

Signed-off-by: Deluan <deluan@navidrome.org>

* test(scanner): add tests for ScanBegin and ScanEnd functionality

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(library): update PRAGMA optimize to check table sizes without ANALYZE

Signed-off-by: Deluan <deluan@navidrome.org>

* test(scanner): refactor tests

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(ui): add selective scan options and update translations

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(ui): add quick and full scan options for individual libraries

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(ui): add Scan buttonsto the LibraryList

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(scan): update scanning parameters from 'path' to 'target' for selective scans.

* refactor(scan): move ParseTargets function to model package

* test(scan): suppress unused return value from SetUserLibraries in tests

* feat(gc): enhance garbage collection to support selective library purging

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(scanner): prevent race condition when scanning deleted folders

When the watcher detects changes in a folder that gets deleted before
the scanner runs (due to the 10-second delay), the scanner was
prematurely removing these folders from the tracking map, preventing
them from being marked as missing.

The issue occurred because `newFolderEntry` was calling `popLastUpdate`
before verifying the folder actually exists on the filesystem.

Changes:
- Move fs.Stat check before newFolderEntry creation in loadDir to
  ensure deleted folders remain in lastUpdates for finalize() to handle
- Add early existence check in walkDirTree to skip non-existent target
  folders with a warning log
- Add unit test verifying non-existent folders aren't removed from
  lastUpdates prematurely
- Add integration test for deleted folder scenario with ScanFolders

Fixes the issue where deleting entire folders (e.g., /music/AC_DC)
wouldn't mark tracks as missing when using selective folder scanning.

* refactor(scan): streamline folder entry creation and update handling

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(scan): add '@Recycle' (QNAP) to ignored directories list

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(log): improve thread safety in logging level management

* test(scan): move unit tests for ParseTargets function

Signed-off-by: Deluan <deluan@navidrome.org>

* review

Signed-off-by: Deluan <deluan@navidrome.org>

---------

Signed-off-by: Deluan <deluan@navidrome.org>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: deluan <deluan.quintao@mechanical-orchard.com>
This commit is contained in:
Deluan Quintão
2025-11-14 22:15:43 -05:00
committed by GitHub
parent bca76069c3
commit 28d5299ffc
52 changed files with 3221 additions and 374 deletions
+5 -1
View File
@@ -337,8 +337,12 @@ on conflict (user_id, item_id, item_type) do update
return r.executeSQL(query)
}
func (r *albumRepository) purgeEmpty() error {
func (r *albumRepository) purgeEmpty(libraryIDs ...int) error {
del := Delete(r.tableName).Where("id not in (select distinct(album_id) from media_file)")
// If libraryIDs are specified, only purge albums from those libraries
if len(libraryIDs) > 0 {
del = del.Where(Eq{"library_id": libraryIDs})
}
c, err := r.executeSQL(del)
if err != nil {
return fmt.Errorf("purging empty albums: %w", err)
+49 -3
View File
@@ -4,7 +4,10 @@ import (
"context"
"encoding/json"
"fmt"
"os"
"path/filepath"
"slices"
"strings"
"time"
. "github.com/Masterminds/squirrel"
@@ -91,8 +94,47 @@ func (r folderRepository) CountAll(opt ...model.QueryOptions) (int64, error) {
return r.count(query)
}
func (r folderRepository) GetLastUpdates(lib model.Library) (map[string]model.FolderUpdateInfo, error) {
sq := r.newSelect().Columns("id", "updated_at", "hash").Where(Eq{"library_id": lib.ID, "missing": false})
func (r folderRepository) GetFolderUpdateInfo(lib model.Library, targetPaths ...string) (map[string]model.FolderUpdateInfo, error) {
where := And{
Eq{"library_id": lib.ID},
Eq{"missing": false},
}
// If specific paths are requested, include those folders and all their descendants
if len(targetPaths) > 0 {
// Collect folder IDs for exact target folders and path conditions for descendants
folderIDs := make([]string, 0, len(targetPaths))
pathConditions := make(Or, 0, len(targetPaths)*2)
for _, targetPath := range targetPaths {
if targetPath == "" || targetPath == "." {
// Root path - include everything in this library
pathConditions = Or{}
folderIDs = nil
break
}
// Clean the path to normalize it. Paths stored in the folder table do not have leading/trailing slashes.
cleanPath := strings.TrimPrefix(targetPath, string(os.PathSeparator))
cleanPath = filepath.Clean(cleanPath)
// Include the target folder itself by ID
folderIDs = append(folderIDs, model.FolderID(lib, cleanPath))
// Include all descendants: folders whose path field equals or starts with the target path
// Note: Folder.Path is the directory path, so children have path = targetPath
pathConditions = append(pathConditions, Eq{"path": cleanPath})
pathConditions = append(pathConditions, Like{"path": cleanPath + "/%"})
}
// Combine conditions: exact folder IDs OR descendant path patterns
if len(folderIDs) > 0 {
where = append(where, Or{Eq{"id": folderIDs}, pathConditions})
} else if len(pathConditions) > 0 {
where = append(where, pathConditions)
}
}
sq := r.newSelect().Columns("id", "updated_at", "hash").Where(where)
var res []struct {
ID string
UpdatedAt time.Time
@@ -149,7 +191,7 @@ func (r folderRepository) GetTouchedWithPlaylists() (model.FolderCursor, error)
}, nil
}
func (r folderRepository) purgeEmpty() error {
func (r folderRepository) purgeEmpty(libraryIDs ...int) error {
sq := Delete(r.tableName).Where(And{
Eq{"num_audio_files": 0},
Eq{"num_playlists": 0},
@@ -157,6 +199,10 @@ func (r folderRepository) purgeEmpty() error {
ConcatExpr("id not in (select parent_id from folder)"),
ConcatExpr("id not in (select folder_id from media_file)"),
})
// If libraryIDs are specified, only purge folders from those libraries
if len(libraryIDs) > 0 {
sq = sq.Where(Eq{"library_id": libraryIDs})
}
c, err := r.executeSQL(sq)
if err != nil {
return fmt.Errorf("purging empty folders: %w", err)
+213
View File
@@ -0,0 +1,213 @@
package persistence
import (
"context"
"fmt"
"github.com/navidrome/navidrome/log"
"github.com/navidrome/navidrome/model"
"github.com/navidrome/navidrome/model/request"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"github.com/pocketbase/dbx"
)
var _ = Describe("FolderRepository", func() {
var repo model.FolderRepository
var ctx context.Context
var conn *dbx.DB
var testLib, otherLib model.Library
BeforeEach(func() {
ctx = request.WithUser(log.NewContext(context.TODO()), model.User{ID: "userid"})
conn = GetDBXBuilder()
repo = newFolderRepository(ctx, conn)
// Use existing library ID 1 from test fixtures
libRepo := NewLibraryRepository(ctx, conn)
lib, err := libRepo.Get(1)
Expect(err).ToNot(HaveOccurred())
testLib = *lib
// Create a second library with its own folder to verify isolation
otherLib = model.Library{Name: "Other Library", Path: "/other/path"}
Expect(libRepo.Put(&otherLib)).To(Succeed())
})
AfterEach(func() {
// Clean up only test folders created by our tests (paths starting with "Test")
// This prevents interference with fixture data needed by other tests
_, _ = conn.NewQuery("DELETE FROM folder WHERE library_id = 1 AND path LIKE 'Test%'").Execute()
_, _ = conn.NewQuery(fmt.Sprintf("DELETE FROM library WHERE id = %d", otherLib.ID)).Execute()
})
Describe("GetFolderUpdateInfo", func() {
Context("with no target paths", func() {
It("returns all folders in the library", func() {
// Create test folders with unique names to avoid conflicts
folder1 := model.NewFolder(testLib, "TestGetLastUpdates/Folder1")
folder2 := model.NewFolder(testLib, "TestGetLastUpdates/Folder2")
err := repo.Put(folder1)
Expect(err).ToNot(HaveOccurred())
err = repo.Put(folder2)
Expect(err).ToNot(HaveOccurred())
otherFolder := model.NewFolder(otherLib, "TestOtherLib/Folder")
err = repo.Put(otherFolder)
Expect(err).ToNot(HaveOccurred())
// Query all folders (no target paths) - should only return folders from testLib
results, err := repo.GetFolderUpdateInfo(testLib)
Expect(err).ToNot(HaveOccurred())
// Should include folders from testLib
Expect(results).To(HaveKey(folder1.ID))
Expect(results).To(HaveKey(folder2.ID))
// Should NOT include folders from other library
Expect(results).ToNot(HaveKey(otherFolder.ID))
})
})
Context("with specific target paths", func() {
It("returns folder info for existing folders", func() {
// Create test folders with unique names
folder1 := model.NewFolder(testLib, "TestSpecific/Rock")
folder2 := model.NewFolder(testLib, "TestSpecific/Jazz")
folder3 := model.NewFolder(testLib, "TestSpecific/Classical")
err := repo.Put(folder1)
Expect(err).ToNot(HaveOccurred())
err = repo.Put(folder2)
Expect(err).ToNot(HaveOccurred())
err = repo.Put(folder3)
Expect(err).ToNot(HaveOccurred())
// Query specific paths
results, err := repo.GetFolderUpdateInfo(testLib, "TestSpecific/Rock", "TestSpecific/Classical")
Expect(err).ToNot(HaveOccurred())
Expect(results).To(HaveLen(2))
// Verify folder IDs are in results
Expect(results).To(HaveKey(folder1.ID))
Expect(results).To(HaveKey(folder3.ID))
Expect(results).ToNot(HaveKey(folder2.ID))
// Verify update info is populated
Expect(results[folder1.ID].UpdatedAt).ToNot(BeZero())
Expect(results[folder1.ID].Hash).To(Equal(folder1.Hash))
})
It("includes all child folders when querying parent", func() {
// Create a parent folder with multiple children
parent := model.NewFolder(testLib, "TestParent/Music")
child1 := model.NewFolder(testLib, "TestParent/Music/Rock/Queen")
child2 := model.NewFolder(testLib, "TestParent/Music/Jazz")
otherParent := model.NewFolder(testLib, "TestParent2/Music/Jazz")
Expect(repo.Put(parent)).To(Succeed())
Expect(repo.Put(child1)).To(Succeed())
Expect(repo.Put(child2)).To(Succeed())
// Query the parent folder - should return parent and all children
results, err := repo.GetFolderUpdateInfo(testLib, "TestParent/Music")
Expect(err).ToNot(HaveOccurred())
Expect(results).To(HaveLen(3))
Expect(results).To(HaveKey(parent.ID))
Expect(results).To(HaveKey(child1.ID))
Expect(results).To(HaveKey(child2.ID))
Expect(results).ToNot(HaveKey(otherParent.ID))
})
It("excludes children from other libraries", func() {
// Create parent in testLib
parent := model.NewFolder(testLib, "TestIsolation/Parent")
child := model.NewFolder(testLib, "TestIsolation/Parent/Child")
Expect(repo.Put(parent)).To(Succeed())
Expect(repo.Put(child)).To(Succeed())
// Create similar path in other library
otherParent := model.NewFolder(otherLib, "TestIsolation/Parent")
otherChild := model.NewFolder(otherLib, "TestIsolation/Parent/Child")
Expect(repo.Put(otherParent)).To(Succeed())
Expect(repo.Put(otherChild)).To(Succeed())
// Query should only return folders from testLib
results, err := repo.GetFolderUpdateInfo(testLib, "TestIsolation/Parent")
Expect(err).ToNot(HaveOccurred())
Expect(results).To(HaveLen(2))
Expect(results).To(HaveKey(parent.ID))
Expect(results).To(HaveKey(child.ID))
Expect(results).ToNot(HaveKey(otherParent.ID))
Expect(results).ToNot(HaveKey(otherChild.ID))
})
It("excludes missing children when querying parent", func() {
// Create parent and children, mark one as missing
parent := model.NewFolder(testLib, "TestMissingChild/Parent")
child1 := model.NewFolder(testLib, "TestMissingChild/Parent/Child1")
child2 := model.NewFolder(testLib, "TestMissingChild/Parent/Child2")
child2.Missing = true
Expect(repo.Put(parent)).To(Succeed())
Expect(repo.Put(child1)).To(Succeed())
Expect(repo.Put(child2)).To(Succeed())
// Query parent - should only return parent and non-missing child
results, err := repo.GetFolderUpdateInfo(testLib, "TestMissingChild/Parent")
Expect(err).ToNot(HaveOccurred())
Expect(results).To(HaveLen(2))
Expect(results).To(HaveKey(parent.ID))
Expect(results).To(HaveKey(child1.ID))
Expect(results).ToNot(HaveKey(child2.ID))
})
It("handles mix of existing and non-existing target paths", func() {
// Create folders for one path but not the other
existingParent := model.NewFolder(testLib, "TestMixed/Exists")
existingChild := model.NewFolder(testLib, "TestMixed/Exists/Child")
Expect(repo.Put(existingParent)).To(Succeed())
Expect(repo.Put(existingChild)).To(Succeed())
// Query both existing and non-existing paths
results, err := repo.GetFolderUpdateInfo(testLib, "TestMixed/Exists", "TestMixed/DoesNotExist")
Expect(err).ToNot(HaveOccurred())
Expect(results).To(HaveLen(2))
Expect(results).To(HaveKey(existingParent.ID))
Expect(results).To(HaveKey(existingChild.ID))
})
It("handles empty folder path as root", func() {
// Test querying for root folder without creating it (fixtures should have one)
rootFolderID := model.FolderID(testLib, ".")
results, err := repo.GetFolderUpdateInfo(testLib, "")
Expect(err).ToNot(HaveOccurred())
// Should return the root folder if it exists
if len(results) > 0 {
Expect(results).To(HaveKey(rootFolderID))
}
})
It("returns empty map for non-existent folders", func() {
results, err := repo.GetFolderUpdateInfo(testLib, "NonExistent/Path")
Expect(err).ToNot(HaveOccurred())
Expect(results).To(BeEmpty())
})
It("skips missing folders", func() {
// Create a folder and mark it as missing
folder := model.NewFolder(testLib, "TestMissing/Folder")
folder.Missing = true
err := repo.Put(folder)
Expect(err).ToNot(HaveOccurred())
results, err := repo.GetFolderUpdateInfo(testLib, "TestMissing/Folder")
Expect(err).ToNot(HaveOccurred())
Expect(results).To(BeEmpty())
})
})
})
})
+3 -1
View File
@@ -177,7 +177,9 @@ func (r *libraryRepository) ScanEnd(id int) error {
return err
}
// https://www.sqlite.org/pragma.html#pragma_optimize
_, err = r.executeSQL(Expr("PRAGMA optimize=0x10012;"))
// Use mask 0x10000 to check table sizes without running ANALYZE
// Running ANALYZE can cause query planner issues with expression-based collation indexes
_, err = r.executeSQL(Expr("PRAGMA optimize=0x10000;"))
return err
}
+58
View File
@@ -142,4 +142,62 @@ var _ = Describe("LibraryRepository", func() {
Expect(libAfter.TotalSize).To(Equal(sizeRes.Sum))
Expect(libAfter.TotalDuration).To(Equal(durationRes.Sum))
})
Describe("ScanBegin and ScanEnd", func() {
var lib *model.Library
BeforeEach(func() {
lib = &model.Library{
ID: 0,
Name: "Test Scan Library",
Path: "/music/test-scan",
}
err := repo.Put(lib)
Expect(err).ToNot(HaveOccurred())
})
DescribeTable("ScanBegin",
func(fullScan bool, expectedFullScanInProgress bool) {
err := repo.ScanBegin(lib.ID, fullScan)
Expect(err).ToNot(HaveOccurred())
updatedLib, err := repo.Get(lib.ID)
Expect(err).ToNot(HaveOccurred())
Expect(updatedLib.LastScanStartedAt).ToNot(BeZero())
Expect(updatedLib.FullScanInProgress).To(Equal(expectedFullScanInProgress))
},
Entry("sets FullScanInProgress to true for full scan", true, true),
Entry("sets FullScanInProgress to false for quick scan", false, false),
)
Context("ScanEnd", func() {
BeforeEach(func() {
err := repo.ScanBegin(lib.ID, true)
Expect(err).ToNot(HaveOccurred())
})
It("sets LastScanAt and clears FullScanInProgress and LastScanStartedAt", func() {
err := repo.ScanEnd(lib.ID)
Expect(err).ToNot(HaveOccurred())
updatedLib, err := repo.Get(lib.ID)
Expect(err).ToNot(HaveOccurred())
Expect(updatedLib.LastScanAt).ToNot(BeZero())
Expect(updatedLib.FullScanInProgress).To(BeFalse())
Expect(updatedLib.LastScanStartedAt).To(BeZero())
})
It("sets LastScanAt to be after LastScanStartedAt", func() {
libBefore, err := repo.Get(lib.ID)
Expect(err).ToNot(HaveOccurred())
err = repo.ScanEnd(lib.ID)
Expect(err).ToNot(HaveOccurred())
libAfter, err := repo.Get(lib.ID)
Expect(err).ToNot(HaveOccurred())
Expect(libAfter.LastScanAt).To(BeTemporally(">=", libBefore.LastScanStartedAt))
})
})
})
})
+9 -3
View File
@@ -157,7 +157,7 @@ func (s *SQLStore) WithTxImmediate(block func(tx model.DataStore) error, scope .
}, scope...)
}
func (s *SQLStore) GC(ctx context.Context) error {
func (s *SQLStore) GC(ctx context.Context, libraryIDs ...int) error {
trace := func(ctx context.Context, msg string, f func() error) func() error {
return func() error {
start := time.Now()
@@ -167,11 +167,17 @@ func (s *SQLStore) GC(ctx context.Context) error {
}
}
// If libraryIDs are provided, scope operations to those libraries where possible
scoped := len(libraryIDs) > 0
if scoped {
log.Debug(ctx, "GC: Running selective garbage collection", "libraryIDs", libraryIDs)
}
err := run.Sequentially(
trace(ctx, "purge empty albums", func() error { return s.Album(ctx).(*albumRepository).purgeEmpty() }),
trace(ctx, "purge empty albums", func() error { return s.Album(ctx).(*albumRepository).purgeEmpty(libraryIDs...) }),
trace(ctx, "purge empty artists", func() error { return s.Artist(ctx).(*artistRepository).purgeEmpty() }),
trace(ctx, "mark missing artists", func() error { return s.Artist(ctx).(*artistRepository).markMissing() }),
trace(ctx, "purge empty folders", func() error { return s.Folder(ctx).(*folderRepository).purgeEmpty() }),
trace(ctx, "purge empty folders", func() error { return s.Folder(ctx).(*folderRepository).purgeEmpty(libraryIDs...) }),
trace(ctx, "clean album annotations", func() error { return s.Album(ctx).(*albumRepository).cleanAnnotations() }),
trace(ctx, "clean artist annotations", func() error { return s.Artist(ctx).(*artistRepository).cleanAnnotations() }),
trace(ctx, "clean media file annotations", func() error { return s.MediaFile(ctx).(*mediaFileRepository).cleanAnnotations() }),