Koito/internal/catalog
safierinx-a 8ce6ec494d Add bulk import optimization: track_lookup cache, batch inserts, BulkSubmitter
Adopts ListenBrainz-inspired patterns to speed up imports from ~24h to
under 30 minutes for 49k scrobbles.

Phase 1 - track_lookup cache table:
- New migration (000006) adds persistent entity lookup cache
- Maps normalized (artist, track, album) → (artist_id, album_id, track_id)
- SubmitListen fast path: cache hit skips 18 DB queries → 2 queries
- Cache populated after entity resolution, invalidated on merge/delete
- Benefits both live scrobbles and imports

Phase 2 - SaveListensBatch:
- New batch listen insert using pgx CopyFrom → temp table → INSERT ON CONFLICT
- Thousands of inserts per second vs one-at-a-time

Phase 3 - BulkSubmitter:
- Reusable import accelerator for all importers
- Pre-deduplicates scrobbles by (artist, track, album) in memory
- Worker pool (4 goroutines) for parallel entity creation on cache miss
- Batch listen insertion via SaveListensBatch

Phase 4 - Migrate importers:
- Maloja, Spotify, LastFM, ListenBrainz importers use BulkSubmitter
- Koito importer left as-is (already fast with pre-resolved IDs)

Phase 5 - Skip image lookups during import:
- GetArtistImage/GetAlbumImage calls fully skipped when SkipCacheImage=true
- Background tasks (FetchMissingArtistImages/FetchMissingAlbumImages) backfill

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 04:17:50 +05:30
..
associate_album.go Add bulk import optimization: track_lookup cache, batch inserts, BulkSubmitter 2026-03-25 04:17:50 +05:30
associate_artists.go Add bulk import optimization: track_lookup cache, batch inserts, BulkSubmitter 2026-03-25 04:17:50 +05:30
associate_track.go fix: ensure listen activity correctly sums listen activity in step (#139) 2026-01-14 21:35:01 -05:00
catalog.go Add bulk import optimization: track_lookup cache, batch inserts, BulkSubmitter 2026-03-25 04:17:50 +05:30
catalog_test.go Pre-release version v0.0.14 (#96) 2025-11-19 20:26:56 -05:00
duration.go fix: correctly cycle tracks in backfill (#138) 2026-01-14 12:46:17 -05:00
duration_test.go fix: correctly cycle tracks in backfill (#138) 2026-01-14 12:46:17 -05:00
images.go fix: improve subsonic image searching (#164) 2026-01-21 14:54:52 -05:00
images_test.go chore: static -> test_assets 2025-06-13 16:23:43 -04:00
lookup_key.go Add bulk import optimization: track_lookup cache, batch inserts, BulkSubmitter 2026-03-25 04:17:50 +05:30
submit_listen_test.go Add MusicBrainz search-by-name enrichment for scrobbles without IDs 2026-03-25 00:01:24 +05:30