Commit graph

10 commits

Author SHA1 Message Date
safierinx-a
8ce6ec494d Add bulk import optimization: track_lookup cache, batch inserts, BulkSubmitter
Adopts ListenBrainz-inspired patterns to speed up imports from ~24h to
under 30 minutes for 49k scrobbles.

Phase 1 - track_lookup cache table:
- New migration (000006) adds persistent entity lookup cache
- Maps normalized (artist, track, album) → (artist_id, album_id, track_id)
- SubmitListen fast path: cache hit skips 18 DB queries → 2 queries
- Cache populated after entity resolution, invalidated on merge/delete
- Benefits both live scrobbles and imports

Phase 2 - SaveListensBatch:
- New batch listen insert using pgx CopyFrom → temp table → INSERT ON CONFLICT
- Thousands of inserts per second vs one-at-a-time

Phase 3 - BulkSubmitter:
- Reusable import accelerator for all importers
- Pre-deduplicates scrobbles by (artist, track, album) in memory
- Worker pool (4 goroutines) for parallel entity creation on cache miss
- Batch listen insertion via SaveListensBatch

Phase 4 - Migrate importers:
- Maloja, Spotify, LastFM, ListenBrainz importers use BulkSubmitter
- Koito importer left as-is (already fast with pre-resolved IDs)

Phase 5 - Skip image lookups during import:
- GetArtistImage/GetAlbumImage calls fully skipped when SkipCacheImage=true
- Background tasks (FetchMissingArtistImages/FetchMissingAlbumImages) backfill

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 04:17:50 +05:30
safierinx-a
c2d393aa03 Fix Maloja importer resilience: nil MBZ client, null album, error handling
Root cause of the panic at ~1,693 items: importers created
`&mbz.MusicBrainzClient{}` (nil requestQueue) instead of using the
engine's properly initialized client. When any code path called an MBZ
method, it panicked on the nil channel.

Changes:
- Pass engine's MBZ client to Maloja and Spotify importers
- Change MalojaTrack.Album to pointer type to handle null album JSON
- Continue on error instead of aborting the entire import
- Accept both Maloja export formats ("scrobbles" and "list" keys)
- Extract per-file import into importFile() with its own defer/recover
- Add progress logging every 500 items

Test fixtures:
- maloja_import_null_album_test.json (null album, valid album, empty artists)
- maloja_api_format_test.json (API "list" format)

New tests: TestImportMaloja_NullAlbum, TestImportMaloja_ApiFormat

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 23:58:45 +05:30
80b6f4deaa feat: v0.0.8 2025-06-16 21:55:39 -04:00
7ff317756f feat: version v0.0.2 2025-06-14 19:14:30 -04:00
c14df8c2fb feat: time-gated imports via cfg 2025-06-13 05:36:41 -04:00
d0d7bd9a4a feat: listenbrainz import 2025-06-12 23:35:17 -04:00
d2277aea32 feat: add last fm importer 2025-06-12 21:54:35 -04:00
0030628db1 fix: use utc time for maloja importer 2025-06-12 02:42:27 -04:00
f7621d0383 feat: throttle variable for imports 2025-06-11 22:40:36 -04:00
fc9054b78c chore: initial public commit 2025-06-11 19:45:39 -04:00