API Reference¶
Auto-generated from docstrings via mkdocstrings. The public surface is small — most users only touch the CLI — but if you're embedding musickit in another tool, these are the entry points.
musickit.metadata¶
Read source audio tags (FLAC / MP3 / generic) and write MP4 ALAC / AAC / MP3 tags.
metadata
¶
Read source audio tags (FLAC / MP3 / generic) and write MP4 ALAC tags.
Public API is split across submodules; this module re-exports the names
the rest of the project (and tests) import from musickit.metadata.
Attributes¶
SUPPORTED_AUDIO_EXTS = frozenset({'.flac', '.mp3', '.m4a', '.m4b', '.mp4', '.aac', '.ogg', '.opus', '.wav', '.aiff', '.aif'})
module-attribute
¶
Classes¶
SourceTrack
¶
Bases: BaseModel
Tag bundle read from a single source audio file.
Source code in src/musickit/metadata/models.py
AlbumSummary
¶
Bases: BaseModel
Album-level rollup derived by majority-vote across the album's tracks.
Source code in src/musickit/metadata/models.py
MusicBrainzIds
¶
Bases: BaseModel
Album-level MusicBrainz IDs supplied by an --enrich provider.
Per-track recording MBIDs live on SourceTrack.mb_recording_id —
they vary per track and don't belong on an album-scope object.
Source code in src/musickit/metadata/models.py
TagOverrides
¶
Bases: BaseModel
Optional tag overrides applied in-place by apply_tag_overrides.
Each field is None to mean "leave the existing tag alone". Pass an empty
string to clear a tag explicitly (rare; typically you just leave it).
Source code in src/musickit/metadata/models.py
Functions¶
read_source(path, *, light=False, measure_pictures=False)
¶
Read tags + embedded cover from a single audio file.
Source values that arrive entirely lowercase are smart-title-cased here
so downstream filenames + tags display consistently. Anything with real
casing (AC/DC, ABBA, iPhone, R.E.M.) is left alone.
light=True skips the two expensive operations the convert pipeline
needs but the library scanner / TUI doesn't:
- Pillow decode of the embedded picture (for cover_pixels)
- A second mutagen open to read info.length (for duration_s)
has_cover still works in light mode (presence is checked without
touching the bytes); only the pixel measurement is skipped.
measure_pictures=True re-enables the Pillow decode even under
light=True, so audit modes that need low-res-cover detection can
pay just that cost without also paying the duration probe.
Source code in src/musickit/metadata/read.py
summarize_album(tracks)
¶
Build an album-level summary by majority-vote across tracks.
For multi-disc albums the album-name vote is biased toward disc 1 — bonus
discs often carry tags like Album (CD2) Live In ... that would otherwise
win on count and produce a misleading combined name.
Source code in src/musickit/metadata/album.py
clean_album_title(album)
¶
Clean disc markers, scene-rip dot-separators, and VA - prefixes from an album tag.
Strips:
- trailing [CDx] / (Disc x) / - CD 1 / [CD.1] markers
- embedded (CDx) markers (Cranberries Roses (CD2) Live In Madrid shape)
- trailing (1) / (2) (bare-paren disc index, no keyword)
- dots / underscores used as word-separator instead of spaces
(Absolute.Music.60, Absolute_Music_45 → Absolute Music 60/45);
preserves single-letter acronyms like R.E.M.
- leading VA - / VA.-. / Various - prefixes once the dots are space
Source code in src/musickit/metadata/album.py
write_tags(path, track, album, *, cover_bytes, cover_mime, musicbrainz=None)
¶
Write the target tag set to path, dispatching by file extension.
Source code in src/musickit/metadata/write.py
write_mp4_tags(path, track, album, *, cover_bytes, cover_mime, musicbrainz=None)
¶
Write the full target tag set to an existing ALAC/AAC .m4a file.
Source code in src/musickit/metadata/write.py
write_id3_tags(path, track, album, *, cover_bytes, cover_mime, musicbrainz=None)
¶
Write the full target tag set to an MP3 file as ID3v2.4.
Source code in src/musickit/metadata/write.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 | |
embed_cover_only(path, *, cover_bytes, cover_mime)
¶
Replace the cover of an existing audio file without touching other tags.
Supports .m4a/.mp4/.m4b, .mp3, and .flac. Used by musickit cover
to retrofit album art onto already-converted files. All previous pictures
are dropped first so we don't end up with multiple covers.
Source code in src/musickit/metadata/write.py
apply_tag_overrides(path, overrides)
¶
Apply overrides to path in-place; leave unspecified tags untouched.
Supports .m4a/.mp4/.m4b, .mp3, .flac. Track totals get merged into
the existing (track, total) tuple so we don't lose the per-track number.
Source code in src/musickit/metadata/overrides.py
musickit.library¶
Walk a converted-output directory, build an Artist→Album→Track index, audit it, fix the deterministic warnings, and persist it as a SQLite cache at <root>/.musickit/index.db.
library
¶
Walk a converted-output directory, build an Artist→Album→Track index, audit it.
Public surface re-exported here so callers keep using
from musickit import library / from musickit.library import ….
The leading-underscore helpers _audit_cover and _split_dir_year
are also re-exported because tests/CLI import them directly.
Attributes¶
SCHEMA_VERSION = 1
module-attribute
¶
Bumped when _SCHEMA changes shape; mismatched DBs are unlinked + rebuilt.
INDEX_DIR_NAME = '.musickit'
module-attribute
¶
INDEX_DB_NAME = 'index.db'
module-attribute
¶
ScanProgressCallback = Callable[[Path, int, int], None]
module-attribute
¶
Classes¶
LibraryTrack
¶
Bases: BaseModel
Track-level summary used by LibraryIndex.
Source code in src/musickit/library/models.py
LibraryAlbum
¶
Bases: BaseModel
Album-level rollup with audit warnings populated by audit().
Source code in src/musickit/library/models.py
LibraryIndex
¶
Bases: BaseModel
Full library index, sorted by (artist_dir, album_dir).
Source code in src/musickit/library/models.py
ValidationResult
¶
Counts returned by validate() for one-line logging.
Source code in src/musickit/library/scan.py
Functions¶
audit_album(album)
¶
Replace album.warnings with a fresh audit pass for one album.
Warnings are sorted alphabetically at the end so the in-memory
LibraryIndex produced by scan_full matches the one produced by
load (SQLite returns album_warnings rows ORDER BY warning).
Source code in src/musickit/library/audit.py
fix_index(index, *, dry_run=False, console=None, year_lookup=None, prefer_dirname=False, on_album=None)
¶
Apply deterministic fixes to every flagged album in index.
Returns a list of human-readable action lines. year_lookup is the
MusicBrainz year-lookup callable (defaults to
enrich.musicbrainz.lookup_release_year — injectable for tests).
prefer_dirname=True inverts the tag/path-mismatch resolution: tags
get rewritten from the dir name instead of the dir being renamed from
the tag. Use this when you've hand-curated the directory layout and
want the tags to follow.
on_album(album, idx, total) fires once per FLAGGED album right
before its fixes run; clean albums (no warnings) are skipped silently
and don't count against the total. Used by the CLI to drive a
progress bar through the slow MB lookups.
Source code in src/musickit/library/fix.py
fix_album(album, *, dry_run=False, console=None, year_lookup, prefer_dirname=False)
¶
Apply fixes to one album. Returns the action lines performed (or planned).
Source code in src/musickit/library/fix.py
db_path(root)
¶
open_db(root)
¶
Open or create the index DB for root.
If the existing DB has a stale schema_version or was written for a
different library_root_abs, the file (and any WAL sidecars) is
unlinked and a fresh schema is created.
Source code in src/musickit/library/db.py
is_empty(conn)
¶
True when the DB has no album rows yet (fresh schema, never scanned).
load_or_scan(root, *, use_cache=True, force=False, on_album=None, measure_pictures=False)
¶
Return a LibraryIndex for root, using the on-disk cache when available.
use_cache=False skips the DB entirely (in-memory scan + audit). Used
when .musickit/ cannot be created (read-only mount) or when the
caller passes --no-cache.
force=True ignores any existing cache and runs a full rescan,
rewriting every row. Maps to the --full-rescan CLI flag and the
startScan Subsonic endpoint.
Without force, a warm cache is loaded and a validate() pass
reconciles the DB against any filesystem-level adds/removes/tag-edits
that happened while no watcher was running.
Source code in src/musickit/library/load.py
scan_full(root, conn, *, on_album=None, measure_pictures=False)
¶
Walk root, audit, and write the full result to the index DB.
Used on cold start when the DB is empty and after a startScan. Wipes
every album/track/genre/warning row first so the DB matches the
filesystem exactly. Returns the same LibraryIndex that callers used
to get from scan() + audit().
Source code in src/musickit/library/scan.py
validate(root, conn, *, measure_pictures=False, on_album=None)
¶
Diff the filesystem against DB rows and apply add/remove/update deltas.
Catches changes that happened while no serve/watcher was running:
new albums dropped in, removed albums, tag edits applied with another
tool. Each affected album is re-scanned in full and re-audited; rows
for vanished albums are dropped.
Returns a ValidationResult so callers can log a one-line summary.
Source code in src/musickit/library/scan.py
167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 | |
rescan_albums(root, conn, album_dirs, *, measure_pictures=False, on_album=None, _db_album_dirs=None)
¶
Re-scan + re-audit each album dir; drop rows for any that vanished.
Reusable by the cold-start validate() pass and (in PR 2) the
filesystem watcher. The DB is updated under one transaction so a
crash mid-rescan can't half-apply changes.
Source code in src/musickit/library/scan.py
249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 | |
musickit.serve¶
FastAPI factory + auth + config for the Subsonic-compatible HTTP server.
serve
¶
Subsonic-compatible HTTP server for the converted musickit library.
musickit serve [DIR] launches a FastAPI app that exposes the library
via the Subsonic API (v1.16.1). Any Subsonic client (Symfonium, play:Sub,
Feishin, DSub, etc.) can browse, search, and stream from it.
Classes¶
ServeConfig
¶
Functions¶
create_app(*, root, cfg, use_cache=True)
¶
Build the FastAPI app for root with the given credentials.
use_cache=False disables the persistent <root>/.musickit/index.db
and falls back to in-memory scan on every rebuild.
Source code in src/musickit/serve/app.py
resolve_credentials(*, cli_user, cli_password)
¶
CLI flags win over the TOML. Falls back to admin/admin when nothing is set.
Returns (cfg, used_defaults) so the caller can warn the user when the
insecure defaults are in play.
Source code in src/musickit/serve/config.py
musickit.naming¶
Filesystem-safe folder + filename builders.
naming
¶
Filesystem-safe name building for artist / album / track output paths.
Attributes¶
VARIOUS_ARTISTS = 'Various Artists'
module-attribute
¶
Functions¶
artist_folder(album_artist, fallback_artist, *, is_compilation=False)
¶
Folder name for the artist level. Maps VA / compilation albums to Various Artists.
Three triggers route to the canonical Various Artists folder:
- album_artist tag is a VA alias (VA, V.A., Various, …)
- fallback_artist (per-track majority) is itself a VA alias — some rips
stamp VA as the per-track artist and leave album_artist empty
- is_compilation is True (album-level signal: distinct per-track artists
with no shared album_artist tag, e.g. an MP3 mix labelled only by
filename)
Source code in src/musickit/naming.py
album_folder(album, year)
¶
Folder name for the album level.
Format: YYYY - Album so directory listings inside an artist folder sort
chronologically. Year is omitted if unknown, falling back to just Album.
A year that's part of the album title (e.g. Vocal Trance Hits 2024,
Taylor Swift's 1989) is intentionally left in place — it's the actual
title.
Source code in src/musickit/naming.py
track_filename(track_no, title, *, artist=None, disc_no=None, disc_total=None, track_total=None, extension='.m4a')
¶
Output filename for a single track.
Default format: 01 - Title<ext>. When the album spans multiple discs
(disc_total > 1), the disc number is prefixed: 01-01 - Title<ext>.
When artist is provided (typically only for compilations / VA albums)
it is inserted between track number and title: 01-05 - Artist - Title<ext>.
Track-number width grows with track_total so albums with ≥100 tracks
sort alphabetically correctly: a 100-track album yields 001, 002,
010, 099, 100 instead of breaking at the ⅔-digit boundary.
Disc-number width is fixed at 2 (no realistic disc count needs more).
Source code in src/musickit/naming.py
clean_folder_album_name(name)
¶
Strip codec/quality tags + edition annotations + extract year.
Returns (cleaned_album_name, year_or_None). Used as a fallback when an
album has no ALBUM tag and we have to lean on the folder name.
Strip order
- Edition annotations (
(Deluxe Edition),[Remastered],(2018 Reissue),(40th Anniversary Edition)) — these would otherwise leak into_FOLDER_YEAR_REand pollute the year pick. - Year extraction from any remaining
(YYYY)/ bare digits. - Codec/quality tags (
[FLAC],[16Bit-44.1kHz]). VA -/Various -prefix.
Live annotations ((Live), (Live in Madrid 2019)) are kept — a live
album is a distinct work from its studio counterpart and the audience
expects to see it labeled.
Source code in src/musickit/naming.py
leading_year_from_folder(name)
¶
Return the 4-digit year iff name starts with one followed by a separator.
Used by the convert pipeline to override reissue years that survive in
track tags when the input dir is hand-named with the original year (e.g.
1983. Album! [2018 Reissue] should yield 1983, not 2018).
Source code in src/musickit/naming.py
is_various_artists(album_artist)
¶
Return True if album_artist indicates a Various-Artists compilation.
sanitize_component(value)
¶
Make value safe to use as a single path component on any OS.
Replaces forbidden characters, collapses whitespace, NFC-normalizes unicode, strips trailing dots/spaces, and caps the encoded length at 180 bytes.
Source code in src/musickit/naming.py
musickit.cover¶
Cover-art candidates, picker, normaliser.
cover
¶
Locate, normalize, and embed album cover art.
Attributes¶
DEFAULT_MAX_EDGE = 1000
module-attribute
¶
Classes¶
CoverCandidate
¶
Bases: BaseModel
A candidate cover image, before normalization.
Source code in src/musickit/cover.py
CoverSource
¶
Bases: str, Enum
Where the cover came from. Used for reporting + tie-breaking under --enrich.
Source code in src/musickit/cover.py
Cover
¶
Bases: BaseModel
A normalized album cover ready to embed into every track of an album.
Source code in src/musickit/cover.py
Functions¶
collect_candidates(album_dir, tracks)
¶
Gather every plausible cover candidate for an album (offline only).
Source code in src/musickit/cover.py
pick_best(candidates)
¶
Pick the highest-quality candidate.
"Quality" = pixel area first, then file size, then source order
(online > folder > embedded). The source-order tiebreaker matters under
--enrich so that an online provider returning the same dimensions as
a 600×600 scanned folder.jpg still wins.
Source code in src/musickit/cover.py
normalize(candidate, *, max_edge=DEFAULT_MAX_EDGE)
¶
Decode + recompress the chosen candidate.
Output is JPEG ≤ max_edge px on the long side, RGB, quality 92 — except
for PNGs that already fit, which are passed through unchanged.