Searching the music collection¶
The music collection supports a flexible, tag-based search language that can return artists, albums, and tracks in a single request. Search results are grouped, scored, highlighted, and designed for lazy navigation via follow-up queries.
Searching is implemented in two layers:
- Core search engine –
MusicCollection.search_grouped(...)(inreader.py) - UI-facing API –
MusicCollectionUI.search_highlighting(...)(inui.py)
Most applications should call the UI-facing method.
🚀 Entry points¶
UI-facing search (recommended)¶
Returns a single list of result objects, ready for rendering in a UI.
Each result object has a type field (artist, album, or track) and includes highlighted text and navigation metadata.
Core grouped search (lower-level)¶
Returns:
- A dictionary with three lists:
artists,albums,tracks - The parsed search terms, grouped by category
This method is primarily used internally by the UI layer.
🔎 Query language¶
The query parser recognizes tagged terms and general free-text terms.
Supported tags¶
| Tag | Example | Meaning |
|---|---|---|
artist: |
artist:Prince |
Restrict results to a specific artist |
album: |
album:"Purple Rain" |
Restrict results to a specific album |
track: / song: |
track:"When Doves Cry" |
Restrict results to track titles |
Free-text terms¶
Any term not prefixed by a tag is treated as free-text and matched against:
- artist
- album
- track title
Example:
Quoting and escaping¶
- Single or double quotes allow multi-word values
- Backslashes can escape special characters inside quoted values
⚙️ Parsed term structure¶
The query is normalized into a dictionary:
This structure is returned alongside the search results and reused for:
- scoring
- highlighting
- UI explanation ("why did this match?")
⚡ Search execution model¶
Pass-one candidate collection¶
The search engine performs a first pass to collect candidates:
- Artists are scored by how well they match artist terms
- Albums are scored by album name and artist context
- Tracks are scored by title and tag matches
The engine may reuse candidates from the previous search session if the new query is a refinement (e.g. clicking an artist).
Scoring (simplified)¶
Matches are weighted using:
- Exact matches
- Prefix matches
- Substring matches
- Tag bonuses (explicit
artist:,album:,track:)
This produces ranked candidate sets for artists, albums, and tracks.
Performance optimization¶
To minimize database overhead, the search engine batches related queries:
- Album counts for all matched artists are fetched in a single query
- Track counts for all matched albums are fetched in a single query
- Compilation status for all matched albums is checked in a single query
This reduces the number of database round trips from O(n+m) to O(1), where n is the number of artists and m is the number of albums.
🏘️ Result grouping and hierarchy¶
After scoring, results are assembled into a hierarchical structure:
- Artists
- Albums
- Tracks
The engine decides which groups to include based on the query:
| Query type | Included sections |
|---|---|
| Free-text only | Artists, albums, and tracks |
artist: present |
Artists + related albums/tracks |
album: present |
Albums + tracks |
track: present |
Tracks only |
🎯 UI result model¶
The UI layer converts grouped results into a single flat list of result objects via the function search_highlighting.
Each object has a type field and a shape appropriate for rendering.
Artist results¶
{
"type": "artist",
"artist": "<mark>Prince</mark>",
"raw_artist": "Prince",
"reasons": [
{ "type": "album", "text": "3 album(s)" },
{ "type": "track", "text": "12 nummer(s)" }
],
"load_on_demand": true,
"clickable": true,
"click_query": "artist:'Prince'"
}
Characteristics:
- Summary only (no albums or tracks included)
- Always lazy-loaded
- Clicking triggers a new search using
click_query
Album results¶
{
"type": "album",
"artist": "Prince",
"album": "<mark>Purple Rain</mark>",
"is_compilation": false,
"cover": "covers/prince_purplerain.jpg",
"reasons": [
{ "type": "track", "text": "5 nummer(s)" }
],
"load_on_demand": true,
"clickable": true,
"click_query": "release_dir:'/Prince/Purple Rain'"
}
Characteristics:
- Summary only
- Tracks are loaded on demand
- Albums with tracks by more than three artists are shown as "Various Artists"
- Includes
coverfield with relative URL to cached cover image
Track results¶
{
"type": "track",
"artist": "Prince",
"album": "Purple Rain",
"track": "<mark>When Doves Cry</mark>",
"duration": "5:54",
"path": "Prince/Purple Rain/01 - When Doves Cry.flac",
"cover": "covers/prince_purplerain.jpg",
"artist_click_query": "artist:'Prince'",
"album_click_query": "album:'Purple Rain'"
}
Characteristics:
- Fully populated (no lazy loading)
- Includes navigation queries for artist and album
- Includes
coverfield with relative URL to cached cover image
🖼️ Cover art management¶
The music collection provides automatic cover art extraction, caching, and serving with support for size-optimized variants for responsive clients like Android Auto.
Basic cover retrieval¶
# Get cover URL for a release directory
cover_url = mc.get_cover("Artist/Album")
# Returns: "covers/artist_album.jpg" or "covers/_fallback.jpg"
Behavior:
- Searches for common cover image files (
cover.jpg,folder.jpg, etc.) - Extracts embedded artwork from audio files if no standalone image found
- Optimizes images to max 800×800px, 85% quality, ≤500KB
- Caches extracted covers in
DATA_ROOT/cache/covers/ - Returns fallback image if no cover found
Size-optimized cover variants¶
For bandwidth-conscious applications (mobile, Android Auto), request specific sizes:
# Get multiple size variants
cover_sizes = mc.get_cover_sizes("Artist/Album")
# Returns:
# {
# "96x96": "covers/artist_album_96x96.jpg",
# "128x128": "covers/artist_album_128x128.jpg",
# "192x192": "covers/artist_album_192x192.jpg",
# "256x256": "covers/artist_album_256x256.jpg",
# "384x384": "covers/artist_album_384x384.jpg",
# "512x512": "covers/artist_album_512x512.jpg"
# }
Behavior:
- Generates size variants on-demand (lazy generation)
- Caches variants permanently for future requests
- Falls back to main cover if variant generation fails
- Returns fallback URLs for all sizes if no cover found
Standard sizes:
| Size | Use case | Typical file size |
|---|---|---|
| 96×96 | Thumbnails, lists | 5-8 KB |
| 128×128 | Small tiles | 8-12 KB |
| 192×192 | Medium tiles | 15-20 KB |
| 256×256 | Android Auto (optimal) | 30-50 KB |
| 384×384 | High-DPI displays | 60-90 KB |
| 512×512 | Full-screen player | 100-150 KB |
Flask API endpoints¶
Two routes are available for serving cover images:
Direct file serving (existing):
Serves cached cover files directly. Used by existing UI code.
Size-parameterized API (new):
Serves size-specific cover variants. Generates on-demand if needed.
Example usage:
# Android Auto - request optimal size
GET /api/covers/Artist%2FAlbum?size=256x256
# Without size parameter - returns main cover
GET /api/covers/Artist%2FAlbum
# Invalid size - returns error JSON
GET /api/covers/Artist%2FAlbum?size=999x999
# {"error": "Invalid size parameter", "valid_sizes": [...]}
Cover extraction details¶
The extraction process follows this priority:
- Common image files in release directory:
cover.jpg,folder.jpg,album.jpg,front.jpg-
cover.png,folder.png -
Embedded artwork from audio files:
- Extracted from first audio file with embedded art
-
Supports all formats handled by TinyTag
-
Optimization:
- Converts all images to RGB JPEG
- Resizes to max 800×800px (maintains aspect ratio)
- Compresses to 85% quality initially
- Reduces quality iteratively if file >500KB
-
Handles transparency by compositing on white background
-
Caching:
- Sanitizes release directory to safe filename slug
- Stores in
DATA_ROOT/cache/covers/{slug}.jpg - Size variants stored as
{slug}_{size}x{size}.jpg
Performance characteristics¶
Storage impact:
- Main cover: 300-500 KB per album
- All 6 size variants: 50-150 KB total per album
- Lazy generation: variants only created when requested
Bandwidth savings:
| Client | Original (800px) | Optimized | Savings |
|---|---|---|---|
| Android Auto | 300-500 KB | 30-50 KB (256×256) | ~90% |
| Mobile web | 300-500 KB | 15-25 KB (128×128) | ~95% |
| List thumbnails | 300-500 KB | 5-8 KB (96×96) | ~98% |
Generation performance:
- First request with variants: +100-200ms (one-time cost)
- Subsequent requests: 0ms (served from cache)
- Main cover extraction: typically <100ms
💤 Lazy loading¶
Artist and album results include a click_query field.
When the user clicks such a result:
- The UI issues a new search
- Using the stored
click_query - Which returns a more specific result set
This keeps the API stateless and avoids nested payloads.
✨ Highlighting¶
All matched terms are automatically highlighted:
- Implemented in
_highlight_text(...) - Case-insensitive
- Wrapped in
<mark>...</mark>
Highlighting applies to:
- Artist names
- Album titles
- Track titles
This behavior is UI-specific and not part of the core search engine.
💡 Match explanations¶
Each result may include a reasons list explaining why it matched:
- Matching artist name
- Number of matching albums
- Number of matching tracks
These are intended for UI hints, badges, or tooltips.
👀 Real‑time monitoring¶
MusicCollection.start_monitoring() creates a watchdog.observers.Observer that uses the
EnhancedWatcher class (defined in src/musiclib/_watcher.py).
The enhanced watcher adds two important behaviours that differ from a naïve FileSystemEventHandler:
| Feature | What it does | Why it matters |
|---|---|---|
Debounce delay (DEBOUNCE_DELAY = 2.0 s) |
After the last change to a given file, the watcher waits 2 seconds before queuing an INDEX_FILE or DELETE_FILE event. |
Prevents a burst of rapid edits (e.g., a tag‑editing batch) from generating many separate index operations, which could corrupt the DB. |
| Coalescing | Multiple created/modified events for the same path are merged into a single INDEX_FILE event; a later deleted event overrides any pending modified events. |
Guarantees that the final state of the file is what gets indexed. |
Graceful shutdown (shutdown() method) |
Cancels all pending timers and flushes any remaining events to the write queue before the observer is stopped. | Ensures no file‑system changes are lost when the application exits. |
The rest of the monitoring flow (observer start/stop, queue → writer thread) remains exactly as described in the original diagram.
📄 Summary¶
In short, searching works as follows:
- Parse the query into tagged and free-text terms
- Collect and score artist, album, and track candidates
- Build hierarchical grouped results
- Flatten results into UI-friendly objects
- Highlight matches and attach navigation queries
- Support lazy exploration through follow-up searches
Cover art management works as follows:
- Extract covers from release directories or embedded artwork
- Optimize and cache at 800×800px
- Generate size variants on-demand for bandwidth efficiency
- Serve via direct file URLs or size-parameterized API
- Fall back gracefully when covers unavailable
This design allows the UI to deliver a fast, expressive, and navigable search experience without embedding deep hierarchies in a single response, while efficiently serving cover art to clients with varying bandwidth and display requirements.
🔌 API¶
Only the following methods are considered stable public APIs:
MusicCollection.search_grouped, MusicCollectionUI.search_highlighting, MusicCollection.rebuild, MusicCollection.resync, MusicCollection.close, MusicCollection.get_collection_stats, MusicCollection.get_cover, MusicCollection.get_cover_sizes.
MusicCollection(music_root, db_path, logger=None)
¶
Manages a music collection database and provides search and detail retrieval functionality. Handles lazy loading, background indexing, and query parsing for artists, albums, and tracks.
Initializes a MusicCollection backed by a SQLite database and music root directory. Sets up logging, collection extraction, and schedules any required initial indexing or resync operations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
music_root
|
Path | str
|
The root directory containing the music files to be indexed. |
required |
db_path
|
Path | str
|
The path to the SQLite database file used to store collection metadata. |
required |
logger
|
Logger
|
Optional logger instance for recording informational and error messages. |
None
|
Methods:
| Name | Description |
|---|---|
rebuild |
Triggers a full rebuild of the music collection database. |
resync |
Performs a resync of the music collection database. |
close |
Stops monitoring and closes resources associated with the music collection. |
count |
Returns the total number of tracks in the music collection database. |
search_grouped |
Searches the music collection and returns grouped results by artists, albums, and tracks. |
get_artist_details |
Retrieves detailed information about an artist, including their albums and tracks. |
get_album_details |
Retrieves detailed information about an album given its release directory. |
get_track |
Retrieves metadata for a single track identified by its path. |
get_cover |
Retrieves or generates the cover image path for a given album release directory. |
get_cover_sizes |
Returns URLs for multiple size variants of a cover image. |
get_collection_stats |
Returns high-level statistics about the music collection. |
Source code in src/musiclib/reader.py
rebuild()
¶
Triggers a full rebuild of the music collection database. Rebuilds the database from scratch using the current music files.
Returns:
| Type | Description |
|---|---|
None
|
None |
resync()
¶
Performs a resync of the music collection database. Updates the database to reflect changes in the music files without a full rebuild.
Returns:
| Type | Description |
|---|---|
None
|
None |
close()
¶
Stops monitoring and closes resources associated with the music collection. Cleans up background tasks and releases any held resources.
Returns:
| Type | Description |
|---|---|
None
|
None |
count()
¶
Returns the total number of tracks in the music collection database. Executes a query to count all tracks currently indexed.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The number of tracks in the database. |
Source code in src/musiclib/reader.py
search_grouped(query, limit=30)
¶
Searches the music collection and returns grouped results by artists, albums, and tracks. Also returns the parsed terms for highlighting.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
The search query string. |
required |
limit
|
int
|
Maximum number of results to return per group. |
30
|
Returns:
| Type | Description |
|---|---|
tuple[dict, dict]
|
tuple[dict, dict]: A tuple of (grouped results, parsed terms). |
Source code in src/musiclib/reader.py
get_artist_details(artist)
¶
Retrieves detailed information about an artist, including their albums and tracks. Returns a dictionary with the artist name and a list of albums containing track details.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
artist
|
str
|
The name of the artist to retrieve details for. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict
|
Dictionary containing the artist name and a list of albums with track information. |
Source code in src/musiclib/reader.py
get_album_details(release_dir)
¶
Retrieves detailed information about an album given its release directory. Returns a dictionary with album details, including artist, tracks, compilation status, and release directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
release_dir
|
str
|
The release directory relative to the music root. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict
|
Dictionary containing album details, track list, and compilation status. |
Source code in src/musiclib/reader.py
get_track(path)
¶
Retrieves metadata for a single track identified by its path.
Queries the music library database for the track and returns a normalized metadata dictionary if found.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
The full or relative path of the track to look up. |
required |
Returns:
| Type | Description |
|---|---|
dict | None
|
dict | None: A dictionary of track metadata if the track exists, otherwise None. |
Source code in src/musiclib/reader.py
get_cover(release_dir)
¶
Retrieves or generates the cover image path for a given album release directory. Returns a relative path to a cached or newly extracted cover image if available, or a fallback image path if no cover is found.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
release_dir
|
str
|
The release directory identifier used to locate or derive the cover image. |
required |
Returns:
| Type | Description |
|---|---|
str | None
|
str | None: Relative path to the cover image within the covers directory, or fallback image path if not found. |
Source code in src/musiclib/reader.py
get_cover_sizes(release_dir)
¶
Returns URLs for multiple size variants of a cover image. Generates size variants on-demand if not already cached.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
release_dir
|
str
|
The release directory identifier for which to retrieve cover variants. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, str]
|
dict[str, str]: Dictionary mapping size strings (e.g., "96x96") to relative URL paths. |
Source code in src/musiclib/reader.py
get_collection_stats()
¶
Returns high-level statistics about the music collection.
This method queries the tracks table to compute aggregate counts of distinct artists, distinct artist/album combinations, total tracks, and the timestamp of the most recently added track.
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict
|
A dictionary containing |
dict
|
|
Source code in src/musiclib/reader.py
MusicCollectionUI(music_root, db_path, logger=None)
¶
Bases: MusicCollection
Extends MusicCollection to provide UI-specific search and highlighting features. Adds methods for formatting, escaping, and highlighting search results for user interfaces.
Initializes the MusicCollectionUI with the given music root, database path, and optional logger. Sets up the UI-specific extension of the music collection functionality.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
music_root
|
Path
|
Path to the root directory containing music files. |
required |
db_path
|
Path
|
Path to the SQLite database file. |
required |
logger
|
Logger
|
Optional logger instance. |
None
|
Returns:
| Type | Description |
|---|---|
None
|
None |
Methods:
| Name | Description |
|---|---|
search_highlighting |
Performs a search and returns results with highlighted matching terms for UI display. |
Source code in src/musiclib/ui.py
search_highlighting(query, limit=30)
¶
Performs a search and returns results with highlighted matching terms for UI display. Groups and highlights artists, albums, and tracks based on the search query and terms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
The search query string. |
required |
limit
|
int
|
Maximum number of results to return. |
30
|
Returns:
| Type | Description |
|---|---|
list[dict]
|
list[dict]: List of dictionaries containing highlighted search results for artists, albums, and tracks. |
