Skip to content

Audio cache

Audio Caching System

The audio‑caching subsystem automatically converts large lossless audio files (FLAC, WAV, AIFF, …) into smaller MP3 streams, dramatically reducing bandwidth while preserving a pleasant listening experience.

TL;DR – The cache turns a 40 MB FLAC track into a ~5 MB MP3 (≈ 87 % bandwidth saving) and serves the MP3 via HTTP range requests.

📖 Overview

When streaming lossless audio over the web, bandwidth quickly becomes a bottleneck:

Format Approx. size (4‑min track) Typical bitrate
FLAC (original) 40‑50 MB ~1 000 kbps
MP3 – High (256 kbps) ≈ 8 MB 256 kbps
MP3 – Medium (192 kbps) ≈ 5 MB 192 kbps
MP3 – Low (128 kbps) ≈ 3 MB 128 kbps

Result: A 4‑minute track drops from ~45 MB to ~5 MB (≈ 87 % bandwidth reduction) with negligible audible loss for casual listening.

The cache is transparent to the rest of the application:

  • The UI requests a track → the Flask route asks AudioCache for a cached version.
  • If a suitable MP3 exists, it is streamed via HTTP range requests.
  • If not, the original file is streamed (or a background job creates the cache for the next request).

🏛️ Architecture Overview

graph TD
    A["AudioCache (core)"] --> B["Cache Path Generation"]
    A --> C["Transcoding (ffmpeg)"]
    A --> D["Cache Management (size, cleanup)"]

    E[CacheWorker] --> A
    E --> F["ThreadPool (parallel batch)"]
    E --> G["ProgressTracker (SSE)"]

    H[ProgressTracker] --> I["Frontend (EventSource)"]
Hold "Alt" / "Option" to enable pan & zoom
  • audio_cache.py – core logic (hash‑based filenames, transcoding, cache look‑ups).
  • cache_worker.py – batch processing, thread‑pool parallelism, progress callbacks.
  • progress_tracker.py – Server‑Sent Events (SSE) emitter that feeds the UI’s “caching progress” modal.

✨ Key Features

Feature Description
Automatic transcoding FLAC, WAV, AIFF, APE, ALAC → MP3 (high/medium/low).
Multiple quality levels high (256 kbps), medium (192 kbps), low (128 kbps).
Smart caching Only creates a cached file when the source is lossless and the cache is missing/out-of-date.
Pre-caching on upload When a mixtape is saved, the system can generate caches automatically.
Parallel batch processing Thread-pool (configurable workers) for fast bulk transcoding.
Progress tracking Real-time SSE updates displayed in a Bootstrap modal.
Cache management utilities Size calculation, age-based cleanup, full purge.
Config-driven All knobs live in src/config/config.py (AUDIO_CACHE_*).

📋 How It Works (Step‑by‑Step)

Cache Path Generation

flowchart LR
    A[Original file path] --> B[Normalize & resolve]
    B --> C[MD5 hash of full path]
    C --> D[Compose filename: `<hash>_<quality>_<bitrate>.mp3`]
    D --> E["Cache directory (`AUDIO_CACHE_DIR`)"]
Hold "Alt" / "Option" to enable pan & zoom
  • The hash guarantees collision‑free filenames, even for identically named tracks in different folders.
  • Example: Original: /music/Radiohead/OK Computer/01 Airbag.flac Hash: a1b2c3… → Cache file a1b2c3_medium_192k.mp3.

Transcoding Flow

sequenceDiagram
    participant UI
    participant Flask
    participant CacheWorker
    participant AudioCache
    participant ffmpeg

    UI->>Flask: Request play (quality=medium)
    Flask->>AudioCache: get_cached_or_original()
    alt Cached version exists
        AudioCache-->>Flask: Return cached path
    else No cache
        AudioCache->>CacheWorker: transcode_file()
        CacheWorker->>ffmpeg: Run ffmpeg command
        ffmpeg-->>CacheWorker: MP3 file created
        CacheWorker->>AudioCache: Store in cache dir
        AudioCache-->>Flask: Return newly cached path
    end
    Flask->>UI: Stream MP3
Hold "Alt" / "Option" to enable pan & zoom
  • If a cached file is present, it is served immediately.
  • Otherwise the worker spawns ffmpeg, writes the MP3, and returns the new path.

Playback Flow

graph LR
    A[User clicks Play] --> B{Quality selected?}
    B -->|Original| C[Serve original FLAC]
    B -->|High/Med/Low| D{Is source lossless?}
    D -->|No| C
    D -->|Yes| E{Cache exists?}
    E -->|Yes| F[Serve cached MP3]
    E -->|No| G[Log warning → fall back to original]
    F --> H[User streams small file]
    C --> I[User streams large file]
Hold "Alt" / "Option" to enable pan & zoom

🔌 API Reference

AudioCache (core)

AudioCache(cache_dir, logger=None)

Manages audio file transcoding and caching for bandwidth optimization.

Provides methods to check for cached versions, generate transcoded files, and manage the cache directory.

Initialize the AudioCache manager.

Parameters:

Name Type Description Default
cache_dir Path

Directory where cached transcoded files will be stored.

required
logger Logger | None

Optional logger for tracking operations.

None

Methods:

Name Description
get_cache_path

Generate a cache filename based on the original path and quality level.

should_transcode

Determine if a file should be transcoded based on its format.

is_cached

Check if a cached version exists and is up-to-date.

transcode_file

Transcode an audio file to a cached version.

get_cached_or_original

Get the cached version if available, otherwise return original path.

precache_file

Pre-generate cached versions at multiple quality levels.

get_cache_size

Calculate total size of the cache directory in bytes.

clear_cache

Clear cached files, optionally only those older than specified days.

Source code in src/audio_cache/audio_cache.py
def __init__(self, cache_dir: Path, logger: Logger | None = None):
    """
    Initialize the AudioCache manager.

    Args:
        cache_dir: Directory where cached transcoded files will be stored.
        logger: Optional logger for tracking operations.
    """
    self.cache_dir = Path(cache_dir)
    self.cache_dir.mkdir(parents=True, exist_ok=True)
    self.logger = logger or NullLogger()
get_cache_path(original_path, quality='medium')

Generate a cache filename based on the original path and quality level.

Parameters:

Name Type Description Default
original_path Path

Path to the original audio file.

required
quality QualityLevel

Quality level for transcoding (high, medium, low).

'medium'

Returns:

Type Description
Path

Path to the cached file location.

Source code in src/audio_cache/audio_cache.py
def get_cache_path(
    self, original_path: Path, quality: QualityLevel = "medium"
) -> Path:
    """
    Generate a cache filename based on the original path and quality level.

    Args:
        original_path: Path to the original audio file.
        quality: Quality level for transcoding (high, medium, low).

    Returns:
        Path to the cached file location.
    """
    if quality == "original":
        return original_path

    # Create unique hash from the normalized file path
    path_str = self._normalize_path(original_path)
    path_hash = hashlib.md5(path_str.encode()).hexdigest()

    # Get quality settings
    settings = QUALITY_SETTINGS[quality]
    bitrate = settings["bitrate"]
    ext = settings["format"]

    cache_filename = f"{path_hash}_{quality}_{bitrate}.{ext}"

    # Log for debugging
    self.logger.debug(
        f"Cache path generation: {original_path.name} -> {cache_filename} "
        f"(hash of: {path_str})"
    )

    return self.cache_dir / cache_filename
should_transcode(file_path)

Determine if a file should be transcoded based on its format.

Parameters:

Name Type Description Default
file_path Path

Path to the audio file.

required

Returns:

Type Description
bool

True if the file should be transcoded, False otherwise.

Source code in src/audio_cache/audio_cache.py
def should_transcode(self, file_path: Path) -> bool:
    """
    Determine if a file should be transcoded based on its format.

    Args:
        file_path: Path to the audio file.

    Returns:
        True if the file should be transcoded, False otherwise.
    """
    # Transcode lossless formats that are bandwidth-heavy
    transcode_formats = {".flac", ".wav", ".aiff", ".ape", ".alac"}
    return file_path.suffix.lower() in transcode_formats
is_cached(original_path, quality='medium')

Check if a cached version exists and is up-to-date.

Parameters:

Name Type Description Default
original_path Path

Path to the original audio file.

required
quality QualityLevel

Quality level to check.

'medium'

Returns:

Type Description
bool

True if a valid cached version exists, False otherwise.

Source code in src/audio_cache/audio_cache.py
def is_cached(self, original_path: Path, quality: QualityLevel = "medium") -> bool:
    """
    Check if a cached version exists and is up-to-date.

    Args:
        original_path: Path to the original audio file.
        quality: Quality level to check.

    Returns:
        True if a valid cached version exists, False otherwise.
    """
    if quality == "original" or not self.should_transcode(original_path):
        return True

    cache_path = self.get_cache_path(original_path, quality)

    if not cache_path.exists():
        self.logger.debug(f"Cache miss: {cache_path.name} does not exist")
        return False

    # Check if cache is newer than original
    try:
        cache_mtime = cache_path.stat().st_mtime

        # Only check mtime if original file exists
        if original_path.exists():
            original_mtime = original_path.stat().st_mtime
            if cache_mtime < original_mtime:
                self.logger.debug(
                    f"Cache outdated: {cache_path.name} is older than source"
                )
                return False

        self.logger.debug(f"Cache hit: {cache_path.name}")
        return True

    except OSError as e:
        self.logger.warning(f"Error checking cache status for {cache_path}: {e}")
        return False
transcode_file(original_path, quality='medium', overwrite=False)

Transcode an audio file to a cached version.

Parameters:

Name Type Description Default
original_path Path

Path to the original audio file.

required
quality QualityLevel

Quality level for transcoding.

'medium'
overwrite bool

If True, regenerate cache even if it exists.

False

Returns:

Type Description
Path

Path to the transcoded file (or original if no transcoding needed).

Raises:

Type Description
CalledProcessError

If ffmpeg transcoding fails.

FileNotFoundError

If the original file doesn't exist.

Source code in src/audio_cache/audio_cache.py
def transcode_file(
    self,
    original_path: Path,
    quality: QualityLevel = "medium",
    overwrite: bool = False
) -> Path:
    """
    Transcode an audio file to a cached version.

    Args:
        original_path: Path to the original audio file.
        quality: Quality level for transcoding.
        overwrite: If True, regenerate cache even if it exists.

    Returns:
        Path to the transcoded file (or original if no transcoding needed).

    Raises:
        subprocess.CalledProcessError: If ffmpeg transcoding fails.
        FileNotFoundError: If the original file doesn't exist.
    """
    if quality == "original" or not self.should_transcode(original_path):
        return original_path

    if not original_path.exists():
        raise FileNotFoundError(f"Original file not found: {original_path}")

    cache_path = self.get_cache_path(original_path, quality)

    # Check if we need to transcode
    if not overwrite and self.is_cached(original_path, quality):
        self.logger.debug(f"Using existing cache: {cache_path}")
        return cache_path

    # Get transcoding settings
    settings = QUALITY_SETTINGS[quality]
    bitrate = settings["bitrate"]

    self.logger.debug(f"Transcoding {original_path.name} to {quality} quality ({bitrate})")

    # Build ffmpeg command
    cmd = [
        "ffmpeg",
        "-y",  # Overwrite output file
        "-i", str(original_path),
        "-vn",  # No video
        "-ar", "44100",  # Sample rate
        "-ac", "2",  # Stereo
        "-b:a", bitrate,  # Target bitrate
        "-map_metadata", "0",  # Copy metadata
        "-id3v2_version", "3",  # ID3v2.3 for better compatibility
        str(cache_path),
    ]

    try:
        subprocess.run(
            cmd,
            capture_output=True,
            check=True,
            text=True,
        )
        self.logger.debug(f"Successfully cached: {cache_path}")
        return cache_path

    except subprocess.CalledProcessError as e:
        self.logger.error(f"Transcoding failed for {original_path}: {e.stderr}")
        # Clean up partial file if it exists
        if cache_path.exists():
            cache_path.unlink()
        raise
get_cached_or_original(original_path, quality='medium')

Get the cached version if available, otherwise return original path.

This method does NOT generate a cache if it doesn't exist. Use transcode_file() for that purpose.

Parameters:

Name Type Description Default
original_path Path

Path to the original audio file.

required
quality QualityLevel

Quality level to retrieve.

'medium'

Returns:

Type Description
Path

Path to cached version if available, otherwise original path.

Source code in src/audio_cache/audio_cache.py
def get_cached_or_original(
    self, original_path: Path, quality: QualityLevel = "medium"
) -> Path:
    """
    Get the cached version if available, otherwise return original path.

    This method does NOT generate a cache if it doesn't exist.
    Use transcode_file() for that purpose.

    Args:
        original_path: Path to the original audio file.
        quality: Quality level to retrieve.

    Returns:
        Path to cached version if available, otherwise original path.
    """
    if quality == "original" or not self.should_transcode(original_path):
        return original_path

    cache_path = self.get_cache_path(original_path, quality)

    return cache_path if self.is_cached(original_path, quality) else original_path
precache_file(original_path, qualities=None)

Pre-generate cached versions at multiple quality levels.

Parameters:

Name Type Description Default
original_path Path

Path to the original audio file.

required
qualities list[QualityLevel]

List of quality levels to generate. Defaults to ["medium"].

None

Returns:

Type Description
dict[QualityLevel, Path]

Dictionary mapping quality levels to their cached paths.

Source code in src/audio_cache/audio_cache.py
def precache_file(
    self,
    original_path: Path,
    qualities: list[QualityLevel] = None
) -> dict[QualityLevel, Path]:
    """
    Pre-generate cached versions at multiple quality levels.

    Args:
        original_path: Path to the original audio file.
        qualities: List of quality levels to generate. Defaults to ["medium"].

    Returns:
        Dictionary mapping quality levels to their cached paths.
    """
    if qualities is None:
        qualities = ["medium"]

    results = {}

    for quality in qualities:
        if quality == "original":
            results[quality] = original_path
            continue

        try:
            cached_path = self.transcode_file(original_path, quality)
            results[quality] = cached_path
        except Exception as e:
            self.logger.error(f"Failed to precache {quality} version: {e}")
            results[quality] = original_path

    return results
get_cache_size()

Calculate total size of the cache directory in bytes.

Returns:

Type Description
int

Total size in bytes.

Source code in src/audio_cache/audio_cache.py
def get_cache_size(self) -> int:
    """
    Calculate total size of the cache directory in bytes.

    Returns:
        Total size in bytes.
    """
    total_size = sum(
        file_path.stat().st_size
        for file_path in self.cache_dir.rglob("*")
        if file_path.is_file()
    )
    return total_size
clear_cache(older_than_days=None)

Clear cached files, optionally only those older than specified days.

Parameters:

Name Type Description Default
older_than_days int | None

If specified, only delete files older than this many days.

None

Returns:

Type Description
int

Number of files deleted.

Source code in src/audio_cache/audio_cache.py
def clear_cache(self, older_than_days: int | None = None) -> int:
    """
    Clear cached files, optionally only those older than specified days.

    Args:
        older_than_days: If specified, only delete files older than this many days.

    Returns:
        Number of files deleted.
    """
    import time

    deleted_count = 0
    current_time = time.time()

    for file_path in self.cache_dir.rglob("*.mp3"):
        if file_path.is_file():
            should_delete = True

            if older_than_days is not None:
                file_age_days = (current_time - file_path.stat().st_mtime) / 86400
                should_delete = file_age_days > older_than_days

            if should_delete:
                try:
                    file_path.unlink()
                    deleted_count += 1
                    self.logger.debug(f"Deleted cached file: {file_path}")
                except OSError as e:
                    self.logger.error(f"Failed to delete {file_path}: {e}")

    self.logger.info(f"Cache cleanup: deleted {deleted_count} files")
    return deleted_count

CacheWorker (batch & async)

CacheWorker(audio_cache, logger=None, max_workers=4)

Worker for pre-caching audio files in the background.

Provides methods to cache individual files or entire mixtapes at specified quality levels using thread pools for parallel processing.

Initialize the cache worker.

Parameters:

Name Type Description Default
audio_cache AudioCache

AudioCache instance for transcoding operations.

required
logger Logger | None

Optional logger for tracking operations.

None
max_workers int

Maximum number of parallel transcoding threads.

4

Methods:

Name Description
cache_single_file

Cache a single audio file at specified quality levels.

cache_mixtape

Cache all audio files in a mixtape.

cache_mixtape_async

Cache all audio files in a mixtape using parallel processing.

verify_mixtape_cache

Verify which tracks in a mixtape have valid cached versions.

regenerate_outdated_cache

Regenerate cached versions that are older than their source files.

Source code in src/audio_cache/cache_worker.py
def __init__(
    self,
    audio_cache: AudioCache,
    logger: Logger | None = None,
    max_workers: int = 4,
):
    """
    Initialize the cache worker.

    Args:
        audio_cache: AudioCache instance for transcoding operations.
        logger: Optional logger for tracking operations.
        max_workers: Maximum number of parallel transcoding threads.
    """
    self.audio_cache = audio_cache
    self.logger = logger or NullLogger()
    self.max_workers = max_workers
cache_single_file(file_path, qualities=None)

Cache a single audio file at specified quality levels.

Parameters:

Name Type Description Default
file_path Path

Path to the audio file.

required
qualities list[QualityLevel]

List of quality levels to cache. Defaults to ["medium"].

None

Returns:

Type Description
dict[QualityLevel, bool]

Dictionary mapping quality levels to success status.

Source code in src/audio_cache/cache_worker.py
def cache_single_file(
    self,
    file_path: Path,
    qualities: list[QualityLevel] = None,
) -> dict[QualityLevel, bool]:
    """
    Cache a single audio file at specified quality levels.

    Args:
        file_path: Path to the audio file.
        qualities: List of quality levels to cache. Defaults to ["medium"].

    Returns:
        Dictionary mapping quality levels to success status.
    """
    if qualities is None:
        qualities = ["medium"]

    results = {}

    for quality in qualities:
        if quality == "original":
            results[quality] = True
            continue

        try:
            self.audio_cache.transcode_file(file_path, quality)
            results[quality] = True
            self.logger.debug(f"Cached {file_path.name} at {quality} quality")
        except Exception as e:
            results[quality] = False
            self.logger.error(f"Failed to cache {file_path.name} at {quality}: {e}")

    return results
cache_mixtape(track_paths, qualities=None, progress_callback=None)

Cache all audio files in a mixtape.

Parameters:

Name Type Description Default
track_paths list[Path]

List of paths to audio files in the mixtape.

required
qualities list[QualityLevel]

Quality levels to cache. Defaults to ["medium"].

None
progress_callback Callable[[int, int], None] | None

Optional callback function(current, total) for progress updates.

None

Returns:

Type Description
dict[str, dict]

Dictionary with results for each file.

Source code in src/audio_cache/cache_worker.py
def cache_mixtape(
    self,
    track_paths: list[Path],
    qualities: list[QualityLevel] = None,
    progress_callback: Callable[[int, int], None] | None = None,
) -> dict[str, dict]:
    """
    Cache all audio files in a mixtape.

    Args:
        track_paths: List of paths to audio files in the mixtape.
        qualities: Quality levels to cache. Defaults to ["medium"].
        progress_callback: Optional callback function(current, total) for progress updates.

    Returns:
        Dictionary with results for each file.
    """
    if qualities is None:
        qualities = ["medium"]

    total_files = len(track_paths)
    results = {}

    self.logger.debug(
        f"Starting cache generation for {total_files} tracks at {qualities} quality levels"
    )

    for idx, file_path in enumerate(track_paths, 1):
        if not self.audio_cache.should_transcode(file_path):
            self.logger.debug(f"Skipping {file_path.name} (no transcoding needed)")
            results[str(file_path)] = {"skipped": True}
            continue

        file_results = self.cache_single_file(file_path, qualities)
        results[str(file_path)] = file_results

        if progress_callback:
            progress_callback(idx, total_files)

    self.logger.debug(f"Cache generation complete for {total_files} tracks")
    return results
cache_mixtape_async(track_paths, qualities=None, progress_callback=None)

Cache all audio files in a mixtape using parallel processing.

Parameters:

Name Type Description Default
track_paths list[Path]

List of paths to audio files in the mixtape.

required
qualities list[QualityLevel]

Quality levels to cache. Defaults to ["medium"].

None
progress_callback Callable[[int, int], None] | None

Optional callback function(current, total) for progress updates.

None

Returns:

Type Description
dict[str, dict]

Dictionary with results for each file.

Source code in src/audio_cache/cache_worker.py
def cache_mixtape_async(
    self,
    track_paths: list[Path],
    qualities: list[QualityLevel] = None,
    progress_callback: Callable[[int, int], None] | None = None,
) -> dict[str, dict]:
    """
    Cache all audio files in a mixtape using parallel processing.

    Args:
        track_paths: List of paths to audio files in the mixtape.
        qualities: Quality levels to cache. Defaults to ["medium"].
        progress_callback: Optional callback function(current, total) for progress updates.

    Returns:
        Dictionary with results for each file.
    """
    if qualities is None:
        qualities = ["medium"]

    # Filter files that need transcoding
    files_to_cache = [
        path for path in track_paths if self.audio_cache.should_transcode(path)
    ]

    # Track skipped files (non-FLAC that don't need transcoding)
    skipped_files = [
        path for path in track_paths if not self.audio_cache.should_transcode(path)
    ]

    total_files = len(files_to_cache)
    # Report skipped files via progress callback
    if progress_callback and hasattr(progress_callback, 'track_skipped'):
        for path in skipped_files:
            progress_callback.track_skipped(path.name, reason="No transcoding needed (MP3/M4A/etc)")

    results = {
        str(path): {"skipped": True, "reason": "No transcoding needed"}
        for path in skipped_files
    }
    if total_files == 0:
        self.logger.debug(f"No files need transcoding ({len(skipped_files)} files skipped)")
        # Emit final progress if all files were skipped
        if progress_callback and hasattr(progress_callback, '__call__'):
            progress_callback(len(track_paths), len(track_paths))
        return results

    completed = 0

    self.logger.debug(
        f"Starting parallel cache generation for {total_files} tracks "
        f"at {qualities} quality levels (max workers: {self.max_workers})"
    )

    with ThreadPoolExecutor(max_workers=self.max_workers) as executor:
        # Submit all jobs
        future_to_path = {
            executor.submit(self.cache_single_file, path, qualities): path
            for path in files_to_cache
        }

        # Process completed jobs
        for future in as_completed(future_to_path):
            path = future_to_path[future]
            completed += 1

            try:
                file_results = future.result()
                results[str(path)] = file_results
                self.logger.debug(
                    f"[{completed}/{total_files}] Cached {path.name}"
                )
            except Exception as e:
                results[str(path)] = {"error": str(e)}
                self.logger.error(f"Failed to cache {path.name}: {e}")

            if progress_callback:
                progress_callback(completed, total_files)

    self.logger.debug(f"Parallel cache generation complete for {total_files} tracks")
    return results
verify_mixtape_cache(track_paths, quality='medium')

Verify which tracks in a mixtape have valid cached versions.

Parameters:

Name Type Description Default
track_paths list[Path]

List of paths to audio files in the mixtape.

required
quality QualityLevel

Quality level to check.

'medium'

Returns:

Type Description
dict[str, bool]

Dictionary mapping file paths to cache availability status.

Source code in src/audio_cache/cache_worker.py
def verify_mixtape_cache(
    self, track_paths: list[Path], quality: QualityLevel = "medium"
) -> dict[str, bool]:
    """
    Verify which tracks in a mixtape have valid cached versions.

    Args:
        track_paths: List of paths to audio files in the mixtape.
        quality: Quality level to check.

    Returns:
        Dictionary mapping file paths to cache availability status.
    """
    results = {
        str(path): (
            self.audio_cache.is_cached(path, quality) if self.audio_cache.should_transcode(path) else True
        )
        for path in track_paths
    }
    return results
regenerate_outdated_cache(track_paths, qualities=None)

Regenerate cached versions that are older than their source files.

Parameters:

Name Type Description Default
track_paths list[Path]

List of paths to audio files.

required
qualities list[QualityLevel]

Quality levels to regenerate.

None

Returns:

Type Description
dict[str, dict]

Dictionary with regeneration results for each file.

Source code in src/audio_cache/cache_worker.py
def regenerate_outdated_cache(
    self, track_paths: list[Path], qualities: list[QualityLevel] = None
) -> dict[str, dict]:
    """
    Regenerate cached versions that are older than their source files.

    Args:
        track_paths: List of paths to audio files.
        qualities: Quality levels to regenerate.

    Returns:
        Dictionary with regeneration results for each file.
    """
    if qualities is None:
        qualities = ["medium"]

    files_to_regenerate = []

    for path in track_paths:
        if not self.audio_cache.should_transcode(path):
            continue

        # Check each quality level
        for quality in qualities:
            if not self.audio_cache.is_cached(path, quality):
                files_to_regenerate.append((path, quality))
                self.logger.debug(
                    f"Cache outdated or missing: {path.name} at {quality}"
                )

    if not files_to_regenerate:
        self.logger.info("All caches are up-to-date")
        return {}

    results = {}

    for path, quality in files_to_regenerate:
        try:
            self.audio_cache.transcode_file(path, quality, overwrite=True)
            key = f"{path}_{quality}"
            results[key] = {"success": True}
            self.logger.debug(f"Regenerated cache: {path.name} at {quality}")
        except Exception as e:
            key = f"{path}_{quality}"
            results[key] = {"success": False, "error": str(e)}
            self.logger.error(f"Failed to regenerate {path.name}: {e}")

    return results

Convenience Scheduler

schedule_mixtape_caching(mixtape_tracks, music_root, audio_cache, logger=None, qualities=None, async_mode=True, progress_callback=None)

Convenience function to schedule caching for a mixtape's tracks.

Parameters:

Name Type Description Default
mixtape_tracks list[dict]

List of track dictionaries with 'path' keys.

required
music_root Path

Root directory for music files.

required
audio_cache AudioCache

AudioCache instance.

required
logger Logger | None

Optional logger.

None
qualities list[QualityLevel]

Quality levels to cache. Defaults to ["medium"].

None
async_mode bool

If True, use parallel processing.

True
progress_callback Callable[[int, int], None] | None

Optional callback function(current, total) for progress updates.

None

Returns:

Type Description
dict

Dictionary with caching results.

Source code in src/audio_cache/cache_worker.py
def schedule_mixtape_caching(
    mixtape_tracks: list[dict],
    music_root: Path,
    audio_cache: AudioCache,
    logger: Logger | None = None,
    qualities: list[QualityLevel] = None,
    async_mode: bool = True,
    progress_callback: Callable[[int, int], None] | None = None,
) -> dict:
    """
    Convenience function to schedule caching for a mixtape's tracks.

    Args:
        mixtape_tracks: List of track dictionaries with 'path' keys.
        music_root: Root directory for music files.
        audio_cache: AudioCache instance.
        logger: Optional logger.
        qualities: Quality levels to cache. Defaults to ["medium"].
        async_mode: If True, use parallel processing.
        progress_callback: Optional callback function(current, total) for progress updates.

    Returns:
        Dictionary with caching results.
    """
    if qualities is None:
        qualities = ["medium"]

    # Convert track dictionaries to Path objects
    track_paths = [music_root / track["path"] for track in mixtape_tracks]

    # Filter out non-existent files
    valid_paths = [path for path in track_paths if path.exists()]

    if len(valid_paths) < len(track_paths):
        missing = len(track_paths) - len(valid_paths)
        if logger:
            logger.warning(f"{missing} track files not found, skipping")

    # Create worker and cache files
    worker = CacheWorker(audio_cache, logger)

    if async_mode:
        return worker.cache_mixtape_async(valid_paths, qualities, progress_callback)
    else:
        return worker.cache_mixtape(valid_paths, qualities, progress_callback)

Progress Tracker (SSE)

get_progress_tracker(logger=None)

Get or create the global progress tracker instance.

Source code in src/audio_cache/progress_tracker.py
def get_progress_tracker(logger: Logger | None = None) -> ProgressTracker:
    """Get or create the global progress tracker instance."""
    global _progress_tracker
    if _progress_tracker is None:
        _progress_tracker = ProgressTracker(logger)
    return _progress_tracker

ProgressTracker(logger=None)

Tracks progress of long-running operations and broadcasts updates via SSE.

Thread-safe implementation that allows multiple operations to report progress while clients listen for updates.

Initializes a new progress tracker with optional logging.

Sets up internal, thread-safe queues for tracking task-specific progress events that can be streamed to clients.

Parameters:

Name Type Description Default
logger Logger | None

Optional logger instance used to record progress tracker activity.

None

Methods:

Name Description
create_task

Create a new task for tracking.

emit

Emit a progress event.

listen

Generator that yields SSE-formatted progress events.

cleanup_task

Remove a task and its queue.

Source code in src/audio_cache/progress_tracker.py
def __init__(self, logger: Logger | None = None):
    """Initializes a new progress tracker with optional logging.

    Sets up internal, thread-safe queues for tracking task-specific progress events that can be streamed to clients.

    Args:
        logger: Optional logger instance used to record progress tracker activity.
    """
    self.logger = logger or NullLogger()
    self._queues: dict[str, Queue] = {}
    self._lock = threading.Lock()
create_task(task_id)

Create a new task for tracking.

Parameters:

Name Type Description Default
task_id str

Unique identifier for this task (e.g., mixtape slug)

required
Source code in src/audio_cache/progress_tracker.py
def create_task(self, task_id: str) -> None:
    """
    Create a new task for tracking.

    Args:
        task_id: Unique identifier for this task (e.g., mixtape slug)
    """
    with self._lock:
        if task_id not in self._queues:
            self._queues[task_id] = Queue()
            self.logger.debug(f"Created progress task: {task_id}")
emit(task_id, step, status, message, current=0, total=0)

Emit a progress event.

Parameters:

Name Type Description Default
task_id str

Task identifier

required
step str

Name of the current step (e.g., "saving", "caching_track")

required
status ProgressStatus

Current status

required
message str

Human-readable message

required
current int

Current progress count

0
total int

Total items to process

0
Source code in src/audio_cache/progress_tracker.py
def emit(
    self,
    task_id: str,
    step: str,
    status: ProgressStatus,
    message: str,
    current: int = 0,
    total: int = 0
) -> None:
    """
    Emit a progress event.

    Args:
        task_id: Task identifier
        step: Name of the current step (e.g., "saving", "caching_track")
        status: Current status
        message: Human-readable message
        current: Current progress count
        total: Total items to process
    """
    event = ProgressEvent(
        task_id=task_id,
        step=step,
        status=status,
        message=message,
        current=current,
        total=total
    )

    with self._lock:
        if task_id in self._queues:
            self._queues[task_id].put(event)
            self.logger.debug(f"[{task_id}] {step}: {message} ({current}/{total})")
listen(task_id, timeout=300)

Generator that yields SSE-formatted progress events.

Parameters:

Name Type Description Default
task_id str

Task identifier to listen to

required
timeout int

Maximum time to wait for events (seconds)

300

Yields:

Name Type Description
str

SSE-formatted event strings

Source code in src/audio_cache/progress_tracker.py
def listen(self, task_id: str, timeout: int = 300):
    """
    Generator that yields SSE-formatted progress events.

    Args:
        task_id: Task identifier to listen to
        timeout: Maximum time to wait for events (seconds)

    Yields:
        str: SSE-formatted event strings
    """
    self.create_task(task_id)

    start_time = time.time()

    # Send initial connection event
    yield f"data: {json.dumps({'type': 'connected', 'task_id': task_id})}\n\n"

    while True:
        # Check timeout
        if time.time() - start_time > timeout:
            self.logger.warning(f"Progress stream timeout for task: {task_id}")
            break

        try:
            with self._lock:
                queue = self._queues.get(task_id)

            if queue is None:
                break

            # Get event with short timeout to allow checking for completion
            event = queue.get(timeout=1)

            yield event.to_sse()

            # If task is completed or failed, clean up and stop
            if event.status in (ProgressStatus.COMPLETED, ProgressStatus.FAILED):
                self.logger.debug(f"Task completed: {task_id}")
                break

        except Empty:
            # Send keepalive
            yield ": keepalive\n\n"
            continue

    # Cleanup
    self.cleanup_task(task_id)
cleanup_task(task_id)

Remove a task and its queue.

Parameters:

Name Type Description Default
task_id str

Task identifier to clean up

required
Source code in src/audio_cache/progress_tracker.py
def cleanup_task(self, task_id: str) -> None:
    """
    Remove a task and its queue.

    Args:
        task_id: Task identifier to clean up
    """
    with self._lock:
        if task_id in self._queues:
            del self._queues[task_id]
            self.logger.debug(f"Cleaned up progress task: {task_id}")

ProgressCallback(task_id, tracker, total_tracks)

Callback wrapper for audio caching progress.

Translates cache worker progress updates into SSE events.

Initialize the progress callback.

Parameters:

Name Type Description Default
task_id str

Task identifier

required
tracker ProgressTracker

ProgressTracker instance

required
total_tracks int

Total number of tracks to cache

required

Methods:

Name Description
__call__

Called by cache worker with progress updates.

track_cached

Records that a track has been successfully cached.

track_skipped

Records that a track was intentionally skipped during caching.

track_failed

Records that caching a track has failed.

Source code in src/audio_cache/progress_tracker.py
def __init__(self, task_id: str, tracker: ProgressTracker, total_tracks: int):
    """
    Initialize the progress callback.

    Args:
        task_id: Task identifier
        tracker: ProgressTracker instance
        total_tracks: Total number of tracks to cache
    """
    self.task_id = task_id
    self.tracker = tracker
    self.total_tracks = total_tracks
    self.cached_count = 0
    self.skipped_count = 0
    self.failed_count = 0
__call__(current, total)

Called by cache worker with progress updates.

Parameters:

Name Type Description Default
current int

Current file number

required
total int

Total files to process

required
Source code in src/audio_cache/progress_tracker.py
def __call__(self, current: int, total: int) -> None:
    """
    Called by cache worker with progress updates.

    Args:
        current: Current file number
        total: Total files to process
    """
    self.tracker.emit(
        task_id=self.task_id,
        step="caching",
        status=ProgressStatus.IN_PROGRESS,
        message=f"Caching track {current} of {total}",
        current=current,
        total=total
    )
track_cached(track_name)

Records that a track has been successfully cached.

Increments the count of cached tracks and emits a progress event reflecting the updated completion state.

Parameters:

Name Type Description Default
track_name str

The display name or identifier of the cached track.

required
Source code in src/audio_cache/progress_tracker.py
def track_cached(self, track_name: str) -> None:
    """Records that a track has been successfully cached.

    Increments the count of cached tracks and emits a progress event reflecting the updated completion state.

    Args:
        track_name: The display name or identifier of the cached track.
    """
    self.cached_count += 1
    self.tracker.emit(
        task_id=self.task_id,
        step="track_cached",
        status=ProgressStatus.IN_PROGRESS,
        message=f"✓ Cached: {track_name}",
        current=self.cached_count,
        total=self.total_tracks
    )
track_skipped(track_name, reason='already cached')

Records that a track was intentionally skipped during caching.

Increments the skipped count and emits a progress event explaining why the track was not processed.

Parameters:

Name Type Description Default
track_name str

The display name or identifier of the skipped track.

required
reason str

Human-readable explanation for why the track was skipped.

'already cached'
Source code in src/audio_cache/progress_tracker.py
def track_skipped(self, track_name: str, reason: str = "already cached") -> None:
    """Records that a track was intentionally skipped during caching.

    Increments the skipped count and emits a progress event explaining why the track was not processed.

    Args:
        track_name: The display name or identifier of the skipped track.
        reason: Human-readable explanation for why the track was skipped.
    """
    self.skipped_count += 1
    self.tracker.emit(
        task_id=self.task_id,
        step="track_skipped",
        status=ProgressStatus.SKIPPED,
        message=f"⊘ Skipped: {track_name} ({reason})",
        current=self.cached_count + self.skipped_count,
        total=self.total_tracks
    )
track_failed(track_name, error)

Records that caching a track has failed.

Increments the failed count and emits a progress event describing the error that occurred.

Parameters:

Name Type Description Default
track_name str

The display name or identifier of the track that failed to cache.

required
error str

Human-readable error description explaining the failure.

required
Source code in src/audio_cache/progress_tracker.py
def track_failed(self, track_name: str, error: str) -> None:
    """Records that caching a track has failed.

    Increments the failed count and emits a progress event describing the error that occurred.

    Args:
        track_name: The display name or identifier of the track that failed to cache.
        error: Human-readable error description explaining the failure.
    """
    self.failed_count += 1
    self.tracker.emit(
        task_id=self.task_id,
        step="track_failed",
        status=ProgressStatus.FAILED,
        message=f"✗ Failed: {track_name} - {error}",
        current=self.cached_count + self.skipped_count + self.failed_count,
        total=self.total_tracks
    )

🛠️ Configuration Options

Option Default Description
AUDIO_CACHE_DIR "cache/audio" Directory where MP3 caches are stored (relative to DATA_ROOT).
AUDIO_CACHE_ENABLED True Master switch – set to False to bypass the entire subsystem.
AUDIO_CACHE_DEFAULT_QUALITY "medium" Quality used when a client does not specify one.
AUDIO_CACHE_MAX_WORKERS 4 Number of parallel threads for batch transcoding.
AUDIO_CACHE_PRECACHE_ON_UPLOAD True Auto-cache mixtape tracks when a mixtape is saved.
AUDIO_CACHE_PRECACHE_QUALITIES ["medium"] List of qualities to pre-generate (e.g., ["low", "medium", "high"]).

These values are defined in src/config/config.py and can be overridden with environment variables (e.g., AUDIO_CACHE_MAX_WORKERS=8).


⏳ Progress Tracking (SSE)

The progress modal in the editor UI subscribes to the endpoint:

GET /editor/progress/<slug>

The server returns a Server‑Sent Events stream. Each event looks like:

{
  "task_id": "summer-vibes",
  "step": "caching",
  "status": "in_progress",
  "message": "Caching track 3 of 15",
  "current": 3,
  "total": 15,
  "timestamp": "2024-09-28T12:34:56.789012"
}

The modal updates the progress bar, logs messages, and shows a final summary when the status becomes completed or failed.

Implementation note:ProgressCallback.track_cached(), track_skipped(), and track_failed() are called from CacheWorker to emit the above events.

🔧 Troubleshooting FAQ

Cache Misses – “Why isn’t my file being cached?”

Symptom Check Fix
Cache miss warning in logs grep -i "cache miss" app.log Verify AUDIO_CACHE_ENABLED=True and that the file’s suffix is in should_transcode (FLAC, WAV, AIFF, APE, ALAC).
Cache file exists but not found ls collection-data/cache/audio/ Ensure the hash matches the current absolute path. If you moved the music folder, run python debug_cache.py <MUSIC_ROOT> <REL_PATH> <CACHE_DIR> (see debug_cache.py).
Cache never generated AUDIO_CACHE_PRECACHE_ON_UPLOAD=False Enable pre-caching or trigger it manually via schedule_mixtape_caching.
ffmpeg not found ffmpeg -version Install ffmpeg on the host (Ubuntu: apt install ffmpeg; Alpine: apk add ffmpeg).
Permission denied on cache dir ls -ld collection-data/cache/audio The Flask process must have write permission (owner UID = the container user).
High CPU usage during batch caching top while caching Reduce AUDIO_CACHE_MAX_WORKERS (e.g., export AUDIO_CACHE_MAX_WORKERS=2).
Stale cache after source file change Compare timestamps (stat -c %Y file) Run cache.clear_cache() or set overwrite=True in transcode_file.

Transcoding Failures – “ffmpeg exited with error code 1”"

  1. Inspect the ffmpeg stderr – it is logged by AudioCache.transcode_file.
  2. Common culprits:
  3. Corrupt source file – try re‑encoding the source with ffmpeg -i inut.flac -c copy output.flac.
  4. Unsupported codec – ensure the source is a supported lossless format.
  5. Insufficient disk space – check free space on the cache volume.
  6. Manual test:

    ffmpeg -i "/music/Artist/Album/BadTrack.flac" -b:a 192k -y "/tmp/test.mp3"
    

    If this works, the problem is likely in the path handling (hash mismatch).

  7. Fix path mismatches – run debug_cache.py (see the script in the repo) to compare the hash generated by the app vs. the one you expect.