Audio Caching System¶

The audio‑caching subsystem automatically converts large lossless audio files (FLAC, WAV, AIFF, …) into smaller MP3 streams, dramatically reducing bandwidth while preserving a pleasant listening experience.

TL;DR – The cache turns a 40 MB FLAC track into a ~5 MB MP3 (≈ 87 % bandwidth saving) and serves the MP3 via HTTP range requests.

📖 Overview¶

When streaming lossless audio over the web, bandwidth quickly becomes a bottleneck:

Format	Approx. size (4‑min track)	Typical bitrate
FLAC (original)	40‑50 MB	~1 000 kbps
MP3 – High (256 kbps)	≈ 8 MB	256 kbps
MP3 – Medium (192 kbps)	≈ 5 MB	192 kbps
MP3 – Low (128 kbps)	≈ 3 MB	128 kbps

Result: A 4‑minute track drops from ~45 MB to ~5 MB (≈ 87 % bandwidth reduction) with negligible audible loss for casual listening.

The cache is transparent to the rest of the application:

The UI requests a track → the Flask route asks AudioCache for a cached version.
If a suitable MP3 exists, it is streamed via HTTP range requests.
If not, the original file is streamed (or a background job creates the cache for the next request).

🏛️ Architecture Overview¶

graph TD
    A["AudioCache (core)"] --> B["Cache Path Generation"]
    A --> C["Transcoding (ffmpeg)"]
    A --> D["Cache Management (size, cleanup)"]

    E[CacheWorker] --> A
    E --> F["ThreadPool (parallel batch)"]
    E --> G["ProgressTracker (SSE)"]

    H[ProgressTracker] --> I["Frontend (EventSource)"]

Hold "Alt" / "Option" to enable pan & zoom

audio_cache.py – core logic (hash‑based filenames, transcoding, cache look‑ups).
cache_worker.py – batch processing, thread‑pool parallelism, progress callbacks.
progress_tracker.py – Server‑Sent Events (SSE) emitter that feeds the UI’s “caching progress” modal.

✨ Key Features¶

Feature	Description
Automatic transcoding	FLAC, WAV, AIFF, APE, ALAC → MP3 (high/medium/low).
Multiple quality levels	`high` (256 kbps), `medium` (192 kbps), `low` (128 kbps).
Smart caching	Only creates a cached file when the source is lossless and the cache is missing/out-of-date.
Pre-caching on upload	When a mixtape is saved, the system can generate caches automatically.
Parallel batch processing	Thread-pool (configurable workers) for fast bulk transcoding.
Progress tracking	Real-time SSE updates displayed in a Bootstrap modal.
Cache management utilities	Size calculation, age-based cleanup, full purge.
Config-driven	All knobs live in `src/config/config.py` (`AUDIO_CACHE_*`).

📋 How It Works (Step‑by‑Step)¶

Cache Path Generation¶

flowchart LR
    A[Original file path] --> B[Normalize & resolve]
    B --> C[MD5 hash of full path]
    C --> D[Compose filename: `<hash>_<quality>_<bitrate>.mp3`]
    D --> E["Cache directory (`AUDIO_CACHE_DIR`)"]

Hold "Alt" / "Option" to enable pan & zoom

The hash guarantees collision‑free filenames, even for identically named tracks in different folders.
Example: Original: /music/Radiohead/OK Computer/01 Airbag.flac Hash: a1b2c3… → Cache file a1b2c3_medium_192k.mp3.

Transcoding Flow¶

sequenceDiagram
    participant UI
    participant Flask
    participant CacheWorker
    participant AudioCache
    participant ffmpeg

    UI->>Flask: Request play (quality=medium)
    Flask->>AudioCache: get_cached_or_original()
    alt Cached version exists
        AudioCache-->>Flask: Return cached path
    else No cache
        AudioCache->>CacheWorker: transcode_file()
        CacheWorker->>ffmpeg: Run ffmpeg command
        ffmpeg-->>CacheWorker: MP3 file created
        CacheWorker->>AudioCache: Store in cache dir
        AudioCache-->>Flask: Return newly cached path
    end
    Flask->>UI: Stream MP3

Hold "Alt" / "Option" to enable pan & zoom

If a cached file is present, it is served immediately.
Otherwise the worker spawns ffmpeg, writes the MP3, and returns the new path.

Playback Flow¶

graph LR
    A[User clicks Play] --> B{Quality selected?}
    B -->|Original| C[Serve original FLAC]
    B -->|High/Med/Low| D{Is source lossless?}
    D -->|No| C
    D -->|Yes| E{Cache exists?}
    E -->|Yes| F[Serve cached MP3]
    E -->|No| G[Log warning → fall back to original]
    F --> H[User streams small file]
    C --> I[User streams large file]

Hold "Alt" / "Option" to enable pan & zoom

🔌 API Reference¶

AudioCache (core)¶

`AudioCache(cache_dir, logger=None)` ¶

Manages audio file transcoding and caching for bandwidth optimization.

Provides methods to check for cached versions, generate transcoded files, and manage the cache directory.

Initialize the AudioCache manager.

Parameters:

Name	Type	Description	Default
`cache_dir`	`Path`	Directory where cached transcoded files will be stored.	required
`logger`	`Logger \| None`	Optional logger for tracking operations.	`None`

Methods:

Name	Description
`get_cache_path`	Generate a cache filename based on the original path and quality level.
`should_transcode`	Determine if a file should be transcoded based on its format.
`is_cached`	Check if a cached version exists and is up-to-date.
`transcode_file`	Transcode an audio file to a cached version.
`get_cached_or_original`	Get the cached version if available, otherwise return original path.
`precache_file`	Pre-generate cached versions at multiple quality levels.
`get_cache_size`	Calculate total size of the cache directory in bytes.
`clear_cache`	Clear cached files, optionally only those older than specified days.

Source code in src/audio_cache/audio_cache.py

def __init__(self, cache_dir: Path, logger: Logger | None = None):
    """
    Initialize the AudioCache manager.

    Args:
        cache_dir: Directory where cached transcoded files will be stored.
        logger: Optional logger for tracking operations.
    """
    self.cache_dir = Path(cache_dir)
    self.cache_dir.mkdir(parents=True, exist_ok=True)
    self.logger = logger or NullLogger()

`get_cache_path(original_path, quality='medium')` ¶

Generate a cache filename based on the original path and quality level.

Parameters:

Name	Type	Description	Default
`original_path`	`Path`	Path to the original audio file.	required
`quality`	`QualityLevel`	Quality level for transcoding (high, medium, low).	`'medium'`

Returns:

Type	Description
`Path`	Path to the cached file location.

Source code in src/audio_cache/audio_cache.py

def get_cache_path(
    self, original_path: Path, quality: QualityLevel = "medium"
) -> Path:
    """
    Generate a cache filename based on the original path and quality level.

    Args:
        original_path: Path to the original audio file.
        quality: Quality level for transcoding (high, medium, low).

    Returns:
        Path to the cached file location.
    """
    if quality == "original":
        return original_path

    # Create unique hash from the normalized file path
    path_str = self._normalize_path(original_path)
    path_hash = hashlib.md5(path_str.encode()).hexdigest()

    # Get quality settings
    settings = QUALITY_SETTINGS[quality]
    bitrate = settings["bitrate"]
    ext = settings["format"]

    cache_filename = f"{path_hash}_{quality}_{bitrate}.{ext}"

    # Log for debugging
    self.logger.debug(
        f"Cache path generation: {original_path.name} -> {cache_filename} "
        f"(hash of: {path_str})"
    )

    return self.cache_dir / cache_filename

`should_transcode(file_path)` ¶

Determine if a file should be transcoded based on its format.

Parameters:

Name	Type	Description	Default
`file_path`	`Path`	Path to the audio file.	required

Returns:

Type	Description
`bool`	True if the file should be transcoded, False otherwise.

Source code in src/audio_cache/audio_cache.py

def should_transcode(self, file_path: Path) -> bool:
    """
    Determine if a file should be transcoded based on its format.

    Args:
        file_path: Path to the audio file.

    Returns:
        True if the file should be transcoded, False otherwise.
    """
    # Transcode lossless formats that are bandwidth-heavy
    transcode_formats = {".flac", ".wav", ".aiff", ".ape", ".alac"}
    return file_path.suffix.lower() in transcode_formats

`is_cached(original_path, quality='medium')` ¶

Check if a cached version exists and is up-to-date.

Parameters:

Name	Type	Description	Default
`original_path`	`Path`	Path to the original audio file.	required
`quality`	`QualityLevel`	Quality level to check.	`'medium'`

Returns:

Type	Description
`bool`	True if a valid cached version exists, False otherwise.

Source code in src/audio_cache/audio_cache.py

def is_cached(self, original_path: Path, quality: QualityLevel = "medium") -> bool:
    """
    Check if a cached version exists and is up-to-date.

    Args:
        original_path: Path to the original audio file.
        quality: Quality level to check.

    Returns:
        True if a valid cached version exists, False otherwise.
    """
    if quality == "original" or not self.should_transcode(original_path):
        return True

    cache_path = self.get_cache_path(original_path, quality)

    if not cache_path.exists():
        self.logger.debug(f"Cache miss: {cache_path.name} does not exist")
        return False

    # Check if cache is newer than original
    try:
        cache_mtime = cache_path.stat().st_mtime

        # Only check mtime if original file exists
        if original_path.exists():
            original_mtime = original_path.stat().st_mtime
            if cache_mtime < original_mtime:
                self.logger.debug(
                    f"Cache outdated: {cache_path.name} is older than source"
                )
                return False

        self.logger.debug(f"Cache hit: {cache_path.name}")
        return True

    except OSError as e:
        self.logger.warning(f"Error checking cache status for {cache_path}: {e}")
        return False

`transcode_file(original_path, quality='medium', overwrite=False)` ¶

Transcode an audio file to a cached version.

Parameters:

Name	Type	Description	Default
`original_path`	`Path`	Path to the original audio file.	required
`quality`	`QualityLevel`	Quality level for transcoding.	`'medium'`
`overwrite`	`bool`	If True, regenerate cache even if it exists.	`False`

Returns:

Type	Description
`Path`	Path to the transcoded file (or original if no transcoding needed).

Raises:

Type	Description
`CalledProcessError`	If ffmpeg transcoding fails.
`FileNotFoundError`	If the original file doesn't exist.

Source code in src/audio_cache/audio_cache.py

def transcode_file(
    self,
    original_path: Path,
    quality: QualityLevel = "medium",
    overwrite: bool = False
) -> Path:
    """
    Transcode an audio file to a cached version.

    Args:
        original_path: Path to the original audio file.
        quality: Quality level for transcoding.
        overwrite: If True, regenerate cache even if it exists.

    Returns:
        Path to the transcoded file (or original if no transcoding needed).

    Raises:
        subprocess.CalledProcessError: If ffmpeg transcoding fails.
        FileNotFoundError: If the original file doesn't exist.
    """
    if quality == "original" or not self.should_transcode(original_path):
        return original_path

    if not original_path.exists():
        raise FileNotFoundError(f"Original file not found: {original_path}")

    cache_path = self.get_cache_path(original_path, quality)

    # Check if we need to transcode
    if not overwrite and self.is_cached(original_path, quality):
        self.logger.debug(f"Using existing cache: {cache_path}")
        return cache_path

    # Get transcoding settings
    settings = QUALITY_SETTINGS[quality]
    bitrate = settings["bitrate"]

    self.logger.debug(f"Transcoding {original_path.name} to {quality} quality ({bitrate})")

    # Build ffmpeg command
    cmd = [
        "ffmpeg",
        "-y",  # Overwrite output file
        "-i", str(original_path),
        "-vn",  # No video
        "-ar", "44100",  # Sample rate
        "-ac", "2",  # Stereo
        "-b:a", bitrate,  # Target bitrate
        "-map_metadata", "0",  # Copy metadata
        "-id3v2_version", "3",  # ID3v2.3 for better compatibility
        str(cache_path),
    ]

    try:
        subprocess.run(
            cmd,
            capture_output=True,
            check=True,
            text=True,
        )
        self.logger.debug(f"Successfully cached: {cache_path}")
        return cache_path

    except subprocess.CalledProcessError as e:
        self.logger.error(f"Transcoding failed for {original_path}: {e.stderr}")
        # Clean up partial file if it exists
        if cache_path.exists():
            cache_path.unlink()
        raise

`get_cached_or_original(original_path, quality='medium')` ¶

Get the cached version if available, otherwise return original path.

This method does NOT generate a cache if it doesn't exist. Use transcode_file() for that purpose.

Parameters:

Name	Type	Description	Default
`original_path`	`Path`	Path to the original audio file.	required
`quality`	`QualityLevel`	Quality level to retrieve.	`'medium'`

Returns:

Type	Description
`Path`	Path to cached version if available, otherwise original path.

Source code in src/audio_cache/audio_cache.py

def get_cached_or_original(
    self, original_path: Path, quality: QualityLevel = "medium"
) -> Path:
    """
    Get the cached version if available, otherwise return original path.

    This method does NOT generate a cache if it doesn't exist.
    Use transcode_file() for that purpose.

    Args:
        original_path: Path to the original audio file.
        quality: Quality level to retrieve.

    Returns:
        Path to cached version if available, otherwise original path.
    """
    if quality == "original" or not self.should_transcode(original_path):
        return original_path

    cache_path = self.get_cache_path(original_path, quality)

    return cache_path if self.is_cached(original_path, quality) else original_path

`precache_file(original_path, qualities=None)` ¶

Pre-generate cached versions at multiple quality levels.

Parameters:

Name	Type	Description	Default
`original_path`	`Path`	Path to the original audio file.	required
`qualities`	`list[QualityLevel]`	List of quality levels to generate. Defaults to ["medium"].	`None`

Returns:

Type	Description
`dict[QualityLevel, Path]`	Dictionary mapping quality levels to their cached paths.

Source code in src/audio_cache/audio_cache.py

def precache_file(
    self,
    original_path: Path,
    qualities: list[QualityLevel] = None
) -> dict[QualityLevel, Path]:
    """
    Pre-generate cached versions at multiple quality levels.

    Args:
        original_path: Path to the original audio file.
        qualities: List of quality levels to generate. Defaults to ["medium"].

    Returns:
        Dictionary mapping quality levels to their cached paths.
    """
    if qualities is None:
        qualities = ["medium"]

    results = {}

    for quality in qualities:
        if quality == "original":
            results[quality] = original_path
            continue

        try:
            cached_path = self.transcode_file(original_path, quality)
            results[quality] = cached_path
        except Exception as e:
            self.logger.error(f"Failed to precache {quality} version: {e}")
            results[quality] = original_path

    return results

`get_cache_size()` ¶

Calculate total size of the cache directory in bytes.

Returns:

Type	Description
`int`	Total size in bytes.

Source code in src/audio_cache/audio_cache.py

def get_cache_size(self) -> int:
    """
    Calculate total size of the cache directory in bytes.

    Returns:
        Total size in bytes.
    """
    total_size = sum(
        file_path.stat().st_size
        for file_path in self.cache_dir.rglob("*")
        if file_path.is_file()
    )
    return total_size

`clear_cache(older_than_days=None)` ¶

Clear cached files, optionally only those older than specified days.

Parameters:

Name	Type	Description	Default
`older_than_days`	`int \| None`	If specified, only delete files older than this many days.	`None`

Returns:

Type	Description
`int`	Number of files deleted.

Source code in src/audio_cache/audio_cache.py

def clear_cache(self, older_than_days: int | None = None) -> int:
    """
    Clear cached files, optionally only those older than specified days.

    Args:
        older_than_days: If specified, only delete files older than this many days.

    Returns:
        Number of files deleted.
    """
    import time

    deleted_count = 0
    current_time = time.time()

    for file_path in self.cache_dir.rglob("*.mp3"):
        if file_path.is_file():
            should_delete = True

            if older_than_days is not None:
                file_age_days = (current_time - file_path.stat().st_mtime) / 86400
                should_delete = file_age_days > older_than_days

            if should_delete:
                try:
                    file_path.unlink()
                    deleted_count += 1
                    self.logger.debug(f"Deleted cached file: {file_path}")
                except OSError as e:
                    self.logger.error(f"Failed to delete {file_path}: {e}")

    self.logger.info(f"Cache cleanup: deleted {deleted_count} files")
    return deleted_count

CacheWorker (batch & async)¶

`CacheWorker(audio_cache, logger=None, max_workers=4)` ¶

Worker for pre-caching audio files in the background.

Provides methods to cache individual files or entire mixtapes at specified quality levels using thread pools for parallel processing.

Initialize the cache worker.

Parameters:

Name	Type	Description	Default
`audio_cache`	`AudioCache`	AudioCache instance for transcoding operations.	required
`logger`	`Logger \| None`	Optional logger for tracking operations.	`None`
`max_workers`	`int`	Maximum number of parallel transcoding threads.	`4`

Methods:

Name	Description
`cache_single_file`	Cache a single audio file at specified quality levels.
`cache_mixtape`	Cache all audio files in a mixtape.
`cache_mixtape_async`	Cache all audio files in a mixtape using parallel processing.
`verify_mixtape_cache`	Verify which tracks in a mixtape have valid cached versions.
`regenerate_outdated_cache`	Regenerate cached versions that are older than their source files.

Source code in src/audio_cache/cache_worker.py

def __init__(
    self,
    audio_cache: AudioCache,
    logger: Logger | None = None,
    max_workers: int = 4,
):
    """
    Initialize the cache worker.

    Args:
        audio_cache: AudioCache instance for transcoding operations.
        logger: Optional logger for tracking operations.
        max_workers: Maximum number of parallel transcoding threads.
    """
    self.audio_cache = audio_cache
    self.logger = logger or NullLogger()
    self.max_workers = max_workers

`cache_single_file(file_path, qualities=None)` ¶

Cache a single audio file at specified quality levels.

Parameters:

Name	Type	Description	Default
`file_path`	`Path`	Path to the audio file.	required
`qualities`	`list[QualityLevel]`	List of quality levels to cache. Defaults to ["medium"].	`None`

Returns:

Type	Description
`dict[QualityLevel, bool]`	Dictionary mapping quality levels to success status.

Source code in src/audio_cache/cache_worker.py

def cache_single_file(
    self,
    file_path: Path,
    qualities: list[QualityLevel] = None,
) -> dict[QualityLevel, bool]:
    """
    Cache a single audio file at specified quality levels.

    Args:
        file_path: Path to the audio file.
        qualities: List of quality levels to cache. Defaults to ["medium"].

    Returns:
        Dictionary mapping quality levels to success status.
    """
    if qualities is None:
        qualities = ["medium"]

    results = {}

    for quality in qualities:
        if quality == "original":
            results[quality] = True
            continue

        try:
            self.audio_cache.transcode_file(file_path, quality)
            results[quality] = True
            self.logger.debug(f"Cached {file_path.name} at {quality} quality")
        except Exception as e:
            results[quality] = False
            self.logger.error(f"Failed to cache {file_path.name} at {quality}: {e}")

    return results

`cache_mixtape(track_paths, qualities=None, progress_callback=None)` ¶

Cache all audio files in a mixtape.

Parameters:

Name	Type	Description	Default
`track_paths`	`list[Path]`	List of paths to audio files in the mixtape.	required
`qualities`	`list[QualityLevel]`	Quality levels to cache. Defaults to ["medium"].	`None`
`progress_callback`	`Callable[[int, int], None] \| None`	Optional callback function(current, total) for progress updates.	`None`

Returns:

Type	Description
`dict[str, dict]`	Dictionary with results for each file.

Source code in src/audio_cache/cache_worker.py

def cache_mixtape(
    self,
    track_paths: list[Path],
    qualities: list[QualityLevel] = None,
    progress_callback: Callable[[int, int], None] | None = None,
) -> dict[str, dict]:
    """
    Cache all audio files in a mixtape.

    Args:
        track_paths: List of paths to audio files in the mixtape.
        qualities: Quality levels to cache. Defaults to ["medium"].
        progress_callback: Optional callback function(current, total) for progress updates.

    Returns:
        Dictionary with results for each file.
    """
    if qualities is None:
        qualities = ["medium"]

    total_files = len(track_paths)
    results = {}

    self.logger.debug(
        f"Starting cache generation for {total_files} tracks at {qualities} quality levels"
    )

    for idx, file_path in enumerate(track_paths, 1):
        if not self.audio_cache.should_transcode(file_path):
            self.logger.debug(f"Skipping {file_path.name} (no transcoding needed)")
            results[str(file_path)] = {"skipped": True}
            continue

        file_results = self.cache_single_file(file_path, qualities)
        results[str(file_path)] = file_results

        if progress_callback:
            progress_callback(idx, total_files)

    self.logger.debug(f"Cache generation complete for {total_files} tracks")
    return results

`cache_mixtape_async(track_paths, qualities=None, progress_callback=None)` ¶

Cache all audio files in a mixtape using parallel processing.

Parameters:

Name	Type	Description	Default
`track_paths`	`list[Path]`	List of paths to audio files in the mixtape.	required
`qualities`	`list[QualityLevel]`	Quality levels to cache. Defaults to ["medium"].	`None`
`progress_callback`	`Callable[[int, int], None] \| None`	Optional callback function(current, total) for progress updates.	`None`

Returns:

Type	Description
`dict[str, dict]`	Dictionary with results for each file.

Source code in src/audio_cache/cache_worker.py

def cache_mixtape_async(
    self,
    track_paths: list[Path],
    qualities: list[QualityLevel] = None,
    progress_callback: Callable[[int, int], None] | None = None,
) -> dict[str, dict]:
    """
    Cache all audio files in a mixtape using parallel processing.

    Args:
        track_paths: List of paths to audio files in the mixtape.
        qualities: Quality levels to cache. Defaults to ["medium"].
        progress_callback: Optional callback function(current, total) for progress updates.

    Returns:
        Dictionary with results for each file.
    """
    if qualities is None:
        qualities = ["medium"]

    # Filter files that need transcoding
    files_to_cache = [
        path for path in track_paths if self.audio_cache.should_transcode(path)
    ]

    # Track skipped files (non-FLAC that don't need transcoding)
    skipped_files = [
        path for path in track_paths if not self.audio_cache.should_transcode(path)
    ]

    total_files = len(files_to_cache)
    # Report skipped files via progress callback
    if progress_callback and hasattr(progress_callback, 'track_skipped'):
        for path in skipped_files:
            progress_callback.track_skipped(path.name, reason="No transcoding needed (MP3/M4A/etc)")

    results = {
        str(path): {"skipped": True, "reason": "No transcoding needed"}
        for path in skipped_files
    }
    if total_files == 0:
        self.logger.debug(f"No files need transcoding ({len(skipped_files)} files skipped)")
        # Emit final progress if all files were skipped
        if progress_callback and hasattr(progress_callback, '__call__'):
            progress_callback(len(track_paths), len(track_paths))
        return results

    completed = 0

    self.logger.debug(
        f"Starting parallel cache generation for {total_files} tracks "
        f"at {qualities} quality levels (max workers: {self.max_workers})"
    )

    with ThreadPoolExecutor(max_workers=self.max_workers) as executor:
        # Submit all jobs
        future_to_path = {
            executor.submit(self.cache_single_file, path, qualities): path
            for path in files_to_cache
        }

        # Process completed jobs
        for future in as_completed(future_to_path):
            path = future_to_path[future]
            completed += 1

            try:
                file_results = future.result()
                results[str(path)] = file_results
                self.logger.debug(
                    f"[{completed}/{total_files}] Cached {path.name}"
                )
            except Exception as e:
                results[str(path)] = {"error": str(e)}
                self.logger.error(f"Failed to cache {path.name}: {e}")

            if progress_callback:
                progress_callback(completed, total_files)

    self.logger.debug(f"Parallel cache generation complete for {total_files} tracks")
    return results

`verify_mixtape_cache(track_paths, quality='medium')` ¶

Verify which tracks in a mixtape have valid cached versions.

Parameters:

Name	Type	Description	Default
`track_paths`	`list[Path]`	List of paths to audio files in the mixtape.	required
`quality`	`QualityLevel`	Quality level to check.	`'medium'`

Returns:

Type	Description
`dict[str, bool]`	Dictionary mapping file paths to cache availability status.

Source code in src/audio_cache/cache_worker.py

def verify_mixtape_cache(
    self, track_paths: list[Path], quality: QualityLevel = "medium"
) -> dict[str, bool]:
    """
    Verify which tracks in a mixtape have valid cached versions.

    Args:
        track_paths: List of paths to audio files in the mixtape.
        quality: Quality level to check.

    Returns:
        Dictionary mapping file paths to cache availability status.
    """
    results = {
        str(path): (
            self.audio_cache.is_cached(path, quality) if self.audio_cache.should_transcode(path) else True
        )
        for path in track_paths
    }
    return results

`regenerate_outdated_cache(track_paths, qualities=None)` ¶

Regenerate cached versions that are older than their source files.

Parameters:

Name	Type	Description	Default
`track_paths`	`list[Path]`	List of paths to audio files.	required
`qualities`	`list[QualityLevel]`	Quality levels to regenerate.	`None`

Returns:

Type	Description
`dict[str, dict]`	Dictionary with regeneration results for each file.

Source code in src/audio_cache/cache_worker.py

def regenerate_outdated_cache(
    self, track_paths: list[Path], qualities: list[QualityLevel] = None
) -> dict[str, dict]:
    """
    Regenerate cached versions that are older than their source files.

    Args:
        track_paths: List of paths to audio files.
        qualities: Quality levels to regenerate.

    Returns:
        Dictionary with regeneration results for each file.
    """
    if qualities is None:
        qualities = ["medium"]

    files_to_regenerate = []

    for path in track_paths:
        if not self.audio_cache.should_transcode(path):
            continue

        # Check each quality level
        for quality in qualities:
            if not self.audio_cache.is_cached(path, quality):
                files_to_regenerate.append((path, quality))
                self.logger.debug(
                    f"Cache outdated or missing: {path.name} at {quality}"
                )

    if not files_to_regenerate:
        self.logger.info("All caches are up-to-date")
        return {}

    results = {}

    for path, quality in files_to_regenerate:
        try:
            self.audio_cache.transcode_file(path, quality, overwrite=True)
            key = f"{path}_{quality}"
            results[key] = {"success": True}
            self.logger.debug(f"Regenerated cache: {path.name} at {quality}")
        except Exception as e:
            key = f"{path}_{quality}"
            results[key] = {"success": False, "error": str(e)}
            self.logger.error(f"Failed to regenerate {path.name}: {e}")

    return results

Convenience Scheduler¶

`schedule_mixtape_caching(mixtape_tracks, music_root, audio_cache, logger=None, qualities=None, async_mode=True, progress_callback=None)` ¶

Convenience function to schedule caching for a mixtape's tracks.

Parameters:

Name	Type	Description	Default
`mixtape_tracks`	`list[dict]`	List of track dictionaries with 'path' keys.	required
`music_root`	`Path`	Root directory for music files.	required
`audio_cache`	`AudioCache`	AudioCache instance.	required
`logger`	`Logger \| None`	Optional logger.	`None`
`qualities`	`list[QualityLevel]`	Quality levels to cache. Defaults to ["medium"].	`None`
`async_mode`	`bool`	If True, use parallel processing.	`True`
`progress_callback`	`Callable[[int, int], None] \| None`	Optional callback function(current, total) for progress updates.	`None`

Returns:

Type	Description
`dict`	Dictionary with caching results.

Source code in src/audio_cache/cache_worker.py

def schedule_mixtape_caching(
    mixtape_tracks: list[dict],
    music_root: Path,
    audio_cache: AudioCache,
    logger: Logger | None = None,
    qualities: list[QualityLevel] = None,
    async_mode: bool = True,
    progress_callback: Callable[[int, int], None] | None = None,
) -> dict:
    """
    Convenience function to schedule caching for a mixtape's tracks.

    Args:
        mixtape_tracks: List of track dictionaries with 'path' keys.
        music_root: Root directory for music files.
        audio_cache: AudioCache instance.
        logger: Optional logger.
        qualities: Quality levels to cache. Defaults to ["medium"].
        async_mode: If True, use parallel processing.
        progress_callback: Optional callback function(current, total) for progress updates.

    Returns:
        Dictionary with caching results.
    """
    if qualities is None:
        qualities = ["medium"]

    # Convert track dictionaries to Path objects
    track_paths = [music_root / track["path"] for track in mixtape_tracks]

    # Filter out non-existent files
    valid_paths = [path for path in track_paths if path.exists()]

    if len(valid_paths) < len(track_paths):
        missing = len(track_paths) - len(valid_paths)
        if logger:
            logger.warning(f"{missing} track files not found, skipping")

    # Create worker and cache files
    worker = CacheWorker(audio_cache, logger)

    if async_mode:
        return worker.cache_mixtape_async(valid_paths, qualities, progress_callback)
    else:
        return worker.cache_mixtape(valid_paths, qualities, progress_callback)

Progress Tracker (SSE)¶

`get_progress_tracker(logger=None)` ¶

Get or create the global progress tracker instance.

Source code in src/audio_cache/progress_tracker.py

def get_progress_tracker(logger: Logger | None = None) -> ProgressTracker:
    """Get or create the global progress tracker instance."""
    global _progress_tracker
    if _progress_tracker is None:
        _progress_tracker = ProgressTracker(logger)
    return _progress_tracker

`ProgressTracker(logger=None)` ¶

Tracks progress of long-running operations and broadcasts updates via SSE.

Thread-safe implementation that allows multiple operations to report progress while clients listen for updates.

Initializes a new progress tracker with optional logging.

Sets up internal, thread-safe queues for tracking task-specific progress events that can be streamed to clients.

Parameters:

Name	Type	Description	Default
`logger`	`Logger \| None`	Optional logger instance used to record progress tracker activity.	`None`

Methods:

Name	Description
`create_task`	Create a new task for tracking.
`emit`	Emit a progress event.
`listen`	Generator that yields SSE-formatted progress events.
`cleanup_task`	Remove a task and its queue.

Source code in src/audio_cache/progress_tracker.py

def __init__(self, logger: Logger | None = None):
    """Initializes a new progress tracker with optional logging.

    Sets up internal, thread-safe queues for tracking task-specific progress events that can be streamed to clients.

    Args:
        logger: Optional logger instance used to record progress tracker activity.
    """
    self.logger = logger or NullLogger()
    self._queues: dict[str, Queue] = {}
    self._lock = threading.Lock()

`create_task(task_id)` ¶

Create a new task for tracking.

Parameters:

Name	Type	Description	Default
`task_id`	`str`	Unique identifier for this task (e.g., mixtape slug)	required

Source code in src/audio_cache/progress_tracker.py

def create_task(self, task_id: str) -> None:
    """
    Create a new task for tracking.

    Args:
        task_id: Unique identifier for this task (e.g., mixtape slug)
    """
    with self._lock:
        if task_id not in self._queues:
            self._queues[task_id] = Queue()
            self.logger.debug(f"Created progress task: {task_id}")

`emit(task_id, step, status, message, current=0, total=0)` ¶

Emit a progress event.

Parameters:

Name	Type	Description	Default
`task_id`	`str`	Task identifier	required
`step`	`str`	Name of the current step (e.g., "saving", "caching_track")	required
`status`	`ProgressStatus`	Current status	required
`message`	`str`	Human-readable message	required
`current`	`int`	Current progress count	`0`
`total`	`int`	Total items to process	`0`

Source code in src/audio_cache/progress_tracker.py

def emit(
    self,
    task_id: str,
    step: str,
    status: ProgressStatus,
    message: str,
    current: int = 0,
    total: int = 0
) -> None:
    """
    Emit a progress event.

    Args:
        task_id: Task identifier
        step: Name of the current step (e.g., "saving", "caching_track")
        status: Current status
        message: Human-readable message
        current: Current progress count
        total: Total items to process
    """
    event = ProgressEvent(
        task_id=task_id,
        step=step,
        status=status,
        message=message,
        current=current,
        total=total
    )

    with self._lock:
        if task_id in self._queues:
            self._queues[task_id].put(event)
            self.logger.debug(f"[{task_id}] {step}: {message} ({current}/{total})")

`listen(task_id, timeout=300)` ¶

Generator that yields SSE-formatted progress events.

Parameters:

Name	Type	Description	Default
`task_id`	`str`	Task identifier to listen to	required
`timeout`	`int`	Maximum time to wait for events (seconds)	`300`

Yields:

Name	Type	Description
`str`		SSE-formatted event strings

Source code in src/audio_cache/progress_tracker.py

def listen(self, task_id: str, timeout: int = 300):
    """
    Generator that yields SSE-formatted progress events.

    Args:
        task_id: Task identifier to listen to
        timeout: Maximum time to wait for events (seconds)

    Yields:
        str: SSE-formatted event strings
    """
    self.create_task(task_id)

    start_time = time.time()

    # Send initial connection event
    yield f"data: {json.dumps({'type': 'connected', 'task_id': task_id})}\n\n"

    while True:
        # Check timeout
        if time.time() - start_time > timeout:
            self.logger.warning(f"Progress stream timeout for task: {task_id}")
            break

        try:
            with self._lock:
                queue = self._queues.get(task_id)

            if queue is None:
                break

            # Get event with short timeout to allow checking for completion
            event = queue.get(timeout=1)

            yield event.to_sse()

            # If task is completed or failed, clean up and stop
            if event.status in (ProgressStatus.COMPLETED, ProgressStatus.FAILED):
                self.logger.debug(f"Task completed: {task_id}")
                break

        except Empty:
            # Send keepalive
            yield ": keepalive\n\n"
            continue

    # Cleanup
    self.cleanup_task(task_id)

`cleanup_task(task_id)` ¶

Remove a task and its queue.

Parameters:

Name	Type	Description	Default
`task_id`	`str`	Task identifier to clean up	required

Source code in src/audio_cache/progress_tracker.py

def cleanup_task(self, task_id: str) -> None:
    """
    Remove a task and its queue.

    Args:
        task_id: Task identifier to clean up
    """
    with self._lock:
        if task_id in self._queues:
            del self._queues[task_id]
            self.logger.debug(f"Cleaned up progress task: {task_id}")

`ProgressCallback(task_id, tracker, total_tracks)` ¶

Callback wrapper for audio caching progress.

Translates cache worker progress updates into SSE events.

Initialize the progress callback.

Parameters:

Name	Type	Description	Default
`task_id`	`str`	Task identifier	required
`tracker`	`ProgressTracker`	ProgressTracker instance	required
`total_tracks`	`int`	Total number of tracks to cache	required

Methods:

Name	Description
`__call__`	Called by cache worker with progress updates.
`track_cached`	Records that a track has been successfully cached.
`track_skipped`	Records that a track was intentionally skipped during caching.
`track_failed`	Records that caching a track has failed.

Source code in src/audio_cache/progress_tracker.py

def __init__(self, task_id: str, tracker: ProgressTracker, total_tracks: int):
    """
    Initialize the progress callback.

    Args:
        task_id: Task identifier
        tracker: ProgressTracker instance
        total_tracks: Total number of tracks to cache
    """
    self.task_id = task_id
    self.tracker = tracker
    self.total_tracks = total_tracks
    self.cached_count = 0
    self.skipped_count = 0
    self.failed_count = 0

`call(current, total)` ¶

Called by cache worker with progress updates.

Parameters:

Name	Type	Description	Default
`current`	`int`	Current file number	required
`total`	`int`	Total files to process	required

Source code in src/audio_cache/progress_tracker.py

def __call__(self, current: int, total: int) -> None:
    """
    Called by cache worker with progress updates.

    Args:
        current: Current file number
        total: Total files to process
    """
    self.tracker.emit(
        task_id=self.task_id,
        step="caching",
        status=ProgressStatus.IN_PROGRESS,
        message=f"Caching track {current} of {total}",
        current=current,
        total=total
    )

`track_cached(track_name)` ¶

Records that a track has been successfully cached.

Increments the count of cached tracks and emits a progress event reflecting the updated completion state.

Parameters:

Name	Type	Description	Default
`track_name`	`str`	The display name or identifier of the cached track.	required

Source code in src/audio_cache/progress_tracker.py

def track_cached(self, track_name: str) -> None:
    """Records that a track has been successfully cached.

    Increments the count of cached tracks and emits a progress event reflecting the updated completion state.

    Args:
        track_name: The display name or identifier of the cached track.
    """
    self.cached_count += 1
    self.tracker.emit(
        task_id=self.task_id,
        step="track_cached",
        status=ProgressStatus.IN_PROGRESS,
        message=f"✓ Cached: {track_name}",
        current=self.cached_count,
        total=self.total_tracks
    )

`track_skipped(track_name, reason='already cached')` ¶

Records that a track was intentionally skipped during caching.

Increments the skipped count and emits a progress event explaining why the track was not processed.

Parameters:

Name	Type	Description	Default
`track_name`	`str`	The display name or identifier of the skipped track.	required
`reason`	`str`	Human-readable explanation for why the track was skipped.	`'already cached'`

Source code in src/audio_cache/progress_tracker.py

def track_skipped(self, track_name: str, reason: str = "already cached") -> None:
    """Records that a track was intentionally skipped during caching.

    Increments the skipped count and emits a progress event explaining why the track was not processed.

    Args:
        track_name: The display name or identifier of the skipped track.
        reason: Human-readable explanation for why the track was skipped.
    """
    self.skipped_count += 1
    self.tracker.emit(
        task_id=self.task_id,
        step="track_skipped",
        status=ProgressStatus.SKIPPED,
        message=f"⊘ Skipped: {track_name} ({reason})",
        current=self.cached_count + self.skipped_count,
        total=self.total_tracks
    )

`track_failed(track_name, error)` ¶

Records that caching a track has failed.

Increments the failed count and emits a progress event describing the error that occurred.

Parameters:

Name	Type	Description	Default
`track_name`	`str`	The display name or identifier of the track that failed to cache.	required
`error`	`str`	Human-readable error description explaining the failure.	required

Source code in src/audio_cache/progress_tracker.py

def track_failed(self, track_name: str, error: str) -> None:
    """Records that caching a track has failed.

    Increments the failed count and emits a progress event describing the error that occurred.

    Args:
        track_name: The display name or identifier of the track that failed to cache.
        error: Human-readable error description explaining the failure.
    """
    self.failed_count += 1
    self.tracker.emit(
        task_id=self.task_id,
        step="track_failed",
        status=ProgressStatus.FAILED,
        message=f"✗ Failed: {track_name} - {error}",
        current=self.cached_count + self.skipped_count + self.failed_count,
        total=self.total_tracks
    )

🛠️ Configuration Options¶

Option	Default	Description
`AUDIO_CACHE_DIR`	`"cache/audio"`	Directory where MP3 caches are stored (relative to DATA_ROOT).
`AUDIO_CACHE_ENABLED`	`True`	Master switch – set to `False` to bypass the entire subsystem.
`AUDIO_CACHE_DEFAULT_QUALITY`	`"medium"`	Quality used when a client does not specify one.
`AUDIO_CACHE_MAX_WORKERS`	`4`	Number of parallel threads for batch transcoding.
`AUDIO_CACHE_PRECACHE_ON_UPLOAD`	`True`	Auto-cache mixtape tracks when a mixtape is saved.
`AUDIO_CACHE_PRECACHE_QUALITIES`	`["medium"]`	List of qualities to pre-generate (e.g., `["low", "medium", "high"]`).

These values are defined in src/config/config.py and can be overridden with environment variables (e.g., AUDIO_CACHE_MAX_WORKERS=8).

⏳ Progress Tracking (SSE)¶

The progress modal in the editor UI subscribes to the endpoint:

GET /editor/progress/<slug>

The server returns a Server‑Sent Events stream. Each event looks like:

{
  "task_id": "summer-vibes",
  "step": "caching",
  "status": "in_progress",
  "message": "Caching track 3 of 15",
  "current": 3,
  "total": 15,
  "timestamp": "2024-09-28T12:34:56.789012"
}

The modal updates the progress bar, logs messages, and shows a final summary when the status becomes completed or failed.

Implementation note:ProgressCallback.track_cached(), track_skipped(), and track_failed() are called from CacheWorker to emit the above events.

🔧 Troubleshooting FAQ¶

Cache Misses – “Why isn’t my file being cached?”¶

Symptom	Check	Fix
Cache miss warning in logs	`grep -i "cache miss" app.log`	Verify `AUDIO_CACHE_ENABLED=True` and that the file’s suffix is in `should_transcode` (FLAC, WAV, AIFF, APE, ALAC).
Cache file exists but not found	`ls collection-data/cache/audio/`	Ensure the hash matches the current absolute path. If you moved the music folder, run `python debug_cache.py <MUSIC_ROOT> <REL_PATH> <CACHE_DIR>` (see debug_cache.py).
Cache never generated	`AUDIO_CACHE_PRECACHE_ON_UPLOAD=False`	Enable pre-caching or trigger it manually via `schedule_mixtape_caching`.
ffmpeg not found	`ffmpeg -version`	Install ffmpeg on the host (Ubuntu: `apt install ffmpeg`; Alpine: `apk add ffmpeg`).
Permission denied on cache dir	`ls -ld collection-data/cache/audio`	The Flask process must have write permission (owner UID = the container user).
High CPU usage during batch caching	`top while caching`	Reduce `AUDIO_CACHE_MAX_WORKERS` (e.g., `export AUDIO_CACHE_MAX_WORKERS=2`).
Stale cache after source file change	Compare timestamps (`stat -c %Y file`)	Run `cache.clear_cache()` or set `overwrite=True` in `transcode_file`.

Transcoding Failures – “ffmpeg exited with error code 1”"¶

Inspect the ffmpeg stderr – it is logged by AudioCache.transcode_file.
Common culprits:
Corrupt source file – try re‑encoding the source with ffmpeg -i inut.flac -c copy output.flac.
Unsupported codec – ensure the source is a supported lossless format.
Insufficient disk space – check free space on the cache volume.
Manual test:
```
ffmpeg -i "/music/Artist/Album/BadTrack.flac" -b:a 192k -y "/tmp/test.mp3"
```
If this works, the problem is likely in the path handling (hash mismatch).
Fix path mismatches – run debug_cache.py (see the script in the repo) to compare the hash generated by the app vs. the one you expect.

Audio Caching System¶

📖 Overview¶

🏛️ Architecture Overview¶

✨ Key Features¶

📋 How It Works (Step‑by‑Step)¶

Cache Path Generation¶

Transcoding Flow¶

Playback Flow¶

🔌 API Reference¶

AudioCache (core)¶

AudioCache(cache_dir, logger=None) ¶

get_cache_path(original_path, quality='medium') ¶

should_transcode(file_path) ¶

is_cached(original_path, quality='medium') ¶

transcode_file(original_path, quality='medium', overwrite=False) ¶

get_cached_or_original(original_path, quality='medium') ¶

precache_file(original_path, qualities=None) ¶

get_cache_size() ¶

clear_cache(older_than_days=None) ¶

CacheWorker (batch & async)¶

CacheWorker(audio_cache, logger=None, max_workers=4) ¶

cache_single_file(file_path, qualities=None) ¶

cache_mixtape(track_paths, qualities=None, progress_callback=None) ¶

cache_mixtape_async(track_paths, qualities=None, progress_callback=None) ¶

verify_mixtape_cache(track_paths, quality='medium') ¶

regenerate_outdated_cache(track_paths, qualities=None) ¶

Convenience Scheduler¶

schedule_mixtape_caching(mixtape_tracks, music_root, audio_cache, logger=None, qualities=None, async_mode=True, progress_callback=None) ¶

Progress Tracker (SSE)¶

get_progress_tracker(logger=None) ¶

ProgressTracker(logger=None) ¶

create_task(task_id) ¶

emit(task_id, step, status, message, current=0, total=0) ¶

listen(task_id, timeout=300) ¶

cleanup_task(task_id) ¶

ProgressCallback(task_id, tracker, total_tracks) ¶

__call__(current, total) ¶

track_cached(track_name) ¶

track_skipped(track_name, reason='already cached') ¶

track_failed(track_name, error) ¶

🛠️ Configuration Options¶

⏳ Progress Tracking (SSE)¶

🔧 Troubleshooting FAQ¶

Cache Misses – “Why isn’t my file being cached?”¶

Transcoding Failures – “ffmpeg exited with error code 1”"¶

`AudioCache(cache_dir, logger=None)` ¶

`get_cache_path(original_path, quality='medium')` ¶

`should_transcode(file_path)` ¶

`is_cached(original_path, quality='medium')` ¶

`transcode_file(original_path, quality='medium', overwrite=False)` ¶

`get_cached_or_original(original_path, quality='medium')` ¶

`precache_file(original_path, qualities=None)` ¶

`get_cache_size()` ¶

`clear_cache(older_than_days=None)` ¶

`CacheWorker(audio_cache, logger=None, max_workers=4)` ¶

`cache_single_file(file_path, qualities=None)` ¶

`cache_mixtape(track_paths, qualities=None, progress_callback=None)` ¶

`cache_mixtape_async(track_paths, qualities=None, progress_callback=None)` ¶

`verify_mixtape_cache(track_paths, quality='medium')` ¶

`regenerate_outdated_cache(track_paths, qualities=None)` ¶

`schedule_mixtape_caching(mixtape_tracks, music_root, audio_cache, logger=None, qualities=None, async_mode=True, progress_callback=None)` ¶

`get_progress_tracker(logger=None)` ¶

`ProgressTracker(logger=None)` ¶

`create_task(task_id)` ¶

`emit(task_id, step, status, message, current=0, total=0)` ¶

`listen(task_id, timeout=300)` ¶

`cleanup_task(task_id)` ¶

`ProgressCallback(task_id, tracker, total_tracks)` ¶

`call(current, total)` ¶

`track_cached(track_name)` ¶

`track_skipped(track_name, reason='already cached')` ¶

`track_failed(track_name, error)` ¶