Multi-Provider System¶

AI Music Studio uses a pluggable provider architecture so that different audio generation backends can be swapped or combined without changing the rest of the codebase.

Overview¶

The audio_engineer.providers package defines:

Class	Role
`AudioProvider`	Abstract base class every backend must implement
`ProviderCapability`	Enum of capabilities a provider can declare
`TrackRequest`	Pydantic model describing a single track to generate
`TrackResult`	Pydantic model returned by every provider
`ProviderRegistry`	Manages registered providers and routes requests
`MidiProvider`	Built-in algorithmic MIDI backend (zero extra dependencies)
`LLMMidiProvider`	LLM-driven MIDI backend; falls back to `MidiProvider` on parse failure
`GeminiLyriaProvider`	Google Lyria 3 audio backend (requires `[gemini]` extra)

Provider Capabilities¶

ProviderCapability is a string enum. Declare which capabilities your provider supports by returning them from the capabilities property:

Capability	Description
`midi_generation`	Produces `.mid` files
`audio_generation`	Produces rendered audio (WAV/MP3)
`vocals`	Can generate or synthesise vocals
`sound_design`	Sound effects and SFX
`audio_analysis`	Transcription, genre/mood detection
`source_separation`	Stem separation
`effects_processing`	DSP / FX chains
`text_to_speech`	Spoken narration

Built-in Providers¶

`MidiProvider`¶

Name: midi_engine
Capabilities: midi_generation
Availability: always available
What it does: wraps the existing SessionOrchestrator to generate algorithmic MIDI using the full agent pipeline (22 genres, 26 instruments)

`LLMMidiProvider`¶

Name: llm_midi
Capabilities: midi_generation
Availability: when an LLM callable is injected
What it does: constructs a structured prompt from the TrackRequest (genre, key, tempo, instrument, style hints) and asks the LLM to return a JSON array of {pitch, velocity, start_beat, duration_beats} objects. Falls back to MidiProvider if the response is unparseable.
Priority: registered at highest priority in ProviderRegistry when configured; use preferred_provider="llm_midi" to force it.

from audio_engineer.providers.llm_midi_provider import LLMMidiProvider
from audio_engineer.providers.base import TrackRequest

provider = LLMMidiProvider(llm=lambda prompt: openai_client.complete(prompt))

request = TrackRequest(
    track_name="jazz_piano",
    description="Comping jazz piano chords",
    instrument="keys",
    genre="jazz",
    tempo=140,
)
result = provider.generate_track(request)
# result.provider_used == "llm_midi"  or  "midi_engine_fallback" if LLM failed

core/llm_prompts.py provides the underlying helpers:

Helper	Description
`build_midi_prompt(request)`	Builds the canonical MIDI generation prompt including JSON schema
`parse_midi_json(text)`	Tolerant JSON parser — strips Markdown fences, returns `None` on failure
`validate_midi_events(events)`	Guards against out-of-range pitches/velocities and beat positions
`events_to_note_events(events, channel, ticks_per_beat, bar_offset_ticks)`	Converts validated dicts to `NoteEvent` objects

`GeminiLyriaProvider`¶

Name: gemini_lyria
Capabilities: audio_generation
Availability: requires AUDIO_ENGINEER_GEMINI_API_KEY and pip install -e ".[gemini]"
What it does: calls Google Lyria 3 to produce full-length AI-generated audio

ProviderRegistry¶

ProviderRegistry manages a collection of providers and selects the best one for each request.

Routing Priority¶

If TrackRequest.preferred_provider is set and available, use it.
If TrackRequest.required_capabilities are set, find the first available provider that supports all of them.
Fall back to the first available provider.

API¶

from audio_engineer.providers import ProviderRegistry, MidiProvider

registry = ProviderRegistry()
registry.register(MidiProvider())

# List all registered providers
print(registry.list_providers())    # ['midi_engine']
print(registry.list_available())    # ['midi_engine']

# Find providers by capability
from audio_engineer.providers import ProviderCapability
midi_providers = registry.find_by_capability(ProviderCapability.MIDI_GENERATION)

# Generate a track
from audio_engineer.providers import TrackRequest
request = TrackRequest(
    track_name="rhythm_guitar",
    description="Driving rhythm guitar for a blues track",
    genre="blues",
    key="A",
    tempo=100,
    required_capabilities=[ProviderCapability.MIDI_GENERATION],
)
result = registry.generate(request)
print(result.success, result.provider_used, result.track)

Writing a Custom Provider¶

Subclass AudioProvider and implement the four required methods:

from audio_engineer.providers.base import AudioProvider, ProviderCapability, TrackRequest, TrackResult

class MyCustomProvider(AudioProvider):
    @property
    def name(self) -> str:
        return "my_provider"

    @property
    def capabilities(self) -> list[ProviderCapability]:
        return [ProviderCapability.AUDIO_GENERATION]

    def is_available(self) -> bool:
        # Return False if required credentials/packages are missing
        return True

    def generate_track(self, request: TrackRequest) -> TrackResult:
        # ... your generation logic ...
        return TrackResult(success=True, provider_used=self.name, track=audio_track)

Then register it:

from audio_engineer.providers import ProviderRegistry

registry = ProviderRegistry()
registry.register(MyCustomProvider())

You can also inject a registry into SessionOrchestrator:

from audio_engineer.agents.orchestrator import SessionOrchestrator

orchestrator = SessionOrchestrator(output_dir="./output")
orchestrator.provider_registry.register(MyCustomProvider())

Configuration¶

Variable	Description	Default
`AUDIO_ENGINEER_DEFAULT_AUDIO_PROVIDER`	Provider used when no preference is specified	`midi_engine`
`AUDIO_ENGINEER_DEFAULT_MIDI_PROVIDER`	Provider used for MIDI-specific requests	`midi_engine`
`AUDIO_ENGINEER_ENABLE_GEMINI_PROVIDER`	Auto-register `GeminiLyriaProvider` on startup	`true`

See Configuration for the full settings reference.