Agents¶
Every instrument is driven by a dedicated agent class that extends BaseMusician (for performers) or BaseEngineer (for processing roles). All agents receive a SessionContext and return structured MidiTrackData or configuration objects.
Agent Hierarchy¶
classDiagram
class BaseAgent {
+name: str
+llm: LLM | None
+generate(context, progression) MidiTrackData
}
class BaseMusician {
+instrument: str
+midi_channel: int
}
class BaseEngineer {
+process(tracks) dict
}
class DrummerAgent
class BassistAgent
class GuitaristAgent
class LeadGuitarAgent
class KeyboardistAgent
class StringsAgent
class BrassAgent
class SynthAgent
class PercussionAgent
class MixerAgent
class MasteringAgent
BaseAgent <|-- BaseMusician
BaseAgent <|-- BaseEngineer
BaseMusician <|-- DrummerAgent
BaseMusician <|-- BassistAgent
BaseMusician <|-- GuitaristAgent
BaseMusician <|-- LeadGuitarAgent
BaseMusician <|-- KeyboardistAgent
BaseMusician <|-- StringsAgent
BaseMusician <|-- BrassAgent
BaseMusician <|-- SynthAgent
BaseMusician <|-- PercussionAgent
BaseEngineer <|-- MixerAgent
BaseEngineer <|-- MasteringAgent
Musician Agents¶
DrummerAgent¶
Module: audio_engineer.agents.musician.drummer
MIDI channel: 9 (General MIDI drum channel)
Generates kick, snare, and hi-hat patterns based on the genre preset and section type (intro, verse, chorus, outro).
Key behaviors:
- Patterns are drawn from the genre pattern library (core/patterns.py) — 22 genres, 40+ patterns
- All 40 PAS Standard Drum Rudiments are available as DRUM_RUDIMENTS in core/patterns.py
- Velocity varies by beat position to simulate human dynamics
BassistAgent¶
Module: audio_engineer.agents.musician.bassist
Generates root-note bass lines that follow the chord progression and rhythmically lock to the kick drum pattern from the drummer.
Key behaviors:
- Plays the chord root on down beats
- Adds passing tones and octave jumps based on genre feel
- 8 genre-specific BassPattern presets available: jazz walking, funk slap, reggae skank, Latin tumbao, Motown, country boom-chick, R&B two-feel, pop root-fifth
GuitaristAgent¶
Module: audio_engineer.agents.musician.guitarist
Generates rhythm guitar parts — strummed chords, power chords, or arpeggios depending on genre.
Key behaviors:
- Chord voicings derived from music_theory.get_chord_voicing()
- Genre determines strum pattern and palm-muting style
- Can generate both rhythm and lead layers
LeadGuitarAgent¶
Module: audio_engineer.agents.musician.lead_guitar
MIDI channel: 4
Generates pentatonic licks, scale runs, and fills over chord changes.
Key behaviors:
- Derives lick pitches from the pentatonic minor or blues scale of the current key
- High intensity sections produce denser runs; low intensity plays sparse phrases
- Uses LEAD_GUITAR instrument by default; GM program 29 (overdriven guitar)
KeyboardistAgent¶
Module: audio_engineer.agents.musician.keyboardist
Generates chord pads, voicings, and sustained notes. Enabled with --with-keys or with_keys=True.
Key behaviors: - Avoids doubling guitar voicings by choosing wider spread chords - Sustains chords across bar boundaries for pad-like texture - Reduces velocity relative to guitar to sit in the background
StringsAgent¶
Module: audio_engineer.agents.musician.strings
MIDI channel: 5
Generates sustained legato string lines following chord tones, with style varying by intensity.
Key behaviors:
- Low intensity — short pizzicato-style notes (staccato feel)
- High intensity — long tremolo/sustain notes across the whole bar
- Mid intensity — one note per beat following chord tones
- Supports STRINGS (program 48, string ensemble 1) and VIOLIN (program 40) instruments
BrassAgent¶
Module: audio_engineer.agents.musician.brass
MIDI channel: 6
Generates stab chords, long tones, and phrase fall-offs for brass instruments.
Key behaviors: - Stabs are short quarter-note hits on beat 1 and beat 3 - Long tones span the full bar at moderate intensity - At high intensity, adds additional stab notes mid-phrase - Program selection by instrument: trumpet (56), trombone (57), tenor sax (66), alto sax (65), brass section (61)
SynthAgent¶
Module: audio_engineer.agents.musician.synth
MIDI channel: 7
Generates sustained pads and 16th-note arpeggio patterns for synthesizer instruments.
Key behaviors:
- PAD instrument — long whole-note sustains across each bar (programs 88–95)
- SYNTHESIZER instrument — 16th-note arpeggio cycling through chord tones (programs 80–87)
- Arpeggio direction: ascending by default
PercussionAgent¶
Module: audio_engineer.agents.musician.percussion
MIDI channel: 9
Generates Latin/Afro-Cuban hand drum patterns for ethnic percussion instruments.
Key behaviors:
- Uses dedicated GM percussion note numbers (conga, bongo pitches from GM_DRUMS)
- Supports CONGA, BONGO, DJEMBE, and generic PERCUSSION instruments
- Patterns are genre-aware (Latin patterns for Genre.LATIN, Genre.BOSSA_NOVA; funk patterns otherwise)
Engineering Agents¶
MixerAgent¶
Module: audio_engineer.agents.engineer.mixer
Assigns volume, pan, and EQ metadata to each track. Returns a MixConfig dict that is embedded as MIDI control change events.
| Track | Volume | Pan | Notes |
|---|---|---|---|
| Drums | 100/127 | center | Kick louder than hats |
| Bass | 90/127 | slight left | Low-cut on guitar side |
| Guitar | 85/127 | slight right | — |
| Keys | 75/127 | center | Filtered for space |
| Strings | 72/127 | slight right | — |
| Brass | 80/127 | center | — |
| Synth | 70/127 | center | — |
| Percussion | 85/127 | center | — |
MasteringAgent¶
Module: audio_engineer.agents.engineer.mastering
Applies final loudness targets and embeds metadata (tempo, key, genre) into the MIDI file header.
SessionContext¶
All agents receive a SessionContext object that carries:
| Field | Type | Description |
|---|---|---|
session_id |
str |
UUID for this generation run |
genre |
str |
Genre preset |
key |
str |
Root note |
mode |
str |
Scale mode |
tempo |
int |
BPM |
sections |
list[SectionConfig] |
Per-section chord progressions and lengths |
tracks |
dict[str, MidiTrackData] |
Completed tracks from earlier agents |
The tracks field grows as each agent completes, giving later agents access to previous results.
Orchestration Order¶
The SessionOrchestrator generates tracks in a fixed order so each agent can react to what came before:
DrummerAgent— establishes the rhythmic foundationBassistAgent— locks to the kick drumGuitaristAgent— rhythm guitarLeadGuitarAgent— melodic fills and licksKeyboardistAgent— chord padsStringsAgent— orchestral colorBrassAgent— horn stabs and linesSynthAgent— pads and arpeggiosPercussionAgent— hand drums
LLM Integration¶
Every musician agent accepts an optional llm parameter:
from langchain_openai import ChatOpenAI
from audio_engineer.agents.musician.drummer import DrummerAgent
agent = DrummerAgent(llm=ChatOpenAI(model="gpt-4o"))
When an LLM is present, the agent sends a structured prompt describing the genre, key, and section, then parses the response into MIDI note data. Without an LLM, the agent falls back to deterministic algorithmic generation.
For provider-level LLM MIDI generation (rather than per-agent), see LLMMidiProvider.
Adding a New Agent¶
- Create
src/audio_engineer/agents/musician/my_agent.py - Extend
BaseMusician - Implement
generate(context: SessionContext, progression: list) -> MidiTrackData - Register the agent in
SessionOrchestrator._build_agents() - Export it from
agents/musician/__init__.py - Add tests in
tests/agents/test_my_agent.py