Llm Call¶
Make a single LLM API call (Tier 2 execution). Cheaper than spawning a full agent session. Supports freeform text or structured JSON output via output_schema (inline dict or named schema from work_buddy/llm/schemas/). Routes to Claude via 'tier' or to a local/remote OpenAI-compatible server (LM Studio, vLLM, Ollama) via 'profile'. Handles caching and cost tracking automatically.
MCP name: llm_call
Category: llm
Parameters¶
| Name | Type | Required | Description |
|---|---|---|---|
cache_ttl_minutes |
int |
No | Cache TTL in minutes. None=config default, 0=no cache. |
max_tokens |
int |
No | Max response tokens (default: 1024) |
output_schema |
dict|str |
No | JSON Schema for structured output. Pass a dict for inline schemas, or a string name to load from work_buddy/llm/schemas/ |
profile |
str |
No | Named local/remote profile (e.g. 'local_general') declared under llm.profiles in config. Routes through the profile's backend instead of Anthropic. Mutually exclusive with 'tier'. |
system |
str |
Yes | System prompt |
temperature |
float |
No | Sampling temperature (default: 0.0) |
tier |
str |
No | Cloud model tier: 'haiku' (default if no profile given), 'sonnet', or 'opus'. Mutually exclusive with 'profile'. |
user |
str |
Yes | User message content |