Skip to content

CLI Configuration

The llmist CLI is configured through TOML files and environment variables.

The CLI loads configuration from ~/.llmist/cli.toml:

Terminal window
# Initialize with default config
npx @llmist/cli init
~/.llmist/cli.toml
[complete]
model = "anthropic:claude-sonnet-4-5"
temperature = 0.7
[agent]
model = "anthropic:claude-sonnet-4-5"
max-iterations = 15
gadget = ["~/gadgets/common-tools.ts"]

Sections can inherit settings from parent sections:

[agent]
model = "anthropic:claude-sonnet-4-5"
max-iterations = 15
[code-review]
inherits = "agent"
temperature = 0.3
system = "You are a code reviewer."

Create custom commands with a type field:

[code-review]
type = "agent"
description = "Review code for bugs and best practices."
system = "You are a senior code reviewer."
max-iterations = 5
gadget = ["~/gadgets/code-tools.ts"]

Run with:

Terminal window
npx @llmist/cli code-review "Review my PR"

Control proactive rate limiting to prevent API errors before they occur.

Apply to all commands:

[rate-limits]
requests-per-minute = 50
tokens-per-minute = 40000
tokens-per-day = 1500000 # Optional daily cap
safety-margin = 0.8 # Start throttling at 80% of limit
enabled = true # Default: true if any limit is set

Override global settings for specific commands:

[agent.rate-limits]
requests-per-minute = 15 # More conservative for agent mode
tokens-per-minute = 100000
[complete.rate-limits]
requests-per-minute = 100 # Higher limits for completion
tokens-per-minute = 200000

Gemini Free Tier:

[agent.rate-limits]
requests-per-minute = 15
tokens-per-minute = 1000000
tokens-per-day = 1500000

Anthropic Tier 1:

[agent.rate-limits]
requests-per-minute = 50
tokens-per-minute = 40000

OpenAI Free Tier:

[agent.rate-limits]
requests-per-minute = 3
tokens-per-minute = 40000
[rate-limits]
enabled = false

Or per-command:

[agent.rate-limits]
enabled = false

Control automatic retry behavior for transient failures.

[retry]
enabled = true # Default: true
retries = 3 # Max retry attempts
min-timeout = 1000 # Initial delay (ms)
max-timeout = 30000 # Maximum delay (ms)
factor = 2 # Exponential backoff multiplier
randomize = true # Add jitter to prevent thundering herd
respect-retry-after = true # Honor Retry-After headers
max-retry-after-ms = 120000 # Cap server-requested delays (2 minutes)
[agent.retry]
retries = 5 # More retries for long-running agents
max-timeout = 60000 # Up to 1 minute between retries
[complete.retry]
retries = 1 # Fast-fail for completions
max-timeout = 5000

Settings are applied in this order (highest to lowest priority):

  1. CLI flags (--rate-limit-rpm, --max-retries, --no-retry, etc.)
  2. Profile-specific TOML config ([agent.rate-limits], [complete.retry])
  3. Global TOML config ([rate-limits], [retry])
  4. Provider defaults (auto-detected from model)
  5. Built-in defaults

Override TOML configuration for individual runs:

Rate Limiting:

Terminal window
llmist agent "prompt" --rate-limit-rpm 50 --rate-limit-tpm 40000
llmist agent "prompt" --no-rate-limit

Retry:

Terminal window
llmist agent "prompt" --max-retries 5 --retry-max-timeout 60000
llmist agent "prompt" --no-retry

For complete flag documentation, run:

Terminal window
llmist agent --help
llmist complete --help

Define reusable prompts with Eta templating:

[prompts]
base-assistant = "You are a helpful AI assistant."
expert = """
<%~ include("@base-assistant") %>
You are also an expert in <%= it.field %>.
"""
[my-expert]
system = '<%~ include("@expert", {field: "TypeScript"}) %>'

Load prompt content from external files with includeFile():

[prompts]
# Load entire prompt from file
from-file = '<%~ includeFile("~/.llmist/prompts/custom.md") %>'
# Mix file content with inline content
hybrid = """
<%~ include("@base-assistant") %>
<%~ includeFile("./prompts/project-rules.txt") %>
"""
[code-review]
system = '<%~ includeFile("~/.llmist/prompts/code-review.md") %>'

Path resolution:

  • ~ expands to home directory
  • Relative paths resolve from config file location
  • Included files can themselves use includeFile() and include()
  • Circular includes are detected and prevented

The gadget approval system provides a safety layer for potentially dangerous gadget executions.

ModeBehavior
allowedGadget executes immediately
deniedGadget is rejected, LLM receives denial message
approval-requiredUser is prompted before execution

By default, these gadgets require approval:

  • RunCommand - Executes shell commands
  • WriteFile - Creates or modifies files
  • EditFile - Edits existing files

All other gadgets default to allowed.

[agent]
gadget-approval = { WriteFile = "allowed", Shell = "denied", ReadFile = "allowed" }

Set default mode for all unconfigured gadgets:

[agent]
gadget-approval = { "*" = "denied", ReadFile = "allowed", FloppyDisk = "allowed" }

High-Security Mode:

[agent]
gadget-approval = {
WriteFile = "denied",
EditFile = "denied",
RunCommand = "denied",
ReadFile = "allowed"
}

Trust All Mode:

[agent]
gadget-approval = { "*" = "allowed" }

Selective Approval:

[agent]
gadget-approval = {
RunCommand = "approval-required",
WriteFile = "allowed",
DeleteFile = "denied"
}

For file operations, a colored diff is shown:

🔒 Approval required: Modify src/index.ts
--- src/index.ts (original)
+++ src/index.ts (modified)
@@ -1,3 +1,4 @@
import { foo } from './foo';
+import { bar } from './bar';
⏎ approve, or type to reject:
  • Press Enter or type y to approve
  • Type any other text to reject (sent to LLM as feedback)

When running non-interactively (e.g., in scripts or CI), approval-required gadgets are automatically denied.

Terminal window
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="..."
Terminal window
export LLMIST_LOG_LEVEL="debug" # silly, trace, debug, info, warn, error, fatal
export LLMIST_LOG_TEE="true" # Also write to stdout when file logging is active

These flags work with any command:

FlagDescription
--log-level <level>Set log level
--versionShow version number
--helpShow help

Configure Model Context Protocol servers that the agent should attach to. Each server is its own table.

The MCP config schema is strict: [mcp], [mcp.servers], each server block, and nested env / headers blocks must have the expected shape. Unknown keys and invalid values fail config validation with a path-specific error instead of silently dropping a server.

[mcp.servers.fs]
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
# Optional:
# trust = true # bypass the runtime allowlist for this server
# enabled = false # temporarily disable without removing the block
# timeout-ms = 30000 # per-call timeout
# env = { TZ = "UTC" }
[mcp.servers.remote]
transport = "http"
url = "https://my-mcp.example.com/mcp"
[mcp.servers.remote.headers]
Authorization = "Bearer xyz"
FieldRequiredDefaultNotes
transportyes"stdio" or "http" (legacy SSE not supported)
commandstdio onlyExecutable; basename gated by allowlist unless trust = true
argsoptional[]Arguments to the executable
envoptionalparent envEnvironment overrides for the spawned child
urlhttp onlyMust include http:// or https://
headersoptionalFixed HTTP headers (e.g. Authorization)
trustoptionalfalseBypass the stdio command allowlist for this server
enabledoptionaltrueSet false to skip the block at startup
timeout-msoptionalnonePer-operation timeout in milliseconds; must be a non-negative integer, and 0 disables the timeout

If you already have MCP servers configured in ~/.claude.json (Claude Code, Cursor, Cline), pull them into your llmist config:

Terminal window
llmist mcp import-claude-code # emit blocks to stdout
llmist mcp import-claude-code >> ~/.llmist/config.toml
llmist mcp import-claude-code --write # append directly to ~/.llmist/config.toml

Set CLAUDE_CONFIG_HOME to override the source path.