Skip to content

CLI Configuration

The llmist CLI is configured through TOML files and environment variables.

The CLI loads configuration from ~/.llmist/cli.toml:

Terminal window
# Initialize with default config
npx @llmist/cli init
~/.llmist/cli.toml
[complete]
model = "anthropic:claude-sonnet-4-5"
temperature = 0.7
[agent]
model = "anthropic:claude-sonnet-4-5"
max-iterations = 15
gadget = ["~/gadgets/common-tools.ts"]

Sections can inherit settings from parent sections:

[agent]
model = "anthropic:claude-sonnet-4-5"
max-iterations = 15
[code-review]
inherits = "agent"
temperature = 0.3
system = "You are a code reviewer."

Create custom commands with a type field:

[code-review]
type = "agent"
description = "Review code for bugs and best practices."
system = "You are a senior code reviewer."
max-iterations = 5
gadget = ["~/gadgets/code-tools.ts"]

Run with:

Terminal window
npx @llmist/cli code-review "Review my PR"

Control proactive rate limiting to prevent API errors before they occur.

Apply to all commands:

[rate-limits]
requests-per-minute = 50
tokens-per-minute = 40000
tokens-per-day = 1500000 # Optional daily cap
safety-margin = 0.8 # Start throttling at 80% of limit
enabled = true # Default: true if any limit is set

Override global settings for specific commands:

[agent.rate-limits]
requests-per-minute = 15 # More conservative for agent mode
tokens-per-minute = 100000
[complete.rate-limits]
requests-per-minute = 100 # Higher limits for completion
tokens-per-minute = 200000

Gemini Free Tier:

[agent.rate-limits]
requests-per-minute = 15
tokens-per-minute = 1000000
tokens-per-day = 1500000

Anthropic Tier 1:

[agent.rate-limits]
requests-per-minute = 50
tokens-per-minute = 40000

OpenAI Free Tier:

[agent.rate-limits]
requests-per-minute = 3
tokens-per-minute = 40000
[rate-limits]
enabled = false

Or per-command:

[agent.rate-limits]
enabled = false

Control automatic retry behavior for transient failures.

[retry]
enabled = true # Default: true
retries = 3 # Max retry attempts
min-timeout = 1000 # Initial delay (ms)
max-timeout = 30000 # Maximum delay (ms)
factor = 2 # Exponential backoff multiplier
randomize = true # Add jitter to prevent thundering herd
respect-retry-after = true # Honor Retry-After headers
max-retry-after-ms = 120000 # Cap server-requested delays (2 minutes)
[agent.retry]
retries = 5 # More retries for long-running agents
max-timeout = 60000 # Up to 1 minute between retries
[complete.retry]
retries = 1 # Fast-fail for completions
max-timeout = 5000

Settings are applied in this order (highest to lowest priority):

  1. CLI flags (--rate-limit-rpm, --max-retries, --no-retry, etc.)
  2. Profile-specific TOML config ([agent.rate-limits], [complete.retry])
  3. Global TOML config ([rate-limits], [retry])
  4. Provider defaults (auto-detected from model)
  5. Built-in defaults

Override TOML configuration for individual runs:

Rate Limiting:

Terminal window
llmist agent "prompt" --rate-limit-rpm 50 --rate-limit-tpm 40000
llmist agent "prompt" --no-rate-limit

Retry:

Terminal window
llmist agent "prompt" --max-retries 5 --retry-max-timeout 60000
llmist agent "prompt" --no-retry

For complete flag documentation, run:

Terminal window
llmist agent --help
llmist complete --help

Define reusable prompts with Eta templating:

[prompts]
base-assistant = "You are a helpful AI assistant."
expert = """
<%~ include("@base-assistant") %>
You are also an expert in <%= it.field %>.
"""
[my-expert]
system = '<%~ include("@expert", {field: "TypeScript"}) %>'

Load prompt content from external files with includeFile():

[prompts]
# Load entire prompt from file
from-file = '<%~ includeFile("~/.llmist/prompts/custom.md") %>'
# Mix file content with inline content
hybrid = """
<%~ include("@base-assistant") %>
<%~ includeFile("./prompts/project-rules.txt") %>
"""
[code-review]
system = '<%~ includeFile("~/.llmist/prompts/code-review.md") %>'

Path resolution:

  • ~ expands to home directory
  • Relative paths resolve from config file location
  • Included files can themselves use includeFile() and include()
  • Circular includes are detected and prevented

The gadget approval system provides a safety layer for potentially dangerous gadget executions.

ModeBehavior
allowedGadget executes immediately
deniedGadget is rejected, LLM receives denial message
approval-requiredUser is prompted before execution

By default, these gadgets require approval:

  • RunCommand - Executes shell commands
  • WriteFile - Creates or modifies files
  • EditFile - Edits existing files

All other gadgets default to allowed.

[agent]
gadget-approval = { WriteFile = "allowed", Shell = "denied", ReadFile = "allowed" }

Set default mode for all unconfigured gadgets:

[agent]
gadget-approval = { "*" = "denied", ReadFile = "allowed", FloppyDisk = "allowed" }

High-Security Mode:

[agent]
gadget-approval = {
WriteFile = "denied",
EditFile = "denied",
RunCommand = "denied",
ReadFile = "allowed"
}

Trust All Mode:

[agent]
gadget-approval = { "*" = "allowed" }

Selective Approval:

[agent]
gadget-approval = {
RunCommand = "approval-required",
WriteFile = "allowed",
DeleteFile = "denied"
}

For file operations, a colored diff is shown:

🔒 Approval required: Modify src/index.ts
--- src/index.ts (original)
+++ src/index.ts (modified)
@@ -1,3 +1,4 @@
import { foo } from './foo';
+import { bar } from './bar';
⏎ approve, or type to reject:
  • Press Enter or type y to approve
  • Type any other text to reject (sent to LLM as feedback)

When running non-interactively (e.g., in scripts or CI), approval-required gadgets are automatically denied.

Terminal window
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="..."
Terminal window
export LLMIST_LOG_LEVEL="debug" # silly, trace, debug, info, warn, error, fatal

These flags work with any command:

FlagDescription
--log-level <level>Set log level
--versionShow version number
--helpShow help