Cost Tracking
llmist provides comprehensive cost tracking through the ExecutionTree and ModelRegistry APIs.
Quick Start
Section titled “Quick Start”Track costs after an agent run:
const answer = await LLMist.createAgent() .withModel('sonnet') .withGadgets(FloppyDisk) .withHooks({ observers: { onAgentComplete: (ctx) => { const tree = ctx.tree; console.log(`Total cost: $${tree.getTotalCost().toFixed(4)}`); console.log(`Total tokens:`, tree.getTotalTokens()); }, }, }) .askAndCollect('How many floppies for DOOM.ZIP at 50MB?');Output:
Total cost: $0.0032Total tokens: { input: 850, output: 120, cached: 0 }Cost Estimation (Before Calls)
Section titled “Cost Estimation (Before Calls)”Estimate costs before making API calls:
const client = new LLMist();const registry = client.modelRegistry;
// Estimate for known token countsconst cost = registry.estimateCost('gpt-5', 10_000, 2_000);console.log(`Estimated: $${cost.totalCost.toFixed(4)}`);
// Estimate from messagesconst messages = [ { role: 'system', content: 'You are helpful' }, { role: 'user', content: 'Explain quantum computing in detail...' },];const inputTokens = await client.countTokens('openai:gpt-5', messages);const estimatedCost = registry.estimateCost('gpt-5', inputTokens, 1000);Cost Breakdown
Section titled “Cost Breakdown”const cost = registry.estimateCost('sonnet', 10_000, 2_000);
console.log(cost.inputCost); // Cost for input tokensconsole.log(cost.outputCost); // Cost for output tokensconsole.log(cost.totalCost); // Combined totalReal-Time Tracking (ExecutionTree)
Section titled “Real-Time Tracking (ExecutionTree)”The ExecutionTree tracks all costs during agent execution:
const result = await LLMist.createAgent() .withModel('sonnet') .withHooks({ observers: { onLLMCallComplete: (ctx) => { const iterationCost = ctx.tree.getTotalCost(); console.log(`Running total: $${iterationCost.toFixed(4)}`); }, }, }) .askAndCollect('Research task');Token Breakdown
Section titled “Token Breakdown”const tokens = tree.getTotalTokens();
console.log(tokens.input); // Total input tokensconsole.log(tokens.output); // Total output tokensconsole.log(tokens.cached); // Cached tokens (if supported by provider)Subagent Costs
Section titled “Subagent Costs”Track costs for nested agents (subagents):
// Get costs for a specific subtreeconst subtreeCost = tree.getSubtreeCost(gadgetNodeId);const subtreeTokens = tree.getSubtreeTokens(gadgetNodeId);
console.log(`Subagent cost: $${subtreeCost.toFixed(4)}`);Example with BrowseWeb subagent:
ExecutionTree├── LLM Call #1 (sonnet, 1,200 tokens, $0.003)│ ├── Gadget: ReadFile│ └── Gadget: BrowseWeb (subagent)│ ├── LLM Call #1 (haiku, 800 tokens, $0.001)│ └── LLM Call #2 (haiku, 600 tokens, $0.001)└── LLM Call #2 (sonnet, 900 tokens, $0.002)
Total: $0.007Model Pricing
Section titled “Model Pricing”Look up model pricing:
const client = new LLMist();const spec = client.modelRegistry.getModelSpec('gpt-5');
console.log(spec.pricing.input); // $ per 1M input tokensconsole.log(spec.pricing.output); // $ per 1M output tokensFind Cheapest Model
Section titled “Find Cheapest Model”const cheapest = client.modelRegistry.getCheapestModel(10_000, 2_000);console.log(`Cheapest: ${cheapest.modelId}`);Cost-Aware Patterns
Section titled “Cost-Aware Patterns”Monitor High Costs
Section titled “Monitor High Costs”const agent = LLMist.createAgent() .withModel('opus') .withHooks({ controllers: { beforeLLMCall: async (ctx) => { const currentCost = ctx.tree.getTotalCost(); if (currentCost > 0.10) { console.warn('⚠️ Cost exceeds $0.10, switching to cheaper model'); return { action: 'proceed', modifiedOptions: { model: 'haiku' }, }; } return { action: 'proceed' }; }, }, });Token Tracking Preset
Section titled “Token Tracking Preset”import { HookPresets } from 'llmist';
await LLMist.createAgent() .withModel('sonnet') .withHooks(HookPresets.tokenTracking()) .askAndCollect('Your prompt');
// Logs cumulative token usage after each callCost Logging
Section titled “Cost Logging”.withHooks({ observers: { onAgentComplete: (ctx) => { const cost = ctx.tree.getTotalCost(); const tokens = ctx.tree.getTotalTokens();
// Log to your analytics system analytics.track('agent_complete', { cost, tokens, model: ctx.options.model, iterations: ctx.iteration, }); }, },})Cost Optimization Tips
Section titled “Cost Optimization Tips”-
Use model shortcuts strategically
haikufor simple taskssonnetfor complex reasoningopusonly when needed
-
Leverage caching (Anthropic)
- System prompts are cached automatically
- Repeated context reduces costs
-
Monitor with hooks
- Use
HookPresets.tokenTracking()in development - Set cost alerts in production
- Use
-
Batch operations
- Combine related queries into single prompts
- Use subagents for parallel work
See Also
Section titled “See Also”- Execution Tree - Tree structure and navigation
- Model Catalog - Model specs and features
- Hooks Guide - Lifecycle monitoring