The moment an LLM can use tools, it stops being a fancy autocomplete and starts being genuinely useful. Instead of just generating text about what it would do, it actually does things. Thats the magic of AI agents.
But building agents that work reliably? Thats where it gets tricky. Let me show you what actually works.
What Even Is a Tool?
A tool is just a function the AI can call. You describe what it does, what parameters it needs, and let the model decide when to use it. Pretty simple concept, surprisingly powerful in practice.
const weatherTool = {
name: "get_weather",
description: "Get current weather for a city",
parameters: {
type: "object",
properties: {
city: {
type: "string",
description: "City name, e.g. 'San Francisco'"
}
},
required: ["city"]
}
};
When you give this to the model along with "What's the weather in Tokyo?", it outputs a structured request. Your code executes it and feeds results back. The model then formulates a human response.
The Basic Agent Loop
Every agent follows this pattern. Once you understand it, everything else is just variations:
async function runAgent(userMessage: string) {
const messages = [{ role: "user", content: userMessage }];
while (true) {
const response = await llm.chat({
messages,
tools: availableTools,
});
// If no tool call, we're done
if (!response.toolCall) {
return response.content;
}
// Execute the tool
const result = await executeTool(
response.toolCall.name,
response.toolCall.arguments
);
// Add to conversation and continue
messages.push({
role: "assistant",
toolCall: response.toolCall
});
messages.push({
role: "tool",
content: JSON.stringify(result)
});
}
}
The loop keeps going until the model decides it has enough info to answer. Simple but powerful.
Designing Good Tools
This is where most people mess up. A well-designed tool makes the agent smart. A poorly designed tool makes it confused and unreliable.
Rule 1: One Tool, One Job
Bad:
// Too many responsibilities - model gets confused
const databaseTool = {
name: "database",
description: "Query, insert, update, or delete from database"
};
Good:
const queryUsersTool = {
name: "query_users",
description: "Search for users by name or email"
};
const getUserOrdersTool = {
name: "get_user_orders",
description: "Get all orders for a specific user ID"
};
The model picks tools based on descriptions. Focused tools with clear descriptions = better decisions.
Rule 2: Descriptions Are Everything
The model decides which tool to use based purely on the description. Make it crystal clear what the tool does and when to use it.
Bad:
description: "Gets data" // Gets what data? When? Why?
Good:
description: "Retrieves current stock price for a ticker symbol. Returns price in USD and change percentage. Use when user asks about stock prices."
Rule 3: Always Validate, Always Handle Errors
Never trust the model to send valid parameters. It will surprise you.
async function executeTool(name: string, args: unknown) {
const tool = tools[name];
if (!tool) {
return { error: `Unknown tool: ${name}` };
}
// Validate with Zod
const parsed = tool.schema.safeParse(args);
if (!parsed.success) {
return { error: `Invalid arguments: ${parsed.error.message}` };
}
try {
return await tool.execute(parsed.data);
} catch (err) {
// Return error as result - model can often recover
return { error: `Tool failed: ${err.message}` };
}
}
Return errors as results instead of throwing. The model can often recover if you tell it what went wrong. Ive seen agents self-correct after getting "User not found" errors by asking for different search criteria.
Multi-Step Tasks
Real agents need to chain multiple tool calls. Heres where it gets interesting:
The model naturally chains calls if you set it up right:
const systemPrompt = `You are a helpful assistant with access to tools.
When completing a task:
1. Break it into steps
2. Use tools to gather information
3. Use more tools if needed
4. Only respond when you have everything
Think step by step about what information you need.`;
I've seen agents do 5-6 tool calls in sequence to complete complex tasks. The key is giving them permission to take multiple steps.
Error Recovery Strategies
Agents will fail. Tools will timeout. APIs will return garbage. Plan for it.
Let the model retry with different parameters:
if (result.error) {
messages.push({
role: "tool",
content: JSON.stringify({
error: result.error,
suggestion: "Try different parameters or another approach"
})
});
// Continue loop - model will adapt
}
Provide fallback tools:
const tools = [
primarySearchTool,
{
name: "fallback_search",
description: "Use if primary search fails. Slower but more comprehensive."
}
];
Set limits to prevent infinite loops:
if (toolCallCount > 10) {
return "I'm having trouble. Here's what I found so far: " +
summarizeProgress(messages);
}
Security: Be Paranoid
Agents execute code based on user input. This should terrify you a little bit.
Never give direct database/shell access:
// DONT DO THIS
const dangerousTool = {
name: "run_sql",
description: "Run any SQL query" // Recipe for disaster
};
// DO THIS
const safeTool = {
name: "get_user_profile",
description: "Get public profile for a user ID"
// Internally: SELECT name, avatar FROM users WHERE id = ?
};
Rate limit everything:
const rateLimiter = new RateLimiter({
maxCallsPerMinute: 20,
maxCallsPerSession: 100
});
async function executeTool(name, args) {
if (!rateLimiter.allow()) {
return { error: "Rate limit exceeded" };
}
// ...
}
Log everything for audit:
await auditLog.record({
tool: name,
arguments: args,
userId: context.userId,
timestamp: new Date()
});
Real Example: Support Agent
Heres a practical setup that actually works in production:
const supportAgent = {
systemPrompt: `You are a customer support agent.
Guidelines:
- Verify customer identity before sharing order details
- Be helpful but concise
- If you cant resolve, explain next steps
- Never share internal notes`,
tools: [
{
name: "lookup_order",
description: "Find order by ID or customer email"
},
{
name: "check_inventory",
description: "Check if product is in stock"
},
{
name: "initiate_refund",
description: "Start refund process. Needs order ID and reason."
},
{
name: "escalate_to_human",
description: "Transfer to human agent when issue is complex"
}
]
};
The escalation tool is important - agents should know when to give up and ask for help.
Performance Tips
1. Enable parallel tool calls if your model supports it:
const response = await llm.chat({
messages,
tools,
parallelToolCalls: true // Multiple tools at once
});
if (response.toolCalls?.length > 1) {
const results = await Promise.all(
response.toolCalls.map(tc => executeTool(tc.name, tc.arguments))
);
}
2. Cache tool results:
const cache = new LRUCache({ max: 1000, ttl: 60000 });
async function executeTool(name, args) {
const key = `${name}:${JSON.stringify(args)}`;
if (cache.has(key)) return cache.get(key);
const result = await tools[name].execute(args);
cache.set(key, result);
return result;
}
3. Stream responses to users:
for await (const chunk of llm.streamChat({ messages, tools })) {
if (chunk.type === 'text') {
yield chunk.content; // Show immediately
} else if (chunk.type === 'tool_call') {
yield "Looking that up...";
}
}
Wrapping Up
Building agents is about:
- Designing focused tools with clear descriptions
- Building a robust loop with error handling
- Security first - limit, validate, audit
- Performance through caching and parallelization
Start simple. One tool, one task. Get that working reliably, then expand. The most capable agents I've built started as single-tool systems that grew based on actual user needs.
Tools turn LLMs from text generators into actual assistants. Build them well and they'll surprise you with what they can do.
