Troubleshooting Guide - Juspay Agent Framework (JAF)¶
Table of Contents¶
- Common Error Patterns
- Debugging Techniques
- Memory Provider Issues
- Model Provider Troubleshooting
- Tool Execution Debugging
- Server and API Debugging
- Performance Troubleshooting
- Configuration Issues
- Environment Setup Problems
- Log Analysis and Observability
- Frequently Asked Questions (FAQ)
Common Error Patterns¶
1. MaxTurnsExceeded Error¶
Error Pattern:
Causes: - Agent caught in infinite loop calling tools - Model consistently producing tool calls without completion - Overly complex reasoning chains
Solutions:
// Adjust maxTurns in run configuration
const runConfig: RunConfig<Ctx> = {
agentRegistry,
modelProvider,
maxTurns: 20, // Reduce from default 50
};
// Add completion condition in agent instructions
const agent: Agent<Ctx, string> = {
name: 'MyAgent',
instructions: (state) => `
${baseInstructions}
IMPORTANT: After using tools, provide a final answer.
Do not continue calling tools unnecessarily.
`,
};
2. ModelBehaviorError¶
Error Pattern:
Common Causes & Solutions:
// Cause: Invalid model configuration
const modelProvider = makeLiteLLMProvider(
'http://localhost:8000', // Ensure server is running
'valid-api-key' // Check API key
);
// Cause: Network connectivity issues
// Check: curl http://localhost:8000/health
// Cause: Model overload or rate limiting
// Solution: Add retry logic or use different model
3. DecodeError¶
Error Pattern:
Solutions:
// Define clear output schema
const outputSchema = z.object({
answer: z.string().describe('The final answer'),
confidence: z.number().min(0).max(1).optional()
});
// Update agent instructions for structured output
const agent: Agent<Ctx, OutputType> = {
outputCodec: outputSchema,
instructions: (state) => `
Respond with JSON matching this exact structure:
{
"answer": "your response here",
"confidence": 0.9
}
`
};
4. ToolCallError¶
Error Pattern:
Debugging Steps:
// Check tool execution
const debugTool: Tool<CalculatorArgs, Ctx> = {
schema: {
name: 'calculator',
description: 'Perform calculations',
parameters: z.object({
expression: z.string()
})
},
execute: async (args, context) => {
try {
console.log('[DEBUG] Tool args:', args);
console.log('[DEBUG] Context:', context);
const result = evaluate(args.expression);
console.log('[DEBUG] Tool result:', result);
return `Result: ${result}`;
} catch (error) {
console.error('[DEBUG] Tool error:', error);
throw error;
}
}
};
5. AgentNotFound Error¶
Error Pattern:
Solutions:
// Verify agent registry
console.log('Available agents:', Array.from(agentRegistry.keys()));
// Check agent name spelling
const runState: RunState<Ctx> = {
currentAgentName: 'MathTutor', // Ensure exact match
// ... other properties
};
// Use defensive programming
const getAgent = (name: string) => {
const agent = agentRegistry.get(name);
if (!agent) {
throw new Error(`Agent '${name}' not found. Available: ${Array.from(agentRegistry.keys()).join(', ')}`);
}
return agent;
};
6. Guardrail Errors¶
Input Guardrail:
Output Guardrail:
Implementation:
const contentFilter: Guardrail<string> = async (input) => {
const forbidden = ['password', 'secret', 'token'];
const containsForbidden = forbidden.some(word =>
input.toLowerCase().includes(word)
);
return containsForbidden
? { isValid: false, errorMessage: 'Input contains forbidden content' }
: { isValid: true };
};
const runConfig: RunConfig<Ctx> = {
initialInputGuardrails: [contentFilter],
finalOutputGuardrails: [contentFilter],
// ... other config
};
7. HandoffError¶
Error Pattern:
Solutions:
// Define allowed handoffs explicitly
const mathTutor: Agent<Ctx, string> = {
name: 'MathTutor',
handoffs: ['HelperAgent', 'ExpertAgent'], // Allowed targets
tools: [handoffTool],
// ... other properties
};
// Create handoff tool that returns proper format
const handoffTool: Tool<HandoffArgs, Ctx> = {
execute: async (args, context) => {
return JSON.stringify({
handoff_to: args.targetAgent,
reason: args.reason
});
}
};
Debugging Techniques¶
1. Enable Comprehensive Logging¶
import { ConsoleTraceCollector, FileTraceCollector, createCompositeTraceCollector } from 'jaf';
// Console + File logging
const traceCollector = createCompositeTraceCollector(
new ConsoleTraceCollector(),
new FileTraceCollector('./debug.log')
);
const runConfig: RunConfig<Ctx> = {
onEvent: (event) => traceCollector.collect(event),
// ... other config
};
2. Debug Mode Setup¶
// Environment variable for debug mode
const DEBUG_MODE = process.env.JAF_DEBUG === 'true';
const debugLog = (message: string, data?: any) => {
if (DEBUG_MODE) {
console.log(`[JAF:DEBUG] ${message}`, data || '');
}
};
// Use in tools and agents
const debugTool: Tool<any, Ctx> = {
execute: async (args, context) => {
debugLog('Tool execution started', { args, context });
try {
const result = await actualExecution(args, context);
debugLog('Tool execution completed', { result });
return result;
} catch (error) {
debugLog('Tool execution failed', { error });
throw error;
}
}
};
3. Step-by-Step Execution Tracing¶
class DebugModelProvider implements ModelProvider<Ctx> {
constructor(private baseProvider: ModelProvider<Ctx>) {}
async getCompletion(state: RunState<Ctx>, agent: Agent<Ctx, any>, config: RunConfig<Ctx>) {
console.log('[DEBUG] Model call for agent:', agent.name);
console.log('[DEBUG] Message count:', state.messages.length);
console.log('[DEBUG] Last message:', state.messages[state.messages.length - 1]);
const start = Date.now();
const result = await this.baseProvider.getCompletion(state, agent, config);
const duration = Date.now() - start;
console.log('[DEBUG] Model response time:', duration + 'ms');
console.log('[DEBUG] Response type:', result.message?.tool_calls ? 'tool_calls' : 'content');
return result;
}
}
// Wrap your model provider
const modelProvider = new DebugModelProvider(originalProvider);
4. State Inspection Utilities¶
const inspectRunState = (state: RunState<Ctx>, label: string) => {
console.log(`\n=== ${label} ===`);
console.log('Run ID:', state.runId);
console.log('Agent:', state.currentAgentName);
console.log('Turn:', state.turnCount);
console.log('Messages:', state.messages.length);
console.log('Latest message roles:', state.messages.slice(-3).map(m => m.role));
console.log('Context:', Object.keys(state.context));
console.log('==================\n');
};
// Use before and after critical operations
inspectRunState(initialState, 'BEFORE RUN');
const result = await run(initialState, config);
inspectRunState(result.finalState, 'AFTER RUN');
Memory Provider Issues¶
1. Connection Problems¶
In-Memory Provider:
// Issues: None - always works
// Use for: Development, testing
console.log('[MEMORY:InMemory] Always reliable');
Redis Connection Issues:
# Check Redis server status
docker ps | grep redis
redis-cli ping
# Common connection errors
Error: Redis connection refused
// Debug Redis connection
const debugRedisProvider = async (config: RedisConfig) => {
try {
const redis = new Redis({
host: config.host,
port: config.port,
password: config.password,
retryDelayOnFailover: 100,
maxRetriesPerRequest: 3,
});
await redis.ping();
console.log('[REDIS] Connection successful');
return redis;
} catch (error) {
console.error('[REDIS] Connection failed:', error.message);
// Common fixes
console.log('Troubleshooting steps:');
console.log('1. Check if Redis server is running');
console.log('2. Verify host and port configuration');
console.log('3. Check firewall settings');
console.log('4. Verify authentication credentials');
throw error;
}
};
PostgreSQL Connection Issues:
# Check PostgreSQL server
docker ps | grep postgres
pg_isready -h localhost -p 5432
# Connection string format
postgresql://username:password@host:port/database
// Debug PostgreSQL connection
const debugPostgresConnection = async (config: PostgresConfig) => {
const connectionString = config.connectionString ||
`postgresql://${config.username}:${config.password}@${config.host}:${config.port}/${config.database}`;
console.log('[POSTGRES] Attempting connection to:',
connectionString.replace(/:\/\/[^:]+:[^@]+@/, '://***:***@'));
try {
const client = new Client({ connectionString });
await client.connect();
console.log('[POSTGRES] Connection successful');
await client.end();
} catch (error) {
console.error('[POSTGRES] Connection failed:', error.message);
// Error-specific guidance
if (error.code === 'ECONNREFUSED') {
console.log('Fix: Start PostgreSQL server');
} else if (error.code === '28P01') {
console.log('Fix: Check username/password');
} else if (error.code === '3D000') {
console.log('Fix: Create database or check database name');
}
throw error;
}
};
2. Memory Performance Issues¶
Monitoring Memory Usage:
const monitorMemoryProvider = (provider: MemoryProvider) => {
const originalStoreMessages = provider.storeMessages;
provider.storeMessages = async (conversationId, messages, metadata) => {
const start = Date.now();
const result = await originalStoreMessages(conversationId, messages, metadata);
const duration = Date.now() - start;
if (duration > 1000) {
console.warn(`[MEMORY] Slow store operation: ${duration}ms for ${messages.length} messages`);
}
return result;
};
return provider;
};
Memory Cleanup:
// Periodic cleanup for production
const scheduleMemoryCleanup = async (provider: MemoryProvider) => {
setInterval(async () => {
try {
if ('cleanup' in provider && typeof provider.cleanup === 'function') {
const result = await provider.cleanup(30); // 30 days
console.log(`[MEMORY] Cleaned up ${result} old conversations`);
}
} catch (error) {
console.error('[MEMORY] Cleanup failed:', error);
}
}, 24 * 60 * 60 * 1000); // Daily
};
3. Memory Data Corruption¶
Validation:
const validateConversationData = (conversation: ConversationMemory): boolean => {
try {
// Check required fields
if (!conversation.conversationId || !conversation.messages) {
return false;
}
// Validate message structure
for (const message of conversation.messages) {
if (!message.role || !message.content) {
return false;
}
if (!['user', 'assistant', 'tool'].includes(message.role)) {
return false;
}
}
return true;
} catch {
return false;
}
};
// Use when retrieving conversations
const safeGetConversation = async (provider: MemoryProvider, id: string) => {
const result = await provider.getConversation(id);
if (result.success && result.data) {
if (!validateConversationData(result.data)) {
console.warn(`[MEMORY] Invalid conversation data for ${id}`);
return { success: false, error: { message: 'Invalid conversation data' } };
}
}
return result;
};
Model Provider Troubleshooting¶
1. API Connection Issues¶
OpenAI/LiteLLM Provider Issues:
// Test connectivity
const testModelProvider = async (provider: ModelProvider<Ctx>) => {
const testState: RunState<Ctx> = {
runId: createRunId('test'),
traceId: createTraceId('test'),
messages: [{ role: 'user', content: 'Hello' }],
currentAgentName: 'test',
context: {} as Ctx,
turnCount: 0
};
const testAgent: Agent<Ctx, string> = {
name: 'test',
instructions: () => 'Respond with "OK"'
};
try {
const result = await provider.getCompletion(testState, testAgent, {} as RunConfig<Ctx>);
console.log('[MODEL] Provider test successful');
return true;
} catch (error) {
console.error('[MODEL] Provider test failed:', error.message);
// Specific error handling
if (error.status === 401) {
console.log('Fix: Check API key configuration');
} else if (error.status === 429) {
console.log('Fix: Rate limited - wait and retry');
} else if (error.code === 'ECONNREFUSED') {
console.log('Fix: Check if model server is running');
}
return false;
}
};
2. Model Response Issues¶
Empty or Invalid Responses:
const robustModelProvider = (baseProvider: ModelProvider<Ctx>): ModelProvider<Ctx> => ({
async getCompletion(state, agent, config) {
const maxRetries = 3;
let lastError: Error | null = null;
for (let i = 0; i < maxRetries; i++) {
try {
const result = await baseProvider.getCompletion(state, agent, config);
// Validate response
if (!result.message) {
throw new Error('Empty model response');
}
if (!result.message.content && !result.message.tool_calls) {
throw new Error('Model response has no content or tool calls');
}
return result;
} catch (error) {
lastError = error as Error;
console.warn(`[MODEL] Attempt ${i + 1} failed:`, error.message);
if (i < maxRetries - 1) {
await new Promise(resolve => setTimeout(resolve, 1000 * (i + 1)));
}
}
}
throw lastError;
}
});
3. Model Configuration Issues¶
Token Limits:
const calculateTokenUsage = (messages: Message[]): number => {
// Rough estimation: 4 characters ≈ 1 token
return messages.reduce((total, msg) =>
total + Math.ceil(msg.content.length / 4), 0
);
};
const managedModelProvider = (baseProvider: ModelProvider<Ctx>): ModelProvider<Ctx> => ({
async getCompletion(state, agent, config) {
const estimatedTokens = calculateTokenUsage(state.messages);
const maxTokens = agent.modelConfig?.maxTokens || 4000;
if (estimatedTokens > maxTokens * 0.8) {
console.warn(`[MODEL] High token usage: ${estimatedTokens}/${maxTokens}`);
// Truncate older messages
const keepRecent = 5;
const truncatedMessages = state.messages.slice(-keepRecent);
const truncatedState = { ...state, messages: truncatedMessages };
return baseProvider.getCompletion(truncatedState, agent, config);
}
return baseProvider.getCompletion(state, agent, config);
}
});
Tool Execution Debugging¶
1. Tool Call Failures¶
Common Tool Issues:
// Issue: Invalid arguments
{
error: "validation_error",
message: "Invalid arguments for calculator: Required property 'expression'",
tool_name: "calculator",
validation_errors: [...]
}
// Issue: Tool not found
{
error: "tool_not_found",
message: "Tool calculator not found",
tool_name: "calculator"
}
// Issue: Execution error
{
error: "execution_error",
message: "Division by zero",
tool_name: "calculator"
}
Robust Tool Implementation:
const robustCalculatorTool: Tool<CalculatorArgs, Ctx> = {
schema: {
name: 'calculator',
description: 'Perform mathematical calculations',
parameters: z.object({
expression: z.string().describe('Mathematical expression to evaluate')
})
},
execute: async (args, context) => {
try {
// Input validation
if (!args.expression || typeof args.expression !== 'string') {
return JSON.stringify({
error: 'invalid_input',
message: 'Expression must be a non-empty string'
});
}
// Security check
const allowedChars = /^[0-9+\-*/().\s]+$/;
if (!allowedChars.test(args.expression)) {
return JSON.stringify({
error: 'invalid_expression',
message: 'Expression contains invalid characters'
});
}
// Safe evaluation
const result = Function(`"use strict"; return (${args.expression})`)();
if (typeof result !== 'number' || !isFinite(result)) {
return JSON.stringify({
error: 'invalid_result',
message: 'Expression did not produce a valid number'
});
}
return JSON.stringify({
result: result,
expression: args.expression
});
} catch (error) {
console.error('[TOOL:Calculator] Error:', error);
return JSON.stringify({
error: 'execution_error',
message: error instanceof Error ? error.message : 'Unknown error',
expression: args.expression
});
}
}
};
2. Tool Response Formats¶
Standardized Tool Responses:
import { ToolResult } from 'jaf';
// Success response
const successTool: Tool<any, Ctx> = {
execute: async (args, context) => {
const result: ToolResult = {
status: 'success',
data: { value: 42 },
message: 'Calculation completed successfully'
};
return result;
}
};
// Error response
const errorTool: Tool<any, Ctx> = {
execute: async (args, context) => {
const result: ToolResult = {
status: 'error',
error: 'division_by_zero',
message: 'Cannot divide by zero'
};
return result;
}
};
// Async operation
const asyncTool: Tool<any, Ctx> = {
execute: async (args, context) => {
const result: ToolResult = {
status: 'pending',
taskId: 'task-123',
message: 'Operation started, check status later'
};
return result;
}
};
3. Tool Debugging Utilities¶
// Tool execution wrapper for debugging
const debugTool = <T, Ctx>(tool: Tool<T, Ctx>): Tool<T, Ctx> => ({
...tool,
execute: async (args, context) => {
const toolName = tool.schema.name;
const startTime = Date.now();
console.log(`[TOOL:${toolName}] Starting execution`);
console.log(`[TOOL:${toolName}] Args:`, JSON.stringify(args, null, 2));
console.log(`[TOOL:${toolName}] Context keys:`, Object.keys(context));
try {
const result = await tool.execute(args, context);
const duration = Date.now() - startTime;
console.log(`[TOOL:${toolName}] Completed in ${duration}ms`);
console.log(`[TOOL:${toolName}] Result type:`, typeof result);
console.log(`[TOOL:${toolName}] Result preview:`,
typeof result === 'string' ? result.substring(0, 200) + '...' : result);
return result;
} catch (error) {
const duration = Date.now() - startTime;
console.error(`[TOOL:${toolName}] Failed after ${duration}ms`);
console.error(`[TOOL:${toolName}] Error:`, error);
throw error;
}
}
});
// Apply to all tools
const debugAgent: Agent<Ctx, any> = {
...originalAgent,
tools: originalAgent.tools?.map(debugTool)
};
Server and API Debugging¶
1. Server Startup Issues¶
Common Server Problems:
// Port already in use
Error: listen EADDRINUSE: address already in use :::3000
// Memory provider not configured
{
success: false,
error: 'Memory provider not configured'
}
// Missing environment variables
Error: Missing required environment variable: JAF_API_KEY
Startup Diagnostics:
const diagnosticServer = (config: ServerConfig<Ctx>) => {
console.log('=== JAF Server Diagnostics ===');
// Check port availability
const net = require('net');
const server = net.createServer();
server.listen(config.port, (err: any) => {
if (err) {
console.error(`❌ Port ${config.port} is not available`);
console.log(`💡 Try: lsof -ti:${config.port} | xargs kill -9`);
} else {
console.log(`✅ Port ${config.port} is available`);
server.close();
}
});
// Check agent registry
console.log(`📋 Agents: ${Array.from(config.agentRegistry.keys()).join(', ')}`);
// Check memory provider
if (config.defaultMemoryProvider) {
console.log('✅ Memory provider configured');
config.defaultMemoryProvider.healthCheck().then(result => {
if (result.success) {
console.log(`✅ Memory provider healthy (${result.data.latencyMs}ms)`);
} else {
console.error('❌ Memory provider unhealthy:', result.error);
}
});
} else {
console.log('⚠️ Memory provider not configured');
}
console.log('===============================');
};
2. API Request/Response Issues¶
Request Validation:
// Detailed request logging
app.addHook('preHandler', async (request, reply) => {
const reqId = Math.random().toString(36).substr(2, 9);
console.log(`[REQ:${reqId}] ${request.method} ${request.url}`);
console.log(`[REQ:${reqId}] Headers:`, request.headers);
if (request.body) {
console.log(`[REQ:${reqId}] Body:`, JSON.stringify(request.body, null, 2));
}
request.reqId = reqId;
});
app.addHook('onSend', async (request, reply, payload) => {
const reqId = (request as any).reqId;
console.log(`[RES:${reqId}] Status: ${reply.statusCode}`);
console.log(`[RES:${reqId}] Body preview:`, payload.toString().substring(0, 500));
});
Common API Errors:
// Agent not found
{
success: false,
error: "Agent 'NonexistentAgent' not found. Available agents: MathTutor, RAGAgent"
}
// Invalid request format
{
success: false,
error: "Invalid request body: missing required property 'messages'"
}
// Conversation not found
{
success: false,
error: "Conversation conv-123 not found"
}
3. Streaming and Real-time Issues¶
WebSocket Debugging:
// Server-sent events for real-time updates
app.get('/events/:conversationId', async (request, reply) => {
const conversationId = request.params.conversationId;
reply.raw.writeHead(200, {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive'
});
const sendEvent = (event: string, data: any) => {
reply.raw.write(`event: ${event}\n`);
reply.raw.write(`data: ${JSON.stringify(data)}\n\n`);
};
// Send initial connection event
sendEvent('connected', { conversationId });
// Handle disconnection
request.raw.on('close', () => {
console.log(`[STREAM] Client disconnected from ${conversationId}`);
});
});
Performance Troubleshooting¶
1. Response Time Analysis¶
Performance Monitoring:
class PerformanceTraceCollector implements TraceCollector {
private metrics = new Map<string, number[]>();
collect(event: TraceEvent): void {
if (event.type === 'llm_call_start') {
this.startTime = Date.now();
}
if (event.type === 'llm_call_end') {
const duration = Date.now() - this.startTime;
this.recordMetric('llm_call', duration);
if (duration > 5000) {
console.warn(`[PERF] Slow LLM call: ${duration}ms`);
}
}
if (event.type === 'tool_call_end') {
const toolName = event.data.toolName;
// Extract duration from tool execution
this.recordMetric(`tool_${toolName}`, duration);
}
}
recordMetric(name: string, value: number): void {
if (!this.metrics.has(name)) {
this.metrics.set(name, []);
}
this.metrics.get(name)!.push(value);
}
getStats(name: string) {
const values = this.metrics.get(name) || [];
if (values.length === 0) return null;
values.sort((a, b) => a - b);
return {
count: values.length,
avg: values.reduce((a, b) => a + b) / values.length,
median: values[Math.floor(values.length / 2)],
p95: values[Math.floor(values.length * 0.95)],
max: values[values.length - 1]
};
}
}
2. Memory Usage Optimization¶
Memory Profiling:
const profileMemoryUsage = () => {
const used = process.memoryUsage();
console.log('Memory Usage:');
for (let key in used) {
console.log(`${key}: ${Math.round(used[key] / 1024 / 1024 * 100) / 100} MB`);
}
};
// Monitor during execution
setInterval(profileMemoryUsage, 30000); // Every 30 seconds
Message History Optimization:
const optimizeMessageHistory = (
messages: Message[],
maxTokens: number = 4000
): Message[] => {
// Keep system message and recent messages
const systemMessages = messages.filter(m => m.role === 'user' && m.content.startsWith('System:'));
const recentMessages = messages.slice(-10); // Keep last 10 messages
// Estimate tokens (rough calculation)
let totalTokens = 0;
const optimizedMessages: Message[] = [];
// Add system messages first
for (const msg of systemMessages) {
optimizedMessages.push(msg);
totalTokens += Math.ceil(msg.content.length / 4);
}
// Add recent messages in reverse order
for (let i = recentMessages.length - 1; i >= 0; i--) {
const msg = recentMessages[i];
const msgTokens = Math.ceil(msg.content.length / 4);
if (totalTokens + msgTokens <= maxTokens) {
optimizedMessages.unshift(msg);
totalTokens += msgTokens;
} else {
break;
}
}
return optimizedMessages;
};
3. Database Performance¶
Query Optimization:
// Add database query timing
const timedQuery = async (client: any, sql: string, params: any[]) => {
const start = Date.now();
try {
const result = await client.query(sql, params);
const duration = Date.now() - start;
if (duration > 1000) {
console.warn(`[DB] Slow query (${duration}ms):`, sql.substring(0, 100));
}
return result;
} catch (error) {
const duration = Date.now() - start;
console.error(`[DB] Query failed after ${duration}ms:`, error.message);
throw error;
}
};
// Connection pooling for PostgreSQL
const createOptimizedPostgresProvider = (config: PostgresConfig) => {
const pool = new Pool({
...config,
max: 20, // Maximum pool size
idleTimeoutMillis: 30000, // Close idle connections after 30s
connectionTimeoutMillis: 2000, // Timeout connection attempts after 2s
});
return createPostgresProvider(config, pool);
};
Configuration Issues¶
1. Environment Variables¶
Configuration Validation:
const validateEnvironment = () => {
const required = [
'JAF_MODEL_PROVIDER_URL',
'JAF_MEMORY_TYPE'
];
const optional = [
'JAF_API_KEY',
'JAF_DEBUG',
'JAF_LOG_LEVEL'
];
console.log('=== Environment Configuration ===');
// Check required variables
const missing = required.filter(key => !process.env[key]);
if (missing.length > 0) {
console.error('❌ Missing required environment variables:', missing);
process.exit(1);
}
// Show configuration
required.forEach(key => {
console.log(`✅ ${key}: ${process.env[key]}`);
});
optional.forEach(key => {
const value = process.env[key];
console.log(`${value ? '✅' : '⚠️ '} ${key}: ${value || 'not set'}`);
});
console.log('================================');
};
2. Agent Configuration¶
Agent Validation:
const validateAgent = <Ctx, Out>(agent: Agent<Ctx, Out>): string[] => {
const errors: string[] = [];
if (!agent.name || typeof agent.name !== 'string') {
errors.push('Agent name is required and must be a string');
}
if (!agent.instructions) {
errors.push('Agent instructions are required');
}
if (agent.tools) {
agent.tools.forEach((tool, index) => {
if (!tool.schema.name) {
errors.push(`Tool at index ${index} missing name`);
}
if (!tool.schema.description) {
errors.push(`Tool '${tool.schema.name}' missing description`);
}
if (!tool.execute) {
errors.push(`Tool '${tool.schema.name}' missing execute function`);
}
});
}
if (agent.handoffs) {
if (!Array.isArray(agent.handoffs)) {
errors.push('Agent handoffs must be an array');
}
}
return errors;
};
// Validate all agents
const validateAgentRegistry = (registry: Map<string, Agent<any, any>>) => {
console.log('=== Agent Registry Validation ===');
for (const [name, agent] of registry) {
const errors = validateAgent(agent);
if (errors.length === 0) {
console.log(`✅ Agent '${name}' is valid`);
} else {
console.error(`❌ Agent '${name}' has errors:`);
errors.forEach(error => console.error(` - ${error}`));
}
}
console.log('================================');
};
3. Model Configuration¶
Model Settings Validation:
const validateModelConfig = (config: ModelConfig): string[] => {
const errors: string[] = [];
if (config.temperature !== undefined) {
if (typeof config.temperature !== 'number') {
errors.push('Temperature must be a number');
} else if (config.temperature < 0 || config.temperature > 2) {
errors.push('Temperature must be between 0 and 2');
}
}
if (config.maxTokens !== undefined) {
if (typeof config.maxTokens !== 'number') {
errors.push('maxTokens must be a number');
} else if (config.maxTokens <= 0) {
errors.push('maxTokens must be positive');
} else if (config.maxTokens > 128000) {
errors.push('maxTokens seems unusually high (>128k)');
}
}
return errors;
};
Environment Setup Problems¶
1. Node.js and Dependencies¶
Version Compatibility:
# Check Node.js version
node --version # Should be >= 18.0.0
# Check npm version
npm --version # Should be >= 8.0.0
# Clean installation
rm -rf node_modules package-lock.json
npm install
# Check for peer dependency issues
npm ls --depth=0
Common Package Issues:
# TypeScript compilation errors
npm run build
# Missing peer dependencies
npm install @types/node typescript ts-node
# ESM/CommonJS issues
echo "{ \"type\": \"module\" }" > package.json # For ESM
# Or remove for CommonJS
2. Docker Environment¶
Docker Debugging:
# Check Docker services
docker-compose ps
docker-compose logs jaf-redis
docker-compose logs jaf-postgres
# Network connectivity
docker network ls
docker exec jaf-redis redis-cli ping
docker exec jaf-postgres pg_isready
# Volume mounts
docker volume ls
docker exec -it jaf-postgres ls -la /var/lib/postgresql/data
3. Development vs Production¶
Environment-Specific Configuration:
const getEnvironmentConfig = (): EnvironmentConfig => {
const env = process.env.NODE_ENV || 'development';
const baseConfig = {
host: '0.0.0.0',
port: 3000,
debug: false
};
switch (env) {
case 'development':
return {
...baseConfig,
host: 'localhost',
debug: true,
memory: { type: 'memory' as const }
};
case 'production':
return {
...baseConfig,
debug: false,
memory: {
type: 'postgres' as const,
connectionString: process.env.DATABASE_URL
}
};
case 'test':
return {
...baseConfig,
port: 0, // Random port
memory: { type: 'memory' as const }
};
default:
throw new Error(`Unknown environment: ${env}`);
}
};
Log Analysis and Observability¶
1. Structured Logging¶
Comprehensive Logging Setup:
import pino from 'pino';
const logger = pino({
level: process.env.LOG_LEVEL || 'info',
transport: {
target: 'pino-pretty',
options: {
colorize: true,
translateTime: 'SYS:standard',
ignore: 'pid,hostname'
}
}
});
// Structured logging for JAF events
const logJAFEvent = (event: TraceEvent, context?: any) => {
const logData = {
jaf_event: event.type,
timestamp: new Date().toISOString(),
...event.data,
...context
};
switch (event.type) {
case 'run_start':
logger.info(logData, 'JAF run started');
break;
case 'run_end':
if (event.data.outcome.status === 'error') {
logger.error(logData, 'JAF run failed');
} else {
logger.info(logData, 'JAF run completed');
}
break;
case 'tool_call_start':
logger.debug(logData, 'Tool execution started');
break;
case 'tool_call_end':
logger.debug(logData, 'Tool execution completed');
break;
default:
logger.debug(logData, 'JAF event');
}
};
2. Metrics Collection¶
Custom Metrics:
class JAFMetrics {
private counters = new Map<string, number>();
private histograms = new Map<string, number[]>();
increment(name: string, value = 1): void {
this.counters.set(name, (this.counters.get(name) || 0) + value);
}
recordDuration(name: string, duration: number): void {
if (!this.histograms.has(name)) {
this.histograms.set(name, []);
}
this.histograms.get(name)!.push(duration);
}
getMetrics(): Record<string, any> {
const metrics: Record<string, any> = {};
// Counters
for (const [name, value] of this.counters) {
metrics[name] = value;
}
// Histograms with percentiles
for (const [name, values] of this.histograms) {
if (values.length > 0) {
const sorted = [...values].sort((a, b) => a - b);
metrics[`${name}_count`] = values.length;
metrics[`${name}_avg`] = values.reduce((a, b) => a + b) / values.length;
metrics[`${name}_p50`] = sorted[Math.floor(sorted.length * 0.5)];
metrics[`${name}_p95`] = sorted[Math.floor(sorted.length * 0.95)];
metrics[`${name}_p99`] = sorted[Math.floor(sorted.length * 0.99)];
}
}
return metrics;
}
}
const metrics = new JAFMetrics();
// Use in trace collector
const metricsTraceCollector: TraceCollector = {
collect(event: TraceEvent): void {
metrics.increment(`jaf_event_${event.type}`);
if (event.type === 'run_end') {
if (event.data.outcome.status === 'error') {
metrics.increment('jaf_runs_failed');
} else {
metrics.increment('jaf_runs_completed');
}
}
},
// Implement other required methods...
};
3. Health Monitoring¶
Comprehensive Health Checks:
const performHealthCheck = async (config: ServerConfig<Ctx>) => {
const health = {
status: 'healthy',
timestamp: new Date().toISOString(),
checks: {} as Record<string, any>
};
// Check model provider
try {
await testModelProvider(config.runConfig.modelProvider);
health.checks.model_provider = { status: 'healthy' };
} catch (error) {
health.checks.model_provider = {
status: 'unhealthy',
error: error.message
};
health.status = 'unhealthy';
}
// Check memory provider
if (config.defaultMemoryProvider) {
const memoryHealth = await config.defaultMemoryProvider.healthCheck();
health.checks.memory_provider = memoryHealth.success
? { status: 'healthy', ...memoryHealth.data }
: { status: 'unhealthy', error: memoryHealth.error };
if (!memoryHealth.success) {
health.status = 'unhealthy';
}
}
// Check agent registry
health.checks.agents = {
status: 'healthy',
count: config.agentRegistry.size,
agents: Array.from(config.agentRegistry.keys())
};
return health;
};
// Expose health endpoint
app.get('/health/detailed', async (request, reply) => {
const health = await performHealthCheck(config);
const statusCode = health.status === 'healthy' ? 200 : 503;
return reply.code(statusCode).send(health);
});
Frequently Asked Questions (FAQ)¶
Q: Why is my agent stuck in an infinite loop?¶
A: This usually happens when:
1. The agent keeps calling tools without reaching a completion condition
2. Tools return ambiguous results that trigger more tool calls
3. The maxTurns
limit is too high
Solutions:
- Lower the maxTurns
limit (default is 50)
- Add clear completion instructions to your agent
- Implement tool result validation
- Use the debugging techniques to trace the execution flow
Q: My memory provider keeps failing. What should I check?¶
A: Common issues: 1. Connection problems: Verify the service is running and accessible 2. Authentication: Check credentials and permissions 3. Schema issues: Ensure database tables exist and have correct structure 4. Resource limits: Check memory/disk usage on the provider service
Debug steps:
# For Redis
redis-cli ping
redis-cli info memory
# For PostgreSQL
pg_isready -h host -p port
psql -h host -p port -d database -c "SELECT version();"
Q: Tool calls are failing with validation errors. How do I fix this?¶
A: Tool validation failures usually indicate: 1. Schema mismatch: Tool parameters don't match the Zod schema 2. Model confusion: The model doesn't understand the tool format 3. Complex schemas: Overly complex parameter structures
Solutions: - Simplify tool schemas - Add clear descriptions to all parameters - Use debugging mode to see exact tool call arguments - Test tools independently before integrating
Q: How do I handle rate limiting from model providers?¶
A: Implement retry logic with exponential backoff:
const rateLimitedProvider = (baseProvider: ModelProvider<Ctx>): ModelProvider<Ctx> => ({
async getCompletion(state, agent, config) {
const maxRetries = 3;
for (let i = 0; i <= maxRetries; i++) {
try {
return await baseProvider.getCompletion(state, agent, config);
} catch (error) {
if (error.status === 429 && i < maxRetries) {
const delay = Math.pow(2, i) * 1000; // Exponential backoff
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}
});
Q: My server runs out of memory over time. What's causing this?¶
A: Memory leaks often come from: 1. Conversation history accumulation: Large message histories not being cleaned up 2. Event listeners: Trace collectors not being properly cleaned up 3. Database connections: Connection pools not being managed
Solutions: - Implement conversation cleanup/compression - Use memory monitoring tools - Set up proper connection pooling - Clear old traces periodically
Q: How do I debug complex agent interactions and handoffs?¶
A: Use comprehensive tracing:
const debugConfig: RunConfig<Ctx> = {
...config,
onEvent: (event) => {
console.log(`[TRACE] ${event.type}:`, event.data);
if (event.type === 'handoff') {
console.log(`[HANDOFF] ${event.data.from} → ${event.data.to}`);
}
}
};
Q: What's the best way to structure error handling in production?¶
A: Use structured error handling:
const handleJAFError = (error: JAFError): ErrorResponse => {
const severity = JAFErrorHandler.getSeverity(error);
const isRetryable = JAFErrorHandler.isRetryable(error);
const message = JAFErrorHandler.format(error);
// Log based on severity
if (severity === 'critical') {
logger.error({ error, severity }, message);
// Alert operations team
} else {
logger.warn({ error, severity }, message);
}
return {
error: error._tag,
message,
retryable: isRetryable,
severity
};
};
Q: How do I optimize for high-throughput scenarios?¶
A: Several optimizations help:
- Connection pooling for databases
- Request batching for model calls
- Caching for frequently used data
- Horizontal scaling with load balancers
// Example: Request batching
const batchedModelProvider = (baseProvider: ModelProvider<Ctx>) => {
const queue: Array<{ resolve: Function, reject: Function, args: any }> = [];
const processBatch = async () => {
if (queue.length === 0) return;
const batch = queue.splice(0, 10); // Process 10 at a time
await Promise.all(batch.map(async ({ resolve, reject, args }) => {
try {
const result = await baseProvider.getCompletion(...args);
resolve(result);
} catch (error) {
reject(error);
}
}));
};
setInterval(processBatch, 100); // Process every 100ms
return {
async getCompletion(...args) {
return new Promise((resolve, reject) => {
queue.push({ resolve, reject, args });
});
}
};
};
Q: How do I set up monitoring and alerting?¶
A: Implement monitoring at multiple levels:
// Application metrics
const alertOnError = (error: JAFError) => {
const severity = JAFErrorHandler.getSeverity(error);
if (severity === 'critical') {
// Send to alerting system (PagerDuty, Slack, etc.)
sendAlert({
severity: 'critical',
message: JAFErrorHandler.format(error),
service: 'jaf-agent',
timestamp: new Date().toISOString()
});
}
};
// System health monitoring
setInterval(async () => {
const health = await performHealthCheck(config);
if (health.status === 'unhealthy') {
sendAlert({
severity: 'warning',
message: 'JAF system health check failed',
details: health.checks
});
}
}, 60000); // Check every minute
Conclusion¶
This troubleshooting guide covers the most common issues you'll encounter when working with the Juspay Agent Framework. Remember to:
- Start with basics: Check connections, configurations, and environment setup first
- Use debugging tools: Enable comprehensive logging and tracing
- Monitor proactively: Set up health checks and alerting
- Test incrementally: Validate components individually before integration
- Keep logs structured: Use consistent logging formats for easier analysis
For additional help: - Check the framework's GitHub issues for known problems - Review the example implementations for reference patterns - Use the built-in diagnostic tools and health checks - Enable debug mode during development
The JAF framework is designed to be observable and debuggable - use the built-in tools to understand what's happening in your system.