Skip to main content
Error Handling for AI Apps: Graceful Failures

Error Handling for AI Apps: Graceful Failures

Nov 23, 2025

LLM APIs are flaky. Rate limits, timeouts, content filters, random 500s. If you dont handle errors gracefully, your AI features will frustrate users constantly.

Common Failures

Some are retryable, some arent. Handle them differently.

Basic Retry Logic

async function callWithRetry<T>(
  fn: () => Promise<T>,
  maxRetries = 3
): Promise<T> {
  let lastError: Error;

  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (err) {
      lastError = err;

      // Don't retry non-retryable errors
      if (!isRetryable(err)) throw err;

      // Exponential backoff
      const delay = Math.pow(2, i) * 1000;
      await sleep(delay);
    }
  }

  throw lastError;
}

function isRetryable(err: Error): boolean {
  const code = err['status'] || err['code'];
  return [429, 500, 502, 503, 504].includes(code);
}

Rate Limit Handling

Rate limits are guaranteed to happen. Respect the retry-after header:

async function handleRateLimit(response: Response) {
  const retryAfter = response.headers.get("retry-after");

  if (retryAfter) {
    const waitMs = parseInt(retryAfter) * 1000;
    await sleep(waitMs);
    return true; // Should retry
  }

  return false;
}

Better yet, add a token bucket to avoid hitting limits:

class RateLimiter {
  private tokens: number;
  private lastRefill: number;

  constructor(private tokensPerMinute: number) {
    this.tokens = tokensPerMinute;
    this.lastRefill = Date.now();
  }

  async acquire() {
    this.refill();

    if (this.tokens <= 0) {
      // Wait until next refill
      const waitTime = 60000 - (Date.now() - this.lastRefill);
      await sleep(waitTime);
      this.refill();
    }

    this.tokens--;
  }

  private refill() {
    const now = Date.now();
    const elapsed = now - this.lastRefill;

    if (elapsed >= 60000) {
      this.tokens = this.tokensPerMinute;
      this.lastRefill = now;
    }
  }
}

Graceful Degradation

When AI fails, have a backup:

async function getAIResponse(prompt: string): Promise<string> {
  try {
    return await callWithRetry(() => llm.complete(prompt));
  } catch (err) {
    // Log for monitoring
    logger.error("AI call failed", { error: err, prompt });

    // Return graceful fallback
    if (isContentFiltered(err)) {
      return "I can't help with that request.";
    }

    if (isQuotaExceeded(err)) {
      return "AI features are temporarily unavailable.";
    }

    return "Something went wrong. Please try again.";
  }
}

Never show raw error messages to users. They dont care about 429 Too Many Requests.

Timeout Protection

LLM calls can hang. Always set timeouts:

async function withTimeout<T>(
  promise: Promise<T>,
  ms: number
): Promise<T> {
  const timeout = new Promise<never>((_, reject) => {
    setTimeout(() => reject(new Error("Request timeout")), ms);
  });

  return Promise.race([promise, timeout]);
}

// Usage
const response = await withTimeout(
  llm.complete(prompt),
  30000 // 30 second max
);

Circuit Breaker Pattern

If an API keeps failing, stop hammering it:

class CircuitBreaker {
  private failures = 0;
  private lastFailure: number = 0;
  private state: "closed" | "open" | "half-open" = "closed";

  async call<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === "open") {
      // Check if we should try again
      if (Date.now() - this.lastFailure > 30000) {
        this.state = "half-open";
      } else {
        throw new Error("Circuit breaker is open");
      }
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (err) {
      this.onFailure();
      throw err;
    }
  }

  private onSuccess() {
    this.failures = 0;
    this.state = "closed";
  }

  private onFailure() {
    this.failures++;
    this.lastFailure = Date.now();

    if (this.failures >= 5) {
      this.state = "open";
    }
  }
}

Quick Checklist

Before shipping AI features:

  • [ ] Retry logic with exponential backoff
  • [ ] Timeout on all LLM calls
  • [ ] User-friendly error messages
  • [ ] Rate limit handling
  • [ ] Fallback for when AI is down
  • [ ] Error logging for debugging
  • [ ] Alerts for quota/billing issues

LLM APIs will fail. Plan for it. Your 2am self will thank you.

© 2026 Tawan. All rights reserved.