Skip to main content
REST API requests are rate limited using a weighted token bucket to ensure fair usage and platform stability.

How It Works

Each user has a token bucket with a fixed capacity that refills at a steady rate. Every API request consumes tokens based on its cost — heavier operations like uploads consume more tokens than simple reads.
ParameterValue
Bucket capacity400 tokens
Refill rate100 tokens per second

Request Costs

Different operations consume different amounts of tokens:
OperationCostExamples
Metadata1Get asset details, create album, update person
List / Search5List assets, search albums
Thumbnail10Download a thumbnail
Upload / Download20Upload an asset, download original file

MCP Tool Calls

If you use the MCP server, tool calls follow the same weighted cost model as the REST API. Each tool call is charged once using the cost of the closest REST operation. For example:
  • Search and list tools use the list/search cost
  • Write tools use the corresponding mutation cost
  • view_asset counts like a metadata read rather than a full file download

Rate Limit Headers

Rate limit information is included in every response:
X-RateLimit-Limit: 400
X-RateLimit-Remaining: 375
X-RateLimit-Cost: 5

Handling Rate Limits

When rate limited, you’ll receive a 429 response with a Retry-After header indicating how many seconds to wait:
HTTP/1.1 429 Too Many Requests
Retry-After: 3
X-RateLimit-Limit: 400
X-RateLimit-Remaining: 0
X-RateLimit-Cost: 20
Respect the Retry-After value before retrying:
async function requestWithBackoff(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) return response;

    const retryAfterSec = parseInt(response.headers.get('Retry-After'), 10) || 0;
    const backoffSec = Math.pow(2, attempt);
    const waitSec = Math.max(retryAfterSec, backoffSec);
    await new Promise(resolve => setTimeout(resolve, waitSec * 1000));
  }
  throw new Error('Max retries exceeded');
}
If you’re using one of our SDKs, automatic retries with exponential backoff are built in.

Best Practices

  • Monitor X-RateLimit-Remaining to proactively throttle requests before hitting limits
  • Batch operations when possible to reduce total token consumption
  • Cache responses to avoid redundant requests