How It Works
Each user has a token bucket with a fixed capacity that refills at a steady rate. Every API request consumes tokens based on its cost — heavier operations like uploads consume more tokens than simple reads.| Parameter | Value |
|---|---|
| Bucket capacity | 400 tokens |
| Refill rate | 100 tokens per second |
Request Costs
Different operations consume different amounts of tokens:| Operation | Cost | Examples |
|---|---|---|
| Metadata | 1 | Get asset details, create album, update person |
| List / Search | 5 | List assets, search albums |
| Thumbnail | 10 | Download a thumbnail |
| Upload / Download | 20 | Upload an asset, download original file |
MCP Tool Calls
If you use the MCP server, tool calls follow the same weighted cost model as the REST API. Each tool call is charged once using the cost of the closest REST operation. For example:- Search and list tools use the list/search cost
- Write tools use the corresponding mutation cost
view_assetcounts like a metadata read rather than a full file download
Rate Limit Headers
Rate limit information is included in every response:Handling Rate Limits
When rate limited, you’ll receive a429 response with a Retry-After header indicating how many seconds to wait:
Retry-After value before retrying:
If you’re using one of our SDKs, automatic retries with exponential backoff are built in.
Best Practices
- Monitor
X-RateLimit-Remainingto proactively throttle requests before hitting limits - Batch operations when possible to reduce total token consumption
- Cache responses to avoid redundant requests