# Latency
| Workload | Median | p99 |
|---|---|---|
| Search with filter on a 100k-document index | 6–12 ms | < 30 ms |
Autocomplete (/suggest) |
3–8 ms | < 15 ms |
| Search with AI Query Understanding (cache hit) | 8–14 ms | < 35 ms |
| Search with AI Query Understanding (cache miss) | 1.5–3 s | first request only — cached for 7 days |
These are engine times (search_time_ms in the response). Add network
round-trip from your client.
# Throughput
A single index handles 2–5k QPS on the default plan tier. We scale by adding read replicas under the hood — you don't need to do anything; only the price changes.
# What costs time
In rough descending order:
- Wide
query_by(5+ fields) without weights. - Many facets with high cardinality.
- Sort + filter combinations that can't use the engine's bitmaps.
- AI Query Understanding cache misses (one Skryx AI call per fresh query).
If you're chasing the last few ms, see the Performance guide.
# Caching
Three layers protect you from work:
- AI Query cache — rewrites are persisted for 7 days; identical queries are free.
- Synonym set cache — pushed to the engine once per change, then memory-resident.
- HTTP-level caching — the API sets
Cache-Control: private, max-age=2on identical*browse queries; turn it off with?nocache=1.
# Indexing throughput
A batch of 1,000 documents indexes in ~150 ms typical. For a million-document re-index, expect ~2–3 minutes with 4 parallel batch streams.