Performance

Latency, throughput and caching expectations.

# Latency

Workload Median p99
Search with filter on a 100k-document index 6–12 ms < 30 ms
Autocomplete (/suggest) 3–8 ms < 15 ms
Search with AI Query Understanding (cache hit) 8–14 ms < 35 ms
Search with AI Query Understanding (cache miss) 1.5–3 s first request only — cached for 7 days

These are engine times (search_time_ms in the response). Add network round-trip from your client.

# Throughput

A single index handles 2–5k QPS on the default plan tier. We scale by adding read replicas under the hood — you don't need to do anything; only the price changes.

# What costs time

In rough descending order:

  1. Wide query_by (5+ fields) without weights.
  2. Many facets with high cardinality.
  3. Sort + filter combinations that can't use the engine's bitmaps.
  4. AI Query Understanding cache misses (one Skryx AI call per fresh query).

If you're chasing the last few ms, see the Performance guide.

# Caching

Three layers protect you from work:

  • AI Query cache — rewrites are persisted for 7 days; identical queries are free.
  • Synonym set cache — pushed to the engine once per change, then memory-resident.
  • HTTP-level caching — the API sets Cache-Control: private, max-age=2 on identical * browse queries; turn it off with ?nocache=1.

# Indexing throughput

A batch of 1,000 documents indexes in ~150 ms typical. For a million-document re-index, expect ~2–3 minutes with 4 parallel batch streams.

esc