A search index is only as good as the data it sees. This guide is the decision tree for how to feed it — single-document writes vs batches vs full swaps — and the webhook surface for staying in sync.
# The three write modes
| Mode | When | Endpoint |
|---|---|---|
| Single document | Real-time updates: a price changes, a stock count flips, a single record was edited. Sub-second latency, one HTTP call per change. | POST/PATCH /v1/indexes/{name}/documents/{id} |
| Batch upsert (1,000 docs) | Bulk imports, hourly delta syncs, "everything that changed since the last cron run". One call per 1,000 records — 50–100× faster than the per-document loop. | POST /v1/indexes/{name}/documents/batch |
| Full swap (zero-downtime reindex) | Major schema change, taxonomy redesign, full re-ingest from the source of truth. Build a fresh temp index, push everything, then atomically swap. Live search never sees an empty or half-built index. | Zero-downtime reindex pattern |
The wrong choice is rarely catastrophic — but the cost difference is real. Pushing 25,000 documents one-at-a-time takes ~2 hours and costs you 25,000 HTTP requests. The same dataset in batches of 1,000 takes ~3 minutes and costs 25 requests.
# Picking by traffic pattern
- High-write, low-volume per change (real-time inventory, price ticks).
Single-document PATCHes. Skryx upserts by
id, so any retry is safe. - Periodic bulk catch-up (nightly or hourly cron from an ERP). Batch upsert in chunks of 1,000, parallelised 4–8 wide. The engine saturates network before it saturates indexing throughput.
- Schema rebuild / category restructure. Always full swap. Mid-flight schema changes are a great way to ship broken search to production.
# Webhooks (coming)
When live, you'll be able to receive a POST to your endpoint whenever:
- A data-source sync completes (success or failure).
- AI Search Coach generates new recommendations.
- A query starts returning consistently zero results ("stock alert"-style).
- An index is swapped via the zero-downtime pattern.
Payload sketch:
{
"event": "data_source.sync_completed",
"tenant_id": 6,
"index": "products",
"data_source_id": 12,
"delivered_at": "2026-09-15T08:14:12Z",
"data": {
"imported": 23142,
"deleted": 18,
"failed": 0,
"took_ms": 41200
}
}
Subscribe via /v1/tenants/me/webhooks (also coming with the same release).
Signing follows the standard Stripe-Signature-style HMAC; events are
delivered at-least-once with idempotency keys so your handler can be
straightforward.
# What about eventual consistency?
After any write — single, batch, or swap — search read-replicas catch up within a few hundred milliseconds. Two implications:
- A
GET /documents/{id}immediately after an upsert can occasionally 404. Retry once. - The total result count from
/queryimmediately after a large delete can drift by up to a couple hundred docs for a few seconds before settling.
For most product / content / SaaS search this is irrelevant. If you're using Skryx as a primary store, you're using the wrong tool.