Real-time updates

When to use batch vs single-document writes, and how to keep your search index in step with your database.

A search index is only as good as the data it sees. This guide is the decision tree for how to feed it — single-document writes vs batches vs full swaps — and the webhook surface for staying in sync.

# The three write modes

Mode When Endpoint
Single document Real-time updates: a price changes, a stock count flips, a single record was edited. Sub-second latency, one HTTP call per change. POST/PATCH /v1/indexes/{name}/documents/{id}
Batch upsert (1,000 docs) Bulk imports, hourly delta syncs, "everything that changed since the last cron run". One call per 1,000 records — 50–100× faster than the per-document loop. POST /v1/indexes/{name}/documents/batch
Full swap (zero-downtime reindex) Major schema change, taxonomy redesign, full re-ingest from the source of truth. Build a fresh temp index, push everything, then atomically swap. Live search never sees an empty or half-built index. Zero-downtime reindex pattern

The wrong choice is rarely catastrophic — but the cost difference is real. Pushing 25,000 documents one-at-a-time takes ~2 hours and costs you 25,000 HTTP requests. The same dataset in batches of 1,000 takes ~3 minutes and costs 25 requests.

# Picking by traffic pattern

  • High-write, low-volume per change (real-time inventory, price ticks). Single-document PATCHes. Skryx upserts by id, so any retry is safe.
  • Periodic bulk catch-up (nightly or hourly cron from an ERP). Batch upsert in chunks of 1,000, parallelised 4–8 wide. The engine saturates network before it saturates indexing throughput.
  • Schema rebuild / category restructure. Always full swap. Mid-flight schema changes are a great way to ship broken search to production.

# Webhooks (coming)

When live, you'll be able to receive a POST to your endpoint whenever:

  • A data-source sync completes (success or failure).
  • AI Search Coach generates new recommendations.
  • A query starts returning consistently zero results ("stock alert"-style).
  • An index is swapped via the zero-downtime pattern.

Payload sketch:

{
  "event": "data_source.sync_completed",
  "tenant_id": 6,
  "index": "products",
  "data_source_id": 12,
  "delivered_at": "2026-09-15T08:14:12Z",
  "data": {
    "imported": 23142,
    "deleted": 18,
    "failed": 0,
    "took_ms": 41200
  }
}

Subscribe via /v1/tenants/me/webhooks (also coming with the same release). Signing follows the standard Stripe-Signature-style HMAC; events are delivered at-least-once with idempotency keys so your handler can be straightforward.

# What about eventual consistency?

After any write — single, batch, or swap — search read-replicas catch up within a few hundred milliseconds. Two implications:

  1. A GET /documents/{id} immediately after an upsert can occasionally 404. Retry once.
  2. The total result count from /query immediately after a large delete can drift by up to a couple hundred docs for a few seconds before settling.

For most product / content / SaaS search this is irrelevant. If you're using Skryx as a primary store, you're using the wrong tool.

esc