🔄 Auto-Sync Feeds

Paste a URL. Pick a cadence. Walk away.

Five formats parsed natively, three sync strategies, five schedule slots, auto-pause after consecutive failures, and a schema detector that maps Romanian / Spanish / German / French / Italian column names without intervention.

Start free → Supported formats

Starter Growth Scale Enterprise

products · auto-sync

url: https://shop.example.com/feed.xml

format: gmc_xml (auto-detected)

schedule: every_6h

strategy: upsert_delete_missing

─────────────────────────

Last run · ✓ success

fetched 26,011 · upserted 312

deleted 14 · duration 12.4s

Next run · 2h 14m

Supported formats

Five parsers. One auto-detector.

Skryx ships native parsers for each format — no third-party services in the request path, no AWS Lambda glue. The auto setting sniffs the first 5 KB of the response and picks the right parser.

gmc_xml

Google Merchant Center RSS + g: namespace. Streamed via XMLReader — no full-document load, no 1 GB feed OOMs. Standard fields auto-mapped from g: tags.

meta_xml

Meta / Facebook / Instagram catalog. Structurally identical to GMC — inherits the same streaming parser, same field mapping.

json

Array of objects, or { products / items / data: [...] } envelope. Iterated lazily so memory stays bounded on long feeds.

csv

RFC 4180 with header row. Uses league/csv for streaming. Auto-typed per column: string, int, float, bool, string[], facet candidate (cardinality 2–50).

auto

Sniffs the first 5 KB. Tries gmc_xml, then json, then csv. Returns the first parser that confidently matches. Used by the "Test connection" preview before you commit.

json_ld

Scaffold in place. Not yet implemented. JSON-LD-shaped feeds currently need to be converted to plain JSON, or wait for the parser to ship.

Schema detection

Auto-maps 14 known roles. In five languages.

The Auto-Pilot schema detector reads your first 100 documents and matches column names against a registry of field roles (title, sku, brand, category, description, price, stock, image, url, rating, ranking_score, date, tags, plus localised aliases). You don't write a mapping rule for the common case.

🌍 Multilingual column aliases

"denumire" → title. "pret" → price. Out of the box.

The detector ships with aliases for Romanian, Spanish, German, French, and Italian. Romanian shops commonly export denumire, pret, marca, stoc, categorie — all auto-mapped to their canonical Skryx roles without intervention.

Two-pass matching: exact normalisation, then substring containment
Only roles that actually appear in your feed get mapped — no inventing fields
Override per source via the field_mapping JSON if a column needs a custom destination

// Detected role → actual feed column
{
  "title":        "denumire",
  "sku":          "cod",
  "price":        "pret",
  "brand":        "marca",
  "category":     "categorie",
  "stock":        "stoc",
  "image":        "poza",
  "ranking_score":"popularitate"
}

🧬 Value coercion

"1.299,00 EUR" → 1299.00. "in_stock" → true.

Beyond column-name matching, Skryx normalises values per detected type. Currency strings parse to floats. Pipe-separated lists become string arrays. "yes" / "in_stock" / "available" all coerce to bool true. Category paths split on > or |.

Currency: handles thousand separators in EU and US formats
Boolean: standard truthy strings across languages
Tags / categories: pipe / arrow / comma split, deduplicated
A field matching points / score / rank / popularity with numeric values is auto-promoted as the index's ranking_score tiebreaker

// Raw feed value → indexed

"1.299,00 EUR" → 1299.00

"red|blue|green" → ["red", "blue", "green"]

"in_stock" → true

"Audio > Headphones" → ["Audio", "Headphones"]

Schedules & sync strategy

Five cadences. Three strategies. Pick once.

⏰ Schedule slots

Manual · hourly · every 6 h · daily 03:00 · weekly Mon 03:00.

Schedules tighter than the plan ceiling are disabled in the UI — Free can pick manual / daily / weekly; Starter unlocks hourly; Growth and above can sync as often as every hour. There's a "Sync now" button on every source for ad-hoc runs.

Source count limits per plan: Free 1, Starter 3, Growth 10, Business 50, Enterprise unlimited
Dispatcher runs every minute and queues whatever's due
Manual "Sync now" returns 202 and a queued run reference; UI polls every 4 s

{
  "schedule": "every_6h",
  // next run:
  "next_sync_at": "2026-05-25T20:00:00Z",
  "interval":     "PT6H"
}

// Daily / weekly fixed at 03:00 UTC
// to skip your peak traffic window

🔀 Sync strategies

upsert · upsert_delete_missing · full_replace.

upsert (default) inserts new SKUs, updates existing, leaves the rest alone. upsert_delete_missing additionally drops items not present in the new feed. full_replace wipes the index then re-imports — use only when your feed is authoritative.

Diff is computed in-process: snapshots index IDs, tracks added / updated / deleted per batch
Upserts go through the engine's native upsert action
Per-batch error tracking: the first engine rejection per batch is surfaced in the run log so you see why docs failed, not just a count

upsert

+ new SKUs

~ updated

· existing untouched

─────────────────

upsert_delete_missing

+ new · ~ updated

− rows not in feed

─────────────────

full_replace

drop & re-import

Failure handling

Auto-pause after five strikes. No noisy retries.

⚙️ Resilience

Built to survive flaky vendor feeds.

The job has a 1 800-second hard timeout, three retries with backoff [60s, 5 min, 15 min], and auto-pauses the source after five consecutive failed runs. You get an alert in the dashboard; the source toggle stays off until you flip it back manually.

Custom auth: basic_auth_username + password (encrypted) and / or custom_headers JSON for Bearer-style tokens
Run history retains products_fetched / added / updated / deleted / failed, error_message (1 000 chars), error_details (5 000 char trace), duration_ms, bytes_fetched
Per-batch error: which row failed and the engine's reason — surfaced in the run detail page

// Last 5 runs

03:00 ✓ 26,011 · 12s

09:00 ✗ timeout · 1800s

15:00 ✗ 502 from upstream

21:00 ✗ 502 from upstream

03:00 ✗ 502 from upstream

─────────────────

Auto-paused after 5 strikes

🧪 Test connection

Preview before you commit.

Hit the test endpoint with a URL + format and Skryx fetches a sample, detects format, counts total products, returns a 3-document preview, and surfaces warnings ("No products found", "Some products missing external_id"). The default field mapping it computed is dropped into the form so you only edit the bits that need fixing.

Available both in the dashboard UI and as a public API endpoint
Returns: format_detected, total_products, preview, default_mapping, bytes, warnings
No side effects — nothing is written until you click Save

POST /v1/indexes/products/data-sources/
     test-connection
{
  "url":    "https://shop…/feed.xml",
  "format": "auto"
}

// →
{
  "format_detected": "gmc_xml",
  "total_products": 26011,
  "preview":         [ /* 3 docs */ ],
  "default_mapping": { /* … */ },
  "warnings":        []
}

First-batch Auto-Pilot

50 documents in, recommendations already on the dashboard.

When you sync your first feed, Auto-Pilot runs at the 50-document threshold (not the 1 000-document bulk threshold) and triggers AI Coach early. New tenants see their first ranking-rule, synonym, and stop-word suggestions before the feed even finishes importing — instead of waiting until next Monday's deep pass.

Parsers shipped · gmc_xml, meta_xml, json, csv, auto (json_ld in scaffold)

Documents needed before Auto-Pilot fires for first-batch syncs

512 MB

Memory cap on the sync job · streamed parsing keeps even huge feeds under it

Paste a URL. Pick a cadence. Walk away.

Five parsers. One auto-detector.

gmc_xml

meta_xml

json

csv

auto

json_ld

Auto-maps 14 known roles. In five languages.

"denumire" → title. "pret" → price. Out of the box.

"1.299,00 EUR" → 1299.00. "in_stock" → true.

Five cadences. Three strategies. Pick once.

Manual · hourly · every 6 h · daily 03:00 · weekly Mon 03:00.

upsert · upsert_delete_missing · full_replace.

Auto-pause after five strikes. No noisy retries.

Built to survive flaky vendor feeds.

Preview before you commit.

50 documents in, recommendations already on the dashboard.

Other things Skryx does

Try it on your own catalog.

We use cookies