Home· Features· Auto-Sync Feeds
🔄 Auto-Sync Feeds

Paste a URL. Pick a cadence. Walk away.

Five formats parsed natively, three sync strategies, five schedule slots, auto-pause after consecutive failures, and a schema detector that maps Romanian / Spanish / German / French / Italian column names without intervention.

Starter Growth Scale Enterprise
products · auto-sync
url: https://shop.example.com/feed.xml
format: gmc_xml (auto-detected)
schedule: every_6h
strategy: upsert_delete_missing
─────────────────────────
Last run · ✓ success
fetched 26,011 · upserted 312
deleted 14 · duration 12.4s
Next run · 2h 14m
Supported formats

Five parsers. One auto-detector.

Skryx ships native parsers for each format — no third-party services in the request path, no AWS Lambda glue. The auto setting sniffs the first 5 KB of the response and picks the right parser.

1

gmc_xml

Google Merchant Center RSS + g: namespace. Streamed via XMLReader — no full-document load, no 1 GB feed OOMs. Standard fields auto-mapped from g: tags.

2

meta_xml

Meta / Facebook / Instagram catalog. Structurally identical to GMC — inherits the same streaming parser, same field mapping.

3

json

Array of objects, or { products / items / data: [...] } envelope. Iterated lazily so memory stays bounded on long feeds.

4

csv

RFC 4180 with header row. Uses league/csv for streaming. Auto-typed per column: string, int, float, bool, string[], facet candidate (cardinality 2–50).

5

auto

Sniffs the first 5 KB. Tries gmc_xml, then json, then csv. Returns the first parser that confidently matches. Used by the "Test connection" preview before you commit.

+

json_ld

Scaffold in place. Not yet implemented. JSON-LD-shaped feeds currently need to be converted to plain JSON, or wait for the parser to ship.

Schema detection

Auto-maps 14 known roles. In five languages.

The Auto-Pilot schema detector reads your first 100 documents and matches column names against a registry of field roles (title, sku, brand, category, description, price, stock, image, url, rating, ranking_score, date, tags, plus localised aliases). You don't write a mapping rule for the common case.

🌍 Multilingual column aliases

"denumire" → title. "pret" → price. Out of the box.

The detector ships with aliases for Romanian, Spanish, German, French, and Italian. Romanian shops commonly export denumire, pret, marca, stoc, categorie — all auto-mapped to their canonical Skryx roles without intervention.

  • Two-pass matching: exact normalisation, then substring containment
  • Only roles that actually appear in your feed get mapped — no inventing fields
  • Override per source via the field_mapping JSON if a column needs a custom destination
// Detected role → actual feed column
{
  "title":        "denumire",
  "sku":          "cod",
  "price":        "pret",
  "brand":        "marca",
  "category":     "categorie",
  "stock":        "stoc",
  "image":        "poza",
  "ranking_score":"popularitate"
}
🧬 Value coercion

"1.299,00 EUR" → 1299.00. "in_stock" → true.

Beyond column-name matching, Skryx normalises values per detected type. Currency strings parse to floats. Pipe-separated lists become string arrays. "yes" / "in_stock" / "available" all coerce to bool true. Category paths split on > or |.

  • Currency: handles thousand separators in EU and US formats
  • Boolean: standard truthy strings across languages
  • Tags / categories: pipe / arrow / comma split, deduplicated
  • A field matching points / score / rank / popularity with numeric values is auto-promoted as the index's ranking_score tiebreaker
// Raw feed value → indexed
"1.299,00 EUR"1299.00
"red|blue|green"["red", "blue", "green"]
"in_stock"true
"Audio > Headphones"["Audio", "Headphones"]
Schedules & sync strategy

Five cadences. Three strategies. Pick once.

⏰ Schedule slots

Manual · hourly · every 6 h · daily 03:00 · weekly Mon 03:00.

Schedules tighter than the plan ceiling are disabled in the UI — Free can pick manual / daily / weekly; Starter unlocks hourly; Growth and above can sync as often as every hour. There's a "Sync now" button on every source for ad-hoc runs.

  • Source count limits per plan: Free 1, Starter 3, Growth 10, Business 50, Enterprise unlimited
  • Dispatcher runs every minute and queues whatever's due
  • Manual "Sync now" returns 202 and a queued run reference; UI polls every 4 s
{
  "schedule": "every_6h",
  // next run:
  "next_sync_at": "2026-05-25T20:00:00Z",
  "interval":     "PT6H"
}

// Daily / weekly fixed at 03:00 UTC
// to skip your peak traffic window
🔀 Sync strategies

upsert · upsert_delete_missing · full_replace.

upsert (default) inserts new SKUs, updates existing, leaves the rest alone. upsert_delete_missing additionally drops items not present in the new feed. full_replace wipes the index then re-imports — use only when your feed is authoritative.

  • Diff is computed in-process: snapshots index IDs, tracks added / updated / deleted per batch
  • Upserts go through the engine's native upsert action
  • Per-batch error tracking: the first engine rejection per batch is surfaced in the run log so you see why docs failed, not just a count
upsert
+ new SKUs
~ updated
· existing untouched
─────────────────
upsert_delete_missing
+ new · ~ updated
− rows not in feed
─────────────────
full_replace
drop & re-import
Failure handling

Auto-pause after five strikes. No noisy retries.

⚙️ Resilience

Built to survive flaky vendor feeds.

The job has a 1 800-second hard timeout, three retries with backoff [60s, 5 min, 15 min], and auto-pauses the source after five consecutive failed runs. You get an alert in the dashboard; the source toggle stays off until you flip it back manually.

  • Custom auth: basic_auth_username + password (encrypted) and / or custom_headers JSON for Bearer-style tokens
  • Run history retains products_fetched / added / updated / deleted / failed, error_message (1 000 chars), error_details (5 000 char trace), duration_ms, bytes_fetched
  • Per-batch error: which row failed and the engine's reason — surfaced in the run detail page
// Last 5 runs
03:00 26,011 · 12s
09:00 timeout · 1800s
15:00 502 from upstream
21:00 502 from upstream
03:00 502 from upstream
─────────────────
Auto-paused after 5 strikes
🧪 Test connection

Preview before you commit.

Hit the test endpoint with a URL + format and Skryx fetches a sample, detects format, counts total products, returns a 3-document preview, and surfaces warnings ("No products found", "Some products missing external_id"). The default field mapping it computed is dropped into the form so you only edit the bits that need fixing.

  • Available both in the dashboard UI and as a public API endpoint
  • Returns: format_detected, total_products, preview, default_mapping, bytes, warnings
  • No side effects — nothing is written until you click Save
POST /v1/indexes/products/data-sources/
     test-connection
{
  "url":    "https://shop…/feed.xml",
  "format": "auto"
}

// →
{
  "format_detected": "gmc_xml",
  "total_products": 26011,
  "preview":         [ /* 3 docs */ ],
  "default_mapping": { /* … */ },
  "warnings":        []
}
First-batch Auto-Pilot

50 documents in, recommendations already on the dashboard.

When you sync your first feed, Auto-Pilot runs at the 50-document threshold (not the 1 000-document bulk threshold) and triggers AI Coach early. New tenants see their first ranking-rule, synonym, and stop-word suggestions before the feed even finishes importing — instead of waiting until next Monday's deep pass.

5
Parsers shipped · gmc_xml, meta_xml, json, csv, auto (json_ld in scaffold)
50
Documents needed before Auto-Pilot fires for first-batch syncs
512 MB
Memory cap on the sync job · streamed parsing keeps even huge feeds under it
Keep exploring

Other things Skryx does

Try it on your own catalog.

Free tier, no credit card. EU-hosted from day one.