Paste a URL. Pick a cadence. Walk away.
Five formats parsed natively, three sync strategies, five schedule slots, auto-pause after consecutive failures, and a schema detector that maps Romanian / Spanish / German / French / Italian column names without intervention.
Five parsers. One auto-detector.
Skryx ships native parsers for each format — no third-party services in the
request path, no AWS Lambda glue. The auto setting sniffs the
first 5 KB of the response and picks the right parser.
gmc_xml
Google Merchant Center RSS + g: namespace. Streamed via
XMLReader — no full-document load, no 1 GB feed OOMs.
Standard fields auto-mapped from g: tags.
meta_xml
Meta / Facebook / Instagram catalog. Structurally identical to GMC — inherits the same streaming parser, same field mapping.
json
Array of objects, or { products / items / data: [...] }
envelope. Iterated lazily so memory stays bounded on long feeds.
csv
RFC 4180 with header row. Uses league/csv for streaming.
Auto-typed per column: string, int, float, bool, string[], facet
candidate (cardinality 2–50).
auto
Sniffs the first 5 KB. Tries gmc_xml, then json,
then csv. Returns the first parser that confidently
matches. Used by the "Test connection" preview before you commit.
json_ld
Scaffold in place. Not yet implemented. JSON-LD-shaped feeds currently need to be converted to plain JSON, or wait for the parser to ship.
Auto-maps 14 known roles. In five languages.
The Auto-Pilot schema detector reads your first 100 documents and matches column names against a registry of field roles (title, sku, brand, category, description, price, stock, image, url, rating, ranking_score, date, tags, plus localised aliases). You don't write a mapping rule for the common case.
"denumire" → title. "pret" → price. Out of the box.
The detector ships with aliases for Romanian, Spanish, German, French,
and Italian. Romanian shops commonly export denumire,
pret, marca, stoc, categorie
— all auto-mapped to their canonical Skryx roles without intervention.
- Two-pass matching: exact normalisation, then substring containment
- Only roles that actually appear in your feed get mapped — no inventing fields
- Override per source via the
field_mappingJSON if a column needs a custom destination
// Detected role → actual feed column { "title": "denumire", "sku": "cod", "price": "pret", "brand": "marca", "category": "categorie", "stock": "stoc", "image": "poza", "ranking_score":"popularitate" }
"1.299,00 EUR" → 1299.00. "in_stock" → true.
Beyond column-name matching, Skryx normalises values per detected
type. Currency strings parse to floats. Pipe-separated lists become
string arrays. "yes" / "in_stock" / "available" all coerce to bool
true. Category paths split on > or
|.
- Currency: handles thousand separators in EU and US formats
- Boolean: standard truthy strings across languages
- Tags / categories: pipe / arrow / comma split, deduplicated
- A field matching
points/score/rank/popularitywith numeric values is auto-promoted as the index'sranking_scoretiebreaker
Five cadences. Three strategies. Pick once.
Manual · hourly · every 6 h · daily 03:00 · weekly Mon 03:00.
Schedules tighter than the plan ceiling are disabled in the UI — Free
can pick manual / daily / weekly; Starter unlocks hourly;
Growth and above can sync as often as every hour. There's a "Sync now"
button on every source for ad-hoc runs.
- Source count limits per plan: Free 1, Starter 3, Growth 10, Business 50, Enterprise unlimited
- Dispatcher runs every minute and queues whatever's due
- Manual "Sync now" returns 202 and a queued run reference; UI polls every 4 s
{
"schedule": "every_6h",
// next run:
"next_sync_at": "2026-05-25T20:00:00Z",
"interval": "PT6H"
}
// Daily / weekly fixed at 03:00 UTC
// to skip your peak traffic window
upsert · upsert_delete_missing · full_replace.
upsert (default) inserts new SKUs, updates existing, leaves the rest alone. upsert_delete_missing additionally drops items not present in the new feed. full_replace wipes the index then re-imports — use only when your feed is authoritative.
- Diff is computed in-process: snapshots index IDs, tracks added / updated / deleted per batch
- Upserts go through the engine's native upsert action
- Per-batch error tracking: the first engine rejection per batch is surfaced in the run log so you see why docs failed, not just a count
Auto-pause after five strikes. No noisy retries.
Built to survive flaky vendor feeds.
The job has a 1 800-second hard timeout, three retries with backoff [60s, 5 min, 15 min], and auto-pauses the source after five consecutive failed runs. You get an alert in the dashboard; the source toggle stays off until you flip it back manually.
- Custom auth:
basic_auth_username+ password (encrypted) and / orcustom_headersJSON for Bearer-style tokens - Run history retains
products_fetched / added / updated / deleted / failed,error_message(1 000 chars),error_details(5 000 char trace),duration_ms,bytes_fetched - Per-batch error: which row failed and the engine's reason — surfaced in the run detail page
Preview before you commit.
Hit the test endpoint with a URL + format and Skryx fetches a sample, detects format, counts total products, returns a 3-document preview, and surfaces warnings ("No products found", "Some products missing external_id"). The default field mapping it computed is dropped into the form so you only edit the bits that need fixing.
- Available both in the dashboard UI and as a public API endpoint
- Returns:
format_detected,total_products,preview,default_mapping,bytes,warnings - No side effects — nothing is written until you click Save
POST /v1/indexes/products/data-sources/
test-connection
{
"url": "https://shop…/feed.xml",
"format": "auto"
}
// →
{
"format_detected": "gmc_xml",
"total_products": 26011,
"preview": [ /* 3 docs */ ],
"default_mapping": { /* … */ },
"warnings": []
}
50 documents in, recommendations already on the dashboard.
When you sync your first feed, Auto-Pilot runs at the 50-document threshold (not the 1 000-document bulk threshold) and triggers AI Coach early. New tenants see their first ranking-rule, synonym, and stop-word suggestions before the feed even finishes importing — instead of waiting until next Monday's deep pass.