User reported clicking matrix prices led to 'Произошла ошибка!' on OZON home page.
Cause: parsers captured /product/?sponsored=1&cpc=Jtiito95... links that died after few hours.
Fix:
- ozon.py: skip href with 'sponsored=1', '/promo/', 'cpc='. Strip query string from final URL.
- yamarket.py: skip 'sponsored=1', 'cpc=', 'advUuid' (Я.Маркет sponsored marker)
- citilink.py: strip query string from final URL (defensive)
Now matrix links go to canonical product pages that don't expire.
DIAGNOSTIC RESULTS:
- OZON: 19 product links via Playwright on naked VPS-IP ✓
- Citilink: 112 data-meta-name Snippets ✓
- Wildberries: JSON API works with delays ✓
- Я.Маркет, DNS: blocked by ASN (need residential proxy)
OZON PARSER:
- Pure Playwright DOM (composer-api dropped — was blocked)
- Selects a[href*='/product/'], walks up to card div, extracts title/price/img
- Filters fake 'titles' like Распродажа, Скидка
CITILINK PARSER (new):
- Selects [data-meta-name*='Snippet'] or ProductCard markers
- Multiple title selectors fallback chain
- Filters out non-product hits
PARSERS/__init__.py:
- DEFAULT_SOURCES = (ozon, citilink, wb) — all work without proxy
- Я.Маркет, DNS kept but not default — usable when residential proxy added
NEW ENDPOINT:
- GET /api/parse_citilink?q=...&limit=N