zov-tech/backend-py/app/parsers
wasrusgen 03c95fe13a wb: relevance filter — discard anti-bot trash products (платья/обувь in fridge search)
WB sometimes responds with 1-2 unrelated products instead of 429 status.
Was returning 'Платье вечернее' on 'Haier холодильник' query.

Fix: _is_relevant(product, query) checks that at least 1 significant query word (>=3 chars)
appears in product name or brand. Discards full result if zero matches.

Tradeoff: may sometimes reject valid product if query is overly specific (e.g. exact SKU).
But that's OK — we fall through to next query variant.
2026-05-11 23:02:37 +03:00
..
__init__.py backend: working parsers — OZON + Citilink (DOM via Playwright) + WB 2026-05-11 13:53:07 +03:00
citilink.py parsers: skip sponsored/ad URLs (cpc/sponsored=1) — they expire in 2-3 hours 2026-05-11 17:20:59 +03:00
dns.py dns+ozon: 4 retries with proxy rotation (residential pool has dirty IPs) 2026-05-11 16:37:28 +03:00
ozon.py parsers: skip sponsored/ad URLs (cpc/sponsored=1) — they expire in 2-3 hours 2026-05-11 17:20:59 +03:00
playwright_engine.py playwright_engine: route through proxy_pool — random residential IP per request 2026-05-11 16:05:36 +03:00
wb.py wb: relevance filter — discard anti-bot trash products (платья/обувь in fridge search) 2026-05-11 23:02:37 +03:00
yamarket.py parsers: skip sponsored/ad URLs (cpc/sponsored=1) — they expire in 2-3 hours 2026-05-11 17:20:59 +03:00