zov-tech/backend-py
wasrusgen 1a57374020 parsers: better image extraction — real product photos in report cards
CITILINK:
- Now reads data-src / data-original / srcset / src in priority order
- srcset → picks largest size variant (last in comma-list)
- Filters only _next/static/images (placeholder) and 'placeholder' in URL
- Accepts cs.citilink.ru / c.citilink.ru / images.citilink.ru product photos

ЯНДЕКС.МАРКЕТ:
- Collects all img attrs (data-src, data-original, srcset, data-srcset, src)
- Prefers avatars.mds.yandex.net (real product CDN), skips yastatic (icons/logos)
- Auto-appends /300x300 suffix to avatars.mds URLs without size

ENRICH_ONE (aggregator):
- Image picked by source priority: yamarket > wb > ozon > citilink > dns
- Yamarket photos are cleanest (avatars.mds.yandex.net)
- WB has product photos via basket-XX.wbbasket.ru
2026-05-11 23:43:25 +03:00
..
app parsers: better image extraction — real product photos in report cards 2026-05-11 23:43:25 +03:00
.dockerignore feat(infra): Python FastAPI backend + Docker compose for VPS deploy (GigaChat with Russian root CA) 2026-05-10 17:44:21 +03:00
Dockerfile backend: Playwright + Chromium for JS-rendered sites (Я.Маркет, OZON fallback) 2026-05-11 13:25:05 +03:00
requirements.txt backend: Playwright + Chromium for JS-rendered sites (Я.Маркет, OZON fallback) 2026-05-11 13:25:05 +03:00