WB sometimes responds with 1-2 unrelated products instead of 429 status.
Was returning 'Платье вечернее' on 'Haier холодильник' query.
Fix: _is_relevant(product, query) checks that at least 1 significant query word (>=3 chars)
appears in product name or brand. Discards full result if zero matches.
Tradeoff: may sometimes reject valid product if query is overly specific (e.g. exact SKU).
But that's OK — we fall through to next query variant.
DISCOVERED in real test:
- WB API v9 (/exactmatch/ru/common/v9/search) теперь возвращает только метаданные
(name, query, shardKey, filters, search_result={}) — products пусто
- WB API v18 (/exactmatch/ru/common/v18/search) — рабочий
Структура: {metadata, products, total} — products НА ВЕРХНЕМ уровне (не data.products)
- Подтверждено: query='Haier холодильник' → 100 products via v18
CHANGES:
1. _SEARCH_URL → v18 endpoint
2. Парсинг products: сначала data.products (legacy fallback), потом products top-level
3. _build_item: цены теперь читаются из sizes[].price.{product, total, basic}
(v18 формат), с fallback на priceU/salePriceU (v9 legacy)
4. _generate_query_variants: добавлен brand+category fallback
('Bosch холодильник' если не нашли по модели)
TEST: Haier холодильник → 100 results (first: 'Холодильник двухкамерный C2F619CFU1')
User reported clicking matrix prices led to 'Произошла ошибка!' on OZON home page.
Cause: parsers captured /product/?sponsored=1&cpc=Jtiito95... links that died after few hours.
Fix:
- ozon.py: skip href with 'sponsored=1', '/promo/', 'cpc='. Strip query string from final URL.
- yamarket.py: skip 'sponsored=1', 'cpc=', 'advUuid' (Я.Маркет sponsored marker)
- citilink.py: strip query string from final URL (defensive)
Now matrix links go to canonical product pages that don't expire.
1. MODEL COUNT SELECTOR (strategy step):
- new PODBOR_MODEL_COUNTS [3/5/7]
- state.model_count default '5'
- UI on strategy page with description (быстро/оптимально/максимум)
2. AI PROMPT EXPANDED:
- new field: manual_search_query — for Google search instruction PDF
- new specs object per model: dimensions_mm/volume_l/weight_kg/noise_db/energy_class/color
- 'specs ОБЯЗАТЕЛЬНЫ для проектирования кухни' explicit rule
- reads checklist.model_count to determine how many models per category
- max_tokens 4000 → 8000 (room for richer responses)
3. MODEL CARD RICHER:
- _renderSpecsBlock — characteristics in 2-col grid, dimensions highlighted
- _renderUtilityLinks — Google search buttons for инструкция (PDF) + Схема установки
- Specs critical for ZOV kitchen design (manager needs to verify niche fits)
4. EXPORT BUTTONS:
- 'Скачать HTML' — generates standalone HTML with inline styles, downloads as file
- 'Печать → PDF' — opens new window with cleaned layout + auto-prints
- User can save as PDF via system print dialog
5. PREVIEW updated with realistic specs/manual_query for all 3 fridges
- New isValidPhone(raw): checks 11-digit Russian after normalization (8/7/+7/9-prefix)
- Intro 'Начать' button now custom click handler instead of data-go
- Validates name (non-empty) and phone (Russian format)
- Inline .field-error red message under invalid field
- .field-hint shows format help under phone input
- Haptic 'warning' feedback on invalid submit
- Phone is auto-normalized to '+7 900 123-45-67' before transition
- DNS: использовали httpx + proxy_pool но Qrator кидал 401 даже с residential
→ теперь Playwright + residential — браузер сам решает JS challenge
- OZON: теперь проверяем только <title>='Доступ ограничен' (точная), а не подстроку '/robotcheck/'
Я.Маркет рендерит SnippetConstructor виджет с JSON-стейтом ВНУТРИ a-тега.
Поэтому link.get_text() возвращает мусор типа {'widgets':{...}}.
Фикс:
- copy.copy(card) и удаление <script>/<noscript>/<noframes>/<template>
- Title теперь берётся из URL slug первым приоритетом (всегда чистый)
- _slug_to_title: транслитерация и капитализация
'bosch-kgn39ul30u-dvukhkamernyy-kholodilnik-no-frost-seryy-metallik' →
'Bosch KGN39UL30U Двухкамерный Холодильник NoFrost Серый Металлик'
- Old /product--{id} URLs deprecated
- Walks up from a[href*='/card/'] to nearest article/zone-div
- Extracts title from link text or h2/h3/itemprop=name
- Price: min from card text (with sanity bounds 100..10M)
- Image filters yastatic / _next placeholders
- Rating: '4.7★' or '4.7 N оценок' pattern
- Reviews: 'N отзывов' / 'N оценок'
- Stores count: 'от N магазинов / предложений'
- New use_proxy param (default True)
- Per-request random proxy from pool
- _parse_proxy_url_for_playwright converts http://user:pass@host:port to playwright.proxy dict
- New env: PROXY_LIST_FILE — path to file with one proxy per line
- _normalize_proxy_entry accepts: http://user:pass@host:port, host:port:user:pass (Proxys.io format), host:port
- _load_from_file reads file, dedup with static list
- /api/proxy_status returns file_path, file_loaded count, sample (first 3 masked)
WHAT CHANGED:
- New _renderPriceMatrix(models) — table with rows=models, columns=stores
- Inserted as PRIMARY view above model cards (was secondary accordion)
- Columns dynamically include only stores that returned data
- Sticky model column (left) — scrolls horizontally on mobile
- Best price per row highlighted: green bg + ✓ badge + green text
- Empty cells: '—' if no URL, 'смотреть →' if URL but no price yet
- 'Мин' column on far right — explicit cheapest price summary
CSS:
- .report-matrix-wrap with rounded card
- Sticky col-model with box-shadow on right edge
- Cell-price.best with rgba green background
- .best-mark circle badge
PREVIEW:
- Updated mock with 3 fridges + 3 hobs across multiple stores (real pricing spread)
- Demonstrates min-price highlighting working
UX:
- User can now visually compare 'where is it cheapest' at a glance
- Tap any cell with price → opens store page
- Tap empty cell with URL → opens search in store
NEXT: same matrix can become PDF/Excel export for client briefcase
DIAGNOSTIC RESULTS:
- OZON: 19 product links via Playwright on naked VPS-IP ✓
- Citilink: 112 data-meta-name Snippets ✓
- Wildberries: JSON API works with delays ✓
- Я.Маркет, DNS: blocked by ASN (need residential proxy)
OZON PARSER:
- Pure Playwright DOM (composer-api dropped — was blocked)
- Selects a[href*='/product/'], walks up to card div, extracts title/price/img
- Filters fake 'titles' like Распродажа, Скидка
CITILINK PARSER (new):
- Selects [data-meta-name*='Snippet'] or ProductCard markers
- Multiple title selectors fallback chain
- Filters out non-product hits
PARSERS/__init__.py:
- DEFAULT_SOURCES = (ozon, citilink, wb) — all work without proxy
- Я.Маркет, DNS kept but not default — usable when residential proxy added
NEW ENDPOINT:
- GET /api/parse_citilink?q=...&limit=N
- proxy_pool now loads from both PROXY_STATIC_LIST (env, comma-separated) and PROXY6_TOKEN (API)
- Static list has priority, merged with API list (dedup by URL)
- /api/proxy_status returns masked proxy URLs for diagnostic (passwords hidden)
- Supports formats: 'http://user:pass@host:port' or 'host:port' (assumed http://)
PROXY POOL (app/proxy_pool.py):
- Loads active proxies from Proxy6.net API every 10 min
- Random rotation per request via proxied_client(timeout, headers)
- Graceful fallback to direct HTTP if PROXY6_TOKEN not set
- Config: PROXY6_TOKEN env var
PARSERS (app/parsers/):
- dns.py — refactored to use proxy_pool with retry+rotation on Qrator block
- wb.py — Wildberries JSON API (search.wb.ru), retries on 429
- ozon.py — OZON composer-api JSON (widgetStates extraction)
- yamarket.py — Я.Маркет HTML + embedded JSON parser
- __init__.py — enrich_one() fans out to all sources, aggregates min/max prices, max rating, sum reviews
- enrich_models() — batch enrich for AI by_category output
NEW DIAGNOSTIC ENDPOINTS (main.py):
- GET /api/parse_wb?q=...&limit=N
- GET /api/parse_ozon?q=...&limit=N
- GET /api/parse_yamarket?q=...&limit=N
- GET /api/parse_all?q=... — fan-out + aggregate
- GET /api/proxy_status — pool diagnostics (count, token configured, age)
PODBOR (main.py):
- _enrich_ai_with_dns -> _enrich_ai_marketplaces (uses all sources)
DEPLOY: needs PROXY6_TOKEN in /opt/zov-tech/deploy/.env on VPS, then docker compose build + up -d backend
AI PROMPT (ai.py):
- Документирует новую форму checklist (per_cat.answers, brand_strategy, single_brand, brands, budget_preset, pick_strategies)
- Просит вернуть 3-5 моделей по КАЖДОЙ категории (не одну)
- Новый формат ответа: by_category[cat].models[] с brand/model/price_min/price_max/search_query/pros/cons/tier
- Подробные правила для бренд-стратегий (single → вся техника одной марки; different → preferred/acceptable/avoid)
- Бюджет-пресеты с авто-распределением по категориям (fridge ~25%, hob ~12% и т.д.)
DNS PARSER (parsers/dns.py):
- search_dns(query, limit) — HTTP + BeautifulSoup
- Реалистичный User-Agent, фолбэк на JSON-LD если HTML-селекторы не сработали
- enrich_models(models) — обогащает список моделей от AI, добавляя dns: {title, price, image, url, rating, reviews}
- Вежливая задержка 0.4с между запросами
MAIN.PY:
- /api/parse_dns?q=... — тестовый эндпоинт для проверки парсера
- _handle_podbor теперь после AI вызывает _enrich_ai_with_dns для каждой модели
- _format_podbor_for_telegram переписан под новый формат by_category — выводит 3-5 моделей в каждой категории с pros/cons
- Fallback на старый формат items[] для совместимости
REQUIREMENTS:
- + beautifulsoup4 >= 4.12
- + lxml >= 5.2
DEPLOY: после пуша на VPS нужно пересобрать backend контейнер (docker compose up --build -d backend)
CATEGORIES MIGRATED to steps[] schema:
- hob: Источник нагрева → Подтип (multi, optionsBy) → Размер → Конфорки → Особенности
- oven: Установка → Функции (multi) → Размер → Где ставим (cond:built_in) → Особенности
- dw: Тип встройки → Класс (multi) → Ширина → Корзины → Особенности
- hood: Форм-фактор → Подключение → Ширина → Цвет (cond:visible-types) → Особенности
- microwave: Установка → Функции (multi) → Размер (optionsBy) → Особенности
- coffee: Тип → Молоко (cond:grinder/manual) → Вода (cond:built-in/tap) → Размер (cond:built-in) → Особенности
- washer: Установка → Функция → Глубина → Загрузка → Объём → Особенности
NEW PODBOR.JS FEATURES:
- isStepActive(step, answers) — predicate for condition field
- findNextActiveIdx / findPrevActiveIdx — skip inactive steps in navigation
- Auto-advance through inactive on single-select pick
- Review screen filters inactive steps
- isCategoryFilled checks only active single-steps
- buildPerCatSummary skips inactive
- Clearing dependent answers when condition's parent changes (in addition to optionsBy)
NEXT: pictograms for step 1 of each category (currently text-pin layout)
- Visible on all steps after categories are selected
- Highlights current category when inside its wizard
- Filled categories show checkmark
- Tap chip jumps directly to that category's wizard
- Horizontal scroll if many categories don't fit
- New PODBOR_PARAMS schema with steps[] supporting single/multi + optionsBy branches
- 11 fridge SVG pictograms in podbor.picts.js (style D — 3D perspective with shadow)
- renderCategoryWizard with step-by-step flow, chips for prior answers, review screen
- Legacy renderCategoryDetail still used for other 7 categories until migrated
- Auto-advance on single-select, Дальше button for multi-select
- Backend-compatible: per_cat[catKey].answers replaces .params/.features