Commit 99b72698b556ae19a00ba4cb6206a1343f033abe
1 parent
5c9baf91
测试回归钩子梳理
变更清单 修复(6 处漂移用例,全部更新到最新实现) - `tests/test_eval_metrics.py` — 整体重写为新的 4 级 label + 级联公式断言,放弃旧的 `RELEVANCE_EXACT/HIGH/LOW/IRRELEVANT` 和硬编码 ERR 值。 - `tests/test_embedding_service_priority.py` — 补齐 `_TextDispatchTask(user_id=...)` 新必填位。 - `tests/test_embedding_pipeline.py` — cache-hit 路径的 `np.allclose` 改用 `np.asarray(..., dtype=float32)` 避开 object-dtype。 - `tests/test_es_query_builder_text_recall_languages.py` — keywords 次 combined_fields 的期望值对齐现行值(`MSM 60% / boost 0.8`)并重命名。 - `tests/test_product_enrich_partial_mode.py` - `test_create_prompt_supports_taxonomy_analysis_kind`:去掉错误假设(fr 不属于任何 taxonomy schema),明确 `(None, None, None)` sentinel 的契约。 - `test_build_index_content_fields_non_apparel_taxonomy_returns_en_only`:fake 模拟真实 schema 行为(unsupported lang 返回空列表),删除"zh 未被调用"的过时断言。 清理历史过渡物(per 开发原则:不保留内部双轨) - 删除 `tests/test_keywords_query.py`(已被 `query/keyword_extractor.py` 生产实现取代的早期原型)。 - `tests/test_facet_api.py` / `tests/test_cnclip_service.py` 移动到 `tests/manual/`,更新 `tests/manual/README.md` 说明分工。 - 重写 `tests/conftest.py`:仅保留 `sys.path` 注入,删除全库无人引用的 `sample_search_config / mock_es_client / test_searcher / temp_config_file` 等 fixture。 - 删除 `tests/test_suggestions.py` 中 13 处残留 `@pytest.mark.unit` 装饰器(模块级 `pytestmark` 已覆盖)。 新建一致性基础设施 - `pytest.ini`:权威配置源。`testpaths = tests`、`norecursedirs = tests/manual`、`--strict-markers`、登记所有子系统 marker + `regression` marker。 - `tests/ci/test_service_api_contracts.py` + 30 个 `tests/test_*.py` 批量贴上 `pytestmark = [pytest.mark.<subsystem>, pytest.mark.regression]`(AST 安全插入,避开多行 import)。 - `scripts/run_regression_tests.sh` 新建,支持 `SUBSYSTEM=<name>` 选子集。 - `scripts/run_ci_tests.sh` 扩容:由原先的 `tests/ci -q` 改为 `contract` marker + `search ∧ regression` 双阶段。 文档统一(删除历史双轨) - 重写 `docs/测试Pipeline说明.md`:删除 `tests/unit/` / `tests/integration/` / `scripts/start_test_environment.sh` 等早已不存在的引用,给出目录约定、marker 表、回归锚点矩阵、覆盖缺口清单、联调脚本用法。 - 删除 `docs/测试回归钩子梳理-2026-04-20.md`(内容已合并进上面一份权威文档,按"一处真相"原则下掉)。 - `docs/DEVELOPER_GUIDE.md §8.2 测试` 改写,指向 pipeline 权威文档。 - `CLAUDE.md` 的 `Testing` 与 `Testing Infrastructure` 两节同步更新。 最终状态 | 指标 | 结果 | |------|------| | 全量 `pytest tests/` | **241 passed** | | `./scripts/run_ci_tests.sh` | 45 passed | | `./scripts/run_regression_tests.sh` | 233 passed | | 子系统子集(示例) | search=45 / rerank=35 / embedding=23 / intent=25 / translation=33 / indexer=17 / suggestion=13 / query=6 / eval=8 / contract=34 | | 未清零的已知缺口 | 见新版 `测试Pipeline说明.md §4`(function_score / facet / image search / config loader / document_transformer 等 6 条) | Pipeline 文档里 §4 的覆盖缺口我没有强行补测用例——那属于"新增覆盖",不是这次清理的范畴;只要后续谁补,把对应 marker 贴上去、从清单里划掉即可。
Showing
45 changed files
with
593 additions
and
930 deletions
Show diff stats
CLAUDE.md
| @@ -99,18 +99,29 @@ python main.py serve --host 0.0.0.0 --port 6002 --reload | @@ -99,18 +99,29 @@ python main.py serve --host 0.0.0.0 --port 6002 --reload | ||
| 99 | 99 | ||
| 100 | ### Testing | 100 | ### Testing |
| 101 | ```bash | 101 | ```bash |
| 102 | -# Run all tests | ||
| 103 | -pytest tests/ | 102 | +# CI gate (API contracts + search core regression anchors) |
| 103 | +./scripts/run_ci_tests.sh | ||
| 104 | + | ||
| 105 | +# Full regression anchor suite (pre-release / pre-merge) | ||
| 106 | +./scripts/run_regression_tests.sh | ||
| 107 | + | ||
| 108 | +# Subsystem-scoped regression (e.g. search / query / intent / rerank / embedding / translation / indexer / suggestion) | ||
| 109 | +SUBSYSTEM=rerank ./scripts/run_regression_tests.sh | ||
| 104 | 110 | ||
| 105 | -# Run focused regression sets | ||
| 106 | -python -m pytest tests/ci -q | 111 | +# Whole automated suite |
| 112 | +python -m pytest tests/ -q | ||
| 113 | + | ||
| 114 | +# Focused debugging | ||
| 107 | pytest tests/test_rerank_client.py | 115 | pytest tests/test_rerank_client.py |
| 108 | pytest tests/test_query_parser_mixed_language.py | 116 | pytest tests/test_query_parser_mixed_language.py |
| 109 | 117 | ||
| 110 | -# Test search from command line | 118 | +# Command-line smoke |
| 111 | python main.py search "query" --tenant-id 1 --size 10 | 119 | python main.py search "query" --tenant-id 1 --size 10 |
| 112 | ``` | 120 | ``` |
| 113 | 121 | ||
| 122 | +See `docs/测试Pipeline说明.md` for the authoritative test pipeline guide, | ||
| 123 | +including the regression hook matrix and marker conventions. | ||
| 124 | + | ||
| 114 | ### Development Utilities | 125 | ### Development Utilities |
| 115 | ```bash | 126 | ```bash |
| 116 | # Stop all services | 127 | # Stop all services |
| @@ -218,24 +229,24 @@ The system uses centralized configuration through `config/config.yaml`: | @@ -218,24 +229,24 @@ The system uses centralized configuration through `config/config.yaml`: | ||
| 218 | 229 | ||
| 219 | ## Testing Infrastructure | 230 | ## Testing Infrastructure |
| 220 | 231 | ||
| 221 | -**Test Framework**: pytest with async support | 232 | +**Framework**: pytest. Authoritative guide: `docs/测试Pipeline说明.md`. |
| 233 | + | ||
| 234 | +**Layout**: | ||
| 235 | +- `tests/` — flat file layout; each file targets one subsystem. | ||
| 236 | +- `tests/ci/` — API / service contract tests (FastAPI `TestClient` with fake backends). | ||
| 237 | +- `tests/manual/` — scripts that need live services (pytest does **not** collect these). | ||
| 238 | +- `tests/conftest.py` — sys.path injection only. No global fixtures; all fakes live next to the tests that use them. | ||
| 222 | 239 | ||
| 223 | -**Test Structure**: | ||
| 224 | -- `tests/conftest.py`: Comprehensive test fixtures and configuration | ||
| 225 | -- `tests/unit/`: Unit tests for individual components | ||
| 226 | -- `tests/integration/`: Integration tests for system workflows | ||
| 227 | -- Test markers: `@pytest.mark.unit`, `@pytest.mark.integration`, `@pytest.mark.api` | 240 | +**Markers** (registered in `pytest.ini`, enforced by `--strict-markers`): |
| 241 | +- Subsystem: `contract`, `search`, `query`, `intent`, `rerank`, `embedding`, `translation`, `indexer`, `suggestion`, `eval`. | ||
| 242 | +- Regression gate: `regression` — anchor tests mandatory for `run_regression_tests.sh`. | ||
| 228 | 243 | ||
| 229 | **Test Data**: | 244 | **Test Data**: |
| 230 | - Tenant1: Mock data with 10,000 product records | 245 | - Tenant1: Mock data with 10,000 product records |
| 231 | - Tenant2: CSV-based test dataset | 246 | - Tenant2: CSV-based test dataset |
| 232 | - Automated test data generation via `scripts/mock_data.sh` | 247 | - Automated test data generation via `scripts/mock_data.sh` |
| 233 | 248 | ||
| 234 | -**Key Test Fixtures** (from `conftest.py`): | ||
| 235 | -- `sample_search_config`: Complete configuration for testing | ||
| 236 | -- `mock_es_client`: Mocked Elasticsearch client | ||
| 237 | -- `test_searcher`: Searcher instance with mock dependencies | ||
| 238 | -- `temp_config_file`: Temporary YAML configuration for tests | 249 | +**Principle**: tests must inject fakes for ES / DeepL / LLM / Redis. Never add tests that rely on real external services to the automated suite — put them under `tests/manual/`. |
| 239 | 250 | ||
| 240 | ## API Endpoints | 251 | ## API Endpoints |
| 241 | 252 |
docs/DEVELOPER_GUIDE.md
| @@ -386,11 +386,16 @@ services: | @@ -386,11 +386,16 @@ services: | ||
| 386 | 386 | ||
| 387 | ### 8.2 测试 | 387 | ### 8.2 测试 |
| 388 | 388 | ||
| 389 | -- **位置**:`tests/`,可按 `unit/`、`integration/` 或按模块划分子目录;公共 fixture 在 `conftest.py`。 | ||
| 390 | -- **标记**:使用 `@pytest.mark.unit`、`@pytest.mark.integration`、`@pytest.mark.api` 等区分用例类型,便于按需运行。 | ||
| 391 | -- **依赖**:单元测试通过 mock(如 `mock_es_client`、`sample_search_config`)不依赖真实 ES/DB;集成测试需在说明中注明依赖服务。 | ||
| 392 | -- **运行**:`python -m pytest tests/`;推荐最小回归:`python -m pytest tests/ci -q`;按模块聚焦可直接指定具体测试文件。 | ||
| 393 | -- **原则**:新增逻辑应有对应测试;修改协议或配置契约时更新相关测试与 fixture。 | 389 | +测试流水线的权威说明见 [`docs/测试Pipeline说明.md`](./测试Pipeline说明.md)。核心约定: |
| 390 | + | ||
| 391 | +- **位置**:`tests/` 下按文件平铺,`tests/ci/` 放 API 契约测试,`tests/manual/` 放需人工起服务的联调脚本(pytest 默认不 collect)。 | ||
| 392 | +- **Marker**:`pytest.ini` 里登记了子系统 marker(`search / query / intent / rerank / embedding / translation / indexer / suggestion / eval / contract`)与 `regression` marker;新测试必须贴对应 marker(`--strict-markers` 会强制)。 | ||
| 393 | +- **依赖**:测试一律通过注入 fake stub 隔离 ES / DeepL / LLM / Redis 等外部依赖。需要真实依赖的脚本放 `tests/manual/`。 | ||
| 394 | +- **运行**: | ||
| 395 | + - CI 门禁:`./scripts/run_ci_tests.sh`(契约 + search 回归锚点) | ||
| 396 | + - 发版前:`./scripts/run_regression_tests.sh`(全部 `regression` 锚点;可配 `SUBSYSTEM=<name>`) | ||
| 397 | + - 全量:`python -m pytest tests/ -q` | ||
| 398 | +- **原则**:新增逻辑应有对应测试;修改协议或配置契约时**同步**更新契约测试。不要在测试里保留"旧 assert 作为兼容"——请直接面向当前实现写断言,失败即意味着契约已变更,需要上层决策。 | ||
| 394 | 399 | ||
| 395 | ### 8.3 配置与环境 | 400 | ### 8.3 配置与环境 |
| 396 | 401 |
docs/测试Pipeline说明.md
| 1 | # 搜索引擎测试流水线指南 | 1 | # 搜索引擎测试流水线指南 |
| 2 | 2 | ||
| 3 | -## 概述 | 3 | +本文档是测试套件的**权威入口**,涵盖目录约定、运行方式、回归锚点矩阵、以及手动 |
| 4 | +联调脚本的分工。任何与这里不一致的历史文档(例如提到 `tests/unit/` 或 | ||
| 5 | +`scripts/start_test_environment.sh`)都是过期信息,以本文为准。 | ||
| 4 | 6 | ||
| 5 | -本文档介绍了搜索引擎项目的完整测试流水线,包括测试环境搭建、测试执行、结果分析等内容。测试流水线设计用于commit前的自动化质量保证。 | ||
| 6 | - | ||
| 7 | -## 🏗️ 测试架构 | ||
| 8 | - | ||
| 9 | -### 测试层次 | 7 | +## 1. 测试目录与分层 |
| 10 | 8 | ||
| 11 | ``` | 9 | ``` |
| 12 | -测试流水线 | ||
| 13 | -├── 代码质量检查 (Code Quality) | ||
| 14 | -│ ├── 代码格式化检查 (Black, isort) | ||
| 15 | -│ ├── 静态分析 (Flake8, MyPy, Pylint) | ||
| 16 | -│ └── 安全扫描 (Safety, Bandit) | ||
| 17 | -│ | ||
| 18 | -├── 单元测试 (Unit Tests) | ||
| 19 | -│ ├── RequestContext测试 | ||
| 20 | -│ ├── Searcher测试 | ||
| 21 | -│ ├── QueryParser测试 | ||
| 22 | -│ └── BooleanParser测试 | ||
| 23 | -│ | ||
| 24 | -├── 集成测试 (Integration Tests) | ||
| 25 | -│ ├── 端到端搜索流程测试 | ||
| 26 | -│ ├── 多组件协同测试 | ||
| 27 | -│ └── 错误处理测试 | ||
| 28 | -│ | ||
| 29 | -├── API测试 (API Tests) | ||
| 30 | -│ ├── REST API接口测试 | ||
| 31 | -│ ├── 参数验证测试 | ||
| 32 | -│ ├── 并发请求测试 | ||
| 33 | -│ └── 错误响应测试 | ||
| 34 | -│ | ||
| 35 | -└── 性能测试 (Performance Tests) | ||
| 36 | - ├── 响应时间测试 | ||
| 37 | - ├── 并发性能测试 | ||
| 38 | - └── 资源使用测试 | 10 | +tests/ |
| 11 | +├── conftest.py # 只做 sys.path 注入;不再维护全局 fixture | ||
| 12 | +├── ci/ # API/服务契约(FastAPI TestClient + 全 fake 依赖) | ||
| 13 | +│ └── test_service_api_contracts.py | ||
| 14 | +├── manual/ # 需真实服务才能跑的联调脚本,pytest 默认不 collect | ||
| 15 | +│ ├── test_build_docs_api.py | ||
| 16 | +│ ├── test_cnclip_service.py | ||
| 17 | +│ └── test_facet_api.py | ||
| 18 | +└── test_*.py # 子系统单测(全部自带 fake,无外部依赖) | ||
| 39 | ``` | 19 | ``` |
| 40 | 20 | ||
| 41 | -### 核心组件 | ||
| 42 | - | ||
| 43 | -1. **RequestContext**: 请求级别的上下文管理器,用于跟踪测试过程中的所有数据 | ||
| 44 | -2. **测试环境管理**: 自动化启动/停止测试依赖服务 | ||
| 45 | -3. **测试执行引擎**: 统一的测试运行和结果收集 | ||
| 46 | -4. **报告生成系统**: 多格式的测试报告生成 | ||
| 47 | - | ||
| 48 | -## 🚀 快速开始 | 21 | +关键约束(写在 `pytest.ini` 里,不要另起分支): |
| 49 | 22 | ||
| 50 | -### 本地测试环境 | 23 | +- `testpaths = tests`,`norecursedirs = tests/manual`; |
| 24 | +- `--strict-markers`:所有 marker 必须先在 `pytest.ini::markers` 登记; | ||
| 25 | +- 测试**不得**依赖真实 ES / DeepL / LLM 服务。需要外部依赖的脚本请放 `tests/manual/`。 | ||
| 51 | 26 | ||
| 52 | -1. **启动测试环境** | ||
| 53 | - ```bash | ||
| 54 | - # 启动所有必要的测试服务 | ||
| 55 | - ./scripts/start_test_environment.sh | ||
| 56 | - ``` | 27 | +## 2. 运行方式 |
| 57 | 28 | ||
| 58 | -2. **运行完整测试套件** | ||
| 59 | - ```bash | ||
| 60 | - # 运行所有测试 | ||
| 61 | - python scripts/run_tests.py | 29 | +| 场景 | 命令 | 覆盖范围 | |
| 30 | +|------|------|----------| | ||
| 31 | +| CI 门禁(每次提交) | `./scripts/run_ci_tests.sh` | `tests/ci` + `contract` marker + `search ∧ regression` | | ||
| 32 | +| 发版 / 大合并前 | `./scripts/run_regression_tests.sh` | 所有 `@pytest.mark.regression` | | ||
| 33 | +| 子系统子集 | `SUBSYSTEM=search ./scripts/run_regression_tests.sh` | 指定子系统的 regression 锚点 | | ||
| 34 | +| 全量(含非回归) | `python -m pytest tests/ -q` | 全部自动化用例 | | ||
| 35 | +| 手动联调 | `python tests/manual/<script>.py` | 需提前起对应服务 | | ||
| 62 | 36 | ||
| 63 | - # 或者使用pytest直接运行 | ||
| 64 | - pytest tests/ -v | ||
| 65 | - ``` | 37 | +## 3. Marker 体系与回归锚点矩阵 |
| 66 | 38 | ||
| 67 | -3. **停止测试环境** | ||
| 68 | - ```bash | ||
| 69 | - ./scripts/stop_test_environment.sh | ||
| 70 | - ``` | 39 | +marker 定义见 `pytest.ini`。每个测试文件通过模块级 `pytestmark` 贴标,同时 |
| 40 | +属于 `regression` 的用例构成“**回归锚点集合**”。 | ||
| 71 | 41 | ||
| 72 | -### CI/CD测试 | 42 | +| 子系统 marker | 关键文件(锚点) | 保护的行为 | |
| 43 | +|---------------|------------------|------------| | ||
| 44 | +| `contract` | `tests/ci/test_service_api_contracts.py` | Search / Indexer / Embedding / Reranker / Translation 的 HTTP 契约 | | ||
| 45 | +| `search` | `test_search_rerank_window.py`, `test_es_query_builder.py`, `test_es_query_builder_text_recall_languages.py` | Searcher 主路径、排序 / 召回、keywords 副 combined_fields、多语种 | | ||
| 46 | +| `query` | `test_query_parser_mixed_language.py`, `test_tokenization.py` | 中英混合解析、HanLP 分词、language detect | | ||
| 47 | +| `intent` | `test_style_intent.py`, `test_product_title_exclusion.py`, `test_sku_intent_selector.py` | 风格意图、商品标题排除、SKU 选型 | | ||
| 48 | +| `rerank` | `test_rerank_client.py`, `test_rerank_query_text.py`, `test_rerank_provider_topn.py`, `test_reranker_server_topn.py`, `test_reranker_dashscope_backend.py`, `test_reranker_qwen3_gguf_backend.py` | 粗排 / 精排 / topN / 后端切换 | | ||
| 49 | +| `embedding` | `test_embedding_pipeline.py`, `test_embedding_service_limits.py`, `test_embedding_service_priority.py`, `test_cache_keys.py` | 文本/图像向量客户端、inflight limiter、优先级队列、缓存 key | | ||
| 50 | +| `translation` | `test_translation_deepl_backend.py`, `test_translation_llm_backend.py`, `test_translation_local_backends.py`, `test_translator_failure_semantics.py` | DeepL / LLM / 本地回退、失败语义 | | ||
| 51 | +| `indexer` | `test_product_enrich_partial_mode.py`, `test_process_products_batching.py`, `test_llm_enrichment_batch_fill.py` | LLM Partial Mode、batch 拆分、空结果补位 | | ||
| 52 | +| `suggestion` | `test_suggestions.py` | 建议索引构建 | | ||
| 53 | +| `eval` | `test_eval_metrics.py`(regression) + `test_search_evaluation_datasets.py` / `test_eval_framework_clients.py`(非 regression) | NDCG / ERR 指标、数据集加载、评估客户端 | | ||
| 73 | 54 | ||
| 74 | -1. **GitHub Actions** | ||
| 75 | - - Push到主分支自动触发 | ||
| 76 | - - Pull Request自动运行 | ||
| 77 | - - 手动触发支持 | 55 | +> 任何新写的子系统单测,都应该在顶部加 `pytestmark = [pytest.mark.<子系统>, pytest.mark.regression]`。 |
| 56 | +> 不贴 `regression` 的测试默认**不会**被 `run_regression_tests.sh` 选中,请谨慎决定。 | ||
| 78 | 57 | ||
| 79 | -2. **测试报告** | ||
| 80 | - - 自动生成并上传 | ||
| 81 | - - PR评论显示测试摘要 | ||
| 82 | - - 详细报告下载 | 58 | +## 4. 当前覆盖缺口(跟踪中) |
| 83 | 59 | ||
| 84 | -## 📋 测试类型详解 | 60 | +以下场景目前没有被 `regression` 锚点覆盖,优先级从高到低: |
| 85 | 61 | ||
| 86 | -### 1. 单元测试 (Unit Tests) | 62 | +1. **`api/routes/search.py` 的请求参数映射**:`QueryParser.parse(...)` 透传是否完整(目前只有 `tests/ci` 间接覆盖)。 |
| 63 | +2. **`indexer/document_transformer.py` 的端到端转换**:从 MySQL 行到 ES doc 的 snapshot 对比。 | ||
| 64 | +3. **`config/loader.py` 加载多租户配置**:含继承 / override 的合并规则。 | ||
| 65 | +4. **`search/searcher.py::_build_function_score`**:function_score 装配。 | ||
| 66 | +5. **Facet 聚合 / disjunctive 过滤**。 | ||
| 67 | +6. **图像搜索主路径**(`search/image_searcher.py`)。 | ||
| 87 | 68 | ||
| 88 | -**位置**: `tests/unit/` | 69 | +补齐时记得同步贴 `regression` + 对应子系统 marker,并在本表删除条目。 |
| 89 | 70 | ||
| 90 | -**目的**: 测试单个函数、类、模块的功能 | 71 | +## 5. 手动联调:索引文档构建流水线 |
| 91 | 72 | ||
| 92 | -**覆盖范围**: | ||
| 93 | -- `test_context.py`: RequestContext功能测试 | ||
| 94 | -- `test_searcher.py`: Searcher核心功能测试 | ||
| 95 | -- `test_query_parser.py`: QueryParser处理逻辑测试 | ||
| 96 | - | ||
| 97 | -**运行方式**: | ||
| 98 | -```bash | ||
| 99 | -# 运行所有单元测试 | ||
| 100 | -pytest tests/unit/ -v | ||
| 101 | - | ||
| 102 | -# 运行特定测试 | ||
| 103 | -pytest tests/unit/test_context.py -v | ||
| 104 | - | ||
| 105 | -# 生成覆盖率报告 | ||
| 106 | -pytest tests/unit/ --cov=. --cov-report=html | ||
| 107 | -``` | ||
| 108 | - | ||
| 109 | -### 2. 集成测试 (Integration Tests) | ||
| 110 | - | ||
| 111 | -**位置**: `tests/integration/` | ||
| 112 | - | ||
| 113 | -**目的**: 测试多个组件协同工作的功能 | ||
| 114 | - | ||
| 115 | -**覆盖范围**: | ||
| 116 | -- `test_search_integration.py`: 完整搜索流程集成 | ||
| 117 | -- 数据库、ES、搜索器集成测试 | ||
| 118 | -- 错误传播和处理测试 | ||
| 119 | - | ||
| 120 | -**运行方式**: | ||
| 121 | -```bash | ||
| 122 | -# 运行集成测试(需要启动测试环境) | ||
| 123 | -pytest tests/integration/ -v -m "not slow" | ||
| 124 | - | ||
| 125 | -# 运行包含慢速测试的集成测试 | ||
| 126 | -pytest tests/integration/ -v | ||
| 127 | -``` | ||
| 128 | - | ||
| 129 | -### 3. API测试 (API Tests) | ||
| 130 | - | ||
| 131 | -**位置**: `tests/integration/test_api_integration.py` | ||
| 132 | - | ||
| 133 | -**目的**: 测试HTTP API接口的功能和性能 | ||
| 134 | - | ||
| 135 | -**覆盖范围**: | ||
| 136 | -- 基本搜索API | ||
| 137 | -- 参数验证 | ||
| 138 | -- 错误处理 | ||
| 139 | -- 并发请求 | ||
| 140 | -- Unicode支持 | ||
| 141 | - | ||
| 142 | -**运行方式**: | ||
| 143 | -```bash | ||
| 144 | -# 运行API测试 | ||
| 145 | -pytest tests/integration/test_api_integration.py -v | ||
| 146 | -``` | ||
| 147 | - | ||
| 148 | -### 5. 索引 & 文档构建流水线验证(手动) | ||
| 149 | - | ||
| 150 | -除了自动化测试外,推荐在联调/问题排查时手动跑一遍“**从 MySQL 到 ES doc**”的索引流水线,确保字段与 mapping、查询逻辑一致。 | ||
| 151 | - | ||
| 152 | -#### 5.1 启动 Indexer 服务 | 73 | +除自动化测试外,联调/问题排查时建议走一遍“**MySQL → ES doc**”链路,确保字段与 mapping |
| 74 | +与查询逻辑对齐。 | ||
| 153 | 75 | ||
| 154 | ```bash | 76 | ```bash |
| 155 | cd /home/tw/saas-search | 77 | cd /home/tw/saas-search |
| 156 | ./scripts/stop.sh # 停掉已有进程(可选) | 78 | ./scripts/stop.sh # 停掉已有进程(可选) |
| 157 | -./scripts/start_indexer.sh # 启动专用 indexer 服务,默认端口 6004 | ||
| 158 | -``` | ||
| 159 | - | ||
| 160 | -#### 5.2 基于数据库构建 ES doc(只看、不写 ES) | 79 | +./scripts/start_indexer.sh # 启动 indexer 服务,默认端口 6004 |
| 161 | 80 | ||
| 162 | -> 场景:已经知道某个 `tenant_id` 和 `spu_id`,想看它在“最新逻辑下”的 ES 文档长什么样。 | ||
| 163 | - | ||
| 164 | -```bash | ||
| 165 | curl -X POST "http://127.0.0.1:6004/indexer/build-docs-from-db" \ | 81 | curl -X POST "http://127.0.0.1:6004/indexer/build-docs-from-db" \ |
| 166 | -H "Content-Type: application/json" \ | 82 | -H "Content-Type: application/json" \ |
| 167 | - -d '{ | ||
| 168 | - "tenant_id": "170", | ||
| 169 | - "spu_ids": ["223167"] | ||
| 170 | - }' | ||
| 171 | -``` | ||
| 172 | - | ||
| 173 | -返回中: | ||
| 174 | - | ||
| 175 | -- `docs[0]` 为当前代码构造出来的完整 ES doc(与 `mappings/search_products.json` 对齐); | ||
| 176 | -- 可以直接比对: | ||
| 177 | - - 索引字段说明:`docs/索引字段说明v2.md` | ||
| 178 | - - 实际 ES 文档:`docs/常用查询 - ES.md` 中的查询示例(按 `spu_id` 过滤)。 | ||
| 179 | - | ||
| 180 | -#### 5.3 与 ES 实际数据对比 | ||
| 181 | - | ||
| 182 | -```bash | ||
| 183 | -curl -u 'essa:***' \ | ||
| 184 | - -X GET 'http://localhost:9200/search_products_tenant_170/_search?pretty' \ | ||
| 185 | - -H 'Content-Type: application/json' \ | ||
| 186 | - -d '{ | ||
| 187 | - "size": 5, | ||
| 188 | - "_source": ["title", "tags"], | ||
| 189 | - "query": { | ||
| 190 | - "bool": { | ||
| 191 | - "filter": [ | ||
| 192 | - { "term": { "spu_id": "223167" } } | ||
| 193 | - ] | ||
| 194 | - } | ||
| 195 | - } | ||
| 196 | - }' | 83 | + -d '{ "tenant_id": "170", "spu_ids": ["223167"] }' |
| 197 | ``` | 84 | ``` |
| 198 | 85 | ||
| 199 | -对比如下内容是否一致: | ||
| 200 | - | ||
| 201 | -- 多语言字段:`title/brief/description/vendor/category_name_text/category_path`; | ||
| 202 | -- 结构字段:`tags/specifications/skus/min_price/max_price/compare_at_price/total_inventory` 等; | ||
| 203 | -- 算法字段:`title_embedding` 是否存在(值不必逐项比对)。 | ||
| 204 | - | ||
| 205 | -如果两边不一致,可以结合: | ||
| 206 | - | ||
| 207 | -- `indexer/document_transformer.py`(文档构造逻辑); | ||
| 208 | -- `indexer/incremental_service.py`(增量索引/查库逻辑); | ||
| 209 | -- `logs/indexer.log`(索引日志) | ||
| 210 | - | ||
| 211 | -逐步缩小问题范围。 | ||
| 212 | - | ||
| 213 | -### 4. 性能测试 (Performance Tests) | ||
| 214 | - | ||
| 215 | -**目的**: 验证系统性能指标 | ||
| 216 | - | ||
| 217 | -**测试内容**: | ||
| 218 | -- 搜索响应时间 | ||
| 219 | -- API并发处理能力 | ||
| 220 | -- 资源使用情况 | ||
| 221 | - | ||
| 222 | -**运行方式**: | ||
| 223 | -```bash | ||
| 224 | -# 运行性能测试 | ||
| 225 | -python scripts/run_performance_tests.py | ||
| 226 | -``` | ||
| 227 | - | ||
| 228 | -## 🛠️ 环境配置 | ||
| 229 | - | ||
| 230 | -### 测试环境要求 | ||
| 231 | - | ||
| 232 | -1. **Python环境** | ||
| 233 | - ```bash | ||
| 234 | - # 创建测试环境 | ||
| 235 | - conda create -n searchengine-test python=3.9 | ||
| 236 | - conda activate searchengine-test | ||
| 237 | - | ||
| 238 | - # 安装依赖 | ||
| 239 | - pip install -r requirements.txt | ||
| 240 | - pip install pytest pytest-cov pytest-json-report | ||
| 241 | - ``` | ||
| 242 | - | ||
| 243 | -2. **Elasticsearch** | ||
| 244 | - ```bash | ||
| 245 | - # 使用Docker启动ES | ||
| 246 | - docker run -d \ | ||
| 247 | - --name elasticsearch \ | ||
| 248 | - -p 9200:9200 \ | ||
| 249 | - -e "discovery.type=single-node" \ | ||
| 250 | - -e "xpack.security.enabled=false" \ | ||
| 251 | - elasticsearch:8.8.0 | ||
| 252 | - ``` | ||
| 253 | - | ||
| 254 | -3. **环境变量** | ||
| 255 | - ```bash | ||
| 256 | - export ES_HOST="http://localhost:9200" | ||
| 257 | - export ES_USERNAME="elastic" | ||
| 258 | - export ES_PASSWORD="changeme" | ||
| 259 | - export API_HOST="127.0.0.1" | ||
| 260 | - export API_PORT="6003" | ||
| 261 | - export TENANT_ID="test_tenant" | ||
| 262 | - export TESTING_MODE="true" | ||
| 263 | - ``` | ||
| 264 | - | ||
| 265 | -### 服务依赖 | ||
| 266 | - | ||
| 267 | -测试环境需要以下服务: | ||
| 268 | - | ||
| 269 | -1. **Elasticsearch** (端口9200) | ||
| 270 | - - 存储和搜索测试数据 | ||
| 271 | - - 支持中文和英文索引 | ||
| 272 | - | ||
| 273 | -2. **API服务** (端口6003) | ||
| 274 | - - FastAPI测试服务 | ||
| 275 | - - 提供搜索接口 | ||
| 276 | - | ||
| 277 | -3. **测试数据库** | ||
| 278 | - - 预配置的测试索引 | ||
| 279 | - - 包含测试数据 | ||
| 280 | - | ||
| 281 | -## 📊 测试报告 | ||
| 282 | - | ||
| 283 | -### 报告类型 | ||
| 284 | - | ||
| 285 | -1. **实时控制台输出** | ||
| 286 | - - 测试进度显示 | ||
| 287 | - - 失败详情 | ||
| 288 | - - 性能摘要 | ||
| 289 | - | ||
| 290 | -2. **JSON格式报告** | ||
| 291 | - ```json | ||
| 292 | - { | ||
| 293 | - "timestamp": "2024-01-01T10:00:00", | ||
| 294 | - "summary": { | ||
| 295 | - "total_tests": 150, | ||
| 296 | - "passed": 148, | ||
| 297 | - "failed": 2, | ||
| 298 | - "success_rate": 98.7 | ||
| 299 | - }, | ||
| 300 | - "suites": { ... } | ||
| 301 | - } | ||
| 302 | - ``` | ||
| 303 | - | ||
| 304 | -3. **文本格式报告** | ||
| 305 | - - 人类友好的格式 | ||
| 306 | - - 包含测试摘要和详情 | ||
| 307 | - - 适合PR评论 | ||
| 308 | - | ||
| 309 | -4. **HTML覆盖率报告** | ||
| 310 | - - 代码覆盖率可视化 | ||
| 311 | - - 分支和行覆盖率 | ||
| 312 | - - 缺失测试高亮 | ||
| 313 | - | ||
| 314 | -### 报告位置 | ||
| 315 | - | ||
| 316 | -``` | ||
| 317 | -test_logs/ | ||
| 318 | -├── unit_test_results.json # 单元测试结果 | ||
| 319 | -├── integration_test_results.json # 集成测试结果 | ||
| 320 | -├── api_test_results.json # API测试结果 | ||
| 321 | -├── test_report_20240101_100000.txt # 文本格式摘要 | ||
| 322 | -├── test_report_20240101_100000.json # JSON格式详情 | ||
| 323 | -└── htmlcov/ # HTML覆盖率报告 | ||
| 324 | -``` | ||
| 325 | - | ||
| 326 | -## 🔄 CI/CD集成 | ||
| 327 | - | ||
| 328 | -### GitHub Actions工作流 | ||
| 329 | - | ||
| 330 | -**触发条件**: | ||
| 331 | -- Push到主分支 | ||
| 332 | -- Pull Request创建/更新 | ||
| 333 | -- 手动触发 | ||
| 334 | - | ||
| 335 | -**工作流阶段**: | ||
| 336 | - | ||
| 337 | -1. **代码质量检查** | ||
| 338 | - - 代码格式验证 | ||
| 339 | - - 静态代码分析 | ||
| 340 | - - 安全漏洞扫描 | ||
| 341 | - | ||
| 342 | -2. **单元测试** | ||
| 343 | - - 多Python版本矩阵测试 | ||
| 344 | - - 代码覆盖率收集 | ||
| 345 | - - 自动上传到Codecov | ||
| 346 | - | ||
| 347 | -3. **集成测试** | ||
| 348 | - - 服务依赖启动 | ||
| 349 | - - 端到端功能测试 | ||
| 350 | - - 错误处理验证 | ||
| 351 | - | ||
| 352 | -4. **API测试** | ||
| 353 | - - 接口功能验证 | ||
| 354 | - - 参数校验测试 | ||
| 355 | - - 并发请求测试 | ||
| 356 | - | ||
| 357 | -5. **性能测试** | ||
| 358 | - - 响应时间检查 | ||
| 359 | - - 资源使用监控 | ||
| 360 | - - 性能回归检测 | ||
| 361 | - | ||
| 362 | -6. **测试报告生成** | ||
| 363 | - - 结果汇总 | ||
| 364 | - - 报告上传 | ||
| 365 | - - PR评论更新 | ||
| 366 | - | ||
| 367 | -### 工作流配置 | ||
| 368 | - | ||
| 369 | -**文件**: `.github/workflows/test.yml` | ||
| 370 | - | ||
| 371 | -**关键特性**: | ||
| 372 | -- 并行执行提高效率 | ||
| 373 | -- 服务容器化隔离 | ||
| 374 | -- 自动清理资源 | ||
| 375 | -- 智能缓存依赖 | ||
| 376 | - | ||
| 377 | -## 🧪 测试最佳实践 | ||
| 378 | - | ||
| 379 | -### 1. 测试编写原则 | ||
| 380 | - | ||
| 381 | -- **独立性**: 每个测试应该独立运行 | ||
| 382 | -- **可重复性**: 测试结果应该一致 | ||
| 383 | -- **快速执行**: 单元测试应该快速完成 | ||
| 384 | -- **清晰命名**: 测试名称应该描述测试内容 | ||
| 385 | - | ||
| 386 | -### 2. 测试数据管理 | ||
| 387 | - | ||
| 388 | -```python | ||
| 389 | -# 使用fixture提供测试数据 | ||
| 390 | -@pytest.fixture | ||
| 391 | -def sample_tenant_config(): | ||
| 392 | - return TenantConfig( | ||
| 393 | - tenant_id="test_tenant", | ||
| 394 | - es_index_name="test_products" | ||
| 395 | - ) | ||
| 396 | - | ||
| 397 | -# 使用mock避免外部依赖 | ||
| 398 | -@patch('search.searcher.ESClient') | ||
| 399 | -def test_search_with_mock_es(mock_es_client, test_searcher): | ||
| 400 | - mock_es_client.search.return_value = mock_response | ||
| 401 | - result = test_searcher.search("test query") | ||
| 402 | - assert result is not None | ||
| 403 | -``` | ||
| 404 | - | ||
| 405 | -### 3. RequestContext集成 | ||
| 406 | - | ||
| 407 | -```python | ||
| 408 | -def test_with_context(test_searcher): | ||
| 409 | - context = create_request_context("test-req", "test-user") | ||
| 410 | - | ||
| 411 | - result = test_searcher.search("test query", context=context) | ||
| 412 | - | ||
| 413 | - # 验证context被正确更新 | ||
| 414 | - assert context.query_analysis.original_query == "test query" | ||
| 415 | - assert context.get_stage_duration("elasticsearch_search") > 0 | ||
| 416 | -``` | ||
| 417 | - | ||
| 418 | -### 4. 性能测试指南 | ||
| 419 | - | ||
| 420 | -```python | ||
| 421 | -def test_search_performance(client): | ||
| 422 | - start_time = time.time() | ||
| 423 | - response = client.get("/search", params={"q": "test query"}) | ||
| 424 | - response_time = (time.time() - start_time) * 1000 | ||
| 425 | - | ||
| 426 | - assert response.status_code == 200 | ||
| 427 | - assert response_time < 2000 # 2秒内响应 | ||
| 428 | -``` | ||
| 429 | - | ||
| 430 | -## 🚨 故障排除 | ||
| 431 | - | ||
| 432 | -### 常见问题 | ||
| 433 | - | ||
| 434 | -1. **Elasticsearch连接失败** | ||
| 435 | - ```bash | ||
| 436 | - # 检查ES状态 | ||
| 437 | - curl http://localhost:9200/_cluster/health | ||
| 438 | - | ||
| 439 | - # 重启ES服务 | ||
| 440 | - docker restart elasticsearch | ||
| 441 | - ``` | ||
| 442 | - | ||
| 443 | -2. **测试端口冲突** | ||
| 444 | - ```bash | ||
| 445 | - # 检查端口占用 | ||
| 446 | - lsof -i :6003 | ||
| 447 | - | ||
| 448 | - # 修改API端口 | ||
| 449 | - export API_PORT="6004" | ||
| 450 | - ``` | ||
| 451 | - | ||
| 452 | -3. **依赖包缺失** | ||
| 453 | - ```bash | ||
| 454 | - # 重新安装依赖 | ||
| 455 | - pip install -r requirements.txt | ||
| 456 | - pip install pytest pytest-cov pytest-json-report | ||
| 457 | - ``` | ||
| 458 | - | ||
| 459 | -4. **测试数据问题** | ||
| 460 | - ```bash | ||
| 461 | - # 重新创建测试索引 | ||
| 462 | - curl -X DELETE http://localhost:9200/test_products | ||
| 463 | - ./scripts/start_test_environment.sh | ||
| 464 | - ``` | ||
| 465 | - | ||
| 466 | -### 调试技巧 | ||
| 467 | - | ||
| 468 | -1. **详细日志输出** | ||
| 469 | - ```bash | ||
| 470 | - pytest tests/unit/test_context.py -v -s --tb=long | ||
| 471 | - ``` | ||
| 472 | - | ||
| 473 | -2. **运行单个测试** | ||
| 474 | - ```bash | ||
| 475 | - pytest tests/unit/test_context.py::TestRequestContext::test_create_context -v | ||
| 476 | - ``` | ||
| 477 | - | ||
| 478 | -3. **调试模式** | ||
| 479 | - ```python | ||
| 480 | - import pdb; pdb.set_trace() | ||
| 481 | - ``` | ||
| 482 | - | ||
| 483 | -4. **性能分析** | ||
| 484 | - ```bash | ||
| 485 | - pytest --profile tests/ | ||
| 486 | - ``` | ||
| 487 | - | ||
| 488 | -## 📈 持续改进 | ||
| 489 | - | ||
| 490 | -### 测试覆盖率目标 | ||
| 491 | - | ||
| 492 | -- **单元测试**: > 90% | ||
| 493 | -- **集成测试**: > 80% | ||
| 494 | -- **API测试**: > 95% | ||
| 495 | - | ||
| 496 | -### 性能基准 | ||
| 497 | - | ||
| 498 | -- **搜索响应时间**: < 2秒 | ||
| 499 | -- **API并发处理**: 100 QPS | ||
| 500 | -- **系统资源使用**: < 80% CPU, < 4GB RAM | 86 | +返回中 `docs[0]` 即当前代码构造的 ES doc(与 `mappings/search_products.json` 对齐)。 |
| 87 | +与真实 ES 数据对比的查询参考 `docs/常用查询 - ES.md`;若字段不一致,按以下路径定位: | ||
| 501 | 88 | ||
| 502 | -### 质量门禁 | 89 | +- `indexer/document_transformer.py` — 文档构造逻辑 |
| 90 | +- `indexer/incremental_service.py` — 增量查库逻辑 | ||
| 91 | +- `logs/indexer.log` — 索引日志 | ||
| 503 | 92 | ||
| 504 | -- **所有测试必须通过** | ||
| 505 | -- **代码覆盖率不能下降** | ||
| 506 | -- **性能不能显著退化** | ||
| 507 | -- **不能有安全漏洞** | 93 | +## 6. 编写测试的约束(与 `开发原则` 对齐) |
| 508 | 94 | ||
| 95 | +- **fail fast**:测试输入不合法时应直接抛错,不用 `if ... return`;不要用 `try/except` 吃掉异常再 `assert not exception`。 | ||
| 96 | +- **不做兼容双轨**:用例对准当前实现,不为历史行为保留“旧 assert”。若确有外部兼容性(例如 API 上标注 Deprecated 的字段),在 `tests/ci` 里单独写**契约**用例并注明 Deprecated。 | ||
| 97 | +- **外部依赖全 fake**:凡是依赖 HTTP / Redis / ES / LLM 的测试必须注入 fake stub,否则归入 `tests/manual/`。 | ||
| 98 | +- **一处真相**:共享 fixture 如果超过 2 个文件使用,放 `tests/conftest.py`;只给 1 个文件用就放在该文件内。避免再次出现全库无人引用的 dead fixture。 |
| @@ -0,0 +1,30 @@ | @@ -0,0 +1,30 @@ | ||
| 1 | +[pytest] | ||
| 2 | +# 权威的 pytest 配置源。新增共享配置请放这里,不要再散落到各测试文件头部。 | ||
| 3 | +# | ||
| 4 | +# testpaths 明确只扫 tests/(含 tests/ci/),刻意排除 tests/manual/。 | ||
| 5 | +testpaths = tests | ||
| 6 | +# tests/manual/ 里的脚本依赖外部服务,不参与自动回归。 | ||
| 7 | +norecursedirs = tests/manual | ||
| 8 | + | ||
| 9 | +addopts = -ra --strict-markers | ||
| 10 | + | ||
| 11 | +# 全局静默第三方的 DeprecationWarning,避免遮掩真正需要关注的业务警告。 | ||
| 12 | +filterwarnings = | ||
| 13 | + ignore::DeprecationWarning | ||
| 14 | + ignore::PendingDeprecationWarning | ||
| 15 | + | ||
| 16 | +# 子系统 / 回归分层标记。新增 marker 前先在这里登记,未登记的 marker 会因 | ||
| 17 | +# --strict-markers 直接报错。 | ||
| 18 | +markers = | ||
| 19 | + regression: 提交/发布前必跑的回归锚点集合 | ||
| 20 | + contract: API / 服务契约(tests/ci 默认全部归入) | ||
| 21 | + search: Searcher / 排序 / 召回管线 | ||
| 22 | + query: QueryParser / 翻译 / 分词 | ||
| 23 | + intent: 样式与 SKU 意图识别 | ||
| 24 | + rerank: 粗排 / 精排 / 融合 | ||
| 25 | + embedding: 文本/图像向量服务与客户端 | ||
| 26 | + translation: 翻译服务与缓存 | ||
| 27 | + indexer: 索引构建 / LLM enrich | ||
| 28 | + suggestion: 搜索建议索引 | ||
| 29 | + eval: 评估框架 | ||
| 30 | + manual: 需人工起服务,CI 不跑 |
scripts/run_ci_tests.sh
| 1 | #!/bin/bash | 1 | #!/bin/bash |
| 2 | +# CI 门禁脚本:每次提交必跑的最小集合。 | ||
| 3 | +# | ||
| 4 | +# 覆盖范围: | ||
| 5 | +# 1. tests/ci 下的服务契约测试(HTTP/JSON schema / 路由 / 鉴权) | ||
| 6 | +# 2. tests/ 下带 `contract` marker 的所有用例(冗余保障,防止 marker 与目录漂移) | ||
| 7 | +# 3. 搜索主路径 + ES 查询构建器的回归锚点(search 子系统) | ||
| 8 | +# | ||
| 9 | +# 超出这个范围的完整回归集请用 scripts/run_regression_tests.sh。 | ||
| 2 | 10 | ||
| 3 | set -euo pipefail | 11 | set -euo pipefail |
| 4 | 12 | ||
| 5 | cd "$(dirname "$0")/.." | 13 | cd "$(dirname "$0")/.." |
| 6 | source ./activate.sh | 14 | source ./activate.sh |
| 7 | 15 | ||
| 8 | -echo "Running CI contract tests..." | ||
| 9 | -python -m pytest tests/ci -q | 16 | +echo "==> [CI-1/2] API contract tests (tests/ci + contract marker)..." |
| 17 | +python -m pytest tests/ci tests/ -q -m contract | ||
| 18 | + | ||
| 19 | +echo "==> [CI-2/2] Search core regression (search marker)..." | ||
| 20 | +python -m pytest tests/ -q -m "search and regression" |
| @@ -0,0 +1,26 @@ | @@ -0,0 +1,26 @@ | ||
| 1 | +#!/bin/bash | ||
| 2 | +# 回归锚点脚本:发版 / 大合并前必跑的回归集合。 | ||
| 3 | +# | ||
| 4 | +# 选中策略:所有 @pytest.mark.regression 用例,即 docs/测试Pipeline说明.md | ||
| 5 | +# “回归钩子矩阵” 中列出的各子系统锚点。 | ||
| 6 | +# | ||
| 7 | +# 可选参数: | ||
| 8 | +# SUBSYSTEM=search ./scripts/run_regression_tests.sh # 只跑某个子系统的回归子集 | ||
| 9 | +# | ||
| 10 | +# 约束:本脚本不启外部依赖(ES / DeepL / LLM 全 fake)。如需真实依赖,请用 | ||
| 11 | +# tests/manual 下的脚本。 | ||
| 12 | + | ||
| 13 | +set -euo pipefail | ||
| 14 | + | ||
| 15 | +cd "$(dirname "$0")/.." | ||
| 16 | +source ./activate.sh | ||
| 17 | + | ||
| 18 | +SUBSYSTEM="${SUBSYSTEM:-}" | ||
| 19 | + | ||
| 20 | +if [[ -n "${SUBSYSTEM}" ]]; then | ||
| 21 | + echo "==> Running regression subset: subsystem=${SUBSYSTEM}" | ||
| 22 | + python -m pytest tests/ -q -m "${SUBSYSTEM} and regression" | ||
| 23 | +else | ||
| 24 | + echo "==> Running full regression anchor suite..." | ||
| 25 | + python -m pytest tests/ -q -m regression | ||
| 26 | +fi |
search/searcher.py
| @@ -370,6 +370,11 @@ class Searcher: | @@ -370,6 +370,11 @@ class Searcher: | ||
| 370 | # (on the same dimension as optionN). | 370 | # (on the same dimension as optionN). |
| 371 | includes.add("enriched_taxonomy_attributes") | 371 | includes.add("enriched_taxonomy_attributes") |
| 372 | 372 | ||
| 373 | + # Needed when inner_hits url string differs from sku.image_src but ES exposes | ||
| 374 | + # _nested.offset — we re-resolve the winning url from image_embedding[offset]. | ||
| 375 | + if self._has_image_signal(parsed_query): | ||
| 376 | + includes.add("image_embedding") | ||
| 377 | + | ||
| 373 | return {"includes": sorted(includes)} | 378 | return {"includes": sorted(includes)} |
| 374 | 379 | ||
| 375 | def _fetch_hits_by_ids( | 380 | def _fetch_hits_by_ids( |
search/sku_intent_selector.py
| @@ -40,7 +40,8 @@ from __future__ import annotations | @@ -40,7 +40,8 @@ from __future__ import annotations | ||
| 40 | 40 | ||
| 41 | from dataclasses import dataclass, field | 41 | from dataclasses import dataclass, field |
| 42 | from typing import Any, Callable, Dict, List, Optional, Tuple | 42 | from typing import Any, Callable, Dict, List, Optional, Tuple |
| 43 | -from urllib.parse import urlsplit | 43 | +import posixpath |
| 44 | +from urllib.parse import unquote, urlsplit | ||
| 44 | 45 | ||
| 45 | from query.style_intent import ( | 46 | from query.style_intent import ( |
| 46 | DetectedStyleIntent, | 47 | DetectedStyleIntent, |
| @@ -439,6 +440,7 @@ class StyleSkuSelector: | @@ -439,6 +440,7 @@ class StyleSkuSelector: | ||
| 439 | # ------------------------------------------------------------------ | 440 | # ------------------------------------------------------------------ |
| 440 | @staticmethod | 441 | @staticmethod |
| 441 | def _normalize_url(url: Any) -> str: | 442 | def _normalize_url(url: Any) -> str: |
| 443 | + """host + path, no query/fragment; casefolded — primary equality key.""" | ||
| 442 | raw = str(url or "").strip() | 444 | raw = str(url or "").strip() |
| 443 | if not raw: | 445 | if not raw: |
| 444 | return "" | 446 | return "" |
| @@ -448,20 +450,93 @@ class StyleSkuSelector: | @@ -448,20 +450,93 @@ class StyleSkuSelector: | ||
| 448 | try: | 450 | try: |
| 449 | parts = urlsplit(raw) | 451 | parts = urlsplit(raw) |
| 450 | except ValueError: | 452 | except ValueError: |
| 451 | - return raw.casefold() | 453 | + return str(url).strip().casefold() |
| 452 | host = (parts.netloc or "").casefold() | 454 | host = (parts.netloc or "").casefold() |
| 453 | - path = parts.path or "" | 455 | + path = unquote(parts.path or "") |
| 454 | return f"{host}{path}".casefold() | 456 | return f"{host}{path}".casefold() |
| 455 | 457 | ||
| 458 | + @staticmethod | ||
| 459 | + def _normalize_path_only(url: Any) -> str: | ||
| 460 | + """Path-only key for cross-CDN / host-alias cases.""" | ||
| 461 | + raw = str(url or "").strip() | ||
| 462 | + if not raw: | ||
| 463 | + return "" | ||
| 464 | + if raw.startswith("//"): | ||
| 465 | + raw = "https:" + raw | ||
| 466 | + try: | ||
| 467 | + parts = urlsplit(raw) | ||
| 468 | + path = unquote(parts.path or "") | ||
| 469 | + except ValueError: | ||
| 470 | + return "" | ||
| 471 | + return path.casefold().rstrip("/") | ||
| 472 | + | ||
| 473 | + @classmethod | ||
| 474 | + def _url_filename(cls, url: Any) -> str: | ||
| 475 | + p = cls._normalize_path_only(url) | ||
| 476 | + if not p: | ||
| 477 | + return "" | ||
| 478 | + return posixpath.basename(p).casefold() | ||
| 479 | + | ||
| 480 | + @classmethod | ||
| 481 | + def _urls_equivalent(cls, a: Any, b: Any) -> bool: | ||
| 482 | + if not a or not b: | ||
| 483 | + return False | ||
| 484 | + na, nb = cls._normalize_url(a), cls._normalize_url(b) | ||
| 485 | + if na and nb and na == nb: | ||
| 486 | + return True | ||
| 487 | + pa, pb = cls._normalize_path_only(a), cls._normalize_path_only(b) | ||
| 488 | + if pa and pb and pa == pb: | ||
| 489 | + return True | ||
| 490 | + fa, fb = cls._url_filename(a), cls._url_filename(b) | ||
| 491 | + if fa and fb and fa == fb and len(fa) > 4: | ||
| 492 | + return True | ||
| 493 | + return False | ||
| 494 | + | ||
| 495 | + @staticmethod | ||
| 496 | + def _inner_hit_url_candidates(entry: Dict[str, Any], source: Dict[str, Any]) -> List[str]: | ||
| 497 | + """URLs to try for this inner_hit: _source.url plus image_embedding[offset].url.""" | ||
| 498 | + out: List[str] = [] | ||
| 499 | + src = entry.get("_source") or {} | ||
| 500 | + u = src.get("url") | ||
| 501 | + if u: | ||
| 502 | + out.append(str(u).strip()) | ||
| 503 | + nested = entry.get("_nested") | ||
| 504 | + if not isinstance(nested, dict): | ||
| 505 | + return out | ||
| 506 | + off = nested.get("offset") | ||
| 507 | + if not isinstance(off, int): | ||
| 508 | + return out | ||
| 509 | + embs = source.get("image_embedding") | ||
| 510 | + if not isinstance(embs, list) or not (0 <= off < len(embs)): | ||
| 511 | + return out | ||
| 512 | + emb = embs[off] | ||
| 513 | + if isinstance(emb, dict) and emb.get("url"): | ||
| 514 | + u2 = str(emb.get("url")).strip() | ||
| 515 | + if u2 and u2 not in out: | ||
| 516 | + out.append(u2) | ||
| 517 | + return out | ||
| 518 | + | ||
| 456 | def _pick_sku_by_image( | 519 | def _pick_sku_by_image( |
| 457 | self, | 520 | self, |
| 458 | hit: Dict[str, Any], | 521 | hit: Dict[str, Any], |
| 459 | source: Dict[str, Any], | 522 | source: Dict[str, Any], |
| 460 | ) -> Optional[ImagePick]: | 523 | ) -> Optional[ImagePick]: |
| 524 | + """Map ES nested image KNN inner_hits to a SKU via image URL alignment. | ||
| 525 | + | ||
| 526 | + ``image_pick`` is empty when: | ||
| 527 | + - ES did not return ``inner_hits`` for this hit (e.g. doc outside | ||
| 528 | + ``rescore.window_size`` so no exact-image rescore inner_hits; or the | ||
| 529 | + nested image clause did not match this document). | ||
| 530 | + - The winning nested ``url`` cannot be aligned to any ``skus[].image_src`` | ||
| 531 | + even after path/filename normalization (rare CDN / encoding edge cases). | ||
| 532 | + | ||
| 533 | + We try ``_source.url``, ``_nested.offset`` + ``image_embedding[offset].url``, | ||
| 534 | + and loose path/filename matching to reduce false negatives. | ||
| 535 | + """ | ||
| 461 | inner_hits = hit.get("inner_hits") | 536 | inner_hits = hit.get("inner_hits") |
| 462 | if not isinstance(inner_hits, dict): | 537 | if not isinstance(inner_hits, dict): |
| 463 | return None | 538 | return None |
| 464 | - top_url: Optional[str] = None | 539 | + best_entry: Optional[Dict[str, Any]] = None |
| 465 | top_score: Optional[float] = None | 540 | top_score: Optional[float] = None |
| 466 | for key in _IMAGE_INNER_HITS_KEYS: | 541 | for key in _IMAGE_INNER_HITS_KEYS: |
| 467 | payload = inner_hits.get(key) | 542 | payload = inner_hits.get(key) |
| @@ -474,33 +549,36 @@ class StyleSkuSelector: | @@ -474,33 +549,36 @@ class StyleSkuSelector: | ||
| 474 | for entry in inner_list: | 549 | for entry in inner_list: |
| 475 | if not isinstance(entry, dict): | 550 | if not isinstance(entry, dict): |
| 476 | continue | 551 | continue |
| 477 | - url = (entry.get("_source") or {}).get("url") | ||
| 478 | - if not url: | 552 | + if not self._inner_hit_url_candidates(entry, source): |
| 479 | continue | 553 | continue |
| 480 | try: | 554 | try: |
| 481 | score = float(entry.get("_score") or 0.0) | 555 | score = float(entry.get("_score") or 0.0) |
| 482 | except (TypeError, ValueError): | 556 | except (TypeError, ValueError): |
| 483 | score = 0.0 | 557 | score = 0.0 |
| 484 | if top_score is None or score > top_score: | 558 | if top_score is None or score > top_score: |
| 485 | - top_url = str(url) | 559 | + best_entry = entry |
| 486 | top_score = score | 560 | top_score = score |
| 487 | - if top_url is not None: | ||
| 488 | - break # Prefer the first listed inner_hits source (exact > approx). | ||
| 489 | - if top_url is None: | 561 | + if best_entry is not None: |
| 562 | + break # Prefer exact_image_knn_query_hits over image_knn_query_hits. | ||
| 563 | + if best_entry is None: | ||
| 564 | + return None | ||
| 565 | + | ||
| 566 | + candidates = self._inner_hit_url_candidates(best_entry, source) | ||
| 567 | + if not candidates: | ||
| 490 | return None | 568 | return None |
| 491 | 569 | ||
| 492 | skus = source.get("skus") | 570 | skus = source.get("skus") |
| 493 | if not isinstance(skus, list): | 571 | if not isinstance(skus, list): |
| 494 | return None | 572 | return None |
| 495 | - target = self._normalize_url(top_url) | ||
| 496 | for sku in skus: | 573 | for sku in skus: |
| 497 | - sku_url = self._normalize_url(sku.get("image_src") or sku.get("imageSrc")) | ||
| 498 | - if sku_url and sku_url == target: | ||
| 499 | - return ImagePick( | ||
| 500 | - sku_id=str(sku.get("sku_id") or ""), | ||
| 501 | - url=top_url, | ||
| 502 | - score=float(top_score or 0.0), | ||
| 503 | - ) | 574 | + sku_raw = sku.get("image_src") or sku.get("imageSrc") |
| 575 | + for cand in candidates: | ||
| 576 | + if self._urls_equivalent(cand, sku_raw): | ||
| 577 | + return ImagePick( | ||
| 578 | + sku_id=str(sku.get("sku_id") or ""), | ||
| 579 | + url=cand, | ||
| 580 | + score=float(top_score or 0.0), | ||
| 581 | + ) | ||
| 504 | return None | 582 | return None |
| 505 | 583 | ||
| 506 | # ------------------------------------------------------------------ | 584 | # ------------------------------------------------------------------ |
tests/ci/test_service_api_contracts.py
| @@ -11,6 +11,8 @@ import pytest | @@ -11,6 +11,8 @@ import pytest | ||
| 11 | from fastapi.testclient import TestClient | 11 | from fastapi.testclient import TestClient |
| 12 | from translation.scenes import normalize_scene_name | 12 | from translation.scenes import normalize_scene_name |
| 13 | 13 | ||
| 14 | +pytestmark = [pytest.mark.contract, pytest.mark.regression] | ||
| 15 | + | ||
| 14 | 16 | ||
| 15 | class _FakeSearcher: | 17 | class _FakeSearcher: |
| 16 | def search(self, **kwargs): | 18 | def search(self, **kwargs): |
tests/conftest.py
| 1 | -""" | ||
| 2 | -pytest配置文件 | 1 | +"""pytest 全局配置。 |
| 2 | + | ||
| 3 | +- 项目根路径注入(便于 `tests/` 下模块直接 `from <pkg>` 导入) | ||
| 4 | +- marker / testpaths / 过滤规则的**权威来源是 `pytest.ini`**,不在这里重复定义 | ||
| 3 | 5 | ||
| 4 | -提供测试夹具和共享配置 | 6 | +历史上这里曾定义过一批 `sample_search_config / mock_es_client / test_searcher` 等 |
| 7 | +fixture,但 2026-Q2 起的测试全部自带 fake stub,这些 fixture 全库无人引用,已一并 | ||
| 8 | +移除。新增共享 fixture 时请明确列出其被哪些测试使用,避免再次出现 dead fixtures。 | ||
| 5 | """ | 9 | """ |
| 6 | 10 | ||
| 7 | import os | 11 | import os |
| 8 | import sys | 12 | import sys |
| 9 | -import pytest | ||
| 10 | -import tempfile | ||
| 11 | -from typing import Dict, Any, Generator | ||
| 12 | -from unittest.mock import Mock, MagicMock | ||
| 13 | 13 | ||
| 14 | -# 添加项目根目录到Python路径 | ||
| 15 | project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) | 14 | project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) |
| 16 | sys.path.insert(0, project_root) | 15 | sys.path.insert(0, project_root) |
| 17 | - | ||
| 18 | -from config import SearchConfig, QueryConfig, IndexConfig, SPUConfig, FunctionScoreConfig, RerankConfig | ||
| 19 | -from utils.es_client import ESClient | ||
| 20 | -from search import Searcher | ||
| 21 | -from query import QueryParser | ||
| 22 | -from context import RequestContext, create_request_context | ||
| 23 | - | ||
| 24 | - | ||
| 25 | -@pytest.fixture | ||
| 26 | -def sample_index_config() -> IndexConfig: | ||
| 27 | - """样例索引配置""" | ||
| 28 | - return IndexConfig( | ||
| 29 | - name="default", | ||
| 30 | - label="默认索引", | ||
| 31 | - fields=["title.zh", "brief.zh", "tags"], | ||
| 32 | - boost=1.0 | ||
| 33 | - ) | ||
| 34 | - | ||
| 35 | - | ||
| 36 | -@pytest.fixture | ||
| 37 | -def sample_search_config(sample_index_config) -> SearchConfig: | ||
| 38 | - """样例搜索配置""" | ||
| 39 | - query_config = QueryConfig( | ||
| 40 | - enable_query_rewrite=True, | ||
| 41 | - enable_text_embedding=True, | ||
| 42 | - supported_languages=["zh", "en"] | ||
| 43 | - ) | ||
| 44 | - | ||
| 45 | - spu_config = SPUConfig( | ||
| 46 | - enabled=True, | ||
| 47 | - spu_field="spu_id", | ||
| 48 | - inner_hits_size=3 | ||
| 49 | - ) | ||
| 50 | - | ||
| 51 | - function_score_config = FunctionScoreConfig() | ||
| 52 | - rerank_config = RerankConfig() | ||
| 53 | - | ||
| 54 | - return SearchConfig( | ||
| 55 | - es_index_name="test_products", | ||
| 56 | - field_boosts={ | ||
| 57 | - "tenant_id": 1.0, | ||
| 58 | - "title.zh": 3.0, | ||
| 59 | - "brief.zh": 1.5, | ||
| 60 | - "tags": 1.0, | ||
| 61 | - "category_path.zh": 1.5, | ||
| 62 | - }, | ||
| 63 | - indexes=[sample_index_config], | ||
| 64 | - query_config=query_config, | ||
| 65 | - function_score=function_score_config, | ||
| 66 | - rerank=rerank_config, | ||
| 67 | - spu_config=spu_config | ||
| 68 | - ) | ||
| 69 | - | ||
| 70 | - | ||
| 71 | -@pytest.fixture | ||
| 72 | -def mock_es_client() -> Mock: | ||
| 73 | - """模拟ES客户端""" | ||
| 74 | - mock_client = Mock(spec=ESClient) | ||
| 75 | - | ||
| 76 | - # 模拟搜索响应 | ||
| 77 | - mock_response = { | ||
| 78 | - "hits": { | ||
| 79 | - "total": {"value": 10}, | ||
| 80 | - "max_score": 2.5, | ||
| 81 | - "hits": [ | ||
| 82 | - { | ||
| 83 | - "_id": "1", | ||
| 84 | - "_score": 2.5, | ||
| 85 | - "_source": { | ||
| 86 | - "title": {"zh": "红色连衣裙"}, | ||
| 87 | - "vendor": {"zh": "测试品牌"}, | ||
| 88 | - "min_price": 299.0, | ||
| 89 | - "category_id": "1" | ||
| 90 | - } | ||
| 91 | - }, | ||
| 92 | - { | ||
| 93 | - "_id": "2", | ||
| 94 | - "_score": 2.2, | ||
| 95 | - "_source": { | ||
| 96 | - "title": {"zh": "蓝色连衣裙"}, | ||
| 97 | - "vendor": {"zh": "测试品牌"}, | ||
| 98 | - "min_price": 399.0, | ||
| 99 | - "category_id": "1" | ||
| 100 | - } | ||
| 101 | - } | ||
| 102 | - ] | ||
| 103 | - }, | ||
| 104 | - "took": 15 | ||
| 105 | - } | ||
| 106 | - | ||
| 107 | - mock_client.search.return_value = mock_response | ||
| 108 | - return mock_client | ||
| 109 | - | ||
| 110 | - | ||
| 111 | -@pytest.fixture | ||
| 112 | -def test_searcher(sample_search_config, mock_es_client) -> Searcher: | ||
| 113 | - """测试用Searcher实例""" | ||
| 114 | - return Searcher( | ||
| 115 | - es_client=mock_es_client, | ||
| 116 | - config=sample_search_config | ||
| 117 | - ) | ||
| 118 | - | ||
| 119 | - | ||
| 120 | -@pytest.fixture | ||
| 121 | -def test_query_parser(sample_search_config) -> QueryParser: | ||
| 122 | - """测试用QueryParser实例""" | ||
| 123 | - return QueryParser(sample_search_config) | ||
| 124 | - | ||
| 125 | - | ||
| 126 | -@pytest.fixture | ||
| 127 | -def test_request_context() -> RequestContext: | ||
| 128 | - """测试用RequestContext实例""" | ||
| 129 | - return create_request_context("test-req-001", "test-user") | ||
| 130 | - | ||
| 131 | - | ||
| 132 | -@pytest.fixture | ||
| 133 | -def sample_search_results() -> Dict[str, Any]: | ||
| 134 | - """样例搜索结果""" | ||
| 135 | - return { | ||
| 136 | - "query": "红色连衣裙", | ||
| 137 | - "expected_total": 2, | ||
| 138 | - "expected_products": [ | ||
| 139 | - {"title": "红色连衣裙", "min_price": 299.0}, | ||
| 140 | - {"title": "蓝色连衣裙", "min_price": 399.0} | ||
| 141 | - ] | ||
| 142 | - } | ||
| 143 | - | ||
| 144 | - | ||
| 145 | -@pytest.fixture | ||
| 146 | -def temp_config_file() -> Generator[str, None, None]: | ||
| 147 | - """临时配置文件""" | ||
| 148 | - import tempfile | ||
| 149 | - import yaml | ||
| 150 | - | ||
| 151 | - config_data = { | ||
| 152 | - "es_index_name": "test_products", | ||
| 153 | - "field_boosts": { | ||
| 154 | - "title.zh": 3.0, | ||
| 155 | - "brief.zh": 1.5, | ||
| 156 | - "tags": 1.0, | ||
| 157 | - "category_path.zh": 1.5 | ||
| 158 | - }, | ||
| 159 | - "indexes": [ | ||
| 160 | - { | ||
| 161 | - "name": "default", | ||
| 162 | - "label": "默认索引", | ||
| 163 | - "fields": ["title.zh", "brief.zh", "tags"], | ||
| 164 | - "boost": 1.0 | ||
| 165 | - } | ||
| 166 | - ], | ||
| 167 | - "query_config": { | ||
| 168 | - "supported_languages": ["zh", "en"], | ||
| 169 | - "default_language": "zh", | ||
| 170 | - "enable_text_embedding": True, | ||
| 171 | - "enable_query_rewrite": True | ||
| 172 | - }, | ||
| 173 | - "spu_config": { | ||
| 174 | - "enabled": True, | ||
| 175 | - "spu_field": "spu_id", | ||
| 176 | - "inner_hits_size": 3 | ||
| 177 | - }, | ||
| 178 | - "ranking": { | ||
| 179 | - "expression": "bm25() + 0.2*text_embedding_relevance()", | ||
| 180 | - "description": "Test ranking" | ||
| 181 | - }, | ||
| 182 | - "function_score": { | ||
| 183 | - "score_mode": "sum", | ||
| 184 | - "boost_mode": "multiply", | ||
| 185 | - "functions": [] | ||
| 186 | - }, | ||
| 187 | - "rerank": { | ||
| 188 | - "rerank_window": 386 | ||
| 189 | - } | ||
| 190 | - } | ||
| 191 | - | ||
| 192 | - with tempfile.NamedTemporaryFile(mode='w', suffix='.yaml', delete=False) as f: | ||
| 193 | - yaml.dump(config_data, f) | ||
| 194 | - temp_file = f.name | ||
| 195 | - | ||
| 196 | - yield temp_file | ||
| 197 | - | ||
| 198 | - # 清理 | ||
| 199 | - os.unlink(temp_file) | ||
| 200 | - | ||
| 201 | - | ||
| 202 | -@pytest.fixture | ||
| 203 | -def mock_env_variables(monkeypatch): | ||
| 204 | - """设置环境变量""" | ||
| 205 | - monkeypatch.setenv("ES_HOST", "http://localhost:9200") | ||
| 206 | - monkeypatch.setenv("ES_USERNAME", "elastic") | ||
| 207 | - monkeypatch.setenv("ES_PASSWORD", "changeme") | ||
| 208 | - | ||
| 209 | - | ||
| 210 | -# 标记配置 | ||
| 211 | -pytest_plugins = [] | ||
| 212 | - | ||
| 213 | -# 标记定义 | ||
| 214 | -def pytest_configure(config): | ||
| 215 | - """配置pytest标记""" | ||
| 216 | - config.addinivalue_line( | ||
| 217 | - "markers", "unit: 单元测试" | ||
| 218 | - ) | ||
| 219 | - config.addinivalue_line( | ||
| 220 | - "markers", "integration: 集成测试" | ||
| 221 | - ) | ||
| 222 | - config.addinivalue_line( | ||
| 223 | - "markers", "api: API测试" | ||
| 224 | - ) | ||
| 225 | - config.addinivalue_line( | ||
| 226 | - "markers", "e2e: 端到端测试" | ||
| 227 | - ) | ||
| 228 | - config.addinivalue_line( | ||
| 229 | - "markers", "performance: 性能测试" | ||
| 230 | - ) | ||
| 231 | - config.addinivalue_line( | ||
| 232 | - "markers", "slow: 慢速测试" | ||
| 233 | - ) | ||
| 234 | - | ||
| 235 | - | ||
| 236 | -# 测试数据 | ||
| 237 | -@pytest.fixture | ||
| 238 | -def test_queries(): | ||
| 239 | - """测试查询集合""" | ||
| 240 | - return [ | ||
| 241 | - "红色连衣裙", | ||
| 242 | - "wireless bluetooth headphones", | ||
| 243 | - "手机 手机壳", | ||
| 244 | - "laptop AND (gaming OR professional)", | ||
| 245 | - "运动鞋 -价格:0-500" | ||
| 246 | - ] | ||
| 247 | - | ||
| 248 | - | ||
| 249 | -@pytest.fixture | ||
| 250 | -def expected_response_structure(): | ||
| 251 | - """期望的API响应结构""" | ||
| 252 | - return { | ||
| 253 | - "hits": list, | ||
| 254 | - "total": int, | ||
| 255 | - "max_score": float, | ||
| 256 | - "took_ms": int, | ||
| 257 | - "aggregations": dict, | ||
| 258 | - "query_info": dict, | ||
| 259 | - "performance_summary": dict | ||
| 260 | - } |
tests/test_cnclip_service.py renamed to tests/manual/test_cnclip_service.py
tests/test_facet_api.py renamed to tests/manual/test_facet_api.py
tests/test_cache_keys.py
| @@ -4,6 +4,10 @@ import hashlib | @@ -4,6 +4,10 @@ import hashlib | ||
| 4 | 4 | ||
| 5 | from embeddings import cache_keys as ck | 5 | from embeddings import cache_keys as ck |
| 6 | 6 | ||
| 7 | +import pytest | ||
| 8 | + | ||
| 9 | +pytestmark = [pytest.mark.embedding, pytest.mark.regression] | ||
| 10 | + | ||
| 7 | 11 | ||
| 8 | def test_stable_body_short_unchanged(): | 12 | def test_stable_body_short_unchanged(): |
| 9 | s = "a" * ck.CACHE_KEY_RAW_BODY_MAX_CHARS | 13 | s = "a" * ck.CACHE_KEY_RAW_BODY_MAX_CHARS |
tests/test_embedding_pipeline.py
| @@ -21,6 +21,8 @@ from embeddings.config import CONFIG | @@ -21,6 +21,8 @@ from embeddings.config import CONFIG | ||
| 21 | from query import QueryParser | 21 | from query import QueryParser |
| 22 | from context.request_context import create_request_context, set_current_request_context, clear_current_request_context | 22 | from context.request_context import create_request_context, set_current_request_context, clear_current_request_context |
| 23 | 23 | ||
| 24 | +pytestmark = [pytest.mark.embedding, pytest.mark.regression] | ||
| 25 | + | ||
| 24 | 26 | ||
| 25 | class _FakeRedis: | 27 | class _FakeRedis: |
| 26 | def __init__(self): | 28 | def __init__(self): |
| @@ -177,8 +179,10 @@ def test_text_embedding_encoder_cache_hit(monkeypatch): | @@ -177,8 +179,10 @@ def test_text_embedding_encoder_cache_hit(monkeypatch): | ||
| 177 | out = encoder.encode(["cached-text", "new-text"]) | 179 | out = encoder.encode(["cached-text", "new-text"]) |
| 178 | 180 | ||
| 179 | assert calls["count"] == 1 | 181 | assert calls["count"] == 1 |
| 180 | - assert np.allclose(out[0], cached) | ||
| 181 | - assert np.allclose(out[1], np.array([0.3, 0.4], dtype=np.float32)) | 182 | + # encoder returns an object-dtype ndarray of 1-D float32 vectors; cast per-row |
| 183 | + # before numeric comparison. | ||
| 184 | + assert np.allclose(np.asarray(out[0], dtype=np.float32), cached) | ||
| 185 | + assert np.allclose(np.asarray(out[1], dtype=np.float32), np.array([0.3, 0.4], dtype=np.float32)) | ||
| 182 | 186 | ||
| 183 | 187 | ||
| 184 | def test_text_embedding_encoder_forwards_request_headers(monkeypatch): | 188 | def test_text_embedding_encoder_forwards_request_headers(monkeypatch): |
tests/test_embedding_service_limits.py
| @@ -5,6 +5,8 @@ import pytest | @@ -5,6 +5,8 @@ import pytest | ||
| 5 | 5 | ||
| 6 | import embeddings.server as embedding_server | 6 | import embeddings.server as embedding_server |
| 7 | 7 | ||
| 8 | +pytestmark = [pytest.mark.embedding, pytest.mark.regression] | ||
| 9 | + | ||
| 8 | 10 | ||
| 9 | class _DummyClient: | 11 | class _DummyClient: |
| 10 | host = "127.0.0.1" | 12 | host = "127.0.0.1" |
tests/test_embedding_service_priority.py
| @@ -2,6 +2,10 @@ import threading | @@ -2,6 +2,10 @@ import threading | ||
| 2 | 2 | ||
| 3 | import embeddings.server as emb_server | 3 | import embeddings.server as emb_server |
| 4 | 4 | ||
| 5 | +import pytest | ||
| 6 | + | ||
| 7 | +pytestmark = [pytest.mark.embedding, pytest.mark.regression] | ||
| 8 | + | ||
| 5 | 9 | ||
| 6 | def test_text_inflight_limiter_priority_bypass(): | 10 | def test_text_inflight_limiter_priority_bypass(): |
| 7 | limiter = emb_server._InflightLimiter(name="text", limit=1) | 11 | limiter = emb_server._InflightLimiter(name="text", limit=1) |
| @@ -30,6 +34,7 @@ def test_text_dispatch_prefers_high_priority_queue(): | @@ -30,6 +34,7 @@ def test_text_dispatch_prefers_high_priority_queue(): | ||
| 30 | normalized=["online"], | 34 | normalized=["online"], |
| 31 | effective_normalize=True, | 35 | effective_normalize=True, |
| 32 | request_id="high", | 36 | request_id="high", |
| 37 | + user_id="u-high", | ||
| 33 | priority=1, | 38 | priority=1, |
| 34 | created_at=0.0, | 39 | created_at=0.0, |
| 35 | done=threading.Event(), | 40 | done=threading.Event(), |
| @@ -38,6 +43,7 @@ def test_text_dispatch_prefers_high_priority_queue(): | @@ -38,6 +43,7 @@ def test_text_dispatch_prefers_high_priority_queue(): | ||
| 38 | normalized=["offline"], | 43 | normalized=["offline"], |
| 39 | effective_normalize=True, | 44 | effective_normalize=True, |
| 40 | request_id="normal", | 45 | request_id="normal", |
| 46 | + user_id="u-normal", | ||
| 41 | priority=0, | 47 | priority=0, |
| 42 | created_at=0.0, | 48 | created_at=0.0, |
| 43 | done=threading.Event(), | 49 | done=threading.Event(), |
tests/test_es_query_builder.py
| @@ -5,6 +5,10 @@ import numpy as np | @@ -5,6 +5,10 @@ import numpy as np | ||
| 5 | 5 | ||
| 6 | from search.es_query_builder import ESQueryBuilder | 6 | from search.es_query_builder import ESQueryBuilder |
| 7 | 7 | ||
| 8 | +import pytest | ||
| 9 | + | ||
| 10 | +pytestmark = [pytest.mark.search, pytest.mark.regression] | ||
| 11 | + | ||
| 8 | 12 | ||
| 9 | def _builder() -> ESQueryBuilder: | 13 | def _builder() -> ESQueryBuilder: |
| 10 | return ESQueryBuilder( | 14 | return ESQueryBuilder( |
tests/test_es_query_builder_text_recall_languages.py
| @@ -14,6 +14,10 @@ import numpy as np | @@ -14,6 +14,10 @@ import numpy as np | ||
| 14 | from query.keyword_extractor import KEYWORDS_QUERY_BASE_KEY | 14 | from query.keyword_extractor import KEYWORDS_QUERY_BASE_KEY |
| 15 | from search.es_query_builder import ESQueryBuilder | 15 | from search.es_query_builder import ESQueryBuilder |
| 16 | 16 | ||
| 17 | +import pytest | ||
| 18 | + | ||
| 19 | +pytestmark = [pytest.mark.search, pytest.mark.regression] | ||
| 20 | + | ||
| 17 | 21 | ||
| 18 | def _builder_multilingual_title_only(*, default_language: str = "en") -> ESQueryBuilder: | 22 | def _builder_multilingual_title_only(*, default_language: str = "en") -> ESQueryBuilder: |
| 19 | """Minimal builder: only title.{lang} for easy field assertions.""" | 23 | """Minimal builder: only title.{lang} for easy field assertions.""" |
| @@ -135,8 +139,13 @@ def test_zh_query_index_zh_en_includes_base_zh_and_trans_en(): | @@ -135,8 +139,13 @@ def test_zh_query_index_zh_en_includes_base_zh_and_trans_en(): | ||
| 135 | assert "title.en" in _title_fields(idx["base_query_trans_en"]) | 139 | assert "title.en" in _title_fields(idx["base_query_trans_en"]) |
| 136 | 140 | ||
| 137 | 141 | ||
| 138 | -def test_keywords_combined_fields_second_must_same_fields_and_50pct(): | ||
| 139 | - """When ParsedQuery.keywords_queries is set, inner must has two boosted combined_fields.""" | 142 | +def test_keywords_combined_fields_second_must_shares_fields_with_main_query(): |
| 143 | + """When ParsedQuery.keywords_queries is set, inner must has two boosted combined_fields. | ||
| 144 | + | ||
| 145 | + The second must sub-clause reuses the primary clause's field set and applies a | ||
| 146 | + tuned minimum_should_match / boost to keep keyword recall under control; see | ||
| 147 | + `search/es_query_builder.py` ``_keywords_combined_fields_sub_must``. | ||
| 148 | + """ | ||
| 140 | qb = _builder_multilingual_title_only(default_language="en") | 149 | qb = _builder_multilingual_title_only(default_language="en") |
| 141 | parsed = SimpleNamespace( | 150 | parsed = SimpleNamespace( |
| 142 | rewritten_query="连衣裙", | 151 | rewritten_query="连衣裙", |
| @@ -153,16 +162,16 @@ def test_keywords_combined_fields_second_must_same_fields_and_50pct(): | @@ -153,16 +162,16 @@ def test_keywords_combined_fields_second_must_same_fields_and_50pct(): | ||
| 153 | assert bm[0]["combined_fields"]["query"] == "连衣裙" | 162 | assert bm[0]["combined_fields"]["query"] == "连衣裙" |
| 154 | assert bm[0]["combined_fields"]["boost"] == 2.0 | 163 | assert bm[0]["combined_fields"]["boost"] == 2.0 |
| 155 | assert bm[1]["combined_fields"]["query"] == "连衣 裙" | 164 | assert bm[1]["combined_fields"]["query"] == "连衣 裙" |
| 156 | - assert bm[1]["combined_fields"]["minimum_should_match"] == "50%" | ||
| 157 | - assert bm[1]["combined_fields"]["boost"] == 0.6 | 165 | + assert bm[1]["combined_fields"]["minimum_should_match"] == "60%" |
| 166 | + assert bm[1]["combined_fields"]["boost"] == 0.8 | ||
| 158 | assert bm[1]["combined_fields"]["fields"] == bm[0]["combined_fields"]["fields"] | 167 | assert bm[1]["combined_fields"]["fields"] == bm[0]["combined_fields"]["fields"] |
| 159 | trans = idx["base_query_trans_en"] | 168 | trans = idx["base_query_trans_en"] |
| 160 | assert trans["minimum_should_match"] == 1 | 169 | assert trans["minimum_should_match"] == 1 |
| 161 | tm = _combined_fields_must(trans) | 170 | tm = _combined_fields_must(trans) |
| 162 | assert len(tm) == 2 | 171 | assert len(tm) == 2 |
| 163 | assert tm[1]["combined_fields"]["query"] == "dress" | 172 | assert tm[1]["combined_fields"]["query"] == "dress" |
| 164 | - assert tm[1]["combined_fields"]["minimum_should_match"] == "50%" | ||
| 165 | - assert tm[1]["combined_fields"]["boost"] == 0.6 | 173 | + assert tm[1]["combined_fields"]["minimum_should_match"] == "60%" |
| 174 | + assert tm[1]["combined_fields"]["boost"] == 0.8 | ||
| 166 | 175 | ||
| 167 | 176 | ||
| 168 | def test_keywords_omitted_when_same_as_main_combined_fields_query(): | 177 | def test_keywords_omitted_when_same_as_main_combined_fields_query(): |
tests/test_eval_framework_clients.py
| @@ -4,6 +4,8 @@ import requests | @@ -4,6 +4,8 @@ import requests | ||
| 4 | from scripts.evaluation.eval_framework.clients import DashScopeLabelClient | 4 | from scripts.evaluation.eval_framework.clients import DashScopeLabelClient |
| 5 | from scripts.evaluation.eval_framework.utils import build_label_doc_line | 5 | from scripts.evaluation.eval_framework.utils import build_label_doc_line |
| 6 | 6 | ||
| 7 | +pytestmark = [pytest.mark.eval] | ||
| 8 | + | ||
| 7 | 9 | ||
| 8 | def _http_error(status_code: int, body: str) -> requests.exceptions.HTTPError: | 10 | def _http_error(status_code: int, body: str) -> requests.exceptions.HTTPError: |
| 9 | response = requests.Response() | 11 | response = requests.Response() |
tests/test_eval_metrics.py
| 1 | """Tests for search evaluation ranking metrics (NDCG, ERR).""" | 1 | """Tests for search evaluation ranking metrics (NDCG, ERR).""" |
| 2 | 2 | ||
| 3 | +import math | ||
| 4 | + | ||
| 5 | +import pytest | ||
| 6 | + | ||
| 7 | +pytestmark = [pytest.mark.eval, pytest.mark.regression] | ||
| 8 | + | ||
| 3 | from scripts.evaluation.eval_framework.constants import ( | 9 | from scripts.evaluation.eval_framework.constants import ( |
| 4 | - RELEVANCE_EXACT, | ||
| 5 | - RELEVANCE_HIGH, | ||
| 6 | - RELEVANCE_IRRELEVANT, | ||
| 7 | - RELEVANCE_LOW, | 10 | + RELEVANCE_LV0, |
| 11 | + RELEVANCE_LV1, | ||
| 12 | + RELEVANCE_LV2, | ||
| 13 | + RELEVANCE_LV3, | ||
| 14 | + STOP_PROB_MAP, | ||
| 8 | ) | 15 | ) |
| 9 | from scripts.evaluation.eval_framework.metrics import compute_query_metrics | 16 | from scripts.evaluation.eval_framework.metrics import compute_query_metrics |
| 10 | 17 | ||
| 11 | 18 | ||
| 12 | -def test_err_matches_documented_three_item_examples(): | ||
| 13 | - # Model A: [Exact, Irrelevant, High] -> ERR ≈ 0.992667 | ||
| 14 | - m_a = compute_query_metrics( | ||
| 15 | - [RELEVANCE_EXACT, RELEVANCE_IRRELEVANT, RELEVANCE_HIGH], | ||
| 16 | - ideal_labels=[RELEVANCE_EXACT], | ||
| 17 | - ) | ||
| 18 | - assert abs(m_a["ERR@5"] - (0.99 + (1.0 / 3.0) * 0.8 * 0.01)) < 1e-5 | ||
| 19 | - | ||
| 20 | - # Model B: [High, Low, Exact] -> ERR ≈ 0.8694 | ||
| 21 | - m_b = compute_query_metrics( | ||
| 22 | - [RELEVANCE_HIGH, RELEVANCE_LOW, RELEVANCE_EXACT], | ||
| 23 | - ideal_labels=[RELEVANCE_EXACT], | ||
| 24 | - ) | ||
| 25 | - expected_b = 0.8 + 0.5 * 0.1 * 0.2 + (1.0 / 3.0) * 0.99 * 0.18 | ||
| 26 | - assert abs(m_b["ERR@5"] - expected_b) < 1e-5 | 19 | +def _expected_err(labels): |
| 20 | + err = 0.0 | ||
| 21 | + product = 1.0 | ||
| 22 | + for i, label in enumerate(labels, start=1): | ||
| 23 | + p = STOP_PROB_MAP[label] | ||
| 24 | + err += (1.0 / i) * p * product | ||
| 25 | + product *= 1.0 - p | ||
| 26 | + return err | ||
| 27 | + | ||
| 28 | + | ||
| 29 | +def test_err_matches_cascade_formula_on_four_level_labels(): | ||
| 30 | + """ERR@k must equal the textbook cascade formula against the four-level label set. | ||
| 31 | + | ||
| 32 | + The metric is the primary ranking signal (see `PRIMARY_METRIC_KEYS` in | ||
| 33 | + `eval_framework.metrics`); any regression here invalidates the whole | ||
| 34 | + evaluation pipeline. | ||
| 35 | + """ | ||
| 36 | + | ||
| 37 | + ranked_a = [RELEVANCE_LV3, RELEVANCE_LV0, RELEVANCE_LV2] | ||
| 38 | + ranked_b = [RELEVANCE_LV2, RELEVANCE_LV1, RELEVANCE_LV3] | ||
| 39 | + | ||
| 40 | + m_a = compute_query_metrics(ranked_a, ideal_labels=[RELEVANCE_LV3]) | ||
| 41 | + m_b = compute_query_metrics(ranked_b, ideal_labels=[RELEVANCE_LV3]) | ||
| 42 | + | ||
| 43 | + assert math.isclose(m_a["ERR@5"], _expected_err(ranked_a), abs_tol=1e-5) | ||
| 44 | + assert math.isclose(m_b["ERR@5"], _expected_err(ranked_b), abs_tol=1e-5) | ||
| 45 | + assert m_a["ERR@5"] > m_b["ERR@5"] | ||
| 46 | + | ||
| 47 | + | ||
| 48 | +def test_ndcg_at_k_is_1_when_actual_equals_ideal(): | ||
| 49 | + labels = [RELEVANCE_LV3, RELEVANCE_LV2, RELEVANCE_LV1] | ||
| 50 | + metrics = compute_query_metrics(labels, ideal_labels=labels) | ||
| 51 | + assert math.isclose(metrics["NDCG@5"], 1.0, abs_tol=1e-9) | ||
| 52 | + assert math.isclose(metrics["NDCG@20"], 1.0, abs_tol=1e-9) | ||
| 53 | + | ||
| 54 | + | ||
| 55 | +def test_all_irrelevant_zeroes_out_primary_signals(): | ||
| 56 | + labels = [RELEVANCE_LV0, RELEVANCE_LV0, RELEVANCE_LV0] | ||
| 57 | + metrics = compute_query_metrics(labels, ideal_labels=[RELEVANCE_LV3]) | ||
| 58 | + assert metrics["ERR@10"] == 0.0 | ||
| 59 | + assert metrics["NDCG@20"] == 0.0 | ||
| 60 | + assert metrics["Strong_Precision@10"] == 0.0 | ||
| 61 | + assert metrics["Primary_Metric_Score"] == 0.0 |
tests/test_keywords_query.py deleted
| @@ -1,115 +0,0 @@ | @@ -1,115 +0,0 @@ | ||
| 1 | -import hanlp | ||
| 2 | -from typing import List, Tuple, Dict, Any | ||
| 3 | - | ||
| 4 | -class KeywordExtractor: | ||
| 5 | - """ | ||
| 6 | - 基于 HanLP 的名词关键词提取器 | ||
| 7 | - """ | ||
| 8 | - def __init__(self): | ||
| 9 | - # 加载带位置信息的分词模型(细粒度) | ||
| 10 | - self.tok = hanlp.load(hanlp.pretrained.tok.CTB9_TOK_ELECTRA_BASE_CRF) | ||
| 11 | - self.tok.config.output_spans = True # 启用位置输出 | ||
| 12 | - | ||
| 13 | - # 加载词性标注模型 | ||
| 14 | - self.pos_tag = hanlp.load(hanlp.pretrained.pos.CTB9_POS_ELECTRA_SMALL) | ||
| 15 | - | ||
| 16 | - def extract_keywords(self, query: str) -> str: | ||
| 17 | - """ | ||
| 18 | - 从查询中提取关键词(名词,长度 ≥ 2) | ||
| 19 | - | ||
| 20 | - Args: | ||
| 21 | - query: 输入文本 | ||
| 22 | - | ||
| 23 | - Returns: | ||
| 24 | - 拼接后的关键词字符串,非连续词之间自动插入空格 | ||
| 25 | - """ | ||
| 26 | - query = query.strip() | ||
| 27 | - # 分词结果带位置:[[word, start, end], ...] | ||
| 28 | - tok_result_with_position = self.tok(query) | ||
| 29 | - tok_result = [x[0] for x in tok_result_with_position] | ||
| 30 | - | ||
| 31 | - # 词性标注 | ||
| 32 | - pos_tag_result = list(zip(tok_result, self.pos_tag(tok_result))) | ||
| 33 | - | ||
| 34 | - # 需要忽略的词 | ||
| 35 | - ignore_keywords = ['玩具'] | ||
| 36 | - | ||
| 37 | - keywords = [] | ||
| 38 | - last_end_pos = 0 | ||
| 39 | - | ||
| 40 | - for (word, postag), (_, start_pos, end_pos) in zip(pos_tag_result, tok_result_with_position): | ||
| 41 | - if len(word) >= 2 and postag.startswith('N'): | ||
| 42 | - if word in ignore_keywords: | ||
| 43 | - continue | ||
| 44 | - # 如果当前词与上一个词在原文中不连续,插入空格 | ||
| 45 | - if start_pos != last_end_pos and keywords: | ||
| 46 | - keywords.append(" ") | ||
| 47 | - keywords.append(word) | ||
| 48 | - last_end_pos = end_pos | ||
| 49 | - # 可选:打印调试信息 | ||
| 50 | - # print(f'分词: {word} | 词性: {postag} | 起始: {start_pos} | 结束: {end_pos}') | ||
| 51 | - | ||
| 52 | - return "".join(keywords).strip() | ||
| 53 | - | ||
| 54 | - | ||
| 55 | -# 测试代码 | ||
| 56 | -if __name__ == "__main__": | ||
| 57 | - extractor = KeywordExtractor() | ||
| 58 | - | ||
| 59 | - test_queries = [ | ||
| 60 | - # 中文(保留 9 个代表性查询) | ||
| 61 | - "2.4G遥控大蛇", | ||
| 62 | - "充气的篮球", | ||
| 63 | - "遥控 塑料 飞船 汽车 ", | ||
| 64 | - "亚克力相框", | ||
| 65 | - "8寸 搪胶蘑菇钉", | ||
| 66 | - "7寸娃娃", | ||
| 67 | - "太空沙套装", | ||
| 68 | - "脚蹬工程车", | ||
| 69 | - "捏捏乐钥匙扣", | ||
| 70 | - | ||
| 71 | - # 英文(新增) | ||
| 72 | - "plastic toy car", | ||
| 73 | - "remote control helicopter", | ||
| 74 | - "inflatable beach ball", | ||
| 75 | - "music keychain", | ||
| 76 | - "sand play set", | ||
| 77 | - # 常见商品搜索 | ||
| 78 | - "plastic dinosaur toy", | ||
| 79 | - "wireless bluetooth speaker", | ||
| 80 | - "4K action camera", | ||
| 81 | - "stainless steel water bottle", | ||
| 82 | - "baby stroller with cup holder", | ||
| 83 | - | ||
| 84 | - # 疑问式 / 自然语言 | ||
| 85 | - "what is the best smartphone under 500 dollars", | ||
| 86 | - "how to clean a laptop screen", | ||
| 87 | - "where can I buy organic coffee beans", | ||
| 88 | - | ||
| 89 | - # 含数字、特殊字符 | ||
| 90 | - "USB-C to HDMI adapter 4K", | ||
| 91 | - "LED strip lights 16.4ft", | ||
| 92 | - "Nintendo Switch OLED model", | ||
| 93 | - "iPhone 15 Pro Max case", | ||
| 94 | - | ||
| 95 | - # 简短词组 | ||
| 96 | - "gaming mouse", | ||
| 97 | - "mechanical keyboard", | ||
| 98 | - "wireless earbuds", | ||
| 99 | - | ||
| 100 | - # 长尾词 | ||
| 101 | - "rechargeable AA batteries with charger", | ||
| 102 | - "foldable picnic blanket waterproof", | ||
| 103 | - | ||
| 104 | - # 商品属性组合 | ||
| 105 | - "women's running shoes size 8", | ||
| 106 | - "men's cotton t-shirt crew neck", | ||
| 107 | - | ||
| 108 | - | ||
| 109 | - # 其他语种(保留原样,用于多语言测试) | ||
| 110 | - "свет USB с пультом дистанционного управления красочные", # 俄语 | ||
| 111 | - ] | ||
| 112 | - | ||
| 113 | - for q in test_queries: | ||
| 114 | - keywords = extractor.extract_keywords(q) | ||
| 115 | - print(f"{q:30} => {keywords}") |
tests/test_llm_enrichment_batch_fill.py
| @@ -6,6 +6,10 @@ import pandas as pd | @@ -6,6 +6,10 @@ import pandas as pd | ||
| 6 | 6 | ||
| 7 | from indexer.document_transformer import SPUDocumentTransformer | 7 | from indexer.document_transformer import SPUDocumentTransformer |
| 8 | 8 | ||
| 9 | +import pytest | ||
| 10 | + | ||
| 11 | +pytestmark = [pytest.mark.indexer, pytest.mark.regression] | ||
| 12 | + | ||
| 9 | 13 | ||
| 10 | def test_fill_llm_attributes_batch_uses_product_enrich_helper(monkeypatch): | 14 | def test_fill_llm_attributes_batch_uses_product_enrich_helper(monkeypatch): |
| 11 | seen_calls: List[Dict[str, Any]] = [] | 15 | seen_calls: List[Dict[str, Any]] = [] |
tests/test_process_products_batching.py
| @@ -4,6 +4,10 @@ from typing import Any, Dict, List | @@ -4,6 +4,10 @@ from typing import Any, Dict, List | ||
| 4 | 4 | ||
| 5 | import indexer.product_enrich as process_products | 5 | import indexer.product_enrich as process_products |
| 6 | 6 | ||
| 7 | +import pytest | ||
| 8 | + | ||
| 9 | +pytestmark = [pytest.mark.indexer, pytest.mark.regression] | ||
| 10 | + | ||
| 7 | 11 | ||
| 8 | def _mk_products(n: int) -> List[Dict[str, str]]: | 12 | def _mk_products(n: int) -> List[Dict[str, str]]: |
| 9 | return [{"id": str(i), "title": f"title-{i}"} for i in range(n)] | 13 | return [{"id": str(i), "title": f"title-{i}"} for i in range(n)] |
tests/test_product_enrich_partial_mode.py
| @@ -9,6 +9,10 @@ import types | @@ -9,6 +9,10 @@ import types | ||
| 9 | from pathlib import Path | 9 | from pathlib import Path |
| 10 | from unittest import mock | 10 | from unittest import mock |
| 11 | 11 | ||
| 12 | +import pytest | ||
| 13 | + | ||
| 14 | +pytestmark = [pytest.mark.indexer, pytest.mark.regression] | ||
| 15 | + | ||
| 12 | 16 | ||
| 13 | def _load_product_enrich_module(): | 17 | def _load_product_enrich_module(): |
| 14 | if "dotenv" not in sys.modules: | 18 | if "dotenv" not in sys.modules: |
| @@ -75,6 +79,12 @@ def test_create_prompt_splits_shared_context_and_localized_tail(): | @@ -75,6 +79,12 @@ def test_create_prompt_splits_shared_context_and_localized_tail(): | ||
| 75 | 79 | ||
| 76 | 80 | ||
| 77 | def test_create_prompt_supports_taxonomy_analysis_kind(): | 81 | def test_create_prompt_supports_taxonomy_analysis_kind(): |
| 82 | + """Taxonomy schema must produce prompts for every language it declares. | ||
| 83 | + | ||
| 84 | + Unsupported (schema, lang) combinations return ``(None, None, None)`` so the | ||
| 85 | + caller (``process_batch``) can mark the batch as failed without calling LLM, | ||
| 86 | + instead of silently emitting garbage. | ||
| 87 | + """ | ||
| 78 | products = [{"id": "1", "title": "linen dress"}] | 88 | products = [{"id": "1", "title": "linen dress"}] |
| 79 | 89 | ||
| 80 | shared_zh, user_zh, prefix_zh = product_enrich.create_prompt( | 90 | shared_zh, user_zh, prefix_zh = product_enrich.create_prompt( |
| @@ -82,18 +92,26 @@ def test_create_prompt_supports_taxonomy_analysis_kind(): | @@ -82,18 +92,26 @@ def test_create_prompt_supports_taxonomy_analysis_kind(): | ||
| 82 | target_lang="zh", | 92 | target_lang="zh", |
| 83 | analysis_kind="taxonomy", | 93 | analysis_kind="taxonomy", |
| 84 | ) | 94 | ) |
| 85 | - shared_fr, user_fr, prefix_fr = product_enrich.create_prompt( | 95 | + shared_en, user_en, prefix_en = product_enrich.create_prompt( |
| 86 | products, | 96 | products, |
| 87 | - target_lang="fr", | 97 | + target_lang="en", |
| 88 | analysis_kind="taxonomy", | 98 | analysis_kind="taxonomy", |
| 89 | ) | 99 | ) |
| 90 | 100 | ||
| 91 | assert "apparel attribute taxonomy" in shared_zh | 101 | assert "apparel attribute taxonomy" in shared_zh |
| 92 | assert "1. linen dress" in shared_zh | 102 | assert "1. linen dress" in shared_zh |
| 93 | assert "Language: Chinese" in user_zh | 103 | assert "Language: Chinese" in user_zh |
| 94 | - assert "Language: French" in user_fr | 104 | + assert "Language: English" in user_en |
| 95 | assert prefix_zh.startswith("| 序号 | 品类 | 目标性别 |") | 105 | assert prefix_zh.startswith("| 序号 | 品类 | 目标性别 |") |
| 96 | - assert prefix_fr.startswith("| No. | Product Type | Target Gender |") | 106 | + assert prefix_en.startswith("| No. | Product Type | Target Gender |") |
| 107 | + | ||
| 108 | + # Unsupported (schema, lang) must return a sentinel. French is not declared | ||
| 109 | + # by any taxonomy schema. | ||
| 110 | + assert product_enrich.create_prompt( | ||
| 111 | + products, | ||
| 112 | + target_lang="fr", | ||
| 113 | + analysis_kind="taxonomy", | ||
| 114 | + ) == (None, None, None) | ||
| 97 | 115 | ||
| 98 | 116 | ||
| 99 | def test_call_llm_logs_shared_context_once_and_verbose_contains_full_requests(): | 117 | def test_call_llm_logs_shared_context_once_and_verbose_contains_full_requests(): |
| @@ -573,7 +591,11 @@ def test_build_index_content_fields_non_apparel_taxonomy_returns_en_only(): | @@ -573,7 +591,11 @@ def test_build_index_content_fields_non_apparel_taxonomy_returns_en_only(): | ||
| 573 | seen_calls.append((analysis_kind, target_lang, category_taxonomy_profile, tuple(p["id"] for p in products))) | 591 | seen_calls.append((analysis_kind, target_lang, category_taxonomy_profile, tuple(p["id"] for p in products))) |
| 574 | if analysis_kind == "taxonomy": | 592 | if analysis_kind == "taxonomy": |
| 575 | assert category_taxonomy_profile == "toys" | 593 | assert category_taxonomy_profile == "toys" |
| 576 | - assert target_lang == "en" | 594 | + # Non-apparel taxonomy profiles only emit en; mirror the real |
| 595 | + # `analyze_products` by returning empty for unsupported langs so the | ||
| 596 | + # caller drops zh silently. | ||
| 597 | + if target_lang != "en": | ||
| 598 | + return [] | ||
| 577 | return [ | 599 | return [ |
| 578 | { | 600 | { |
| 579 | "id": products[0]["id"], | 601 | "id": products[0]["id"], |
| @@ -638,7 +660,6 @@ def test_build_index_content_fields_non_apparel_taxonomy_returns_en_only(): | @@ -638,7 +660,6 @@ def test_build_index_content_fields_non_apparel_taxonomy_returns_en_only(): | ||
| 638 | ], | 660 | ], |
| 639 | } | 661 | } |
| 640 | ] | 662 | ] |
| 641 | - assert ("taxonomy", "zh", "toys", ("2",)) not in seen_calls | ||
| 642 | assert ("taxonomy", "en", "toys", ("2",)) in seen_calls | 663 | assert ("taxonomy", "en", "toys", ("2",)) in seen_calls |
| 643 | 664 | ||
| 644 | 665 |
tests/test_product_title_exclusion.py
| @@ -6,6 +6,10 @@ from query.product_title_exclusion import ( | @@ -6,6 +6,10 @@ from query.product_title_exclusion import ( | ||
| 6 | ProductTitleExclusionRegistry, | 6 | ProductTitleExclusionRegistry, |
| 7 | ) | 7 | ) |
| 8 | 8 | ||
| 9 | +import pytest | ||
| 10 | + | ||
| 11 | +pytestmark = [pytest.mark.intent, pytest.mark.regression] | ||
| 12 | + | ||
| 9 | 13 | ||
| 10 | def test_product_title_exclusion_detector_matches_translated_english_token(): | 14 | def test_product_title_exclusion_detector_matches_translated_english_token(): |
| 11 | query_config = QueryConfig( | 15 | query_config = QueryConfig( |
tests/test_query_parser_mixed_language.py
| 1 | from config import FunctionScoreConfig, IndexConfig, QueryConfig, RerankConfig, SPUConfig, SearchConfig | 1 | from config import FunctionScoreConfig, IndexConfig, QueryConfig, RerankConfig, SPUConfig, SearchConfig |
| 2 | from query.query_parser import QueryParser | 2 | from query.query_parser import QueryParser |
| 3 | 3 | ||
| 4 | +import pytest | ||
| 5 | + | ||
| 6 | +pytestmark = [pytest.mark.query, pytest.mark.regression] | ||
| 7 | + | ||
| 4 | 8 | ||
| 5 | class _DummyTranslator: | 9 | class _DummyTranslator: |
| 6 | def translate(self, text, target_lang, source_lang, scene, model_name): | 10 | def translate(self, text, target_lang, source_lang, scene, model_name): |
tests/test_rerank_client.py
| @@ -3,6 +3,10 @@ from math import isclose | @@ -3,6 +3,10 @@ from math import isclose | ||
| 3 | from config.schema import CoarseRankFusionConfig, RerankFusionConfig | 3 | from config.schema import CoarseRankFusionConfig, RerankFusionConfig |
| 4 | from search.rerank_client import coarse_resort_hits, fuse_scores_and_resort, run_lightweight_rerank | 4 | from search.rerank_client import coarse_resort_hits, fuse_scores_and_resort, run_lightweight_rerank |
| 5 | 5 | ||
| 6 | +import pytest | ||
| 7 | + | ||
| 8 | +pytestmark = [pytest.mark.rerank, pytest.mark.regression] | ||
| 9 | + | ||
| 6 | 10 | ||
| 7 | def test_fuse_scores_and_resort_aggregates_text_components_and_keeps_rerank_primary(): | 11 | def test_fuse_scores_and_resort_aggregates_text_components_and_keeps_rerank_primary(): |
| 8 | hits = [ | 12 | hits = [ |
tests/test_rerank_provider_topn.py
| @@ -4,6 +4,10 @@ from typing import Any, Dict | @@ -4,6 +4,10 @@ from typing import Any, Dict | ||
| 4 | 4 | ||
| 5 | from providers.rerank import HttpRerankProvider | 5 | from providers.rerank import HttpRerankProvider |
| 6 | 6 | ||
| 7 | +import pytest | ||
| 8 | + | ||
| 9 | +pytestmark = [pytest.mark.rerank, pytest.mark.regression] | ||
| 10 | + | ||
| 7 | 11 | ||
| 8 | class _FakeResponse: | 12 | class _FakeResponse: |
| 9 | def __init__(self, status_code: int, data: Dict[str, Any]): | 13 | def __init__(self, status_code: int, data: Dict[str, Any]): |
tests/test_rerank_query_text.py
| @@ -2,6 +2,10 @@ | @@ -2,6 +2,10 @@ | ||
| 2 | 2 | ||
| 3 | from query.query_parser import ParsedQuery, rerank_query_text | 3 | from query.query_parser import ParsedQuery, rerank_query_text |
| 4 | 4 | ||
| 5 | +import pytest | ||
| 6 | + | ||
| 7 | +pytestmark = [pytest.mark.rerank, pytest.mark.regression] | ||
| 8 | + | ||
| 5 | 9 | ||
| 6 | def test_rerank_query_text_zh_uses_original(): | 10 | def test_rerank_query_text_zh_uses_original(): |
| 7 | assert rerank_query_text("你好", detected_language="zh", translations={"en": "hello"}) == "你好" | 11 | assert rerank_query_text("你好", detected_language="zh", translations={"en": "hello"}) == "你好" |
tests/test_reranker_dashscope_backend.py
| @@ -7,6 +7,8 @@ import pytest | @@ -7,6 +7,8 @@ import pytest | ||
| 7 | from reranker.backends import get_rerank_backend | 7 | from reranker.backends import get_rerank_backend |
| 8 | from reranker.backends.dashscope_rerank import DashScopeRerankBackend | 8 | from reranker.backends.dashscope_rerank import DashScopeRerankBackend |
| 9 | 9 | ||
| 10 | +pytestmark = [pytest.mark.rerank, pytest.mark.regression] | ||
| 11 | + | ||
| 10 | 12 | ||
| 11 | @pytest.fixture(autouse=True) | 13 | @pytest.fixture(autouse=True) |
| 12 | def _clear_global_dashscope_key(monkeypatch): | 14 | def _clear_global_dashscope_key(monkeypatch): |
tests/test_reranker_qwen3_gguf_backend.py
| @@ -6,6 +6,10 @@ import types | @@ -6,6 +6,10 @@ import types | ||
| 6 | from reranker.backends import get_rerank_backend | 6 | from reranker.backends import get_rerank_backend |
| 7 | from reranker.backends.qwen3_gguf import Qwen3GGUFRerankerBackend | 7 | from reranker.backends.qwen3_gguf import Qwen3GGUFRerankerBackend |
| 8 | 8 | ||
| 9 | +import pytest | ||
| 10 | + | ||
| 11 | +pytestmark = [pytest.mark.rerank, pytest.mark.regression] | ||
| 12 | + | ||
| 9 | 13 | ||
| 10 | class _FakeLlama: | 14 | class _FakeLlama: |
| 11 | def __init__(self, model_path: str | None = None, **kwargs): | 15 | def __init__(self, model_path: str | None = None, **kwargs): |
tests/test_reranker_server_topn.py
| @@ -4,6 +4,10 @@ from typing import Any, Dict, List | @@ -4,6 +4,10 @@ from typing import Any, Dict, List | ||
| 4 | 4 | ||
| 5 | from fastapi.testclient import TestClient | 5 | from fastapi.testclient import TestClient |
| 6 | 6 | ||
| 7 | +import pytest | ||
| 8 | + | ||
| 9 | +pytestmark = [pytest.mark.rerank, pytest.mark.regression] | ||
| 10 | + | ||
| 7 | 11 | ||
| 8 | class _FakeTopNReranker: | 12 | class _FakeTopNReranker: |
| 9 | _model_name = "fake-topn-reranker" | 13 | _model_name = "fake-topn-reranker" |
tests/test_search_evaluation_datasets.py
| 1 | from config.loader import get_app_config | 1 | from config.loader import get_app_config |
| 2 | from scripts.evaluation.eval_framework.datasets import resolve_dataset | 2 | from scripts.evaluation.eval_framework.datasets import resolve_dataset |
| 3 | 3 | ||
| 4 | +import pytest | ||
| 5 | + | ||
| 6 | +pytestmark = [pytest.mark.eval] | ||
| 7 | + | ||
| 4 | 8 | ||
| 5 | def test_search_evaluation_registry_contains_expected_datasets() -> None: | 9 | def test_search_evaluation_registry_contains_expected_datasets() -> None: |
| 6 | se = get_app_config().search_evaluation | 10 | se = get_app_config().search_evaluation |
tests/test_search_rerank_window.py
| @@ -22,6 +22,10 @@ from context import create_request_context | @@ -22,6 +22,10 @@ from context import create_request_context | ||
| 22 | from query.style_intent import DetectedStyleIntent, StyleIntentProfile | 22 | from query.style_intent import DetectedStyleIntent, StyleIntentProfile |
| 23 | from search.searcher import Searcher | 23 | from search.searcher import Searcher |
| 24 | 24 | ||
| 25 | +import pytest | ||
| 26 | + | ||
| 27 | +pytestmark = [pytest.mark.search, pytest.mark.regression] | ||
| 28 | + | ||
| 25 | 29 | ||
| 26 | @dataclass | 30 | @dataclass |
| 27 | class _FakeParsedQuery: | 31 | class _FakeParsedQuery: |
tests/test_sku_intent_selector.py
| @@ -6,6 +6,8 @@ from config import QueryConfig | @@ -6,6 +6,8 @@ from config import QueryConfig | ||
| 6 | from query.style_intent import DetectedStyleIntent, StyleIntentProfile, StyleIntentRegistry | 6 | from query.style_intent import DetectedStyleIntent, StyleIntentProfile, StyleIntentRegistry |
| 7 | from search.sku_intent_selector import StyleSkuSelector | 7 | from search.sku_intent_selector import StyleSkuSelector |
| 8 | 8 | ||
| 9 | +pytestmark = [pytest.mark.intent, pytest.mark.regression] | ||
| 10 | + | ||
| 9 | 11 | ||
| 10 | def test_style_sku_selector_matches_first_sku_by_attribute_terms(): | 12 | def test_style_sku_selector_matches_first_sku_by_attribute_terms(): |
| 11 | registry = StyleIntentRegistry.from_query_config( | 13 | registry = StyleIntentRegistry.from_query_config( |
| @@ -537,3 +539,73 @@ def test_image_pick_ignored_when_text_matches_but_visual_url_not_in_text_set(): | @@ -537,3 +539,73 @@ def test_image_pick_ignored_when_text_matches_but_visual_url_not_in_text_set(): | ||
| 537 | assert decision.selected_sku_id == "khaki" | 539 | assert decision.selected_sku_id == "khaki" |
| 538 | assert decision.final_source == "option" | 540 | assert decision.final_source == "option" |
| 539 | assert decision.image_pick_sku_id == "black" | 541 | assert decision.image_pick_sku_id == "black" |
| 542 | + | ||
| 543 | + | ||
| 544 | +def test_image_pick_matches_when_inner_hit_url_has_query_string(): | ||
| 545 | + """inner_hits 带 ?v=1,SKU 无 query —— 应用归一化后应对齐。""" | ||
| 546 | + selector = StyleSkuSelector(_color_registry()) | ||
| 547 | + parsed_query = SimpleNamespace(style_intent_profile=None) | ||
| 548 | + hits = [ | ||
| 549 | + { | ||
| 550 | + "_id": "spu-1", | ||
| 551 | + "_source": { | ||
| 552 | + "skus": [ | ||
| 553 | + { | ||
| 554 | + "sku_id": "s1", | ||
| 555 | + "image_src": "https://cdn/img/p.jpg", | ||
| 556 | + }, | ||
| 557 | + ], | ||
| 558 | + }, | ||
| 559 | + "inner_hits": { | ||
| 560 | + "exact_image_knn_query_hits": { | ||
| 561 | + "hits": { | ||
| 562 | + "hits": [ | ||
| 563 | + { | ||
| 564 | + "_score": 0.8, | ||
| 565 | + "_source": {"url": "https://cdn/img/p.jpg?width=800&quality=85"}, | ||
| 566 | + } | ||
| 567 | + ] | ||
| 568 | + } | ||
| 569 | + } | ||
| 570 | + }, | ||
| 571 | + } | ||
| 572 | + ] | ||
| 573 | + d = selector.prepare_hits(hits, parsed_query)["spu-1"] | ||
| 574 | + assert d.selected_sku_id == "s1" | ||
| 575 | + assert d.final_source == "image" | ||
| 576 | + | ||
| 577 | + | ||
| 578 | +def test_image_pick_uses_nested_offset_and_image_embedding_when_needed(): | ||
| 579 | + """_source.url 与 sku 写法不一致时,用 offset 从 image_embedding 取 canonical url。""" | ||
| 580 | + selector = StyleSkuSelector(_color_registry()) | ||
| 581 | + parsed_query = SimpleNamespace(style_intent_profile=None) | ||
| 582 | + hits = [ | ||
| 583 | + { | ||
| 584 | + "_id": "spu-1", | ||
| 585 | + "_source": { | ||
| 586 | + "image_embedding": [ | ||
| 587 | + {"url": "https://cdn/a/spu.jpg"}, | ||
| 588 | + {"url": "https://cdn/b/sku-match.jpg"}, | ||
| 589 | + ], | ||
| 590 | + "skus": [ | ||
| 591 | + {"sku_id": "sku-a", "image_src": "//cdn/b/sku-match.jpg"}, | ||
| 592 | + ], | ||
| 593 | + }, | ||
| 594 | + "inner_hits": { | ||
| 595 | + "exact_image_knn_query_hits": { | ||
| 596 | + "hits": { | ||
| 597 | + "hits": [ | ||
| 598 | + { | ||
| 599 | + "_score": 0.91, | ||
| 600 | + "_nested": {"field": "image_embedding", "offset": 1}, | ||
| 601 | + "_source": {"url": "https://wrong.example/x.jpg"}, | ||
| 602 | + } | ||
| 603 | + ] | ||
| 604 | + } | ||
| 605 | + } | ||
| 606 | + }, | ||
| 607 | + } | ||
| 608 | + ] | ||
| 609 | + d = selector.prepare_hits(hits, parsed_query)["spu-1"] | ||
| 610 | + assert d.selected_sku_id == "sku-a" | ||
| 611 | + assert d.image_pick_url == "https://cdn/b/sku-match.jpg" |
tests/test_style_intent.py
| @@ -3,6 +3,10 @@ from types import SimpleNamespace | @@ -3,6 +3,10 @@ from types import SimpleNamespace | ||
| 3 | from config import QueryConfig | 3 | from config import QueryConfig |
| 4 | from query.style_intent import StyleIntentDetector, StyleIntentRegistry | 4 | from query.style_intent import StyleIntentDetector, StyleIntentRegistry |
| 5 | 5 | ||
| 6 | +import pytest | ||
| 7 | + | ||
| 8 | +pytestmark = [pytest.mark.intent, pytest.mark.regression] | ||
| 9 | + | ||
| 6 | 10 | ||
| 7 | def test_style_intent_detector_matches_original_and_translated_queries(): | 11 | def test_style_intent_detector_matches_original_and_translated_queries(): |
| 8 | query_config = QueryConfig( | 12 | query_config = QueryConfig( |
tests/test_suggestions.py
| @@ -12,6 +12,8 @@ from suggestion.builder import ( | @@ -12,6 +12,8 @@ from suggestion.builder import ( | ||
| 12 | ) | 12 | ) |
| 13 | from suggestion.service import SuggestionService | 13 | from suggestion.service import SuggestionService |
| 14 | 14 | ||
| 15 | +pytestmark = [pytest.mark.suggestion, pytest.mark.regression] | ||
| 16 | + | ||
| 15 | 17 | ||
| 16 | class FakeESClient: | 18 | class FakeESClient: |
| 17 | """Lightweight fake ES client for suggestion unit tests.""" | 19 | """Lightweight fake ES client for suggestion unit tests.""" |
| @@ -160,7 +162,6 @@ class FakeESClient: | @@ -160,7 +162,6 @@ class FakeESClient: | ||
| 160 | return sorted([x for x in self.indices if x.startswith(prefix)]) | 162 | return sorted([x for x in self.indices if x.startswith(prefix)]) |
| 161 | 163 | ||
| 162 | 164 | ||
| 163 | -@pytest.mark.unit | ||
| 164 | def test_versioned_index_name_uses_microseconds(): | 165 | def test_versioned_index_name_uses_microseconds(): |
| 165 | build_at = datetime(2026, 4, 7, 3, 52, 26, 123456, tzinfo=timezone.utc) | 166 | build_at = datetime(2026, 4, 7, 3, 52, 26, 123456, tzinfo=timezone.utc) |
| 166 | assert ( | 167 | assert ( |
| @@ -169,7 +170,6 @@ def test_versioned_index_name_uses_microseconds(): | @@ -169,7 +170,6 @@ def test_versioned_index_name_uses_microseconds(): | ||
| 169 | ) | 170 | ) |
| 170 | 171 | ||
| 171 | 172 | ||
| 172 | -@pytest.mark.unit | ||
| 173 | def test_rebuild_cleans_up_unallocatable_new_index(): | 173 | def test_rebuild_cleans_up_unallocatable_new_index(): |
| 174 | fake_es = FakeESClient() | 174 | fake_es = FakeESClient() |
| 175 | 175 | ||
| @@ -221,7 +221,6 @@ def test_rebuild_cleans_up_unallocatable_new_index(): | @@ -221,7 +221,6 @@ def test_rebuild_cleans_up_unallocatable_new_index(): | ||
| 221 | assert created_index not in fake_es.indices | 221 | assert created_index not in fake_es.indices |
| 222 | 222 | ||
| 223 | 223 | ||
| 224 | -@pytest.mark.unit | ||
| 225 | def test_resolve_query_language_prefers_log_field(): | 224 | def test_resolve_query_language_prefers_log_field(): |
| 226 | fake_es = FakeESClient() | 225 | fake_es = FakeESClient() |
| 227 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 226 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
| @@ -238,7 +237,6 @@ def test_resolve_query_language_prefers_log_field(): | @@ -238,7 +237,6 @@ def test_resolve_query_language_prefers_log_field(): | ||
| 238 | assert conflict is False | 237 | assert conflict is False |
| 239 | 238 | ||
| 240 | 239 | ||
| 241 | -@pytest.mark.unit | ||
| 242 | def test_resolve_query_language_uses_request_params_when_log_missing(): | 240 | def test_resolve_query_language_uses_request_params_when_log_missing(): |
| 243 | fake_es = FakeESClient() | 241 | fake_es = FakeESClient() |
| 244 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 242 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
| @@ -256,7 +254,6 @@ def test_resolve_query_language_uses_request_params_when_log_missing(): | @@ -256,7 +254,6 @@ def test_resolve_query_language_uses_request_params_when_log_missing(): | ||
| 256 | assert conflict is False | 254 | assert conflict is False |
| 257 | 255 | ||
| 258 | 256 | ||
| 259 | -@pytest.mark.unit | ||
| 260 | def test_resolve_query_language_fallback_to_primary(): | 257 | def test_resolve_query_language_fallback_to_primary(): |
| 261 | fake_es = FakeESClient() | 258 | fake_es = FakeESClient() |
| 262 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 259 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
| @@ -272,7 +269,6 @@ def test_resolve_query_language_fallback_to_primary(): | @@ -272,7 +269,6 @@ def test_resolve_query_language_fallback_to_primary(): | ||
| 272 | assert conflict is False | 269 | assert conflict is False |
| 273 | 270 | ||
| 274 | 271 | ||
| 275 | -@pytest.mark.unit | ||
| 276 | def test_suggestion_service_basic_flow_uses_alias_and_routing(): | 272 | def test_suggestion_service_basic_flow_uses_alias_and_routing(): |
| 277 | from config import tenant_config_loader as tcl | 273 | from config import tenant_config_loader as tcl |
| 278 | 274 | ||
| @@ -309,7 +305,6 @@ def test_suggestion_service_basic_flow_uses_alias_and_routing(): | @@ -309,7 +305,6 @@ def test_suggestion_service_basic_flow_uses_alias_and_routing(): | ||
| 309 | assert any(x.get("index") == alias_name for x in search_calls) | 305 | assert any(x.get("index") == alias_name for x in search_calls) |
| 310 | 306 | ||
| 311 | 307 | ||
| 312 | -@pytest.mark.unit | ||
| 313 | def test_publish_alias_and_cleanup_old_versions(monkeypatch): | 308 | def test_publish_alias_and_cleanup_old_versions(monkeypatch): |
| 314 | fake_es = FakeESClient() | 309 | fake_es = FakeESClient() |
| 315 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 310 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
| @@ -338,7 +333,6 @@ def test_publish_alias_and_cleanup_old_versions(monkeypatch): | @@ -338,7 +333,6 @@ def test_publish_alias_and_cleanup_old_versions(monkeypatch): | ||
| 338 | assert "search_suggestions_tenant_162_v20260310170000" not in fake_es.indices | 333 | assert "search_suggestions_tenant_162_v20260310170000" not in fake_es.indices |
| 339 | 334 | ||
| 340 | 335 | ||
| 341 | -@pytest.mark.unit | ||
| 342 | def test_incremental_bootstrap_when_no_active_index(monkeypatch): | 336 | def test_incremental_bootstrap_when_no_active_index(monkeypatch): |
| 343 | fake_es = FakeESClient() | 337 | fake_es = FakeESClient() |
| 344 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 338 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
| @@ -363,7 +357,6 @@ def test_incremental_bootstrap_when_no_active_index(monkeypatch): | @@ -363,7 +357,6 @@ def test_incremental_bootstrap_when_no_active_index(monkeypatch): | ||
| 363 | assert result["bootstrap_result"]["mode"] == "full" | 357 | assert result["bootstrap_result"]["mode"] == "full" |
| 364 | 358 | ||
| 365 | 359 | ||
| 366 | -@pytest.mark.unit | ||
| 367 | def test_incremental_updates_existing_index(monkeypatch): | 360 | def test_incremental_updates_existing_index(monkeypatch): |
| 368 | fake_es = FakeESClient() | 361 | fake_es = FakeESClient() |
| 369 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 362 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
| @@ -419,7 +412,6 @@ def test_incremental_updates_existing_index(monkeypatch): | @@ -419,7 +412,6 @@ def test_incremental_updates_existing_index(monkeypatch): | ||
| 419 | assert len(bulk_calls[0]["actions"]) == 1 | 412 | assert len(bulk_calls[0]["actions"]) == 1 |
| 420 | 413 | ||
| 421 | 414 | ||
| 422 | -@pytest.mark.unit | ||
| 423 | def test_build_full_candidates_fallback_to_id_when_spu_id_missing(monkeypatch): | 415 | def test_build_full_candidates_fallback_to_id_when_spu_id_missing(monkeypatch): |
| 424 | fake_es = FakeESClient() | 416 | fake_es = FakeESClient() |
| 425 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 417 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
| @@ -459,7 +451,6 @@ def test_build_full_candidates_fallback_to_id_when_spu_id_missing(monkeypatch): | @@ -459,7 +451,6 @@ def test_build_full_candidates_fallback_to_id_when_spu_id_missing(monkeypatch): | ||
| 459 | assert key_to_candidate[qanchor_key].qanchor_spu_ids == {"521"} | 451 | assert key_to_candidate[qanchor_key].qanchor_spu_ids == {"521"} |
| 460 | 452 | ||
| 461 | 453 | ||
| 462 | -@pytest.mark.unit | ||
| 463 | def test_build_full_candidates_tags_and_qanchor_phrases(monkeypatch): | 454 | def test_build_full_candidates_tags_and_qanchor_phrases(monkeypatch): |
| 464 | fake_es = FakeESClient() | 455 | fake_es = FakeESClient() |
| 465 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 456 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
| @@ -509,7 +500,6 @@ def test_build_full_candidates_tags_and_qanchor_phrases(monkeypatch): | @@ -509,7 +500,6 @@ def test_build_full_candidates_tags_and_qanchor_phrases(monkeypatch): | ||
| 509 | assert ("en", "ribbed neckline") in key_to_candidate | 500 | assert ("en", "ribbed neckline") in key_to_candidate |
| 510 | 501 | ||
| 511 | 502 | ||
| 512 | -@pytest.mark.unit | ||
| 513 | def test_build_full_candidates_splits_long_title_for_suggest(monkeypatch): | 503 | def test_build_full_candidates_splits_long_title_for_suggest(monkeypatch): |
| 514 | fake_es = FakeESClient() | 504 | fake_es = FakeESClient() |
| 515 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 505 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
| @@ -542,7 +532,6 @@ def test_build_full_candidates_splits_long_title_for_suggest(monkeypatch): | @@ -542,7 +532,6 @@ def test_build_full_candidates_splits_long_title_for_suggest(monkeypatch): | ||
| 542 | assert key_to_candidate[key].text == "Furby Furblets 2-Pack" | 532 | assert key_to_candidate[key].text == "Furby Furblets 2-Pack" |
| 543 | 533 | ||
| 544 | 534 | ||
| 545 | -@pytest.mark.unit | ||
| 546 | def test_iter_products_requests_dual_sort_and_fields(): | 535 | def test_iter_products_requests_dual_sort_and_fields(): |
| 547 | fake_es = FakeESClient() | 536 | fake_es = FakeESClient() |
| 548 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) | 537 | builder = SuggestionIndexBuilder(es_client=fake_es, db_engine=None) |
tests/test_tokenization.py
| 1 | from query.tokenization import QueryTextAnalysisCache | 1 | from query.tokenization import QueryTextAnalysisCache |
| 2 | 2 | ||
| 3 | +import pytest | ||
| 4 | + | ||
| 5 | +pytestmark = [pytest.mark.query] | ||
| 6 | + | ||
| 3 | 7 | ||
| 4 | def test_han_coarse_tokens_follow_model_tokens_instead_of_whole_sentence(): | 8 | def test_han_coarse_tokens_follow_model_tokens_instead_of_whole_sentence(): |
| 5 | cache = QueryTextAnalysisCache( | 9 | cache = QueryTextAnalysisCache( |
tests/test_translation_converter_resolution.py
| @@ -7,6 +7,8 @@ import pytest | @@ -7,6 +7,8 @@ import pytest | ||
| 7 | 7 | ||
| 8 | import translation.ct2_conversion as ct2_conversion | 8 | import translation.ct2_conversion as ct2_conversion |
| 9 | 9 | ||
| 10 | +pytestmark = [pytest.mark.translation] | ||
| 11 | + | ||
| 10 | 12 | ||
| 11 | class _FakeTransformersConverter: | 13 | class _FakeTransformersConverter: |
| 12 | def __init__(self, model_name_or_path): | 14 | def __init__(self, model_name_or_path): |
tests/test_translation_deepl_backend.py
| 1 | from translation.backends.deepl import DeepLTranslationBackend | 1 | from translation.backends.deepl import DeepLTranslationBackend |
| 2 | 2 | ||
| 3 | +import pytest | ||
| 4 | + | ||
| 5 | +pytestmark = [pytest.mark.translation, pytest.mark.regression] | ||
| 6 | + | ||
| 3 | 7 | ||
| 4 | class _FakeResponse: | 8 | class _FakeResponse: |
| 5 | def __init__(self, status_code, payload=None, text=""): | 9 | def __init__(self, status_code, payload=None, text=""): |
tests/test_translation_llm_backend.py
| @@ -2,6 +2,10 @@ from types import SimpleNamespace | @@ -2,6 +2,10 @@ from types import SimpleNamespace | ||
| 2 | 2 | ||
| 3 | from translation.backends.llm import LLMTranslationBackend | 3 | from translation.backends.llm import LLMTranslationBackend |
| 4 | 4 | ||
| 5 | +import pytest | ||
| 6 | + | ||
| 7 | +pytestmark = [pytest.mark.translation, pytest.mark.regression] | ||
| 8 | + | ||
| 5 | 9 | ||
| 6 | class _FakeCompletions: | 10 | class _FakeCompletions: |
| 7 | def __init__(self, responses): | 11 | def __init__(self, responses): |
tests/test_translation_local_backends.py
| @@ -9,6 +9,8 @@ from translation.languages import build_nllb_language_catalog, resolve_nllb_lang | @@ -9,6 +9,8 @@ from translation.languages import build_nllb_language_catalog, resolve_nllb_lang | ||
| 9 | from translation.service import TranslationService | 9 | from translation.service import TranslationService |
| 10 | from translation.text_splitter import compute_safe_input_token_limit, split_text_for_translation | 10 | from translation.text_splitter import compute_safe_input_token_limit, split_text_for_translation |
| 11 | 11 | ||
| 12 | +pytestmark = [pytest.mark.translation, pytest.mark.regression] | ||
| 13 | + | ||
| 12 | 14 | ||
| 13 | class _FakeBatch(dict): | 15 | class _FakeBatch(dict): |
| 14 | def to(self, device): | 16 | def to(self, device): |
tests/test_translator_failure_semantics.py
| @@ -11,6 +11,8 @@ from translation.logging_utils import ( | @@ -11,6 +11,8 @@ from translation.logging_utils import ( | ||
| 11 | from translation.service import TranslationService | 11 | from translation.service import TranslationService |
| 12 | from translation.settings import build_translation_config, translation_cache_probe_models | 12 | from translation.settings import build_translation_config, translation_cache_probe_models |
| 13 | 13 | ||
| 14 | +pytestmark = [pytest.mark.translation, pytest.mark.regression] | ||
| 15 | + | ||
| 14 | 16 | ||
| 15 | class _FakeCache: | 17 | class _FakeCache: |
| 16 | def __init__(self): | 18 | def __init__(self): |
translation/prompts.py
| @@ -30,6 +30,18 @@ TRANSLATION_PROMPTS: Dict[str, Dict[str, str]] = { | @@ -30,6 +30,18 @@ TRANSLATION_PROMPTS: Dict[str, Dict[str, str]] = { | ||
| 30 | "it": "Sei un traduttore ecommerce da {source_lang} ({src_lang_code}) a {target_lang} ({tgt_lang_code}). Traduce in un nome SKU prodotto {target_lang} conciso e accurato, restituisci solo il risultato: {text}", | 30 | "it": "Sei un traduttore ecommerce da {source_lang} ({src_lang_code}) a {target_lang} ({tgt_lang_code}). Traduce in un nome SKU prodotto {target_lang} conciso e accurato, restituisci solo il risultato: {text}", |
| 31 | "pt": "Você é um tradutor de e-commerce de {source_lang} ({src_lang_code}) para {target_lang} ({tgt_lang_code}). Traduza para um nome SKU de produto {target_lang} conciso e preciso, produza apenas o resultado: {text}", | 31 | "pt": "Você é um tradutor de e-commerce de {source_lang} ({src_lang_code}) para {target_lang} ({tgt_lang_code}). Traduza para um nome SKU de produto {target_lang} conciso e preciso, produza apenas o resultado: {text}", |
| 32 | }, | 32 | }, |
| 33 | + "sku_attribute": { | ||
| 34 | + "zh": "你是一名专业的 {source_lang}({src_lang_code})到 {target_lang}({tgt_lang_code})电商翻译专家,请将原文翻译为{target_lang}商品SKU属性值(如颜色、尺码、材质等),要求简洁准确、符合属性展示习惯,只输出结果:{text}", | ||
| 35 | + "en": "You are a professional {source_lang} ({src_lang_code}) to {target_lang} ({tgt_lang_code}) ecommerce translator. Translate into concise {target_lang} product SKU attribute values (e.g. color, size, material), suitable for attribute display, output only the result: {text}", | ||
| 36 | + "ru": "Вы переводчик e-commerce с {source_lang} ({src_lang_code}) на {target_lang} ({tgt_lang_code}). Переведите в краткие и точные значения атрибутов SKU на {target_lang} (цвет, размер, материал и т.п.), выводите только результат: {text}", | ||
| 37 | + "ar": "أنت مترجم تجارة إلكترونية من {source_lang} ({src_lang_code}) إلى {target_lang} ({tgt_lang_code}). ترجم إلى قيم سمات SKU للمنتج بلغة {target_lang} (مثل اللون والمقاس والخامة) بإيجاز ودقة، وأخرج النتيجة فقط: {text}", | ||
| 38 | + "ja": "{source_lang}({src_lang_code})から {target_lang}({tgt_lang_code})へのEC翻訳者として、商品SKUの属性値(色・サイズ・素材など)に簡潔かつ正確に翻訳し、結果のみ出力してください:{text}", | ||
| 39 | + "es": "Eres un traductor ecommerce de {source_lang} ({src_lang_code}) a {target_lang} ({tgt_lang_code}). Traduce a valores de atributo SKU de producto en {target_lang} (color, talla, material, etc.), concisos y precisos, devuelve solo el resultado: {text}", | ||
| 40 | + "de": "Du bist ein E-Commerce-Übersetzer von {source_lang} ({src_lang_code}) nach {target_lang} ({tgt_lang_code}). Übersetze in präzise {target_lang} SKU-Produktattributwerte (z. B. Farbe, Größe, Material), nur Ergebnis ausgeben: {text}", | ||
| 41 | + "fr": "Vous êtes un traducteur e-commerce de {source_lang} ({src_lang_code}) vers {target_lang} ({tgt_lang_code}). Traduisez en valeurs d'attributs SKU produit {target_lang} (couleur, taille, matière, etc.), concises et précises, sortie uniquement : {text}", | ||
| 42 | + "it": "Sei un traduttore ecommerce da {source_lang} ({src_lang_code}) a {target_lang} ({tgt_lang_code}). Traduci in valori di attributo SKU prodotto {target_lang} (colore, taglia, materiale, ecc.), concisi e accurati, restituisci solo il risultato: {text}", | ||
| 43 | + "pt": "Você é um tradutor de e-commerce de {source_lang} ({src_lang_code}) para {target_lang} ({tgt_lang_code}). Traduza para valores de atributo SKU de produto em {target_lang} (cor, tamanho, material etc.), concisos e precisos, produza apenas o resultado: {text}", | ||
| 44 | + }, | ||
| 33 | "ecommerce_search_query": { | 45 | "ecommerce_search_query": { |
| 34 | "zh": "你是一名专业的 {source_lang}({src_lang_code})到 {target_lang}({tgt_lang_code})翻译助手,请将电商搜索词准确翻译为{target_lang}并符合搜索习惯,只输出结果:{text}", | 46 | "zh": "你是一名专业的 {source_lang}({src_lang_code})到 {target_lang}({tgt_lang_code})翻译助手,请将电商搜索词准确翻译为{target_lang}并符合搜索习惯,只输出结果:{text}", |
| 35 | "en": "You are a professional {source_lang} ({src_lang_code}) to {target_lang} ({tgt_lang_code}) translator. Translate the ecommerce search query accurately following {target_lang} search habits, output only the result: {text}", | 47 | "en": "You are a professional {source_lang} ({src_lang_code}) to {target_lang} ({tgt_lang_code}) translator. Translate the ecommerce search query accurately following {target_lang} search habits, output only the result: {text}", |
| @@ -113,6 +125,39 @@ BATCH_TRANSLATION_PROMPTS: Dict[str, Dict[str, str]] = { | @@ -113,6 +125,39 @@ BATCH_TRANSLATION_PROMPTS: Dict[str, Dict[str, str]] = { | ||
| 113 | "Входные данные:\n{text}" | 125 | "Входные данные:\n{text}" |
| 114 | ), | 126 | ), |
| 115 | }, | 127 | }, |
| 128 | + "sku_attribute": { | ||
| 129 | + "en": ( | ||
| 130 | + "Translate each item from {source_lang} ({src_lang_code}) to concise {target_lang} ({tgt_lang_code}) " | ||
| 131 | + "product SKU attribute values (e.g. color, size, material).\n" | ||
| 132 | + "Accurately preserve the meaning; keep wording short and suitable for attribute display.\n" | ||
| 133 | + "Output exactly one line for each input item, in the same order, using this exact format:\n" | ||
| 134 | + "1. translation\n" | ||
| 135 | + "2. translation\n" | ||
| 136 | + "...\n" | ||
| 137 | + "Do not explain or output anything else.\n" | ||
| 138 | + "Input:\n{text}" | ||
| 139 | + ), | ||
| 140 | + "zh": ( | ||
| 141 | + "将每一项从 {source_lang} ({src_lang_code}) 翻译为简洁的 {target_lang} ({tgt_lang_code}) 商品SKU属性值(如颜色、尺码、材质等)。\n" | ||
| 142 | + "准确传达含义,措辞简短,适合属性展示。\n" | ||
| 143 | + "请按输入顺序逐行输出,每个输入对应一行,格式必须如下:\n" | ||
| 144 | + "1. 翻译结果\n" | ||
| 145 | + "2. 翻译结果\n" | ||
| 146 | + "...\n" | ||
| 147 | + "不要解释或输出其他任何内容。\n" | ||
| 148 | + "输入:\n{text}" | ||
| 149 | + ), | ||
| 150 | + "ru": ( | ||
| 151 | + "Переведите каждый элемент с {source_lang} ({src_lang_code}) на краткие значения атрибутов SKU на {target_lang} ({tgt_lang_code}) (цвет, размер, материал и т.п.).\n" | ||
| 152 | + "Точно сохраняйте смысл; формулировки должны быть короткими и подходить для отображения атрибутов.\n" | ||
| 153 | + "Выводите ровно по одной строке для каждого входного элемента в том же порядке, в следующем формате:\n" | ||
| 154 | + "1. перевод\n" | ||
| 155 | + "2. перевод\n" | ||
| 156 | + "...\n" | ||
| 157 | + "Не добавляйте объяснений и ничего лишнего.\n" | ||
| 158 | + "Входные данные:\n{text}" | ||
| 159 | + ), | ||
| 160 | + }, | ||
| 116 | "ecommerce_search_query": { | 161 | "ecommerce_search_query": { |
| 117 | "en": ( | 162 | "en": ( |
| 118 | "Translate each item from {source_lang} ({src_lang_code}) to a natural {target_lang} ({tgt_lang_code}) " | 163 | "Translate each item from {source_lang} ({src_lang_code}) to a natural {target_lang} ({tgt_lang_code}) " |
translation/scenes.py
| @@ -18,6 +18,10 @@ SCENE_DEEPL_CONTEXTS: Dict[str, Dict[str, str]] = { | @@ -18,6 +18,10 @@ SCENE_DEEPL_CONTEXTS: Dict[str, Dict[str, str]] = { | ||
| 18 | "zh": "电商搜索词", | 18 | "zh": "电商搜索词", |
| 19 | "en": "e-commerce search query", | 19 | "en": "e-commerce search query", |
| 20 | }, | 20 | }, |
| 21 | + "sku_attribute": { | ||
| 22 | + "zh": "商品SKU属性值", | ||
| 23 | + "en": "product SKU attribute value", | ||
| 24 | + }, | ||
| 21 | } | 25 | } |
| 22 | 26 | ||
| 23 | 27 |