Commit 47452e1dd6cd19314e6f867e4b5d346ddbc99651
1 parent
317c5d2c
feat(search): 支持可配置的精确向量重打分 (exact rescore),解决 topN 内 ANN 得分缺失问题
修改内容
1. **新增配置项** (`config/config.yaml`)
- `exact_knn_rescore_enabled`: 是否开启精确向量重打分,默认 true
- `exact_knn_rescore_window`: 重打分窗口大小,默认 160(与 rerank_window 解耦,可独立配置)
2. **ES 查询层改造** (`search/searcher.py`)
- 在第一次 ES 搜索中,根据配置为 window_size 内的文档注入 rescore 阶段
- rescore_query 中包含两个 named script_score 子句:
- `exact_text_knn_query`: 对文本向量执行精确点积
- `exact_image_knn_query`: 对图片向量执行精确点积
- 当前采用 `score_mode=total` 且 `rescore_query_weight=0.0`,**只补分不改排序**,exact 分仅出现在 `matched_queries` 中
3. **统一向量得分 Boost 逻辑** (`search/es_query_builder.py`)
- 新增 `_get_knn_plan()` 方法,集中管理文本/图片 KNN 的 boost 计算规则
- 支持长查询(token 数超过阈值)时文本 boost 额外乘 1.4 倍
- 精确 rescore 与 ANN 召回**共用同一套 boost 规则**,确保分数量纲一致
- 原有 ANN 查询构建逻辑同步迁移至该统一入口
4. **融合阶段得分优先级调整** (`search/rerank_client.py`)
- `_build_hit_signal_bundle()` 中统一处理向量得分读取
- 优先从 `matched_queries` 读取 `exact_text_knn_query` / `exact_image_knn_query`
- 若不存在则回退到原 `knn_query` / `image_knn_query`(ANN 得分)
- 覆盖 coarse_rank、fine_rank、rerank 三个阶段,避免重复补丁
5. **测试覆盖**
- `tests/test_es_query_builder.py`: 验证 ANN 与 exact 共用 boost 规则
- `tests/test_search_rerank_window.py`: 验证 rescore 窗口及 named query 正确注入
- `tests/test_rerank_client.py`: 验证 exact 优先、回退 ANN 的逻辑
技术细节
- **精确向量计算脚本** (Painless)
```painless
// 文本 (dotProduct + 1.0) / 2.0
(dotProduct(params.query_vector, 'title_embedding') + 1.0) / 2.0
// 图片同理,字段为 'image_embedding.vector'
```
乘以统一的 boost(来自配置 `knn_text_boost` / `knn_image_boost` 及长查询放大因子)。
- **named query 保留机制**
- 主查询中已开启 `include_named_queries_score: true`
- rescore 阶段命名的脚本得分会合并到每个 hit 的 `matched_queries` 中
- 通过 `_extract_named_score()` 按名称提取,与原始 ANN 得分访问方式完全一致
- **性能影响** (基于 top160、6 条真实查询、warm-up 后 3 轮平均)
- `elasticsearch_search_primary` 耗时: 124.71ms → 136.60ms (+11.89ms, +9.53%)
- `total_search` 受其他组件抖动影响较大,不作为主要参考
- 该开销在可接受范围内,未出现超时或资源瓶颈
配置示例
```yaml
search:
exact_knn_rescore_enabled: true
exact_knn_rescore_window: 160
knn_text_boost: 4.0
knn_image_boost: 4.0
long_query_token_threshold: 8
long_query_text_boost_factor: 1.4
```
已知问题与后续计划
- 当前版本经过调参实验发现,开启 exact rescore 后部分 query(强类型约束 + 多风格/颜色相似)的主指标相比 baseline(exact=false)下降约 0.031(0.6009 → 0.5697)
- 根因:exact 将 KNN 从稀疏辅助信号变为 dense 排序因子,coarse 阶段排序语义变化,单纯调整现有 `knn_bias/exponent` 无法完全恢复
- 后续迭代方向:**coarse 阶段暂不强制使用 exact**,仅 fine/rerank 优先 exact;或 coarse 采用“ANN 优先,exact 只补缺失”策略,再重新评估
相关文件
- `config/config.yaml`
- `search/searcher.py`
- `search/es_query_builder.py`
- `search/rerank_client.py`
- `tests/test_es_query_builder.py`
- `tests/test_search_rerank_window.py`
- `tests/test_rerank_client.py`
- `scripts/evaluation/exact_rescore_coarse_tuning_round2.json` (调参实验记录)
Showing
10 changed files
with
643 additions
and
210 deletions
Show diff stats
config/config.yaml
| 1 | -# Unified Configuration for Multi-Tenant Search Engine | ||
| 2 | -# 统一配置文件,所有租户共用一套配置 | ||
| 3 | -# 注意:索引结构由 mappings/search_products.json 定义,此文件只配置搜索行为 | ||
| 4 | -# | ||
| 5 | -# 约定:下列键为必填;进程环境变量可覆盖 infrastructure / runtime 中同名语义项 | ||
| 6 | -#(如 ES_HOST、API_PORT 等),未设置环境变量时使用本文件中的值。 | ||
| 7 | - | ||
| 8 | -# Process / bind addresses (环境变量 APP_ENV、RUNTIME_ENV、ES_INDEX_NAMESPACE 可覆盖前两者的语义) | ||
| 9 | runtime: | 1 | runtime: |
| 10 | environment: prod | 2 | environment: prod |
| 11 | index_namespace: '' | 3 | index_namespace: '' |
| @@ -21,8 +13,6 @@ runtime: | @@ -21,8 +13,6 @@ runtime: | ||
| 21 | translator_port: 6006 | 13 | translator_port: 6006 |
| 22 | reranker_host: 0.0.0.0 | 14 | reranker_host: 0.0.0.0 |
| 23 | reranker_port: 6007 | 15 | reranker_port: 6007 |
| 24 | - | ||
| 25 | -# 基础设施连接(敏感项优先读环境变量:ES_*、REDIS_*、DB_*、DASHSCOPE_API_KEY、DEEPL_AUTH_KEY) | ||
| 26 | infrastructure: | 16 | infrastructure: |
| 27 | elasticsearch: | 17 | elasticsearch: |
| 28 | host: http://localhost:9200 | 18 | host: http://localhost:9200 |
| @@ -49,23 +39,12 @@ infrastructure: | @@ -49,23 +39,12 @@ infrastructure: | ||
| 49 | secrets: | 39 | secrets: |
| 50 | dashscope_api_key: null | 40 | dashscope_api_key: null |
| 51 | deepl_auth_key: null | 41 | deepl_auth_key: null |
| 52 | - | ||
| 53 | -# Elasticsearch Index | ||
| 54 | es_index_name: search_products | 42 | es_index_name: search_products |
| 55 | - | ||
| 56 | -# 检索域 / 索引列表(可为空列表;每项字段均需显式给出) | ||
| 57 | indexes: [] | 43 | indexes: [] |
| 58 | - | ||
| 59 | -# Config assets | ||
| 60 | assets: | 44 | assets: |
| 61 | query_rewrite_dictionary_path: config/dictionaries/query_rewrite.dict | 45 | query_rewrite_dictionary_path: config/dictionaries/query_rewrite.dict |
| 62 | - | ||
| 63 | -# Product content understanding (LLM enrich-content) configuration | ||
| 64 | product_enrich: | 46 | product_enrich: |
| 65 | max_workers: 40 | 47 | max_workers: 40 |
| 66 | - | ||
| 67 | -# 离线 / Web 相关性评估(scripts/evaluation、eval-web) | ||
| 68 | -# CLI 未显式传参时使用此处默认值;search_base_url 未配置时自动为 http://127.0.0.1:{runtime.api_port} | ||
| 69 | search_evaluation: | 48 | search_evaluation: |
| 70 | artifact_root: artifacts/search_evaluation | 49 | artifact_root: artifacts/search_evaluation |
| 71 | queries_file: scripts/evaluation/queries/queries.txt | 50 | queries_file: scripts/evaluation/queries/queries.txt |
| @@ -98,23 +77,18 @@ search_evaluation: | @@ -98,23 +77,18 @@ search_evaluation: | ||
| 98 | rebuild_irrelevant_stop_ratio: 0.799 | 77 | rebuild_irrelevant_stop_ratio: 0.799 |
| 99 | rebuild_irrel_low_combined_stop_ratio: 0.959 | 78 | rebuild_irrel_low_combined_stop_ratio: 0.959 |
| 100 | rebuild_irrelevant_stop_streak: 3 | 79 | rebuild_irrelevant_stop_streak: 3 |
| 101 | - | ||
| 102 | -# ES Index Settings (基础设置) | ||
| 103 | es_settings: | 80 | es_settings: |
| 104 | number_of_shards: 1 | 81 | number_of_shards: 1 |
| 105 | number_of_replicas: 0 | 82 | number_of_replicas: 0 |
| 106 | refresh_interval: 30s | 83 | refresh_interval: 30s |
| 107 | 84 | ||
| 108 | -# 字段权重配置(用于搜索时的字段boost) | ||
| 109 | -# 统一按“字段基名”配置;查询时按实际检索语言动态拼接 .{lang}。 | ||
| 110 | -# 若需要按某个语言单独调权,也可以加显式 key(例如 title.de: 3.2)。 | 85 | +# 统一按“字段基名”配置;查询时按实际检索语言动态拼接 .{lang} |
| 111 | field_boosts: | 86 | field_boosts: |
| 112 | title: 3.0 | 87 | title: 3.0 |
| 113 | # qanchors enriched_tags 在 enriched_attributes.value中也存在,所以其实他的权重为自身权重+enriched_attributes.value的权重 | 88 | # qanchors enriched_tags 在 enriched_attributes.value中也存在,所以其实他的权重为自身权重+enriched_attributes.value的权重 |
| 114 | qanchors: 1.0 | 89 | qanchors: 1.0 |
| 115 | enriched_tags: 1.0 | 90 | enriched_tags: 1.0 |
| 116 | enriched_attributes.value: 1.5 | 91 | enriched_attributes.value: 1.5 |
| 117 | - # enriched_taxonomy_attributes.value: 0.3 | ||
| 118 | category_name_text: 2.0 | 92 | category_name_text: 2.0 |
| 119 | category_path: 2.0 | 93 | category_path: 2.0 |
| 120 | keywords: 2.0 | 94 | keywords: 2.0 |
| @@ -126,38 +100,25 @@ field_boosts: | @@ -126,38 +100,25 @@ field_boosts: | ||
| 126 | description: 1.0 | 100 | description: 1.0 |
| 127 | vendor: 1.0 | 101 | vendor: 1.0 |
| 128 | 102 | ||
| 129 | -# Query Configuration(查询配置) | ||
| 130 | query_config: | 103 | query_config: |
| 131 | - # 支持的语言 | ||
| 132 | supported_languages: | 104 | supported_languages: |
| 133 | - zh | 105 | - zh |
| 134 | - en | 106 | - en |
| 135 | default_language: en | 107 | default_language: en |
| 136 | - | ||
| 137 | - # 功能开关(翻译开关由tenant_config控制) | ||
| 138 | enable_text_embedding: true | 108 | enable_text_embedding: true |
| 139 | enable_query_rewrite: true | 109 | enable_query_rewrite: true |
| 140 | 110 | ||
| 141 | - # 查询翻译模型(须与 services.translation.capabilities 中某项一致) | ||
| 142 | - # 源语种在租户 index_languages 内:主召回可打在源语种字段,用下面三项。 | ||
| 143 | - zh_to_en_model: nllb-200-distilled-600m # "opus-mt-zh-en" | ||
| 144 | - en_to_zh_model: nllb-200-distilled-600m # "opus-mt-en-zh" | ||
| 145 | - default_translation_model: nllb-200-distilled-600m | ||
| 146 | - # zh_to_en_model: deepl | ||
| 147 | - # en_to_zh_model: deepl | ||
| 148 | - # default_translation_model: deepl | ||
| 149 | - # 源语种不在 index_languages:翻译对可检索文本更关键,可单独指定(缺省则与上一组相同) | ||
| 150 | - zh_to_en_model__source_not_in_index: nllb-200-distilled-600m | ||
| 151 | - en_to_zh_model__source_not_in_index: nllb-200-distilled-600m | ||
| 152 | - default_translation_model__source_not_in_index: nllb-200-distilled-600m | ||
| 153 | - # zh_to_en_model__source_not_in_index: deepl | ||
| 154 | - # en_to_zh_model__source_not_in_index: deepl | ||
| 155 | - # default_translation_model__source_not_in_index: deepl | 111 | + zh_to_en_model: deepl # nllb-200-distilled-600m |
| 112 | + en_to_zh_model: deepl | ||
| 113 | + default_translation_model: deepl | ||
| 114 | + # 源语种不在 index_languages时翻译质量比较重要,因此单独配置 | ||
| 115 | + zh_to_en_model__source_not_in_index: deepl | ||
| 116 | + en_to_zh_model__source_not_in_index: deepl | ||
| 117 | + default_translation_model__source_not_in_index: deepl | ||
| 156 | 118 | ||
| 157 | - # 查询解析阶段:翻译与 query 向量并发执行,共用同一等待预算(毫秒)。 | ||
| 158 | - # 检测语言已在租户 index_languages 内:较短;不在索引语言内:较长(翻译对召回更关键)。 | ||
| 159 | - translation_embedding_wait_budget_ms_source_in_index: 300 # 80 | ||
| 160 | - translation_embedding_wait_budget_ms_source_not_in_index: 400 # 200 | 119 | + # 查询解析阶段:翻译与 query 向量并发执行,共用同一等待预算(毫秒) |
| 120 | + translation_embedding_wait_budget_ms_source_in_index: 300 | ||
| 121 | + translation_embedding_wait_budget_ms_source_not_in_index: 400 | ||
| 161 | style_intent: | 122 | style_intent: |
| 162 | enabled: true | 123 | enabled: true |
| 163 | selected_sku_boost: 1.2 | 124 | selected_sku_boost: 1.2 |
| @@ -184,11 +145,8 @@ query_config: | @@ -184,11 +145,8 @@ query_config: | ||
| 184 | product_title_exclusion: | 145 | product_title_exclusion: |
| 185 | enabled: true | 146 | enabled: true |
| 186 | dictionary_path: config/dictionaries/product_title_exclusion.tsv | 147 | dictionary_path: config/dictionaries/product_title_exclusion.tsv |
| 187 | - | ||
| 188 | - # 动态多语言检索字段配置 | ||
| 189 | - # multilingual_fields 会被拼成 title.{lang}/brief.{lang}/... 形式; | ||
| 190 | - # shared_fields 为无语言后缀字段。 | ||
| 191 | search_fields: | 148 | search_fields: |
| 149 | + # 统一按“字段基名”配置;查询时按实际检索语言动态拼接 .{lang} | ||
| 192 | multilingual_fields: | 150 | multilingual_fields: |
| 193 | - title | 151 | - title |
| 194 | - keywords | 152 | - keywords |
| @@ -205,13 +163,14 @@ query_config: | @@ -205,13 +163,14 @@ query_config: | ||
| 205 | # - description | 163 | # - description |
| 206 | # - vendor | 164 | # - vendor |
| 207 | # shared_fields: 无语言后缀字段;示例: tags, option1_values, option2_values, option3_values | 165 | # shared_fields: 无语言后缀字段;示例: tags, option1_values, option2_values, option3_values |
| 166 | + | ||
| 208 | shared_fields: null | 167 | shared_fields: null |
| 209 | core_multilingual_fields: | 168 | core_multilingual_fields: |
| 210 | - title | 169 | - title |
| 211 | - qanchors | 170 | - qanchors |
| 212 | - category_name_text | 171 | - category_name_text |
| 213 | 172 | ||
| 214 | - # 统一文本召回策略(主查询 + 翻译查询) | 173 | + # 文本召回(主查询 + 翻译查询) |
| 215 | text_query_strategy: | 174 | text_query_strategy: |
| 216 | base_minimum_should_match: 60% | 175 | base_minimum_should_match: 60% |
| 217 | translation_minimum_should_match: 60% | 176 | translation_minimum_should_match: 60% |
| @@ -226,14 +185,10 @@ query_config: | @@ -226,14 +185,10 @@ query_config: | ||
| 226 | title: 5.0 | 185 | title: 5.0 |
| 227 | qanchors: 4.0 | 186 | qanchors: 4.0 |
| 228 | phrase_match_boost: 3.0 | 187 | phrase_match_boost: 3.0 |
| 229 | - | ||
| 230 | - # Embedding字段名称 | ||
| 231 | text_embedding_field: title_embedding | 188 | text_embedding_field: title_embedding |
| 232 | image_embedding_field: image_embedding.vector | 189 | image_embedding_field: image_embedding.vector |
| 233 | 190 | ||
| 234 | - # 返回字段配置(_source includes) | ||
| 235 | - # null表示返回所有字段,[]表示不返回任何字段,列表表示只返回指定字段 | ||
| 236 | - # 下列字段与 api/result_formatter.py(SpuResult 填充)及 search/searcher.py(SKU 排序/主图替换)一致 | 191 | + # null表示返回所有字段,[]表示不返回任何字段 |
| 237 | source_fields: | 192 | source_fields: |
| 238 | - spu_id | 193 | - spu_id |
| 239 | - handle | 194 | - handle |
| @@ -255,6 +210,7 @@ query_config: | @@ -255,6 +210,7 @@ query_config: | ||
| 255 | # - enriched_tags | 210 | # - enriched_tags |
| 256 | # - enriched_attributes | 211 | # - enriched_attributes |
| 257 | # - # enriched_taxonomy_attributes.value | 212 | # - # enriched_taxonomy_attributes.value |
| 213 | + | ||
| 258 | - min_price | 214 | - min_price |
| 259 | - compare_at_price | 215 | - compare_at_price |
| 260 | - image_url | 216 | - image_url |
| @@ -274,22 +230,17 @@ query_config: | @@ -274,22 +230,17 @@ query_config: | ||
| 274 | # KNN:文本向量与多模态(图片)向量各自 boost 与召回(k / num_candidates) | 230 | # KNN:文本向量与多模态(图片)向量各自 boost 与召回(k / num_candidates) |
| 275 | knn_text_boost: 4 | 231 | knn_text_boost: 4 |
| 276 | knn_image_boost: 4 | 232 | knn_image_boost: 4 |
| 277 | - | ||
| 278 | - # knn_text_num_candidates = k * 3.4 | ||
| 279 | knn_text_k: 160 | 233 | knn_text_k: 160 |
| 280 | - knn_text_num_candidates: 560 | 234 | + knn_text_num_candidates: 560 # k * 3.4 |
| 281 | knn_text_k_long: 400 | 235 | knn_text_k_long: 400 |
| 282 | knn_text_num_candidates_long: 1200 | 236 | knn_text_num_candidates_long: 1200 |
| 283 | knn_image_k: 400 | 237 | knn_image_k: 400 |
| 284 | knn_image_num_candidates: 1200 | 238 | knn_image_num_candidates: 1200 |
| 285 | 239 | ||
| 286 | -# Function Score配置(ES层打分规则) | ||
| 287 | function_score: | 240 | function_score: |
| 288 | score_mode: sum | 241 | score_mode: sum |
| 289 | boost_mode: multiply | 242 | boost_mode: multiply |
| 290 | functions: [] | 243 | functions: [] |
| 291 | - | ||
| 292 | -# 粗排配置(仅融合 ES 文本/向量信号,不调用模型) | ||
| 293 | coarse_rank: | 244 | coarse_rank: |
| 294 | enabled: true | 245 | enabled: true |
| 295 | input_window: 480 | 246 | input_window: 480 |
| @@ -305,24 +256,20 @@ coarse_rank: | @@ -305,24 +256,20 @@ coarse_rank: | ||
| 305 | knn_text_weight: 1.0 | 256 | knn_text_weight: 1.0 |
| 306 | knn_image_weight: 2.0 | 257 | knn_image_weight: 2.0 |
| 307 | knn_tie_breaker: 0.3 | 258 | knn_tie_breaker: 0.3 |
| 308 | - knn_bias: 0.6 | ||
| 309 | - knn_exponent: 0.4 | ||
| 310 | - | ||
| 311 | -# 精排配置(轻量 reranker) | ||
| 312 | -# enabled=false 时仍进入 fine 阶段,但保序透传,不调用 fine 模型服务 | 259 | + knn_bias: 0.0 |
| 260 | + knn_exponent: 5.6 | ||
| 261 | + knn_text_exponent: 0.0 | ||
| 262 | + knn_image_exponent: 0.0 | ||
| 313 | fine_rank: | 263 | fine_rank: |
| 314 | - enabled: false | 264 | + enabled: false # false 时保序透传 |
| 315 | input_window: 160 | 265 | input_window: 160 |
| 316 | output_window: 80 | 266 | output_window: 80 |
| 317 | timeout_sec: 10.0 | 267 | timeout_sec: 10.0 |
| 318 | rerank_query_template: '{query}' | 268 | rerank_query_template: '{query}' |
| 319 | rerank_doc_template: '{title}' | 269 | rerank_doc_template: '{title}' |
| 320 | service_profile: fine | 270 | service_profile: fine |
| 321 | - | ||
| 322 | -# 重排配置(provider/URL 在 services.rerank) | ||
| 323 | -# enabled=false 时仍进入 rerank 阶段,但保序透传,不调用最终 rerank 服务 | ||
| 324 | rerank: | 271 | rerank: |
| 325 | - enabled: true | 272 | + enabled: false # false 时保序透传 |
| 326 | rerank_window: 160 | 273 | rerank_window: 160 |
| 327 | exact_knn_rescore_enabled: true | 274 | exact_knn_rescore_enabled: true |
| 328 | exact_knn_rescore_window: 160 | 275 | exact_knn_rescore_window: 160 |
| @@ -332,7 +279,6 @@ rerank: | @@ -332,7 +279,6 @@ rerank: | ||
| 332 | rerank_query_template: '{query}' | 279 | rerank_query_template: '{query}' |
| 333 | rerank_doc_template: '{title}' | 280 | rerank_doc_template: '{title}' |
| 334 | service_profile: default | 281 | service_profile: default |
| 335 | - | ||
| 336 | # 乘法融合:fused = Π (max(score,0) + bias) ** exponent(es / rerank / fine / text / knn) | 282 | # 乘法融合:fused = Π (max(score,0) + bias) ** exponent(es / rerank / fine / text / knn) |
| 337 | # 其中 knn_score 先做一层 dis_max: | 283 | # 其中 knn_score 先做一层 dis_max: |
| 338 | # max(knn_text_weight * text_knn, knn_image_weight * image_knn) | 284 | # max(knn_text_weight * text_knn, knn_image_weight * image_knn) |
| @@ -345,30 +291,28 @@ rerank: | @@ -345,30 +291,28 @@ rerank: | ||
| 345 | fine_bias: 0.1 | 291 | fine_bias: 0.1 |
| 346 | fine_exponent: 1.0 | 292 | fine_exponent: 1.0 |
| 347 | text_bias: 0.1 | 293 | text_bias: 0.1 |
| 348 | - text_exponent: 0.25 | ||
| 349 | # base_query_trans_* 相对 base_query 的权重(见 search/rerank_client 中文本 dismax 融合) | 294 | # base_query_trans_* 相对 base_query 的权重(见 search/rerank_client 中文本 dismax 融合) |
| 295 | + text_exponent: 0.25 | ||
| 350 | text_translation_weight: 0.8 | 296 | text_translation_weight: 0.8 |
| 351 | knn_text_weight: 1.0 | 297 | knn_text_weight: 1.0 |
| 352 | knn_image_weight: 2.0 | 298 | knn_image_weight: 2.0 |
| 353 | knn_tie_breaker: 0.3 | 299 | knn_tie_breaker: 0.3 |
| 354 | - knn_bias: 0.6 | ||
| 355 | - knn_exponent: 0.4 | 300 | + knn_bias: 0.0 |
| 301 | + knn_exponent: 5.6 | ||
| 356 | 302 | ||
| 357 | -# 可扩展服务/provider 注册表(单一配置源) | ||
| 358 | services: | 303 | services: |
| 359 | translation: | 304 | translation: |
| 360 | service_url: http://127.0.0.1:6006 | 305 | service_url: http://127.0.0.1:6006 |
| 361 | - # default_model: nllb-200-distilled-600m | ||
| 362 | default_model: nllb-200-distilled-600m | 306 | default_model: nllb-200-distilled-600m |
| 363 | default_scene: general | 307 | default_scene: general |
| 364 | timeout_sec: 10.0 | 308 | timeout_sec: 10.0 |
| 365 | cache: | 309 | cache: |
| 366 | ttl_seconds: 62208000 | 310 | ttl_seconds: 62208000 |
| 367 | sliding_expiration: true | 311 | sliding_expiration: true |
| 368 | - # When false, cache keys are exact-match per request model only (ignores model_quality_tiers for lookups). | ||
| 369 | - enable_model_quality_tier_cache: true | 312 | + # When false, cache keys are exact-match per request model only (ignores model_quality_tiers for lookups) |
| 370 | # Higher tier = better quality. Multiple models may share one tier (同级). | 313 | # Higher tier = better quality. Multiple models may share one tier (同级). |
| 371 | # A request may reuse Redis keys from models with tier > A or tier == A (not from lower tiers). | 314 | # A request may reuse Redis keys from models with tier > A or tier == A (not from lower tiers). |
| 315 | + enable_model_quality_tier_cache: true | ||
| 372 | model_quality_tiers: | 316 | model_quality_tiers: |
| 373 | deepl: 30 | 317 | deepl: 30 |
| 374 | qwen-mt: 30 | 318 | qwen-mt: 30 |
| @@ -462,13 +406,12 @@ services: | @@ -462,13 +406,12 @@ services: | ||
| 462 | num_beams: 1 | 406 | num_beams: 1 |
| 463 | use_cache: true | 407 | use_cache: true |
| 464 | embedding: | 408 | embedding: |
| 465 | - provider: http # http | 409 | + provider: http |
| 466 | providers: | 410 | providers: |
| 467 | http: | 411 | http: |
| 468 | text_base_url: http://127.0.0.1:6005 | 412 | text_base_url: http://127.0.0.1:6005 |
| 469 | image_base_url: http://127.0.0.1:6008 | 413 | image_base_url: http://127.0.0.1:6008 |
| 470 | - # 服务内文本后端(embedding 进程启动时读取) | ||
| 471 | - backend: tei # tei | local_st | 414 | + backend: tei |
| 472 | backends: | 415 | backends: |
| 473 | tei: | 416 | tei: |
| 474 | base_url: http://127.0.0.1:8080 | 417 | base_url: http://127.0.0.1:8080 |
| @@ -508,8 +451,8 @@ services: | @@ -508,8 +451,8 @@ services: | ||
| 508 | request: | 451 | request: |
| 509 | max_docs: 1000 | 452 | max_docs: 1000 |
| 510 | normalize: true | 453 | normalize: true |
| 511 | - default_instance: default | ||
| 512 | # 命名实例:同一套 reranker 代码按实例名读取不同端口 / 后端 / runtime 目录。 | 454 | # 命名实例:同一套 reranker 代码按实例名读取不同端口 / 后端 / runtime 目录。 |
| 455 | + default_instance: default | ||
| 513 | instances: | 456 | instances: |
| 514 | default: | 457 | default: |
| 515 | host: 0.0.0.0 | 458 | host: 0.0.0.0 |
| @@ -551,6 +494,7 @@ services: | @@ -551,6 +494,7 @@ services: | ||
| 551 | enforce_eager: false | 494 | enforce_eager: false |
| 552 | infer_batch_size: 100 | 495 | infer_batch_size: 100 |
| 553 | sort_by_doc_length: true | 496 | sort_by_doc_length: true |
| 497 | + | ||
| 554 | # standard=_format_instruction__standard(固定 yes/no system);compact=_format_instruction(instruction 作 system 且 user 内重复 Instruct) | 498 | # standard=_format_instruction__standard(固定 yes/no system);compact=_format_instruction(instruction 作 system 且 user 内重复 Instruct) |
| 555 | instruction_format: standard # compact standard | 499 | instruction_format: standard # compact standard |
| 556 | # instruction: "Given a query, score the product for relevance" | 500 | # instruction: "Given a query, score the product for relevance" |
| @@ -564,6 +508,7 @@ services: | @@ -564,6 +508,7 @@ services: | ||
| 564 | # instruction: "Rank products by query with category & style match prioritized" | 508 | # instruction: "Rank products by query with category & style match prioritized" |
| 565 | # instruction: "Given a fashion shopping query, retrieve relevant products that answer the query" | 509 | # instruction: "Given a fashion shopping query, retrieve relevant products that answer the query" |
| 566 | instruction: rank products by given query | 510 | instruction: rank products by given query |
| 511 | + | ||
| 567 | # vLLM LLM.score()(跨编码打分)。独立高性能环境 .venv-reranker-score(vllm 0.18 固定版):./scripts/setup_reranker_venv.sh qwen3_vllm_score | 512 | # vLLM LLM.score()(跨编码打分)。独立高性能环境 .venv-reranker-score(vllm 0.18 固定版):./scripts/setup_reranker_venv.sh qwen3_vllm_score |
| 568 | # 与 qwen3_vllm 可共用同一 model_name / HF 缓存;venv 分离以便升级 vLLM 而不影响 generate 后端。 | 513 | # 与 qwen3_vllm 可共用同一 model_name / HF 缓存;venv 分离以便升级 vLLM 而不影响 generate 后端。 |
| 569 | qwen3_vllm_score: | 514 | qwen3_vllm_score: |
| @@ -591,15 +536,10 @@ services: | @@ -591,15 +536,10 @@ services: | ||
| 591 | qwen3_transformers: | 536 | qwen3_transformers: |
| 592 | model_name: Qwen/Qwen3-Reranker-0.6B | 537 | model_name: Qwen/Qwen3-Reranker-0.6B |
| 593 | instruction: rank products by given query | 538 | instruction: rank products by given query |
| 594 | - # instruction: "Score the product’s relevance to the given query" | ||
| 595 | max_length: 8192 | 539 | max_length: 8192 |
| 596 | batch_size: 64 | 540 | batch_size: 64 |
| 597 | use_fp16: true | 541 | use_fp16: true |
| 598 | - # sdpa:默认无需 flash-attn;若已安装 flash_attn 可改为 flash_attention_2 | ||
| 599 | attn_implementation: sdpa | 542 | attn_implementation: sdpa |
| 600 | - # Packed Transformers backend: shared query prefix + custom position_ids/attention_mask. | ||
| 601 | - # For 1 query + many short docs (for example 400 product titles), this usually reduces | ||
| 602 | - # repeated prefix work and padding waste compared with pairwise batching. | ||
| 603 | qwen3_transformers_packed: | 543 | qwen3_transformers_packed: |
| 604 | model_name: Qwen/Qwen3-Reranker-0.6B | 544 | model_name: Qwen/Qwen3-Reranker-0.6B |
| 605 | instruction: Rank products by query with category & style match prioritized | 545 | instruction: Rank products by query with category & style match prioritized |
| @@ -608,8 +548,6 @@ services: | @@ -608,8 +548,6 @@ services: | ||
| 608 | max_docs_per_pack: 0 | 548 | max_docs_per_pack: 0 |
| 609 | use_fp16: true | 549 | use_fp16: true |
| 610 | sort_by_doc_length: true | 550 | sort_by_doc_length: true |
| 611 | - # Packed mode relies on a custom 4D attention mask. "eager" is the safest default. | ||
| 612 | - # If your torch/transformers stack validates it, you can benchmark "sdpa". | ||
| 613 | attn_implementation: eager | 551 | attn_implementation: eager |
| 614 | qwen3_gguf: | 552 | qwen3_gguf: |
| 615 | repo_id: DevQuasar/Qwen.Qwen3-Reranker-4B-GGUF | 553 | repo_id: DevQuasar/Qwen.Qwen3-Reranker-4B-GGUF |
| @@ -617,7 +555,6 @@ services: | @@ -617,7 +555,6 @@ services: | ||
| 617 | cache_dir: ./model_cache | 555 | cache_dir: ./model_cache |
| 618 | local_dir: ./models/reranker/qwen3-reranker-4b-gguf | 556 | local_dir: ./models/reranker/qwen3-reranker-4b-gguf |
| 619 | instruction: Rank products by query with category & style match prioritized | 557 | instruction: Rank products by query with category & style match prioritized |
| 620 | - # T4 16GB / 性能优先配置:全量层 offload,实测比保守配置明显更快 | ||
| 621 | n_ctx: 512 | 558 | n_ctx: 512 |
| 622 | n_batch: 512 | 559 | n_batch: 512 |
| 623 | n_ubatch: 512 | 560 | n_ubatch: 512 |
| @@ -640,8 +577,6 @@ services: | @@ -640,8 +577,6 @@ services: | ||
| 640 | cache_dir: ./model_cache | 577 | cache_dir: ./model_cache |
| 641 | local_dir: ./models/reranker/qwen3-reranker-0.6b-q8_0-gguf | 578 | local_dir: ./models/reranker/qwen3-reranker-0.6b-q8_0-gguf |
| 642 | instruction: Rank products by query with category & style match prioritized | 579 | instruction: Rank products by query with category & style match prioritized |
| 643 | - # 0.6B GGUF / online rerank baseline: | ||
| 644 | - # 实测 400 titles 单请求约 265s,因此它更适合作为低显存功能后备,不适合在线低延迟主路由。 | ||
| 645 | n_ctx: 256 | 580 | n_ctx: 256 |
| 646 | n_batch: 256 | 581 | n_batch: 256 |
| 647 | n_ubatch: 256 | 582 | n_ubatch: 256 |
| @@ -661,20 +596,15 @@ services: | @@ -661,20 +596,15 @@ services: | ||
| 661 | verbose: false | 596 | verbose: false |
| 662 | dashscope_rerank: | 597 | dashscope_rerank: |
| 663 | model_name: qwen3-rerank | 598 | model_name: qwen3-rerank |
| 664 | - # 按地域选择 endpoint: | ||
| 665 | - # 中国: https://dashscope.aliyuncs.com/compatible-api/v1/reranks | ||
| 666 | - # 新加坡: https://dashscope-intl.aliyuncs.com/compatible-api/v1/reranks | ||
| 667 | - # 美国: https://dashscope-us.aliyuncs.com/compatible-api/v1/reranks | ||
| 668 | endpoint: https://dashscope.aliyuncs.com/compatible-api/v1/reranks | 599 | endpoint: https://dashscope.aliyuncs.com/compatible-api/v1/reranks |
| 669 | api_key_env: RERANK_DASHSCOPE_API_KEY_CN | 600 | api_key_env: RERANK_DASHSCOPE_API_KEY_CN |
| 670 | timeout_sec: 10.0 | 601 | timeout_sec: 10.0 |
| 671 | - top_n_cap: 0 # 0 表示 top_n=当前请求文档数;>0 则限制 top_n 上限 | ||
| 672 | - batchsize: 64 # 0 关闭;>0 启用并发小包调度(top_n/top_n_cap 仍生效,分包后全局截断) | 602 | + top_n_cap: 0 # 0 表示 top_n=当前请求文档数 |
| 603 | + batchsize: 64 # 0 关闭;>0 启用并发小包调度(top_n/top_n_cap 仍生效,分包后全局截断) | ||
| 673 | instruct: Given a shopping query, rank product titles by relevance | 604 | instruct: Given a shopping query, rank product titles by relevance |
| 674 | max_retries: 2 | 605 | max_retries: 2 |
| 675 | retry_backoff_sec: 0.2 | 606 | retry_backoff_sec: 0.2 |
| 676 | 607 | ||
| 677 | -# SPU配置(已启用,使用嵌套skus) | ||
| 678 | spu_config: | 608 | spu_config: |
| 679 | enabled: true | 609 | enabled: true |
| 680 | spu_field: spu_id | 610 | spu_field: spu_id |
| @@ -686,7 +616,6 @@ spu_config: | @@ -686,7 +616,6 @@ spu_config: | ||
| 686 | - option2 | 616 | - option2 |
| 687 | - option3 | 617 | - option3 |
| 688 | 618 | ||
| 689 | -# 租户配置(Tenant Configuration) | ||
| 690 | # 每个租户可配置主语言 primary_language 与索引语言 index_languages(主市场语言,商家可勾选) | 619 | # 每个租户可配置主语言 primary_language 与索引语言 index_languages(主市场语言,商家可勾选) |
| 691 | # 默认 index_languages: [en, zh],可配置为任意 SOURCE_LANG_CODE_MAP.keys() 的子集 | 620 | # 默认 index_languages: [en, zh],可配置为任意 SOURCE_LANG_CODE_MAP.keys() 的子集 |
| 692 | tenant_config: | 621 | tenant_config: |
config/loader.py
| @@ -587,6 +587,14 @@ class AppConfigLoader: | @@ -587,6 +587,14 @@ class AppConfigLoader: | ||
| 587 | knn_tie_breaker=float(coarse_fusion_raw.get("knn_tie_breaker", 0.0)), | 587 | knn_tie_breaker=float(coarse_fusion_raw.get("knn_tie_breaker", 0.0)), |
| 588 | knn_bias=float(coarse_fusion_raw.get("knn_bias", 0.6)), | 588 | knn_bias=float(coarse_fusion_raw.get("knn_bias", 0.6)), |
| 589 | knn_exponent=float(coarse_fusion_raw.get("knn_exponent", 0.2)), | 589 | knn_exponent=float(coarse_fusion_raw.get("knn_exponent", 0.2)), |
| 590 | + knn_text_bias=float( | ||
| 591 | + coarse_fusion_raw.get("knn_text_bias", coarse_fusion_raw.get("knn_bias", 0.6)) | ||
| 592 | + ), | ||
| 593 | + knn_text_exponent=float(coarse_fusion_raw.get("knn_text_exponent", 0.0)), | ||
| 594 | + knn_image_bias=float( | ||
| 595 | + coarse_fusion_raw.get("knn_image_bias", coarse_fusion_raw.get("knn_bias", 0.6)) | ||
| 596 | + ), | ||
| 597 | + knn_image_exponent=float(coarse_fusion_raw.get("knn_image_exponent", 0.0)), | ||
| 590 | text_translation_weight=float( | 598 | text_translation_weight=float( |
| 591 | coarse_fusion_raw.get("text_translation_weight", 0.8) | 599 | coarse_fusion_raw.get("text_translation_weight", 0.8) |
| 592 | ), | 600 | ), |
| @@ -636,6 +644,14 @@ class AppConfigLoader: | @@ -636,6 +644,14 @@ class AppConfigLoader: | ||
| 636 | knn_tie_breaker=float(fusion_raw.get("knn_tie_breaker", 0.0)), | 644 | knn_tie_breaker=float(fusion_raw.get("knn_tie_breaker", 0.0)), |
| 637 | knn_bias=float(fusion_raw.get("knn_bias", 0.6)), | 645 | knn_bias=float(fusion_raw.get("knn_bias", 0.6)), |
| 638 | knn_exponent=float(fusion_raw.get("knn_exponent", 0.2)), | 646 | knn_exponent=float(fusion_raw.get("knn_exponent", 0.2)), |
| 647 | + knn_text_bias=float( | ||
| 648 | + fusion_raw.get("knn_text_bias", fusion_raw.get("knn_bias", 0.6)) | ||
| 649 | + ), | ||
| 650 | + knn_text_exponent=float(fusion_raw.get("knn_text_exponent", 0.0)), | ||
| 651 | + knn_image_bias=float( | ||
| 652 | + fusion_raw.get("knn_image_bias", fusion_raw.get("knn_bias", 0.6)) | ||
| 653 | + ), | ||
| 654 | + knn_image_exponent=float(fusion_raw.get("knn_image_exponent", 0.0)), | ||
| 639 | fine_bias=float(fusion_raw.get("fine_bias", 0.00001)), | 655 | fine_bias=float(fusion_raw.get("fine_bias", 0.00001)), |
| 640 | fine_exponent=float(fusion_raw.get("fine_exponent", 1.0)), | 656 | fine_exponent=float(fusion_raw.get("fine_exponent", 1.0)), |
| 641 | text_translation_weight=float( | 657 | text_translation_weight=float( |
config/schema.py
| @@ -119,6 +119,18 @@ class RerankFusionConfig: | @@ -119,6 +119,18 @@ class RerankFusionConfig: | ||
| 119 | knn_tie_breaker: float = 0.0 | 119 | knn_tie_breaker: float = 0.0 |
| 120 | knn_bias: float = 0.6 | 120 | knn_bias: float = 0.6 |
| 121 | knn_exponent: float = 0.2 | 121 | knn_exponent: float = 0.2 |
| 122 | + #: Optional additive floor for the weighted text KNN term. | ||
| 123 | + #: Falls back to knn_bias when omitted in config loading. | ||
| 124 | + knn_text_bias: float = 0.6 | ||
| 125 | + #: Optional extra multiplicative term on weighted text KNN. | ||
| 126 | + #: Uses knn_text_bias as the additive floor. | ||
| 127 | + knn_text_exponent: float = 0.0 | ||
| 128 | + #: Optional additive floor for the weighted image KNN term. | ||
| 129 | + #: Falls back to knn_bias when omitted in config loading. | ||
| 130 | + knn_image_bias: float = 0.6 | ||
| 131 | + #: Optional extra multiplicative term on weighted image KNN. | ||
| 132 | + #: Uses knn_image_bias as the additive floor. | ||
| 133 | + knn_image_exponent: float = 0.0 | ||
| 122 | fine_bias: float = 0.00001 | 134 | fine_bias: float = 0.00001 |
| 123 | fine_exponent: float = 1.0 | 135 | fine_exponent: float = 1.0 |
| 124 | #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合) | 136 | #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合) |
| @@ -143,6 +155,18 @@ class CoarseRankFusionConfig: | @@ -143,6 +155,18 @@ class CoarseRankFusionConfig: | ||
| 143 | knn_tie_breaker: float = 0.0 | 155 | knn_tie_breaker: float = 0.0 |
| 144 | knn_bias: float = 0.6 | 156 | knn_bias: float = 0.6 |
| 145 | knn_exponent: float = 0.2 | 157 | knn_exponent: float = 0.2 |
| 158 | + #: Optional additive floor for the weighted text KNN term. | ||
| 159 | + #: Falls back to knn_bias when omitted in config loading. | ||
| 160 | + knn_text_bias: float = 0.6 | ||
| 161 | + #: Optional extra multiplicative term on weighted text KNN. | ||
| 162 | + #: Uses knn_text_bias as the additive floor. | ||
| 163 | + knn_text_exponent: float = 0.0 | ||
| 164 | + #: Optional additive floor for the weighted image KNN term. | ||
| 165 | + #: Falls back to knn_bias when omitted in config loading. | ||
| 166 | + knn_image_bias: float = 0.6 | ||
| 167 | + #: Optional extra multiplicative term on weighted image KNN. | ||
| 168 | + #: Uses knn_image_bias as the additive floor. | ||
| 169 | + knn_image_exponent: float = 0.0 | ||
| 146 | #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合) | 170 | #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合) |
| 147 | text_translation_weight: float = 0.8 | 171 | text_translation_weight: float = 0.8 |
| 148 | 172 |
| @@ -0,0 +1,133 @@ | @@ -0,0 +1,133 @@ | ||
| 1 | +# 本项目缓存一览 | ||
| 2 | + | ||
| 3 | +本文档梳理仓库内**与业务相关的各类缓存**:说明用途、键与过期策略,并汇总运维脚本。按「分布式(Redis)→ 进程内 → 磁盘/模型 → 第三方」组织。 | ||
| 4 | + | ||
| 5 | +--- | ||
| 6 | + | ||
| 7 | +## 一、Redis 集中式缓存(生产主路径) | ||
| 8 | + | ||
| 9 | +所有下列缓存默认连接 **`infrastructure.redis`**(`config/config.yaml` 与 `REDIS_*` 环境变量),**数据库编号一般为 `db=0`**(脚本可通过参数覆盖)。`snapshot_db` 仅在配置中存在,供快照/运维场景选用,应用代码未按该字段切换业务缓存的 DB。 | ||
| 10 | + | ||
| 11 | +### 1. 文本 / 图像向量缓存(Embedding) | ||
| 12 | + | ||
| 13 | +- **作用**:缓存 BGE/TEI 文本向量与 CN-CLIP 图像向量、CLIP 文本塔向量,避免重复推理。 | ||
| 14 | +- **实现**:`embeddings/redis_embedding_cache.py` 的 `RedisEmbeddingCache`;键构造见 `embeddings/cache_keys.py`。 | ||
| 15 | +- **Key 形态**(最终 Redis 键 = `前缀` + `可选 namespace` + `逻辑键`): | ||
| 16 | + - **前缀**:`infrastructure.redis.embedding_cache_prefix`(默认 `embedding`,可用 `REDIS_EMBEDDING_CACHE_PREFIX` 覆盖)。 | ||
| 17 | + - **命名空间**:`embeddings/server.py` 与客户端中分为: | ||
| 18 | + - 文本:`namespace=""` → `{prefix}:{embed:norm0|1:...}` | ||
| 19 | + - 图像:`namespace="image"` → `{prefix}:image:{embed:模型名:txt:norm0|1:...}` | ||
| 20 | + - CLIP 文本:`namespace="clip_text"` → `{prefix}:clip_text:{embed:模型名:img:norm0|1:...}` | ||
| 21 | + - 逻辑键段含 `embed:`、`norm0/1`、模型名(多模态)、过长文本/URL 时用 `h:sha256:...` 摘要(见 `cache_keys.py` 注释)。 | ||
| 22 | +- **值格式**:BF16 压缩后的字节(`embeddings/bf16.py`),非 JSON。 | ||
| 23 | +- **TTL**:`infrastructure.redis.cache_expire_days`(默认 **720 天**,`REDIS_CACHE_EXPIRE_DAYS`)。写入用 `SETEX`;**命中时滑动续期**(`EXPIRE` 刷新为同一时长)。 | ||
| 24 | +- **Redis 客户端**:`decode_responses=False`(二进制)。 | ||
| 25 | + | ||
| 26 | +**主要代码**:`embeddings/server.py`、`embeddings/text_encoder.py`、`embeddings/image_encoder.py`。 | ||
| 27 | + | ||
| 28 | +--- | ||
| 29 | + | ||
| 30 | +### 2. 翻译结果缓存(Translation) | ||
| 31 | + | ||
| 32 | +- **作用**:按「翻译模型 + 目标语言 + 原文」缓存译文;支持**模型质量分层探测**(高 tier 模型写入的缓存可被同 tier 或更高 tier 的请求命中,见 `translation/settings.py` 中 `translation_cache_probe_models`)。 | ||
| 33 | +- **Key 形态**:`trans:{model}:{target_lang}:{text前4字符}{sha256全文}`(`translation/cache.py` 的 `build_key`)。 | ||
| 34 | +- **值格式**:UTF-8 译文字符串。 | ||
| 35 | +- **TTL**:`services.translation.cache.ttl_seconds`(默认 **62208000 秒 = 720 天**)。若 `sliding_expiration: true`,命中时刷新 TTL。 | ||
| 36 | +- **能力级开关**:各 `capabilities.*.use_cache` 为 `false` 时该后端不落 Redis。 | ||
| 37 | +- **Redis 客户端**:`decode_responses=True`。 | ||
| 38 | + | ||
| 39 | +**主要代码**:`translation/cache.py`、`translation/service.py`;翻译 HTTP 服务:`api/translator_app.py`(`get_translation_service()` 使用 `lru_cache` 单例,见下文进程内缓存)。 | ||
| 40 | + | ||
| 41 | +--- | ||
| 42 | + | ||
| 43 | +### 3. 商品内容理解 / Anchors 与语义分析缓存(Indexer) | ||
| 44 | + | ||
| 45 | +- **作用**:缓存 LLM 对商品标题等拼出的 **prompt 输入** 所做的分析结果(anchors、语义属性等),避免重复调用大模型。键与 `analysis_kind`、`prompt` 契约版本、`target_lang` 及输入摘要相关。 | ||
| 46 | +- **Key 形态**:`{anchor_cache_prefix}:{analysis_kind}:{prompt_contract_hash[:12]}:{target_lang}:{prompt_input[:4]}{md5}`(`indexer/product_enrich.py` 中 `_make_analysis_cache_key`)。 | ||
| 47 | +- **前缀**:`infrastructure.redis.anchor_cache_prefix`(默认 `product_anchors`,`REDIS_ANCHOR_CACHE_PREFIX`)。 | ||
| 48 | +- **值格式**:JSON 字符串(规范化后的分析结果)。 | ||
| 49 | +- **TTL**:`anchor_cache_expire_days`(默认 **30 天**),以秒写入 `SETEX`(**非滑动**,与向量/翻译不同)。 | ||
| 50 | +- **读逻辑**:无 TTL 刷新;仅校验内容是否「有意义」再返回。 | ||
| 51 | + | ||
| 52 | +**主要代码**:`indexer/product_enrich.py`;与 HTTP 侧对齐说明见 `api/routes/indexer.py` 注释。 | ||
| 53 | + | ||
| 54 | +--- | ||
| 55 | + | ||
| 56 | +## 二、进程内缓存(非共享、随进程重启失效) | ||
| 57 | + | ||
| 58 | +| 名称 | 用途 | 范围/生命周期 | | ||
| 59 | +|------|------|----------------| | ||
| 60 | +| **`get_app_config()`** | 解析并缓存全局 `AppConfig` | `config/loader.py`:`@lru_cache(maxsize=1)`;`reload_app_config()` 可 `cache_clear()` | | ||
| 61 | +| **`TranslationService` 单例** | 翻译服务进程内复用后端与 Redis 客户端 | `api/translator_app.py`:`get_translation_service()` | | ||
| 62 | +| **`_nllb_tokenizer_code_by_normalized_key`** | NLLB tokenizer 语言码映射 | `translation/languages.py`:`@lru_cache(maxsize=1)` | | ||
| 63 | +| **`QueryTextAnalysisCache`** | 单次查询解析内复用分词、tokenizer 结果 | `query/tokenization.py`,随 `QueryParser` 一次 parse | | ||
| 64 | +| **`_SelectionContext`(SKU 意图)** | 归一化文本、分词、匹配布尔等小字典 | `search/sku_intent_selector.py`,单次选择流程 | | ||
| 65 | +| **`incremental_service` transformer 缓存** | 按 `tenant_id` 缓存文档转换器 | `indexer/incremental_service.py`,**无界**、多租户进程长期存活时需注意内存 | | ||
| 66 | +| **NLLB batch 内 `token_count_cache`** | 同一 batch 内避免重复计 token | `translation/backends/local_ctranslate2.py` | | ||
| 67 | +| **CLIP 分词器 `@lru_cache`**(第三方) | 简单 tokenizer 缓存 | `third-party/clip-as-service/.../simple_tokenizer.py` | | ||
| 68 | + | ||
| 69 | +**说明**:`utils/cache.py` 中的 **`DictCache`**(文件 JSON:默认 `.cache/dict_cache.json`)已导出,但仓库内**无直接 `DictCache(` 调用**,视为可复用工具/预留,非当前主路径。 | ||
| 70 | + | ||
| 71 | +--- | ||
| 72 | + | ||
| 73 | +## 三、磁盘与模型相关「缓存」(非 Redis) | ||
| 74 | + | ||
| 75 | +| 名称 | 用途 | 配置/位置 | | ||
| 76 | +|------|------|-----------| | ||
| 77 | +| **Hugging Face / 本地模型目录** | 重排器、翻译本地模型等权重下载与缓存 | `services.rerank.backends.*.cache_dir` 等,常见默认 **`./model_cache`**(`config/config.yaml`) | | ||
| 78 | +| **vLLM `enable_prefix_caching`** | 重排服务内 **Prefix KV 缓存**(加速同前缀批推理) | `services.rerank.backends.qwen3_vllm*`、`reranker/backends/qwen3_vllm*.py` | | ||
| 79 | +| **运行时目录** | 重排服务状态/引擎文件 | `services.rerank.instances.*.runtime_dir`(如 `./.runtime/reranker/...`) | | ||
| 80 | + | ||
| 81 | +翻译能力里的 **`use_cache: true`**(如 NLLB、Marian)在多数后端指 **推理时的 KV cache(Transformer)**,与 Redis 译文缓存是不同层次;Redis 译文缓存仍由 `TranslationCache` 控制。 | ||
| 82 | + | ||
| 83 | +--- | ||
| 84 | + | ||
| 85 | +## 四、Elasticsearch 内部缓存 | ||
| 86 | + | ||
| 87 | +索引设置中的 `refresh_interval` 等影响近实时可见性,但**不属于应用层键值缓存**。若需调优 ES 查询缓存、节点堆等,见运维文档与集群配置,此处不展开。 | ||
| 88 | + | ||
| 89 | +--- | ||
| 90 | + | ||
| 91 | +## 五、运维与巡检脚本(Redis) | ||
| 92 | + | ||
| 93 | +| 脚本 | 作用 | | ||
| 94 | +|------|------| | ||
| 95 | +| `scripts/redis/redis_cache_health_check.py` | 按 **embedding / translation / anchors** 三类前缀巡检:key 数量估算、TTL 采样、`IDLETIME` 等 | | ||
| 96 | +| `scripts/redis/redis_cache_prefix_stats.py` | 按前缀统计 key 数量与 **MEMORY USAGE**(可多 DB) | | ||
| 97 | +| `scripts/redis/redis_memory_heavy_keys.py` | 扫描占用内存最大的 key,辅助排查「统计与总内存不一致」 | | ||
| 98 | +| `scripts/redis/monitor_eviction.py` | 实时监控 **eviction** 相关事件,用于容量与驱逐策略排查 | | ||
| 99 | + | ||
| 100 | +使用前需加载项目配置(如 `source activate.sh`)以保证 `REDIS_CONFIG` 与生产一致。脚本注释中给出了 **`redis-cli` 手工统计**示例(按前缀 `wc -l`、`MEMORY STATS` 等)。 | ||
| 101 | + | ||
| 102 | +--- | ||
| 103 | + | ||
| 104 | +## 六、总表(Redis 与各层缓存) | ||
| 105 | + | ||
| 106 | +| 缓存名称 | 业务模块 | 存储 | Key 前缀 / 命名模式 | 过期时间 | 过期策略 | 值摘要 | 配置键 / 环境变量 | | ||
| 107 | +|----------|----------|------|---------------------|----------|----------|--------|-------------------| | ||
| 108 | +| 文本向量 | 检索 / 索引 / Embedding 服务 | Redis db≈0 | `{embedding_cache_prefix}:*`(逻辑键以 `embed:norm…` 开头) | `cache_expire_days`(默认 720 天) | 写入 TTL + 命中滑动续期 | BF16 字节向量 | `infrastructure.redis.*`;`REDIS_EMBEDDING_CACHE_PREFIX`、`REDIS_CACHE_EXPIRE_DAYS` | | ||
| 109 | +| 图像向量(CLIP 图) | 图搜 / 多模态 | 同上 | `{prefix}:image:*` | 同上 | 同上 | BF16 字节 | 同上 | | ||
| 110 | +| CLIP 文本塔向量 | 图搜文本侧 | 同上 | `{prefix}:clip_text:*` | 同上 | 同上 | BF16 字节 | 同上 | | ||
| 111 | +| 翻译译文 | 查询翻译、翻译服务 | 同上 | `trans:{model}:{lang}:*` | `services.translation.cache.ttl_seconds`(默认 720 天) | 可配置滑动(`sliding_expiration`) | UTF-8 字符串 | `services.translation.cache.*`;各能力 `use_cache` | | ||
| 112 | +| 商品分析 / Anchors | 索引富化、LLM 内容理解 | 同上 | `{anchor_cache_prefix}:{kind}:{hash}:{lang}:*` | `anchor_cache_expire_days`(默认 30 天) | 固定 TTL,不滑动 | JSON 字符串 | `anchor_cache_prefix`、`anchor_cache_expire_days`;`REDIS_ANCHOR_*` | | ||
| 113 | +| 应用配置 | 全栈 | 进程内存 | N/A(单例) | 进程生命周期 | `reload_app_config` 清除 | `AppConfig` 对象 | `config/loader.py` | | ||
| 114 | +| 翻译服务实例 | 翻译 API | 进程内存 | N/A | 进程生命周期 | 单例 | `TranslationService` | `api/translator_app.py` | | ||
| 115 | +| 查询分词缓存 | 查询解析 | 单次请求内 | N/A | 单次 parse | — | 分词与中间结果 | `query/tokenization.py` | | ||
| 116 | +| SKU 意图辅助字典 | 搜索排序辅助 | 单次请求内 | N/A | 单次选择 | — | 小 dict | `search/sku_intent_selector.py` | | ||
| 117 | +| 增量索引 Transformer | 索引管道 | 进程内存 | `tenant_id` 字符串键 | 长期(无界) | 无自动淘汰 | Transformer 元组 | `indexer/incremental_service.py` | | ||
| 118 | +| 重排 / 翻译模型权重 | 推理服务 | 本地磁盘 | 目录路径 | 无自动删除(人工清理) | — | 模型文件 | `cache_dir: ./model_cache` 等 | | ||
| 119 | +| vLLM Prefix 缓存 | 重排(Qwen3 等) | GPU/引擎内 | 引擎内部 | 引擎管理 | — | KV Cache | `enable_prefix_caching` | | ||
| 120 | +| 文件 Dict 缓存(可选) | 通用 | `.cache/dict_cache.json` | 分类 + 自定义 key | 持久直至删除 | — | JSON 可序列化值 | `utils/cache.py`(当前无调用方) | | ||
| 121 | + | ||
| 122 | +--- | ||
| 123 | + | ||
| 124 | +## 七、维护建议(简要) | ||
| 125 | + | ||
| 126 | +1. **容量**:三类 Redis 缓存(embedding / trans / anchors)可共用同一实例;大租户或图搜多时 **embedding** 与 **trans** 往往占主要内存,可用 `redis_cache_prefix_stats.py` 分前缀观察。 | ||
| 127 | +2. **键迁移**:变更 `embedding_cache_prefix`、CLIP `model_name` 或 prompt 契约会自然**隔离新键空间**;旧键依赖 TTL 或人工批量删除。 | ||
| 128 | +3. **一致性**:向量缓存对异常向量会 **delete key**(`RedisEmbeddingCache.get`);anchors 依赖 `cache_version` 与契约 hash 防止错误复用。 | ||
| 129 | +4. **监控**:除脚本外,Embedding HTTP 服务健康检查会报告各 lane 的 **`cache_enabled`**(`embeddings/server.py`)。 | ||
| 130 | + | ||
| 131 | +--- | ||
| 132 | + | ||
| 133 | +*文档随代码扫描生成;若新增 Redis 用途,请同步更新本文件与 `scripts/redis/redis_cache_health_check.py` 中的 `_load_known_cache_types()`。* |
search/es_query_builder.py
| @@ -8,6 +8,7 @@ Simplified architecture: | @@ -8,6 +8,7 @@ Simplified architecture: | ||
| 8 | - function_score wrapper for boosting fields | 8 | - function_score wrapper for boosting fields |
| 9 | """ | 9 | """ |
| 10 | 10 | ||
| 11 | +from dataclasses import dataclass | ||
| 11 | from typing import Dict, Any, List, Optional, Tuple | 12 | from typing import Dict, Any, List, Optional, Tuple |
| 12 | 13 | ||
| 13 | import numpy as np | 14 | import numpy as np |
| @@ -114,6 +115,171 @@ class ESQueryBuilder: | @@ -114,6 +115,171 @@ class ESQueryBuilder: | ||
| 114 | self.phrase_match_tie_breaker = float(phrase_match_tie_breaker) | 115 | self.phrase_match_tie_breaker = float(phrase_match_tie_breaker) |
| 115 | self.phrase_match_boost = float(phrase_match_boost) | 116 | self.phrase_match_boost = float(phrase_match_boost) |
| 116 | 117 | ||
| 118 | + @dataclass(frozen=True) | ||
| 119 | + class KNNClausePlan: | ||
| 120 | + field: str | ||
| 121 | + boost: float | ||
| 122 | + k: Optional[int] = None | ||
| 123 | + num_candidates: Optional[int] = None | ||
| 124 | + nested_path: Optional[str] = None | ||
| 125 | + | ||
| 126 | + @staticmethod | ||
| 127 | + def _vector_to_list(vector: Any) -> List[float]: | ||
| 128 | + if vector is None: | ||
| 129 | + return [] | ||
| 130 | + if hasattr(vector, "tolist"): | ||
| 131 | + values = vector.tolist() | ||
| 132 | + else: | ||
| 133 | + values = list(vector) | ||
| 134 | + return [float(v) for v in values] | ||
| 135 | + | ||
| 136 | + @staticmethod | ||
| 137 | + def _query_token_count(parsed_query: Optional[Any]) -> int: | ||
| 138 | + if parsed_query is None: | ||
| 139 | + return 0 | ||
| 140 | + query_tokens = getattr(parsed_query, "query_tokens", None) or [] | ||
| 141 | + return len(query_tokens) | ||
| 142 | + | ||
| 143 | + def get_text_knn_plan(self, parsed_query: Optional[Any] = None) -> Optional[KNNClausePlan]: | ||
| 144 | + if not self.text_embedding_field: | ||
| 145 | + return None | ||
| 146 | + boost = self.knn_text_boost | ||
| 147 | + final_knn_k = self.knn_text_k | ||
| 148 | + final_knn_num_candidates = self.knn_text_num_candidates | ||
| 149 | + if self._query_token_count(parsed_query) >= 5: | ||
| 150 | + final_knn_k = self.knn_text_k_long | ||
| 151 | + final_knn_num_candidates = self.knn_text_num_candidates_long | ||
| 152 | + boost = self.knn_text_boost * 1.4 | ||
| 153 | + return self.KNNClausePlan( | ||
| 154 | + field=str(self.text_embedding_field), | ||
| 155 | + boost=float(boost), | ||
| 156 | + k=int(final_knn_k), | ||
| 157 | + num_candidates=int(final_knn_num_candidates), | ||
| 158 | + ) | ||
| 159 | + | ||
| 160 | + def get_image_knn_plan(self) -> Optional[KNNClausePlan]: | ||
| 161 | + if not self.image_embedding_field: | ||
| 162 | + return None | ||
| 163 | + nested_path, _, _ = str(self.image_embedding_field).rpartition(".") | ||
| 164 | + return self.KNNClausePlan( | ||
| 165 | + field=str(self.image_embedding_field), | ||
| 166 | + boost=float(self.knn_image_boost), | ||
| 167 | + k=int(self.knn_image_k), | ||
| 168 | + num_candidates=int(self.knn_image_num_candidates), | ||
| 169 | + nested_path=nested_path or None, | ||
| 170 | + ) | ||
| 171 | + | ||
| 172 | + def build_text_knn_clause( | ||
| 173 | + self, | ||
| 174 | + query_vector: Any, | ||
| 175 | + *, | ||
| 176 | + parsed_query: Optional[Any] = None, | ||
| 177 | + query_name: str = "knn_query", | ||
| 178 | + ) -> Optional[Dict[str, Any]]: | ||
| 179 | + plan = self.get_text_knn_plan(parsed_query) | ||
| 180 | + if plan is None or query_vector is None: | ||
| 181 | + return None | ||
| 182 | + return { | ||
| 183 | + "knn": { | ||
| 184 | + "field": plan.field, | ||
| 185 | + "query_vector": self._vector_to_list(query_vector), | ||
| 186 | + "k": plan.k, | ||
| 187 | + "num_candidates": plan.num_candidates, | ||
| 188 | + "boost": plan.boost, | ||
| 189 | + "_name": query_name, | ||
| 190 | + } | ||
| 191 | + } | ||
| 192 | + | ||
| 193 | + def build_image_knn_clause( | ||
| 194 | + self, | ||
| 195 | + image_query_vector: Any, | ||
| 196 | + *, | ||
| 197 | + query_name: str = "image_knn_query", | ||
| 198 | + ) -> Optional[Dict[str, Any]]: | ||
| 199 | + plan = self.get_image_knn_plan() | ||
| 200 | + if plan is None or image_query_vector is None: | ||
| 201 | + return None | ||
| 202 | + image_knn_query = { | ||
| 203 | + "field": plan.field, | ||
| 204 | + "query_vector": self._vector_to_list(image_query_vector), | ||
| 205 | + "k": plan.k, | ||
| 206 | + "num_candidates": plan.num_candidates, | ||
| 207 | + "boost": plan.boost, | ||
| 208 | + } | ||
| 209 | + if plan.nested_path: | ||
| 210 | + return { | ||
| 211 | + "nested": { | ||
| 212 | + "path": plan.nested_path, | ||
| 213 | + "_name": query_name, | ||
| 214 | + "query": {"knn": image_knn_query}, | ||
| 215 | + "score_mode": "max", | ||
| 216 | + } | ||
| 217 | + } | ||
| 218 | + return { | ||
| 219 | + "knn": { | ||
| 220 | + **image_knn_query, | ||
| 221 | + "_name": query_name, | ||
| 222 | + } | ||
| 223 | + } | ||
| 224 | + | ||
| 225 | + def build_exact_text_knn_rescore_clause( | ||
| 226 | + self, | ||
| 227 | + query_vector: Any, | ||
| 228 | + *, | ||
| 229 | + parsed_query: Optional[Any] = None, | ||
| 230 | + query_name: str = "exact_text_knn_query", | ||
| 231 | + ) -> Optional[Dict[str, Any]]: | ||
| 232 | + plan = self.get_text_knn_plan(parsed_query) | ||
| 233 | + if plan is None or query_vector is None: | ||
| 234 | + return None | ||
| 235 | + return { | ||
| 236 | + "script_score": { | ||
| 237 | + "_name": query_name, | ||
| 238 | + "query": {"exists": {"field": plan.field}}, | ||
| 239 | + "script": { | ||
| 240 | + "source": ( | ||
| 241 | + f"((dotProduct(params.query_vector, '{plan.field}') + 1.0) / 2.0) * params.boost" | ||
| 242 | + ), | ||
| 243 | + "params": { | ||
| 244 | + "query_vector": self._vector_to_list(query_vector), | ||
| 245 | + "boost": float(plan.boost), | ||
| 246 | + }, | ||
| 247 | + }, | ||
| 248 | + } | ||
| 249 | + } | ||
| 250 | + | ||
| 251 | + def build_exact_image_knn_rescore_clause( | ||
| 252 | + self, | ||
| 253 | + image_query_vector: Any, | ||
| 254 | + *, | ||
| 255 | + query_name: str = "exact_image_knn_query", | ||
| 256 | + ) -> Optional[Dict[str, Any]]: | ||
| 257 | + plan = self.get_image_knn_plan() | ||
| 258 | + if plan is None or image_query_vector is None: | ||
| 259 | + return None | ||
| 260 | + script_score_query = { | ||
| 261 | + "query": {"exists": {"field": plan.field}}, | ||
| 262 | + "script": { | ||
| 263 | + "source": ( | ||
| 264 | + f"((dotProduct(params.query_vector, '{plan.field}') + 1.0) / 2.0) * params.boost" | ||
| 265 | + ), | ||
| 266 | + "params": { | ||
| 267 | + "query_vector": self._vector_to_list(image_query_vector), | ||
| 268 | + "boost": float(plan.boost), | ||
| 269 | + }, | ||
| 270 | + }, | ||
| 271 | + } | ||
| 272 | + if plan.nested_path: | ||
| 273 | + return { | ||
| 274 | + "nested": { | ||
| 275 | + "path": plan.nested_path, | ||
| 276 | + "_name": query_name, | ||
| 277 | + "score_mode": "max", | ||
| 278 | + "query": {"script_score": script_score_query}, | ||
| 279 | + } | ||
| 280 | + } | ||
| 281 | + return {"script_score": {"_name": query_name, **script_score_query}} | ||
| 282 | + | ||
| 117 | def _apply_source_filter(self, es_query: Dict[str, Any]) -> None: | 283 | def _apply_source_filter(self, es_query: Dict[str, Any]) -> None: |
| 118 | """ | 284 | """ |
| 119 | Apply tri-state _source semantics: | 285 | Apply tri-state _source semantics: |
| @@ -250,52 +416,21 @@ class ESQueryBuilder: | @@ -250,52 +416,21 @@ class ESQueryBuilder: | ||
| 250 | # 3. Add KNN search clauses alongside lexical clauses under the same bool.should | 416 | # 3. Add KNN search clauses alongside lexical clauses under the same bool.should |
| 251 | # Text KNN: k / num_candidates from config; long queries use *_long and higher boost | 417 | # Text KNN: k / num_candidates from config; long queries use *_long and higher boost |
| 252 | if has_embedding: | 418 | if has_embedding: |
| 253 | - text_knn_boost = self.knn_text_boost | ||
| 254 | - final_knn_k = self.knn_text_k | ||
| 255 | - final_knn_num_candidates = self.knn_text_num_candidates | ||
| 256 | - if parsed_query: | ||
| 257 | - query_tokens = getattr(parsed_query, 'query_tokens', None) or [] | ||
| 258 | - token_count = len(query_tokens) | ||
| 259 | - if token_count >= 5: | ||
| 260 | - final_knn_k = self.knn_text_k_long | ||
| 261 | - final_knn_num_candidates = self.knn_text_num_candidates_long | ||
| 262 | - text_knn_boost = self.knn_text_boost * 1.4 | ||
| 263 | - recall_clauses.append({ | ||
| 264 | - "knn": { | ||
| 265 | - "field": self.text_embedding_field, | ||
| 266 | - "query_vector": query_vector.tolist(), | ||
| 267 | - "k": final_knn_k, | ||
| 268 | - "num_candidates": final_knn_num_candidates, | ||
| 269 | - "boost": text_knn_boost, | ||
| 270 | - "_name": "knn_query", | ||
| 271 | - } | ||
| 272 | - }) | 419 | + text_knn_clause = self.build_text_knn_clause( |
| 420 | + query_vector, | ||
| 421 | + parsed_query=parsed_query, | ||
| 422 | + query_name="knn_query", | ||
| 423 | + ) | ||
| 424 | + if text_knn_clause: | ||
| 425 | + recall_clauses.append(text_knn_clause) | ||
| 273 | 426 | ||
| 274 | if has_image_embedding: | 427 | if has_image_embedding: |
| 275 | - nested_path, _, _ = str(self.image_embedding_field).rpartition(".") | ||
| 276 | - image_knn_query = { | ||
| 277 | - "field": self.image_embedding_field, | ||
| 278 | - "query_vector": image_query_vector.tolist(), | ||
| 279 | - "k": self.knn_image_k, | ||
| 280 | - "num_candidates": self.knn_image_num_candidates, | ||
| 281 | - "boost": self.knn_image_boost, | ||
| 282 | - } | ||
| 283 | - if nested_path: | ||
| 284 | - recall_clauses.append({ | ||
| 285 | - "nested": { | ||
| 286 | - "path": nested_path, | ||
| 287 | - "_name": "image_knn_query", | ||
| 288 | - "query": {"knn": image_knn_query}, | ||
| 289 | - "score_mode": "max", | ||
| 290 | - } | ||
| 291 | - }) | ||
| 292 | - else: | ||
| 293 | - recall_clauses.append({ | ||
| 294 | - "knn": { | ||
| 295 | - **image_knn_query, | ||
| 296 | - "_name": "image_knn_query", | ||
| 297 | - } | ||
| 298 | - }) | 428 | + image_knn_clause = self.build_image_knn_clause( |
| 429 | + image_query_vector, | ||
| 430 | + query_name="image_knn_query", | ||
| 431 | + ) | ||
| 432 | + if image_knn_clause: | ||
| 433 | + recall_clauses.append(image_knn_clause) | ||
| 299 | 434 | ||
| 300 | # 4. Build main query structure: filters and recall | 435 | # 4. Build main query structure: filters and recall |
| 301 | if recall_clauses: | 436 | if recall_clauses: |
search/rerank_client.py
| @@ -396,12 +396,50 @@ def _build_ltr_feature_block( | @@ -396,12 +396,50 @@ def _build_ltr_feature_block( | ||
| 396 | } | 396 | } |
| 397 | 397 | ||
| 398 | 398 | ||
| 399 | +def _maybe_append_weighted_knn_terms( | ||
| 400 | + *, | ||
| 401 | + term_rows: List[Dict[str, Any]], | ||
| 402 | + fusion: CoarseRankFusionConfig | RerankFusionConfig, | ||
| 403 | + knn_components: Optional[Dict[str, Any]], | ||
| 404 | +) -> None: | ||
| 405 | + if not knn_components: | ||
| 406 | + return | ||
| 407 | + | ||
| 408 | + weighted_text_knn_score = _to_score(knn_components.get("weighted_text_knn_score")) | ||
| 409 | + weighted_image_knn_score = _to_score(knn_components.get("weighted_image_knn_score")) | ||
| 410 | + | ||
| 411 | + if float(getattr(fusion, "knn_text_exponent", 0.0)) != 0.0: | ||
| 412 | + text_bias = float(getattr(fusion, "knn_text_bias", fusion.knn_bias)) | ||
| 413 | + term_rows.append( | ||
| 414 | + { | ||
| 415 | + "name": "weighted_text_knn_score", | ||
| 416 | + "raw_score": weighted_text_knn_score, | ||
| 417 | + "bias": text_bias, | ||
| 418 | + "exponent": float(fusion.knn_text_exponent), | ||
| 419 | + "factor": (max(weighted_text_knn_score, 0.0) + text_bias) ** float(fusion.knn_text_exponent), | ||
| 420 | + } | ||
| 421 | + ) | ||
| 422 | + if float(getattr(fusion, "knn_image_exponent", 0.0)) != 0.0: | ||
| 423 | + image_bias = float(getattr(fusion, "knn_image_bias", fusion.knn_bias)) | ||
| 424 | + term_rows.append( | ||
| 425 | + { | ||
| 426 | + "name": "weighted_image_knn_score", | ||
| 427 | + "raw_score": weighted_image_knn_score, | ||
| 428 | + "bias": image_bias, | ||
| 429 | + "exponent": float(fusion.knn_image_exponent), | ||
| 430 | + "factor": (max(weighted_image_knn_score, 0.0) + image_bias) | ||
| 431 | + ** float(fusion.knn_image_exponent), | ||
| 432 | + } | ||
| 433 | + ) | ||
| 434 | + | ||
| 435 | + | ||
| 399 | def _compute_multiplicative_fusion( | 436 | def _compute_multiplicative_fusion( |
| 400 | *, | 437 | *, |
| 401 | es_score: float, | 438 | es_score: float, |
| 402 | text_score: float, | 439 | text_score: float, |
| 403 | knn_score: float, | 440 | knn_score: float, |
| 404 | fusion: RerankFusionConfig, | 441 | fusion: RerankFusionConfig, |
| 442 | + knn_components: Optional[Dict[str, Any]] = None, | ||
| 405 | rerank_score: Optional[float] = None, | 443 | rerank_score: Optional[float] = None, |
| 406 | fine_score: Optional[float] = None, | 444 | fine_score: Optional[float] = None, |
| 407 | style_boost: float = 1.0, | 445 | style_boost: float = 1.0, |
| @@ -427,6 +465,7 @@ def _compute_multiplicative_fusion( | @@ -427,6 +465,7 @@ def _compute_multiplicative_fusion( | ||
| 427 | _add_term("fine_score", fine_score, fusion.fine_bias, fusion.fine_exponent) | 465 | _add_term("fine_score", fine_score, fusion.fine_bias, fusion.fine_exponent) |
| 428 | _add_term("text_score", text_score, fusion.text_bias, fusion.text_exponent) | 466 | _add_term("text_score", text_score, fusion.text_bias, fusion.text_exponent) |
| 429 | _add_term("knn_score", knn_score, fusion.knn_bias, fusion.knn_exponent) | 467 | _add_term("knn_score", knn_score, fusion.knn_bias, fusion.knn_exponent) |
| 468 | + _maybe_append_weighted_knn_terms(term_rows=term_rows, fusion=fusion, knn_components=knn_components) | ||
| 430 | 469 | ||
| 431 | fused = 1.0 | 470 | fused = 1.0 |
| 432 | factors: Dict[str, float] = {} | 471 | factors: Dict[str, float] = {} |
| @@ -450,12 +489,30 @@ def _multiply_coarse_fusion_factors( | @@ -450,12 +489,30 @@ def _multiply_coarse_fusion_factors( | ||
| 450 | es_score: float, | 489 | es_score: float, |
| 451 | text_score: float, | 490 | text_score: float, |
| 452 | knn_score: float, | 491 | knn_score: float, |
| 492 | + knn_components: Dict[str, Any], | ||
| 453 | fusion: CoarseRankFusionConfig, | 493 | fusion: CoarseRankFusionConfig, |
| 454 | -) -> Tuple[float, float, float, float]: | 494 | +) -> Tuple[float, float, float, float, float, float]: |
| 455 | es_factor = (max(es_score, 0.0) + fusion.es_bias) ** fusion.es_exponent | 495 | es_factor = (max(es_score, 0.0) + fusion.es_bias) ** fusion.es_exponent |
| 456 | text_factor = (max(text_score, 0.0) + fusion.text_bias) ** fusion.text_exponent | 496 | text_factor = (max(text_score, 0.0) + fusion.text_bias) ** fusion.text_exponent |
| 457 | knn_factor = (max(knn_score, 0.0) + fusion.knn_bias) ** fusion.knn_exponent | 497 | knn_factor = (max(knn_score, 0.0) + fusion.knn_bias) ** fusion.knn_exponent |
| 458 | - return es_factor, text_factor, knn_factor, es_factor * text_factor * knn_factor | 498 | + text_knn_bias = float(getattr(fusion, "knn_text_bias", fusion.knn_bias)) |
| 499 | + image_knn_bias = float(getattr(fusion, "knn_image_bias", fusion.knn_bias)) | ||
| 500 | + text_knn_factor = ( | ||
| 501 | + (max(_to_score(knn_components.get("weighted_text_knn_score")), 0.0) + text_knn_bias) | ||
| 502 | + ** float(getattr(fusion, "knn_text_exponent", 0.0)) | ||
| 503 | + ) | ||
| 504 | + image_knn_factor = ( | ||
| 505 | + (max(_to_score(knn_components.get("weighted_image_knn_score")), 0.0) + image_knn_bias) | ||
| 506 | + ** float(getattr(fusion, "knn_image_exponent", 0.0)) | ||
| 507 | + ) | ||
| 508 | + return ( | ||
| 509 | + es_factor, | ||
| 510 | + text_factor, | ||
| 511 | + knn_factor, | ||
| 512 | + text_knn_factor, | ||
| 513 | + image_knn_factor, | ||
| 514 | + es_factor * text_factor * knn_factor * text_knn_factor * image_knn_factor, | ||
| 515 | + ) | ||
| 459 | 516 | ||
| 460 | 517 | ||
| 461 | def _has_selected_sku(hit: Dict[str, Any]) -> bool: | 518 | def _has_selected_sku(hit: Dict[str, Any]) -> bool: |
| @@ -481,10 +538,18 @@ def coarse_resort_hits( | @@ -481,10 +538,18 @@ def coarse_resort_hits( | ||
| 481 | knn_components = signal_bundle["knn_components"] | 538 | knn_components = signal_bundle["knn_components"] |
| 482 | text_score = signal_bundle["text_score"] | 539 | text_score = signal_bundle["text_score"] |
| 483 | knn_score = signal_bundle["knn_score"] | 540 | knn_score = signal_bundle["knn_score"] |
| 484 | - es_factor, text_factor, knn_factor, coarse_score = _multiply_coarse_fusion_factors( | 541 | + ( |
| 542 | + es_factor, | ||
| 543 | + text_factor, | ||
| 544 | + knn_factor, | ||
| 545 | + text_knn_factor, | ||
| 546 | + image_knn_factor, | ||
| 547 | + coarse_score, | ||
| 548 | + ) = _multiply_coarse_fusion_factors( | ||
| 485 | es_score=es_score, | 549 | es_score=es_score, |
| 486 | text_score=text_score, | 550 | text_score=text_score, |
| 487 | knn_score=knn_score, | 551 | knn_score=knn_score, |
| 552 | + knn_components=knn_components, | ||
| 488 | fusion=f, | 553 | fusion=f, |
| 489 | ) | 554 | ) |
| 490 | 555 | ||
| @@ -535,6 +600,8 @@ def coarse_resort_hits( | @@ -535,6 +600,8 @@ def coarse_resort_hits( | ||
| 535 | "coarse_es_factor": es_factor, | 600 | "coarse_es_factor": es_factor, |
| 536 | "coarse_text_factor": text_factor, | 601 | "coarse_text_factor": text_factor, |
| 537 | "coarse_knn_factor": knn_factor, | 602 | "coarse_knn_factor": knn_factor, |
| 603 | + "coarse_text_knn_factor": text_knn_factor, | ||
| 604 | + "coarse_image_knn_factor": image_knn_factor, | ||
| 538 | "coarse_score": coarse_score, | 605 | "coarse_score": coarse_score, |
| 539 | "matched_queries": matched_queries, | 606 | "matched_queries": matched_queries, |
| 540 | "ltr_features": ltr_features, | 607 | "ltr_features": ltr_features, |
| @@ -576,7 +643,7 @@ def fuse_scores_and_resort( | @@ -576,7 +643,7 @@ def fuse_scores_and_resort( | ||
| 576 | - _rerank_score: 重排服务返回的分数 | 643 | - _rerank_score: 重排服务返回的分数 |
| 577 | - _fused_score: 融合分数 | 644 | - _fused_score: 融合分数 |
| 578 | - _text_score: 文本相关性分数(优先取 named queries 的 base_query 分数) | 645 | - _text_score: 文本相关性分数(优先取 named queries 的 base_query 分数) |
| 579 | - - _knn_score: KNN 分数(优先取 named queries 的 knn_query 分数) | 646 | + - _knn_score: KNN 分数(优先取 exact named queries,缺失时回退 ANN named queries) |
| 580 | 647 | ||
| 581 | Args: | 648 | Args: |
| 582 | es_hits: ES hits 列表(会被原地修改) | 649 | es_hits: ES hits 列表(会被原地修改) |
| @@ -612,6 +679,7 @@ def fuse_scores_and_resort( | @@ -612,6 +679,7 @@ def fuse_scores_and_resort( | ||
| 612 | text_score=text_score, | 679 | text_score=text_score, |
| 613 | knn_score=knn_score, | 680 | knn_score=knn_score, |
| 614 | fusion=f, | 681 | fusion=f, |
| 682 | + knn_components=knn_components, | ||
| 615 | style_boost=style_boost, | 683 | style_boost=style_boost, |
| 616 | ) | 684 | ) |
| 617 | fused = fusion_result["score"] | 685 | fused = fusion_result["score"] |
| @@ -678,6 +746,8 @@ def fuse_scores_and_resort( | @@ -678,6 +746,8 @@ def fuse_scores_and_resort( | ||
| 678 | "es_factor": fusion_result["factors"].get("es_score"), | 746 | "es_factor": fusion_result["factors"].get("es_score"), |
| 679 | "text_factor": fusion_result["factors"].get("text_score"), | 747 | "text_factor": fusion_result["factors"].get("text_score"), |
| 680 | "knn_factor": fusion_result["factors"].get("knn_score"), | 748 | "knn_factor": fusion_result["factors"].get("knn_score"), |
| 749 | + "text_knn_factor": fusion_result["factors"].get("weighted_text_knn_score"), | ||
| 750 | + "image_knn_factor": fusion_result["factors"].get("weighted_image_knn_score"), | ||
| 681 | "style_intent_selected_sku": sku_selected, | 751 | "style_intent_selected_sku": sku_selected, |
| 682 | "style_intent_selected_sku_boost": style_boost, | 752 | "style_intent_selected_sku_boost": style_boost, |
| 683 | "matched_queries": signal_bundle["matched_queries"], | 753 | "matched_queries": signal_bundle["matched_queries"], |
| @@ -810,6 +880,7 @@ def run_lightweight_rerank( | @@ -810,6 +880,7 @@ def run_lightweight_rerank( | ||
| 810 | text_score=text_score, | 880 | text_score=text_score, |
| 811 | knn_score=knn_score, | 881 | knn_score=knn_score, |
| 812 | fusion=f, | 882 | fusion=f, |
| 883 | + knn_components=signal_bundle["knn_components"], | ||
| 813 | style_boost=style_boost, | 884 | style_boost=style_boost, |
| 814 | ) | 885 | ) |
| 815 | 886 | ||
| @@ -846,6 +917,8 @@ def run_lightweight_rerank( | @@ -846,6 +917,8 @@ def run_lightweight_rerank( | ||
| 846 | "es_factor": fusion_result["factors"].get("es_score"), | 917 | "es_factor": fusion_result["factors"].get("es_score"), |
| 847 | "text_factor": fusion_result["factors"].get("text_score"), | 918 | "text_factor": fusion_result["factors"].get("text_score"), |
| 848 | "knn_factor": fusion_result["factors"].get("knn_score"), | 919 | "knn_factor": fusion_result["factors"].get("knn_score"), |
| 920 | + "text_knn_factor": fusion_result["factors"].get("weighted_text_knn_score"), | ||
| 921 | + "image_knn_factor": fusion_result["factors"].get("weighted_image_knn_score"), | ||
| 849 | "style_intent_selected_sku": sku_selected, | 922 | "style_intent_selected_sku": sku_selected, |
| 850 | "style_intent_selected_sku_boost": style_boost, | 923 | "style_intent_selected_sku_boost": style_boost, |
| 851 | "ltr_features": ltr_features, | 924 | "ltr_features": ltr_features, |
search/searcher.py
| @@ -242,67 +242,29 @@ class Searcher: | @@ -242,67 +242,29 @@ class Searcher: | ||
| 242 | return configured | 242 | return configured |
| 243 | return int(self.config.rerank.rerank_window) | 243 | return int(self.config.rerank.rerank_window) |
| 244 | 244 | ||
| 245 | - @staticmethod | ||
| 246 | - def _vector_to_list(vector: Any) -> List[float]: | ||
| 247 | - if vector is None: | ||
| 248 | - return [] | ||
| 249 | - if hasattr(vector, "tolist"): | ||
| 250 | - values = vector.tolist() | ||
| 251 | - else: | ||
| 252 | - values = list(vector) | ||
| 253 | - return [float(v) for v in values] | ||
| 254 | - | ||
| 255 | def _build_exact_knn_rescore( | 245 | def _build_exact_knn_rescore( |
| 256 | self, | 246 | self, |
| 257 | *, | 247 | *, |
| 258 | query_vector: Any, | 248 | query_vector: Any, |
| 259 | image_query_vector: Any, | 249 | image_query_vector: Any, |
| 250 | + parsed_query: Optional[ParsedQuery] = None, | ||
| 260 | ) -> Optional[Dict[str, Any]]: | 251 | ) -> Optional[Dict[str, Any]]: |
| 261 | clauses: List[Dict[str, Any]] = [] | 252 | clauses: List[Dict[str, Any]] = [] |
| 262 | 253 | ||
| 263 | - if query_vector is not None and self.text_embedding_field: | ||
| 264 | - clauses.append( | ||
| 265 | - { | ||
| 266 | - "script_score": { | ||
| 267 | - "_name": "exact_text_knn_query", | ||
| 268 | - "query": {"exists": {"field": self.text_embedding_field}}, | ||
| 269 | - "script": { | ||
| 270 | - # Keep exact score on the same [0, 1]-ish scale as KNN dot_product recall. | ||
| 271 | - "source": ( | ||
| 272 | - f"(dotProduct(params.query_vector, '{self.text_embedding_field}') + 1.0) / 2.0" | ||
| 273 | - ), | ||
| 274 | - "params": {"query_vector": self._vector_to_list(query_vector)}, | ||
| 275 | - }, | ||
| 276 | - } | ||
| 277 | - } | ||
| 278 | - ) | 254 | + text_clause = self.query_builder.build_exact_text_knn_rescore_clause( |
| 255 | + query_vector, | ||
| 256 | + parsed_query=parsed_query, | ||
| 257 | + query_name="exact_text_knn_query", | ||
| 258 | + ) | ||
| 259 | + if text_clause: | ||
| 260 | + clauses.append(text_clause) | ||
| 279 | 261 | ||
| 280 | - if image_query_vector is not None and self.image_embedding_field: | ||
| 281 | - nested_path, _, _ = str(self.image_embedding_field).rpartition(".") | ||
| 282 | - if nested_path: | ||
| 283 | - clauses.append( | ||
| 284 | - { | ||
| 285 | - "nested": { | ||
| 286 | - "path": nested_path, | ||
| 287 | - "_name": "exact_image_knn_query", | ||
| 288 | - "score_mode": "max", | ||
| 289 | - "query": { | ||
| 290 | - "script_score": { | ||
| 291 | - "query": {"exists": {"field": self.image_embedding_field}}, | ||
| 292 | - "script": { | ||
| 293 | - # Keep exact score on the same [0, 1]-ish scale as KNN dot_product recall. | ||
| 294 | - "source": ( | ||
| 295 | - f"(dotProduct(params.query_vector, '{self.image_embedding_field}') + 1.0) / 2.0" | ||
| 296 | - ), | ||
| 297 | - "params": { | ||
| 298 | - "query_vector": self._vector_to_list(image_query_vector), | ||
| 299 | - }, | ||
| 300 | - }, | ||
| 301 | - } | ||
| 302 | - }, | ||
| 303 | - } | ||
| 304 | - } | ||
| 305 | - ) | 262 | + image_clause = self.query_builder.build_exact_image_knn_rescore_clause( |
| 263 | + image_query_vector, | ||
| 264 | + query_name="exact_image_knn_query", | ||
| 265 | + ) | ||
| 266 | + if image_clause: | ||
| 267 | + clauses.append(image_clause) | ||
| 306 | 268 | ||
| 307 | if not clauses: | 269 | if not clauses: |
| 308 | return None | 270 | return None |
| @@ -330,12 +292,14 @@ class Searcher: | @@ -330,12 +292,14 @@ class Searcher: | ||
| 330 | in_rank_window: bool, | 292 | in_rank_window: bool, |
| 331 | query_vector: Any, | 293 | query_vector: Any, |
| 332 | image_query_vector: Any, | 294 | image_query_vector: Any, |
| 295 | + parsed_query: Optional[ParsedQuery] = None, | ||
| 333 | ) -> None: | 296 | ) -> None: |
| 334 | if not in_rank_window or not self.config.rerank.exact_knn_rescore_enabled: | 297 | if not in_rank_window or not self.config.rerank.exact_knn_rescore_enabled: |
| 335 | return | 298 | return |
| 336 | rescore = self._build_exact_knn_rescore( | 299 | rescore = self._build_exact_knn_rescore( |
| 337 | query_vector=query_vector, | 300 | query_vector=query_vector, |
| 338 | image_query_vector=image_query_vector, | 301 | image_query_vector=image_query_vector, |
| 302 | + parsed_query=parsed_query, | ||
| 339 | ) | 303 | ) |
| 340 | if not rescore: | 304 | if not rescore: |
| 341 | return | 305 | return |
| @@ -689,6 +653,7 @@ class Searcher: | @@ -689,6 +653,7 @@ class Searcher: | ||
| 689 | in_rank_window=in_rank_window, | 653 | in_rank_window=in_rank_window, |
| 690 | query_vector=parsed_query.query_vector if enable_embedding else None, | 654 | query_vector=parsed_query.query_vector if enable_embedding else None, |
| 691 | image_query_vector=image_query_vector, | 655 | image_query_vector=image_query_vector, |
| 656 | + parsed_query=parsed_query, | ||
| 692 | ) | 657 | ) |
| 693 | 658 | ||
| 694 | # Add facets for faceted search | 659 | # Add facets for faceted search |
tests/test_es_query_builder.py
| @@ -208,3 +208,36 @@ def test_image_knn_clause_is_added_alongside_base_translation_and_text_knn(): | @@ -208,3 +208,36 @@ def test_image_knn_clause_is_added_alongside_base_translation_and_text_knn(): | ||
| 208 | assert image_knn["path"] == "image_embedding" | 208 | assert image_knn["path"] == "image_embedding" |
| 209 | assert image_knn["score_mode"] == "max" | 209 | assert image_knn["score_mode"] == "max" |
| 210 | assert image_knn["query"]["knn"]["field"] == "image_embedding.vector" | 210 | assert image_knn["query"]["knn"]["field"] == "image_embedding.vector" |
| 211 | + | ||
| 212 | + | ||
| 213 | +def test_text_knn_plan_is_reused_for_ann_and_exact_rescore(): | ||
| 214 | + qb = _builder() | ||
| 215 | + parsed_query = SimpleNamespace(query_tokens=["a", "b", "c", "d", "e"]) | ||
| 216 | + | ||
| 217 | + ann_clause = qb.build_text_knn_clause( | ||
| 218 | + np.array([0.1, 0.2, 0.3]), | ||
| 219 | + parsed_query=parsed_query, | ||
| 220 | + ) | ||
| 221 | + exact_clause = qb.build_exact_text_knn_rescore_clause( | ||
| 222 | + np.array([0.1, 0.2, 0.3]), | ||
| 223 | + parsed_query=parsed_query, | ||
| 224 | + ) | ||
| 225 | + | ||
| 226 | + assert ann_clause is not None | ||
| 227 | + assert exact_clause is not None | ||
| 228 | + assert ann_clause["knn"]["k"] == qb.knn_text_k_long | ||
| 229 | + assert ann_clause["knn"]["num_candidates"] == qb.knn_text_num_candidates_long | ||
| 230 | + assert ann_clause["knn"]["boost"] == qb.knn_text_boost * 1.4 | ||
| 231 | + assert exact_clause["script_score"]["script"]["params"]["boost"] == qb.knn_text_boost * 1.4 | ||
| 232 | + | ||
| 233 | + | ||
| 234 | +def test_image_knn_plan_is_reused_for_ann_and_exact_rescore(): | ||
| 235 | + qb = _builder() | ||
| 236 | + | ||
| 237 | + ann_clause = qb.build_image_knn_clause(np.array([0.4, 0.5, 0.6])) | ||
| 238 | + exact_clause = qb.build_exact_image_knn_rescore_clause(np.array([0.4, 0.5, 0.6])) | ||
| 239 | + | ||
| 240 | + assert ann_clause is not None | ||
| 241 | + assert exact_clause is not None | ||
| 242 | + assert ann_clause["nested"]["query"]["knn"]["boost"] == qb.knn_image_boost | ||
| 243 | + assert exact_clause["nested"]["query"]["script_score"]["script"]["params"]["boost"] == qb.knn_image_boost |
tests/test_rerank_client.py
| 1 | from math import isclose | 1 | from math import isclose |
| 2 | 2 | ||
| 3 | -from config.schema import RerankFusionConfig | ||
| 4 | -from search.rerank_client import fuse_scores_and_resort, run_lightweight_rerank | 3 | +from config.schema import CoarseRankFusionConfig, RerankFusionConfig |
| 4 | +from search.rerank_client import coarse_resort_hits, fuse_scores_and_resort, run_lightweight_rerank | ||
| 5 | 5 | ||
| 6 | 6 | ||
| 7 | def test_fuse_scores_and_resort_aggregates_text_components_and_keeps_rerank_primary(): | 7 | def test_fuse_scores_and_resort_aggregates_text_components_and_keeps_rerank_primary(): |
| @@ -257,6 +257,96 @@ def test_fuse_scores_and_resort_applies_knn_dismax_weights_and_tie_breaker(): | @@ -257,6 +257,96 @@ def test_fuse_scores_and_resort_applies_knn_dismax_weights_and_tie_breaker(): | ||
| 257 | assert isclose(debug[0]["knn_support_score"], 0.5, rel_tol=1e-9) | 257 | assert isclose(debug[0]["knn_support_score"], 0.5, rel_tol=1e-9) |
| 258 | 258 | ||
| 259 | 259 | ||
| 260 | +def test_fuse_scores_and_resort_can_add_weighted_text_and_image_knn_factors(): | ||
| 261 | + hits = [ | ||
| 262 | + { | ||
| 263 | + "_id": "a", | ||
| 264 | + "_score": 1.0, | ||
| 265 | + "matched_queries": { | ||
| 266 | + "base_query": 2.0, | ||
| 267 | + "knn_query": 0.4, | ||
| 268 | + "image_knn_query": 0.5, | ||
| 269 | + }, | ||
| 270 | + } | ||
| 271 | + ] | ||
| 272 | + fusion = RerankFusionConfig( | ||
| 273 | + rerank_bias=0.0, | ||
| 274 | + rerank_exponent=1.0, | ||
| 275 | + text_bias=0.0, | ||
| 276 | + text_exponent=1.0, | ||
| 277 | + knn_text_weight=2.0, | ||
| 278 | + knn_image_weight=1.0, | ||
| 279 | + knn_tie_breaker=0.25, | ||
| 280 | + knn_bias=0.1, | ||
| 281 | + knn_exponent=1.0, | ||
| 282 | + knn_text_exponent=2.0, | ||
| 283 | + knn_image_exponent=3.0, | ||
| 284 | + ) | ||
| 285 | + | ||
| 286 | + debug = fuse_scores_and_resort(hits, [0.8], fusion=fusion, debug=True) | ||
| 287 | + | ||
| 288 | + weighted_text_knn = 0.8 | ||
| 289 | + weighted_image_knn = 0.5 | ||
| 290 | + expected_knn = weighted_text_knn + 0.25 * weighted_image_knn | ||
| 291 | + expected_fused = ( | ||
| 292 | + 0.8 | ||
| 293 | + * 2.0 | ||
| 294 | + * (expected_knn + 0.1) | ||
| 295 | + * ((weighted_text_knn + 0.1) ** 2.0) | ||
| 296 | + * ((weighted_image_knn + 0.1) ** 3.0) | ||
| 297 | + ) | ||
| 298 | + | ||
| 299 | + assert isclose(hits[0]["_fused_score"], expected_fused, rel_tol=1e-9) | ||
| 300 | + assert isclose(debug[0]["text_knn_factor"], (weighted_text_knn + 0.1) ** 2.0, rel_tol=1e-9) | ||
| 301 | + assert isclose(debug[0]["image_knn_factor"], (weighted_image_knn + 0.1) ** 3.0, rel_tol=1e-9) | ||
| 302 | + assert "weighted_text_knn_score=" in debug[0]["fusion_summary"] | ||
| 303 | + assert "weighted_image_knn_score=" in debug[0]["fusion_summary"] | ||
| 304 | + | ||
| 305 | + | ||
| 306 | +def test_coarse_resort_hits_can_add_weighted_text_and_image_knn_factors(): | ||
| 307 | + hits = [ | ||
| 308 | + { | ||
| 309 | + "_id": "coarse-a", | ||
| 310 | + "_score": 1.0, | ||
| 311 | + "matched_queries": { | ||
| 312 | + "base_query": 2.0, | ||
| 313 | + "knn_query": 0.4, | ||
| 314 | + "image_knn_query": 0.5, | ||
| 315 | + }, | ||
| 316 | + } | ||
| 317 | + ] | ||
| 318 | + fusion = CoarseRankFusionConfig( | ||
| 319 | + es_bias=0.0, | ||
| 320 | + es_exponent=1.0, | ||
| 321 | + text_bias=0.0, | ||
| 322 | + text_exponent=1.0, | ||
| 323 | + knn_text_weight=2.0, | ||
| 324 | + knn_image_weight=1.0, | ||
| 325 | + knn_tie_breaker=0.25, | ||
| 326 | + knn_bias=0.1, | ||
| 327 | + knn_exponent=1.0, | ||
| 328 | + knn_text_exponent=2.0, | ||
| 329 | + knn_image_exponent=3.0, | ||
| 330 | + ) | ||
| 331 | + | ||
| 332 | + debug = coarse_resort_hits(hits, fusion=fusion, debug=True) | ||
| 333 | + | ||
| 334 | + weighted_text_knn = 0.8 | ||
| 335 | + weighted_image_knn = 0.5 | ||
| 336 | + expected_knn = weighted_text_knn + 0.25 * weighted_image_knn | ||
| 337 | + expected_coarse = ( | ||
| 338 | + 1.0 | ||
| 339 | + * 2.0 | ||
| 340 | + * (expected_knn + 0.1) | ||
| 341 | + * ((weighted_text_knn + 0.1) ** 2.0) | ||
| 342 | + * ((weighted_image_knn + 0.1) ** 3.0) | ||
| 343 | + ) | ||
| 344 | + | ||
| 345 | + assert isclose(hits[0]["_coarse_score"], expected_coarse, rel_tol=1e-9) | ||
| 346 | + assert isclose(debug[0]["coarse_text_knn_factor"], (weighted_text_knn + 0.1) ** 2.0, rel_tol=1e-9) | ||
| 347 | + assert isclose(debug[0]["coarse_image_knn_factor"], (weighted_image_knn + 0.1) ** 3.0, rel_tol=1e-9) | ||
| 348 | + | ||
| 349 | + | ||
| 260 | def test_run_lightweight_rerank_sorts_by_fused_stage_score(monkeypatch): | 350 | def test_run_lightweight_rerank_sorts_by_fused_stage_score(monkeypatch): |
| 261 | hits = [ | 351 | hits = [ |
| 262 | { | 352 | { |
tests/test_search_rerank_window.py
| @@ -1055,6 +1055,7 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch): | @@ -1055,6 +1055,7 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch): | ||
| 1055 | translations={}, | 1055 | translations={}, |
| 1056 | query_vector=np.array([0.1, 0.2, 0.3], dtype=np.float32), | 1056 | query_vector=np.array([0.1, 0.2, 0.3], dtype=np.float32), |
| 1057 | image_query_vector=np.array([0.4, 0.5, 0.6], dtype=np.float32), | 1057 | image_query_vector=np.array([0.4, 0.5, 0.6], dtype=np.float32), |
| 1058 | + query_tokens=["dress", "formal", "spring", "summer", "floral"], | ||
| 1058 | ) | 1059 | ) |
| 1059 | 1060 | ||
| 1060 | es_client = _FakeESClient(total_hits=5) | 1061 | es_client = _FakeESClient(total_hits=5) |
| @@ -1081,8 +1082,12 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch): | @@ -1081,8 +1082,12 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch): | ||
| 1081 | es_index_name=base.es_index_name, | 1082 | es_index_name=base.es_index_name, |
| 1082 | es_settings=base.es_settings, | 1083 | es_settings=base.es_settings, |
| 1083 | ) | 1084 | ) |
| 1084 | - searcher = _build_searcher(config, es_client) | ||
| 1085 | - searcher.query_parser = _VectorQueryParser() | 1085 | + searcher = Searcher( |
| 1086 | + es_client=es_client, | ||
| 1087 | + config=config, | ||
| 1088 | + query_parser=_VectorQueryParser(), | ||
| 1089 | + image_encoder=SimpleNamespace(), | ||
| 1090 | + ) | ||
| 1086 | context = create_request_context(reqid="exact-rescore", uid="u-exact") | 1091 | context = create_request_context(reqid="exact-rescore", uid="u-exact") |
| 1087 | 1092 | ||
| 1088 | monkeypatch.setattr( | 1093 | monkeypatch.setattr( |
| @@ -1112,6 +1117,36 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch): | @@ -1112,6 +1117,36 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch): | ||
| 1112 | elif "nested" in clause: | 1117 | elif "nested" in clause: |
| 1113 | names.append(clause["nested"]["_name"]) | 1118 | names.append(clause["nested"]["_name"]) |
| 1114 | assert names == ["exact_text_knn_query", "exact_image_knn_query"] | 1119 | assert names == ["exact_text_knn_query", "exact_image_knn_query"] |
| 1120 | + recall_query = body["query"] | ||
| 1121 | + if "bool" in recall_query and recall_query["bool"].get("must"): | ||
| 1122 | + recall_query = recall_query["bool"]["must"][0] | ||
| 1123 | + if "function_score" in recall_query: | ||
| 1124 | + recall_query = recall_query["function_score"]["query"] | ||
| 1125 | + recall_should = recall_query["bool"]["should"] | ||
| 1126 | + text_knn_clause = next( | ||
| 1127 | + clause["knn"] | ||
| 1128 | + for clause in recall_should | ||
| 1129 | + if clause.get("knn", {}).get("_name") == "knn_query" | ||
| 1130 | + ) | ||
| 1131 | + image_knn_clause = next( | ||
| 1132 | + clause["nested"]["query"]["knn"] | ||
| 1133 | + for clause in recall_should | ||
| 1134 | + if clause.get("nested", {}).get("_name") == "image_knn_query" | ||
| 1135 | + ) | ||
| 1136 | + exact_text_clause = next( | ||
| 1137 | + clause["script_score"] | ||
| 1138 | + for clause in should | ||
| 1139 | + if clause.get("script_score", {}).get("_name") == "exact_text_knn_query" | ||
| 1140 | + ) | ||
| 1141 | + exact_image_clause = next( | ||
| 1142 | + clause["nested"]["query"]["script_score"] | ||
| 1143 | + for clause in should | ||
| 1144 | + if clause.get("nested", {}).get("_name") == "exact_image_knn_query" | ||
| 1145 | + ) | ||
| 1146 | + assert text_knn_clause["boost"] == 28.0 | ||
| 1147 | + assert exact_text_clause["script"]["params"]["boost"] == text_knn_clause["boost"] | ||
| 1148 | + assert image_knn_clause["boost"] == 20.0 | ||
| 1149 | + assert exact_image_clause["script"]["params"]["boost"] == image_knn_clause["boost"] | ||
| 1115 | 1150 | ||
| 1116 | 1151 | ||
| 1117 | def test_searcher_skips_exact_knn_rescore_outside_rank_window(monkeypatch): | 1152 | def test_searcher_skips_exact_knn_rescore_outside_rank_window(monkeypatch): |