Commit 47452e1dd6cd19314e6f867e4b5d346ddbc99651

Authored by tangwang
1 parent 317c5d2c

feat(search): 支持可配置的精确向量重打分 (exact rescore),解决 topN 内 ANN 得分缺失问题

 修改内容

1. **新增配置项** (`config/config.yaml`)
   - `exact_knn_rescore_enabled`: 是否开启精确向量重打分,默认 true
   - `exact_knn_rescore_window`: 重打分窗口大小,默认 160(与 rerank_window 解耦,可独立配置)

2. **ES 查询层改造** (`search/searcher.py`)
   - 在第一次 ES 搜索中,根据配置为 window_size 内的文档注入 rescore 阶段
   - rescore_query 中包含两个 named script_score 子句:
     - `exact_text_knn_query`: 对文本向量执行精确点积
     - `exact_image_knn_query`: 对图片向量执行精确点积
   - 当前采用 `score_mode=total` 且 `rescore_query_weight=0.0`,**只补分不改排序**,exact 分仅出现在 `matched_queries` 中

3. **统一向量得分 Boost 逻辑** (`search/es_query_builder.py`)
   - 新增 `_get_knn_plan()` 方法,集中管理文本/图片 KNN 的 boost 计算规则
   - 支持长查询(token 数超过阈值)时文本 boost 额外乘 1.4 倍
   - 精确 rescore 与 ANN 召回**共用同一套 boost 规则**,确保分数量纲一致
   - 原有 ANN 查询构建逻辑同步迁移至该统一入口

4. **融合阶段得分优先级调整** (`search/rerank_client.py`)
   - `_build_hit_signal_bundle()` 中统一处理向量得分读取
   - 优先从 `matched_queries` 读取 `exact_text_knn_query` / `exact_image_knn_query`
   - 若不存在则回退到原 `knn_query` / `image_knn_query`(ANN 得分)
   - 覆盖 coarse_rank、fine_rank、rerank 三个阶段,避免重复补丁

5. **测试覆盖**
   - `tests/test_es_query_builder.py`: 验证 ANN 与 exact 共用 boost 规则
   - `tests/test_search_rerank_window.py`: 验证 rescore 窗口及 named query 正确注入
   - `tests/test_rerank_client.py`: 验证 exact 优先、回退 ANN 的逻辑

 技术细节

- **精确向量计算脚本** (Painless)
  ```painless
  // 文本 (dotProduct + 1.0) / 2.0
  (dotProduct(params.query_vector, 'title_embedding') + 1.0) / 2.0
  // 图片同理,字段为 'image_embedding.vector'
  ```
  乘以统一的 boost(来自配置 `knn_text_boost` / `knn_image_boost` 及长查询放大因子)。

- **named query 保留机制**
  - 主查询中已开启 `include_named_queries_score: true`
  - rescore 阶段命名的脚本得分会合并到每个 hit 的 `matched_queries` 中
  - 通过 `_extract_named_score()` 按名称提取,与原始 ANN 得分访问方式完全一致

- **性能影响** (基于 top160、6 条真实查询、warm-up 后 3 轮平均)
  - `elasticsearch_search_primary` 耗时: 124.71ms → 136.60ms (+11.89ms, +9.53%)
  - `total_search` 受其他组件抖动影响较大,不作为主要参考
  - 该开销在可接受范围内,未出现超时或资源瓶颈

 配置示例

```yaml
search:
  exact_knn_rescore_enabled: true
  exact_knn_rescore_window: 160
  knn_text_boost: 4.0
  knn_image_boost: 4.0
  long_query_token_threshold: 8
  long_query_text_boost_factor: 1.4
```

 已知问题与后续计划

- 当前版本经过调参实验发现,开启 exact rescore 后部分 query(强类型约束 + 多风格/颜色相似)的主指标相比 baseline(exact=false)下降约 0.031(0.6009 → 0.5697)
- 根因:exact 将 KNN 从稀疏辅助信号变为 dense 排序因子,coarse 阶段排序语义变化,单纯调整现有 `knn_bias/exponent` 无法完全恢复
- 后续迭代方向:**coarse 阶段暂不强制使用 exact**,仅 fine/rerank 优先 exact;或 coarse 采用“ANN 优先,exact 只补缺失”策略,再重新评估

 相关文件

- `config/config.yaml`
- `search/searcher.py`
- `search/es_query_builder.py`
- `search/rerank_client.py`
- `tests/test_es_query_builder.py`
- `tests/test_search_rerank_window.py`
- `tests/test_rerank_client.py`
- `scripts/evaluation/exact_rescore_coarse_tuning_round2.json` (调参实验记录)
config/config.yaml
1   -# Unified Configuration for Multi-Tenant Search Engine
2   -# 统一配置文件,所有租户共用一套配置
3   -# 注意:索引结构由 mappings/search_products.json 定义,此文件只配置搜索行为
4   -#
5   -# 约定:下列键为必填;进程环境变量可覆盖 infrastructure / runtime 中同名语义项
6   -#(如 ES_HOST、API_PORT 等),未设置环境变量时使用本文件中的值。
7   -
8   -# Process / bind addresses (环境变量 APP_ENV、RUNTIME_ENV、ES_INDEX_NAMESPACE 可覆盖前两者的语义)
9 1 runtime:
10 2 environment: prod
11 3 index_namespace: ''
... ... @@ -21,8 +13,6 @@ runtime:
21 13 translator_port: 6006
22 14 reranker_host: 0.0.0.0
23 15 reranker_port: 6007
24   -
25   -# 基础设施连接(敏感项优先读环境变量:ES_*、REDIS_*、DB_*、DASHSCOPE_API_KEY、DEEPL_AUTH_KEY)
26 16 infrastructure:
27 17 elasticsearch:
28 18 host: http://localhost:9200
... ... @@ -49,23 +39,12 @@ infrastructure:
49 39 secrets:
50 40 dashscope_api_key: null
51 41 deepl_auth_key: null
52   -
53   -# Elasticsearch Index
54 42 es_index_name: search_products
55   -
56   -# 检索域 / 索引列表(可为空列表;每项字段均需显式给出)
57 43 indexes: []
58   -
59   -# Config assets
60 44 assets:
61 45 query_rewrite_dictionary_path: config/dictionaries/query_rewrite.dict
62   -
63   -# Product content understanding (LLM enrich-content) configuration
64 46 product_enrich:
65 47 max_workers: 40
66   -
67   -# 离线 / Web 相关性评估(scripts/evaluation、eval-web)
68   -# CLI 未显式传参时使用此处默认值;search_base_url 未配置时自动为 http://127.0.0.1:{runtime.api_port}
69 48 search_evaluation:
70 49 artifact_root: artifacts/search_evaluation
71 50 queries_file: scripts/evaluation/queries/queries.txt
... ... @@ -98,23 +77,18 @@ search_evaluation:
98 77 rebuild_irrelevant_stop_ratio: 0.799
99 78 rebuild_irrel_low_combined_stop_ratio: 0.959
100 79 rebuild_irrelevant_stop_streak: 3
101   -
102   -# ES Index Settings (基础设置)
103 80 es_settings:
104 81 number_of_shards: 1
105 82 number_of_replicas: 0
106 83 refresh_interval: 30s
107 84  
108   -# 字段权重配置(用于搜索时的字段boost)
109   -# 统一按“字段基名”配置;查询时按实际检索语言动态拼接 .{lang}。
110   -# 若需要按某个语言单独调权,也可以加显式 key(例如 title.de: 3.2)。
  85 +# 统一按“字段基名”配置;查询时按实际检索语言动态拼接 .{lang}
111 86 field_boosts:
112 87 title: 3.0
113 88 # qanchors enriched_tags 在 enriched_attributes.value中也存在,所以其实他的权重为自身权重+enriched_attributes.value的权重
114 89 qanchors: 1.0
115 90 enriched_tags: 1.0
116 91 enriched_attributes.value: 1.5
117   - # enriched_taxonomy_attributes.value: 0.3
118 92 category_name_text: 2.0
119 93 category_path: 2.0
120 94 keywords: 2.0
... ... @@ -126,38 +100,25 @@ field_boosts:
126 100 description: 1.0
127 101 vendor: 1.0
128 102  
129   -# Query Configuration(查询配置)
130 103 query_config:
131   - # 支持的语言
132 104 supported_languages:
133 105 - zh
134 106 - en
135 107 default_language: en
136   -
137   - # 功能开关(翻译开关由tenant_config控制)
138 108 enable_text_embedding: true
139 109 enable_query_rewrite: true
140 110  
141   - # 查询翻译模型(须与 services.translation.capabilities 中某项一致)
142   - # 源语种在租户 index_languages 内:主召回可打在源语种字段,用下面三项。
143   - zh_to_en_model: nllb-200-distilled-600m # "opus-mt-zh-en"
144   - en_to_zh_model: nllb-200-distilled-600m # "opus-mt-en-zh"
145   - default_translation_model: nllb-200-distilled-600m
146   - # zh_to_en_model: deepl
147   - # en_to_zh_model: deepl
148   - # default_translation_model: deepl
149   - # 源语种不在 index_languages:翻译对可检索文本更关键,可单独指定(缺省则与上一组相同)
150   - zh_to_en_model__source_not_in_index: nllb-200-distilled-600m
151   - en_to_zh_model__source_not_in_index: nllb-200-distilled-600m
152   - default_translation_model__source_not_in_index: nllb-200-distilled-600m
153   - # zh_to_en_model__source_not_in_index: deepl
154   - # en_to_zh_model__source_not_in_index: deepl
155   - # default_translation_model__source_not_in_index: deepl
  111 + zh_to_en_model: deepl # nllb-200-distilled-600m
  112 + en_to_zh_model: deepl
  113 + default_translation_model: deepl
  114 + # 源语种不在 index_languages时翻译质量比较重要,因此单独配置
  115 + zh_to_en_model__source_not_in_index: deepl
  116 + en_to_zh_model__source_not_in_index: deepl
  117 + default_translation_model__source_not_in_index: deepl
156 118  
157   - # 查询解析阶段:翻译与 query 向量并发执行,共用同一等待预算(毫秒)。
158   - # 检测语言已在租户 index_languages 内:较短;不在索引语言内:较长(翻译对召回更关键)。
159   - translation_embedding_wait_budget_ms_source_in_index: 300 # 80
160   - translation_embedding_wait_budget_ms_source_not_in_index: 400 # 200
  119 + # 查询解析阶段:翻译与 query 向量并发执行,共用同一等待预算(毫秒)
  120 + translation_embedding_wait_budget_ms_source_in_index: 300
  121 + translation_embedding_wait_budget_ms_source_not_in_index: 400
161 122 style_intent:
162 123 enabled: true
163 124 selected_sku_boost: 1.2
... ... @@ -184,11 +145,8 @@ query_config:
184 145 product_title_exclusion:
185 146 enabled: true
186 147 dictionary_path: config/dictionaries/product_title_exclusion.tsv
187   -
188   - # 动态多语言检索字段配置
189   - # multilingual_fields 会被拼成 title.{lang}/brief.{lang}/... 形式;
190   - # shared_fields 为无语言后缀字段。
191 148 search_fields:
  149 + # 统一按“字段基名”配置;查询时按实际检索语言动态拼接 .{lang}
192 150 multilingual_fields:
193 151 - title
194 152 - keywords
... ... @@ -205,13 +163,14 @@ query_config:
205 163 # - description
206 164 # - vendor
207 165 # shared_fields: 无语言后缀字段;示例: tags, option1_values, option2_values, option3_values
  166 +
208 167 shared_fields: null
209 168 core_multilingual_fields:
210 169 - title
211 170 - qanchors
212 171 - category_name_text
213 172  
214   - # 统一文本召回策略(主查询 + 翻译查询)
  173 + # 文本召回(主查询 + 翻译查询)
215 174 text_query_strategy:
216 175 base_minimum_should_match: 60%
217 176 translation_minimum_should_match: 60%
... ... @@ -226,14 +185,10 @@ query_config:
226 185 title: 5.0
227 186 qanchors: 4.0
228 187 phrase_match_boost: 3.0
229   -
230   - # Embedding字段名称
231 188 text_embedding_field: title_embedding
232 189 image_embedding_field: image_embedding.vector
233 190  
234   - # 返回字段配置(_source includes)
235   - # null表示返回所有字段,[]表示不返回任何字段,列表表示只返回指定字段
236   - # 下列字段与 api/result_formatter.py(SpuResult 填充)及 search/searcher.py(SKU 排序/主图替换)一致
  191 + # null表示返回所有字段,[]表示不返回任何字段
237 192 source_fields:
238 193 - spu_id
239 194 - handle
... ... @@ -255,6 +210,7 @@ query_config:
255 210 # - enriched_tags
256 211 # - enriched_attributes
257 212 # - # enriched_taxonomy_attributes.value
  213 +
258 214 - min_price
259 215 - compare_at_price
260 216 - image_url
... ... @@ -274,22 +230,17 @@ query_config:
274 230 # KNN:文本向量与多模态(图片)向量各自 boost 与召回(k / num_candidates)
275 231 knn_text_boost: 4
276 232 knn_image_boost: 4
277   -
278   - # knn_text_num_candidates = k * 3.4
279 233 knn_text_k: 160
280   - knn_text_num_candidates: 560
  234 + knn_text_num_candidates: 560 # k * 3.4
281 235 knn_text_k_long: 400
282 236 knn_text_num_candidates_long: 1200
283 237 knn_image_k: 400
284 238 knn_image_num_candidates: 1200
285 239  
286   -# Function Score配置(ES层打分规则)
287 240 function_score:
288 241 score_mode: sum
289 242 boost_mode: multiply
290 243 functions: []
291   -
292   -# 粗排配置(仅融合 ES 文本/向量信号,不调用模型)
293 244 coarse_rank:
294 245 enabled: true
295 246 input_window: 480
... ... @@ -305,24 +256,20 @@ coarse_rank:
305 256 knn_text_weight: 1.0
306 257 knn_image_weight: 2.0
307 258 knn_tie_breaker: 0.3
308   - knn_bias: 0.6
309   - knn_exponent: 0.4
310   -
311   -# 精排配置(轻量 reranker)
312   -# enabled=false 时仍进入 fine 阶段,但保序透传,不调用 fine 模型服务
  259 + knn_bias: 0.0
  260 + knn_exponent: 5.6
  261 + knn_text_exponent: 0.0
  262 + knn_image_exponent: 0.0
313 263 fine_rank:
314   - enabled: false
  264 + enabled: false # false 时保序透传
315 265 input_window: 160
316 266 output_window: 80
317 267 timeout_sec: 10.0
318 268 rerank_query_template: '{query}'
319 269 rerank_doc_template: '{title}'
320 270 service_profile: fine
321   -
322   -# 重排配置(provider/URL 在 services.rerank)
323   -# enabled=false 时仍进入 rerank 阶段,但保序透传,不调用最终 rerank 服务
324 271 rerank:
325   - enabled: true
  272 + enabled: false # false 时保序透传
326 273 rerank_window: 160
327 274 exact_knn_rescore_enabled: true
328 275 exact_knn_rescore_window: 160
... ... @@ -332,7 +279,6 @@ rerank:
332 279 rerank_query_template: '{query}'
333 280 rerank_doc_template: '{title}'
334 281 service_profile: default
335   -
336 282 # 乘法融合:fused = Π (max(score,0) + bias) ** exponent(es / rerank / fine / text / knn)
337 283 # 其中 knn_score 先做一层 dis_max:
338 284 # max(knn_text_weight * text_knn, knn_image_weight * image_knn)
... ... @@ -345,30 +291,28 @@ rerank:
345 291 fine_bias: 0.1
346 292 fine_exponent: 1.0
347 293 text_bias: 0.1
348   - text_exponent: 0.25
349 294 # base_query_trans_* 相对 base_query 的权重(见 search/rerank_client 中文本 dismax 融合)
  295 + text_exponent: 0.25
350 296 text_translation_weight: 0.8
351 297 knn_text_weight: 1.0
352 298 knn_image_weight: 2.0
353 299 knn_tie_breaker: 0.3
354   - knn_bias: 0.6
355   - knn_exponent: 0.4
  300 + knn_bias: 0.0
  301 + knn_exponent: 5.6
356 302  
357   -# 可扩展服务/provider 注册表(单一配置源)
358 303 services:
359 304 translation:
360 305 service_url: http://127.0.0.1:6006
361   - # default_model: nllb-200-distilled-600m
362 306 default_model: nllb-200-distilled-600m
363 307 default_scene: general
364 308 timeout_sec: 10.0
365 309 cache:
366 310 ttl_seconds: 62208000
367 311 sliding_expiration: true
368   - # When false, cache keys are exact-match per request model only (ignores model_quality_tiers for lookups).
369   - enable_model_quality_tier_cache: true
  312 + # When false, cache keys are exact-match per request model only (ignores model_quality_tiers for lookups)
370 313 # Higher tier = better quality. Multiple models may share one tier (同级).
371 314 # A request may reuse Redis keys from models with tier > A or tier == A (not from lower tiers).
  315 + enable_model_quality_tier_cache: true
372 316 model_quality_tiers:
373 317 deepl: 30
374 318 qwen-mt: 30
... ... @@ -462,13 +406,12 @@ services:
462 406 num_beams: 1
463 407 use_cache: true
464 408 embedding:
465   - provider: http # http
  409 + provider: http
466 410 providers:
467 411 http:
468 412 text_base_url: http://127.0.0.1:6005
469 413 image_base_url: http://127.0.0.1:6008
470   - # 服务内文本后端(embedding 进程启动时读取)
471   - backend: tei # tei | local_st
  414 + backend: tei
472 415 backends:
473 416 tei:
474 417 base_url: http://127.0.0.1:8080
... ... @@ -508,8 +451,8 @@ services:
508 451 request:
509 452 max_docs: 1000
510 453 normalize: true
511   - default_instance: default
512 454 # 命名实例:同一套 reranker 代码按实例名读取不同端口 / 后端 / runtime 目录。
  455 + default_instance: default
513 456 instances:
514 457 default:
515 458 host: 0.0.0.0
... ... @@ -551,6 +494,7 @@ services:
551 494 enforce_eager: false
552 495 infer_batch_size: 100
553 496 sort_by_doc_length: true
  497 +
554 498 # standard=_format_instruction__standard(固定 yes/no system);compact=_format_instruction(instruction 作 system 且 user 内重复 Instruct)
555 499 instruction_format: standard # compact standard
556 500 # instruction: "Given a query, score the product for relevance"
... ... @@ -564,6 +508,7 @@ services:
564 508 # instruction: "Rank products by query with category & style match prioritized"
565 509 # instruction: "Given a fashion shopping query, retrieve relevant products that answer the query"
566 510 instruction: rank products by given query
  511 +
567 512 # vLLM LLM.score()(跨编码打分)。独立高性能环境 .venv-reranker-score(vllm 0.18 固定版):./scripts/setup_reranker_venv.sh qwen3_vllm_score
568 513 # 与 qwen3_vllm 可共用同一 model_name / HF 缓存;venv 分离以便升级 vLLM 而不影响 generate 后端。
569 514 qwen3_vllm_score:
... ... @@ -591,15 +536,10 @@ services:
591 536 qwen3_transformers:
592 537 model_name: Qwen/Qwen3-Reranker-0.6B
593 538 instruction: rank products by given query
594   - # instruction: "Score the product’s relevance to the given query"
595 539 max_length: 8192
596 540 batch_size: 64
597 541 use_fp16: true
598   - # sdpa:默认无需 flash-attn;若已安装 flash_attn 可改为 flash_attention_2
599 542 attn_implementation: sdpa
600   - # Packed Transformers backend: shared query prefix + custom position_ids/attention_mask.
601   - # For 1 query + many short docs (for example 400 product titles), this usually reduces
602   - # repeated prefix work and padding waste compared with pairwise batching.
603 543 qwen3_transformers_packed:
604 544 model_name: Qwen/Qwen3-Reranker-0.6B
605 545 instruction: Rank products by query with category & style match prioritized
... ... @@ -608,8 +548,6 @@ services:
608 548 max_docs_per_pack: 0
609 549 use_fp16: true
610 550 sort_by_doc_length: true
611   - # Packed mode relies on a custom 4D attention mask. "eager" is the safest default.
612   - # If your torch/transformers stack validates it, you can benchmark "sdpa".
613 551 attn_implementation: eager
614 552 qwen3_gguf:
615 553 repo_id: DevQuasar/Qwen.Qwen3-Reranker-4B-GGUF
... ... @@ -617,7 +555,6 @@ services:
617 555 cache_dir: ./model_cache
618 556 local_dir: ./models/reranker/qwen3-reranker-4b-gguf
619 557 instruction: Rank products by query with category & style match prioritized
620   - # T4 16GB / 性能优先配置:全量层 offload,实测比保守配置明显更快
621 558 n_ctx: 512
622 559 n_batch: 512
623 560 n_ubatch: 512
... ... @@ -640,8 +577,6 @@ services:
640 577 cache_dir: ./model_cache
641 578 local_dir: ./models/reranker/qwen3-reranker-0.6b-q8_0-gguf
642 579 instruction: Rank products by query with category & style match prioritized
643   - # 0.6B GGUF / online rerank baseline:
644   - # 实测 400 titles 单请求约 265s,因此它更适合作为低显存功能后备,不适合在线低延迟主路由。
645 580 n_ctx: 256
646 581 n_batch: 256
647 582 n_ubatch: 256
... ... @@ -661,20 +596,15 @@ services:
661 596 verbose: false
662 597 dashscope_rerank:
663 598 model_name: qwen3-rerank
664   - # 按地域选择 endpoint:
665   - # 中国: https://dashscope.aliyuncs.com/compatible-api/v1/reranks
666   - # 新加坡: https://dashscope-intl.aliyuncs.com/compatible-api/v1/reranks
667   - # 美国: https://dashscope-us.aliyuncs.com/compatible-api/v1/reranks
668 599 endpoint: https://dashscope.aliyuncs.com/compatible-api/v1/reranks
669 600 api_key_env: RERANK_DASHSCOPE_API_KEY_CN
670 601 timeout_sec: 10.0
671   - top_n_cap: 0 # 0 表示 top_n=当前请求文档数;>0 则限制 top_n 上限
672   - batchsize: 64 # 0 关闭;>0 启用并发小包调度(top_n/top_n_cap 仍生效,分包后全局截断)
  602 + top_n_cap: 0 # 0 表示 top_n=当前请求文档数
  603 + batchsize: 64 # 0 关闭;>0 启用并发小包调度(top_n/top_n_cap 仍生效,分包后全局截断)
673 604 instruct: Given a shopping query, rank product titles by relevance
674 605 max_retries: 2
675 606 retry_backoff_sec: 0.2
676 607  
677   -# SPU配置(已启用,使用嵌套skus)
678 608 spu_config:
679 609 enabled: true
680 610 spu_field: spu_id
... ... @@ -686,7 +616,6 @@ spu_config:
686 616 - option2
687 617 - option3
688 618  
689   -# 租户配置(Tenant Configuration)
690 619 # 每个租户可配置主语言 primary_language 与索引语言 index_languages(主市场语言,商家可勾选)
691 620 # 默认 index_languages: [en, zh],可配置为任意 SOURCE_LANG_CODE_MAP.keys() 的子集
692 621 tenant_config:
... ...
config/loader.py
... ... @@ -587,6 +587,14 @@ class AppConfigLoader:
587 587 knn_tie_breaker=float(coarse_fusion_raw.get("knn_tie_breaker", 0.0)),
588 588 knn_bias=float(coarse_fusion_raw.get("knn_bias", 0.6)),
589 589 knn_exponent=float(coarse_fusion_raw.get("knn_exponent", 0.2)),
  590 + knn_text_bias=float(
  591 + coarse_fusion_raw.get("knn_text_bias", coarse_fusion_raw.get("knn_bias", 0.6))
  592 + ),
  593 + knn_text_exponent=float(coarse_fusion_raw.get("knn_text_exponent", 0.0)),
  594 + knn_image_bias=float(
  595 + coarse_fusion_raw.get("knn_image_bias", coarse_fusion_raw.get("knn_bias", 0.6))
  596 + ),
  597 + knn_image_exponent=float(coarse_fusion_raw.get("knn_image_exponent", 0.0)),
590 598 text_translation_weight=float(
591 599 coarse_fusion_raw.get("text_translation_weight", 0.8)
592 600 ),
... ... @@ -636,6 +644,14 @@ class AppConfigLoader:
636 644 knn_tie_breaker=float(fusion_raw.get("knn_tie_breaker", 0.0)),
637 645 knn_bias=float(fusion_raw.get("knn_bias", 0.6)),
638 646 knn_exponent=float(fusion_raw.get("knn_exponent", 0.2)),
  647 + knn_text_bias=float(
  648 + fusion_raw.get("knn_text_bias", fusion_raw.get("knn_bias", 0.6))
  649 + ),
  650 + knn_text_exponent=float(fusion_raw.get("knn_text_exponent", 0.0)),
  651 + knn_image_bias=float(
  652 + fusion_raw.get("knn_image_bias", fusion_raw.get("knn_bias", 0.6))
  653 + ),
  654 + knn_image_exponent=float(fusion_raw.get("knn_image_exponent", 0.0)),
639 655 fine_bias=float(fusion_raw.get("fine_bias", 0.00001)),
640 656 fine_exponent=float(fusion_raw.get("fine_exponent", 1.0)),
641 657 text_translation_weight=float(
... ...
config/schema.py
... ... @@ -119,6 +119,18 @@ class RerankFusionConfig:
119 119 knn_tie_breaker: float = 0.0
120 120 knn_bias: float = 0.6
121 121 knn_exponent: float = 0.2
  122 + #: Optional additive floor for the weighted text KNN term.
  123 + #: Falls back to knn_bias when omitted in config loading.
  124 + knn_text_bias: float = 0.6
  125 + #: Optional extra multiplicative term on weighted text KNN.
  126 + #: Uses knn_text_bias as the additive floor.
  127 + knn_text_exponent: float = 0.0
  128 + #: Optional additive floor for the weighted image KNN term.
  129 + #: Falls back to knn_bias when omitted in config loading.
  130 + knn_image_bias: float = 0.6
  131 + #: Optional extra multiplicative term on weighted image KNN.
  132 + #: Uses knn_image_bias as the additive floor.
  133 + knn_image_exponent: float = 0.0
122 134 fine_bias: float = 0.00001
123 135 fine_exponent: float = 1.0
124 136 #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合)
... ... @@ -143,6 +155,18 @@ class CoarseRankFusionConfig:
143 155 knn_tie_breaker: float = 0.0
144 156 knn_bias: float = 0.6
145 157 knn_exponent: float = 0.2
  158 + #: Optional additive floor for the weighted text KNN term.
  159 + #: Falls back to knn_bias when omitted in config loading.
  160 + knn_text_bias: float = 0.6
  161 + #: Optional extra multiplicative term on weighted text KNN.
  162 + #: Uses knn_text_bias as the additive floor.
  163 + knn_text_exponent: float = 0.0
  164 + #: Optional additive floor for the weighted image KNN term.
  165 + #: Falls back to knn_bias when omitted in config loading.
  166 + knn_image_bias: float = 0.6
  167 + #: Optional extra multiplicative term on weighted image KNN.
  168 + #: Uses knn_image_bias as the additive floor.
  169 + knn_image_exponent: float = 0.0
146 170 #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合)
147 171 text_translation_weight: float = 0.8
148 172  
... ...
docs/caches-inventory.md 0 → 100644
... ... @@ -0,0 +1,133 @@
  1 +# 本项目缓存一览
  2 +
  3 +本文档梳理仓库内**与业务相关的各类缓存**:说明用途、键与过期策略,并汇总运维脚本。按「分布式(Redis)→ 进程内 → 磁盘/模型 → 第三方」组织。
  4 +
  5 +---
  6 +
  7 +## 一、Redis 集中式缓存(生产主路径)
  8 +
  9 +所有下列缓存默认连接 **`infrastructure.redis`**(`config/config.yaml` 与 `REDIS_*` 环境变量),**数据库编号一般为 `db=0`**(脚本可通过参数覆盖)。`snapshot_db` 仅在配置中存在,供快照/运维场景选用,应用代码未按该字段切换业务缓存的 DB。
  10 +
  11 +### 1. 文本 / 图像向量缓存(Embedding)
  12 +
  13 +- **作用**:缓存 BGE/TEI 文本向量与 CN-CLIP 图像向量、CLIP 文本塔向量,避免重复推理。
  14 +- **实现**:`embeddings/redis_embedding_cache.py` 的 `RedisEmbeddingCache`;键构造见 `embeddings/cache_keys.py`。
  15 +- **Key 形态**(最终 Redis 键 = `前缀` + `可选 namespace` + `逻辑键`):
  16 + - **前缀**:`infrastructure.redis.embedding_cache_prefix`(默认 `embedding`,可用 `REDIS_EMBEDDING_CACHE_PREFIX` 覆盖)。
  17 + - **命名空间**:`embeddings/server.py` 与客户端中分为:
  18 + - 文本:`namespace=""` → `{prefix}:{embed:norm0|1:...}`
  19 + - 图像:`namespace="image"` → `{prefix}:image:{embed:模型名:txt:norm0|1:...}`
  20 + - CLIP 文本:`namespace="clip_text"` → `{prefix}:clip_text:{embed:模型名:img:norm0|1:...}`
  21 + - 逻辑键段含 `embed:`、`norm0/1`、模型名(多模态)、过长文本/URL 时用 `h:sha256:...` 摘要(见 `cache_keys.py` 注释)。
  22 +- **值格式**:BF16 压缩后的字节(`embeddings/bf16.py`),非 JSON。
  23 +- **TTL**:`infrastructure.redis.cache_expire_days`(默认 **720 天**,`REDIS_CACHE_EXPIRE_DAYS`)。写入用 `SETEX`;**命中时滑动续期**(`EXPIRE` 刷新为同一时长)。
  24 +- **Redis 客户端**:`decode_responses=False`(二进制)。
  25 +
  26 +**主要代码**:`embeddings/server.py`、`embeddings/text_encoder.py`、`embeddings/image_encoder.py`。
  27 +
  28 +---
  29 +
  30 +### 2. 翻译结果缓存(Translation)
  31 +
  32 +- **作用**:按「翻译模型 + 目标语言 + 原文」缓存译文;支持**模型质量分层探测**(高 tier 模型写入的缓存可被同 tier 或更高 tier 的请求命中,见 `translation/settings.py` 中 `translation_cache_probe_models`)。
  33 +- **Key 形态**:`trans:{model}:{target_lang}:{text前4字符}{sha256全文}`(`translation/cache.py` 的 `build_key`)。
  34 +- **值格式**:UTF-8 译文字符串。
  35 +- **TTL**:`services.translation.cache.ttl_seconds`(默认 **62208000 秒 = 720 天**)。若 `sliding_expiration: true`,命中时刷新 TTL。
  36 +- **能力级开关**:各 `capabilities.*.use_cache` 为 `false` 时该后端不落 Redis。
  37 +- **Redis 客户端**:`decode_responses=True`。
  38 +
  39 +**主要代码**:`translation/cache.py`、`translation/service.py`;翻译 HTTP 服务:`api/translator_app.py`(`get_translation_service()` 使用 `lru_cache` 单例,见下文进程内缓存)。
  40 +
  41 +---
  42 +
  43 +### 3. 商品内容理解 / Anchors 与语义分析缓存(Indexer)
  44 +
  45 +- **作用**:缓存 LLM 对商品标题等拼出的 **prompt 输入** 所做的分析结果(anchors、语义属性等),避免重复调用大模型。键与 `analysis_kind`、`prompt` 契约版本、`target_lang` 及输入摘要相关。
  46 +- **Key 形态**:`{anchor_cache_prefix}:{analysis_kind}:{prompt_contract_hash[:12]}:{target_lang}:{prompt_input[:4]}{md5}`(`indexer/product_enrich.py` 中 `_make_analysis_cache_key`)。
  47 +- **前缀**:`infrastructure.redis.anchor_cache_prefix`(默认 `product_anchors`,`REDIS_ANCHOR_CACHE_PREFIX`)。
  48 +- **值格式**:JSON 字符串(规范化后的分析结果)。
  49 +- **TTL**:`anchor_cache_expire_days`(默认 **30 天**),以秒写入 `SETEX`(**非滑动**,与向量/翻译不同)。
  50 +- **读逻辑**:无 TTL 刷新;仅校验内容是否「有意义」再返回。
  51 +
  52 +**主要代码**:`indexer/product_enrich.py`;与 HTTP 侧对齐说明见 `api/routes/indexer.py` 注释。
  53 +
  54 +---
  55 +
  56 +## 二、进程内缓存(非共享、随进程重启失效)
  57 +
  58 +| 名称 | 用途 | 范围/生命周期 |
  59 +|------|------|----------------|
  60 +| **`get_app_config()`** | 解析并缓存全局 `AppConfig` | `config/loader.py`:`@lru_cache(maxsize=1)`;`reload_app_config()` 可 `cache_clear()` |
  61 +| **`TranslationService` 单例** | 翻译服务进程内复用后端与 Redis 客户端 | `api/translator_app.py`:`get_translation_service()` |
  62 +| **`_nllb_tokenizer_code_by_normalized_key`** | NLLB tokenizer 语言码映射 | `translation/languages.py`:`@lru_cache(maxsize=1)` |
  63 +| **`QueryTextAnalysisCache`** | 单次查询解析内复用分词、tokenizer 结果 | `query/tokenization.py`,随 `QueryParser` 一次 parse |
  64 +| **`_SelectionContext`(SKU 意图)** | 归一化文本、分词、匹配布尔等小字典 | `search/sku_intent_selector.py`,单次选择流程 |
  65 +| **`incremental_service` transformer 缓存** | 按 `tenant_id` 缓存文档转换器 | `indexer/incremental_service.py`,**无界**、多租户进程长期存活时需注意内存 |
  66 +| **NLLB batch 内 `token_count_cache`** | 同一 batch 内避免重复计 token | `translation/backends/local_ctranslate2.py` |
  67 +| **CLIP 分词器 `@lru_cache`**(第三方) | 简单 tokenizer 缓存 | `third-party/clip-as-service/.../simple_tokenizer.py` |
  68 +
  69 +**说明**:`utils/cache.py` 中的 **`DictCache`**(文件 JSON:默认 `.cache/dict_cache.json`)已导出,但仓库内**无直接 `DictCache(` 调用**,视为可复用工具/预留,非当前主路径。
  70 +
  71 +---
  72 +
  73 +## 三、磁盘与模型相关「缓存」(非 Redis)
  74 +
  75 +| 名称 | 用途 | 配置/位置 |
  76 +|------|------|-----------|
  77 +| **Hugging Face / 本地模型目录** | 重排器、翻译本地模型等权重下载与缓存 | `services.rerank.backends.*.cache_dir` 等,常见默认 **`./model_cache`**(`config/config.yaml`) |
  78 +| **vLLM `enable_prefix_caching`** | 重排服务内 **Prefix KV 缓存**(加速同前缀批推理) | `services.rerank.backends.qwen3_vllm*`、`reranker/backends/qwen3_vllm*.py` |
  79 +| **运行时目录** | 重排服务状态/引擎文件 | `services.rerank.instances.*.runtime_dir`(如 `./.runtime/reranker/...`) |
  80 +
  81 +翻译能力里的 **`use_cache: true`**(如 NLLB、Marian)在多数后端指 **推理时的 KV cache(Transformer)**,与 Redis 译文缓存是不同层次;Redis 译文缓存仍由 `TranslationCache` 控制。
  82 +
  83 +---
  84 +
  85 +## 四、Elasticsearch 内部缓存
  86 +
  87 +索引设置中的 `refresh_interval` 等影响近实时可见性,但**不属于应用层键值缓存**。若需调优 ES 查询缓存、节点堆等,见运维文档与集群配置,此处不展开。
  88 +
  89 +---
  90 +
  91 +## 五、运维与巡检脚本(Redis)
  92 +
  93 +| 脚本 | 作用 |
  94 +|------|------|
  95 +| `scripts/redis/redis_cache_health_check.py` | 按 **embedding / translation / anchors** 三类前缀巡检:key 数量估算、TTL 采样、`IDLETIME` 等 |
  96 +| `scripts/redis/redis_cache_prefix_stats.py` | 按前缀统计 key 数量与 **MEMORY USAGE**(可多 DB) |
  97 +| `scripts/redis/redis_memory_heavy_keys.py` | 扫描占用内存最大的 key,辅助排查「统计与总内存不一致」 |
  98 +| `scripts/redis/monitor_eviction.py` | 实时监控 **eviction** 相关事件,用于容量与驱逐策略排查 |
  99 +
  100 +使用前需加载项目配置(如 `source activate.sh`)以保证 `REDIS_CONFIG` 与生产一致。脚本注释中给出了 **`redis-cli` 手工统计**示例(按前缀 `wc -l`、`MEMORY STATS` 等)。
  101 +
  102 +---
  103 +
  104 +## 六、总表(Redis 与各层缓存)
  105 +
  106 +| 缓存名称 | 业务模块 | 存储 | Key 前缀 / 命名模式 | 过期时间 | 过期策略 | 值摘要 | 配置键 / 环境变量 |
  107 +|----------|----------|------|---------------------|----------|----------|--------|-------------------|
  108 +| 文本向量 | 检索 / 索引 / Embedding 服务 | Redis db≈0 | `{embedding_cache_prefix}:*`(逻辑键以 `embed:norm…` 开头) | `cache_expire_days`(默认 720 天) | 写入 TTL + 命中滑动续期 | BF16 字节向量 | `infrastructure.redis.*`;`REDIS_EMBEDDING_CACHE_PREFIX`、`REDIS_CACHE_EXPIRE_DAYS` |
  109 +| 图像向量(CLIP 图) | 图搜 / 多模态 | 同上 | `{prefix}:image:*` | 同上 | 同上 | BF16 字节 | 同上 |
  110 +| CLIP 文本塔向量 | 图搜文本侧 | 同上 | `{prefix}:clip_text:*` | 同上 | 同上 | BF16 字节 | 同上 |
  111 +| 翻译译文 | 查询翻译、翻译服务 | 同上 | `trans:{model}:{lang}:*` | `services.translation.cache.ttl_seconds`(默认 720 天) | 可配置滑动(`sliding_expiration`) | UTF-8 字符串 | `services.translation.cache.*`;各能力 `use_cache` |
  112 +| 商品分析 / Anchors | 索引富化、LLM 内容理解 | 同上 | `{anchor_cache_prefix}:{kind}:{hash}:{lang}:*` | `anchor_cache_expire_days`(默认 30 天) | 固定 TTL,不滑动 | JSON 字符串 | `anchor_cache_prefix`、`anchor_cache_expire_days`;`REDIS_ANCHOR_*` |
  113 +| 应用配置 | 全栈 | 进程内存 | N/A(单例) | 进程生命周期 | `reload_app_config` 清除 | `AppConfig` 对象 | `config/loader.py` |
  114 +| 翻译服务实例 | 翻译 API | 进程内存 | N/A | 进程生命周期 | 单例 | `TranslationService` | `api/translator_app.py` |
  115 +| 查询分词缓存 | 查询解析 | 单次请求内 | N/A | 单次 parse | — | 分词与中间结果 | `query/tokenization.py` |
  116 +| SKU 意图辅助字典 | 搜索排序辅助 | 单次请求内 | N/A | 单次选择 | — | 小 dict | `search/sku_intent_selector.py` |
  117 +| 增量索引 Transformer | 索引管道 | 进程内存 | `tenant_id` 字符串键 | 长期(无界) | 无自动淘汰 | Transformer 元组 | `indexer/incremental_service.py` |
  118 +| 重排 / 翻译模型权重 | 推理服务 | 本地磁盘 | 目录路径 | 无自动删除(人工清理) | — | 模型文件 | `cache_dir: ./model_cache` 等 |
  119 +| vLLM Prefix 缓存 | 重排(Qwen3 等) | GPU/引擎内 | 引擎内部 | 引擎管理 | — | KV Cache | `enable_prefix_caching` |
  120 +| 文件 Dict 缓存(可选) | 通用 | `.cache/dict_cache.json` | 分类 + 自定义 key | 持久直至删除 | — | JSON 可序列化值 | `utils/cache.py`(当前无调用方) |
  121 +
  122 +---
  123 +
  124 +## 七、维护建议(简要)
  125 +
  126 +1. **容量**:三类 Redis 缓存(embedding / trans / anchors)可共用同一实例;大租户或图搜多时 **embedding** 与 **trans** 往往占主要内存,可用 `redis_cache_prefix_stats.py` 分前缀观察。
  127 +2. **键迁移**:变更 `embedding_cache_prefix`、CLIP `model_name` 或 prompt 契约会自然**隔离新键空间**;旧键依赖 TTL 或人工批量删除。
  128 +3. **一致性**:向量缓存对异常向量会 **delete key**(`RedisEmbeddingCache.get`);anchors 依赖 `cache_version` 与契约 hash 防止错误复用。
  129 +4. **监控**:除脚本外,Embedding HTTP 服务健康检查会报告各 lane 的 **`cache_enabled`**(`embeddings/server.py`)。
  130 +
  131 +---
  132 +
  133 +*文档随代码扫描生成;若新增 Redis 用途,请同步更新本文件与 `scripts/redis/redis_cache_health_check.py` 中的 `_load_known_cache_types()`。*
... ...
search/es_query_builder.py
... ... @@ -8,6 +8,7 @@ Simplified architecture:
8 8 - function_score wrapper for boosting fields
9 9 """
10 10  
  11 +from dataclasses import dataclass
11 12 from typing import Dict, Any, List, Optional, Tuple
12 13  
13 14 import numpy as np
... ... @@ -114,6 +115,171 @@ class ESQueryBuilder:
114 115 self.phrase_match_tie_breaker = float(phrase_match_tie_breaker)
115 116 self.phrase_match_boost = float(phrase_match_boost)
116 117  
  118 + @dataclass(frozen=True)
  119 + class KNNClausePlan:
  120 + field: str
  121 + boost: float
  122 + k: Optional[int] = None
  123 + num_candidates: Optional[int] = None
  124 + nested_path: Optional[str] = None
  125 +
  126 + @staticmethod
  127 + def _vector_to_list(vector: Any) -> List[float]:
  128 + if vector is None:
  129 + return []
  130 + if hasattr(vector, "tolist"):
  131 + values = vector.tolist()
  132 + else:
  133 + values = list(vector)
  134 + return [float(v) for v in values]
  135 +
  136 + @staticmethod
  137 + def _query_token_count(parsed_query: Optional[Any]) -> int:
  138 + if parsed_query is None:
  139 + return 0
  140 + query_tokens = getattr(parsed_query, "query_tokens", None) or []
  141 + return len(query_tokens)
  142 +
  143 + def get_text_knn_plan(self, parsed_query: Optional[Any] = None) -> Optional[KNNClausePlan]:
  144 + if not self.text_embedding_field:
  145 + return None
  146 + boost = self.knn_text_boost
  147 + final_knn_k = self.knn_text_k
  148 + final_knn_num_candidates = self.knn_text_num_candidates
  149 + if self._query_token_count(parsed_query) >= 5:
  150 + final_knn_k = self.knn_text_k_long
  151 + final_knn_num_candidates = self.knn_text_num_candidates_long
  152 + boost = self.knn_text_boost * 1.4
  153 + return self.KNNClausePlan(
  154 + field=str(self.text_embedding_field),
  155 + boost=float(boost),
  156 + k=int(final_knn_k),
  157 + num_candidates=int(final_knn_num_candidates),
  158 + )
  159 +
  160 + def get_image_knn_plan(self) -> Optional[KNNClausePlan]:
  161 + if not self.image_embedding_field:
  162 + return None
  163 + nested_path, _, _ = str(self.image_embedding_field).rpartition(".")
  164 + return self.KNNClausePlan(
  165 + field=str(self.image_embedding_field),
  166 + boost=float(self.knn_image_boost),
  167 + k=int(self.knn_image_k),
  168 + num_candidates=int(self.knn_image_num_candidates),
  169 + nested_path=nested_path or None,
  170 + )
  171 +
  172 + def build_text_knn_clause(
  173 + self,
  174 + query_vector: Any,
  175 + *,
  176 + parsed_query: Optional[Any] = None,
  177 + query_name: str = "knn_query",
  178 + ) -> Optional[Dict[str, Any]]:
  179 + plan = self.get_text_knn_plan(parsed_query)
  180 + if plan is None or query_vector is None:
  181 + return None
  182 + return {
  183 + "knn": {
  184 + "field": plan.field,
  185 + "query_vector": self._vector_to_list(query_vector),
  186 + "k": plan.k,
  187 + "num_candidates": plan.num_candidates,
  188 + "boost": plan.boost,
  189 + "_name": query_name,
  190 + }
  191 + }
  192 +
  193 + def build_image_knn_clause(
  194 + self,
  195 + image_query_vector: Any,
  196 + *,
  197 + query_name: str = "image_knn_query",
  198 + ) -> Optional[Dict[str, Any]]:
  199 + plan = self.get_image_knn_plan()
  200 + if plan is None or image_query_vector is None:
  201 + return None
  202 + image_knn_query = {
  203 + "field": plan.field,
  204 + "query_vector": self._vector_to_list(image_query_vector),
  205 + "k": plan.k,
  206 + "num_candidates": plan.num_candidates,
  207 + "boost": plan.boost,
  208 + }
  209 + if plan.nested_path:
  210 + return {
  211 + "nested": {
  212 + "path": plan.nested_path,
  213 + "_name": query_name,
  214 + "query": {"knn": image_knn_query},
  215 + "score_mode": "max",
  216 + }
  217 + }
  218 + return {
  219 + "knn": {
  220 + **image_knn_query,
  221 + "_name": query_name,
  222 + }
  223 + }
  224 +
  225 + def build_exact_text_knn_rescore_clause(
  226 + self,
  227 + query_vector: Any,
  228 + *,
  229 + parsed_query: Optional[Any] = None,
  230 + query_name: str = "exact_text_knn_query",
  231 + ) -> Optional[Dict[str, Any]]:
  232 + plan = self.get_text_knn_plan(parsed_query)
  233 + if plan is None or query_vector is None:
  234 + return None
  235 + return {
  236 + "script_score": {
  237 + "_name": query_name,
  238 + "query": {"exists": {"field": plan.field}},
  239 + "script": {
  240 + "source": (
  241 + f"((dotProduct(params.query_vector, '{plan.field}') + 1.0) / 2.0) * params.boost"
  242 + ),
  243 + "params": {
  244 + "query_vector": self._vector_to_list(query_vector),
  245 + "boost": float(plan.boost),
  246 + },
  247 + },
  248 + }
  249 + }
  250 +
  251 + def build_exact_image_knn_rescore_clause(
  252 + self,
  253 + image_query_vector: Any,
  254 + *,
  255 + query_name: str = "exact_image_knn_query",
  256 + ) -> Optional[Dict[str, Any]]:
  257 + plan = self.get_image_knn_plan()
  258 + if plan is None or image_query_vector is None:
  259 + return None
  260 + script_score_query = {
  261 + "query": {"exists": {"field": plan.field}},
  262 + "script": {
  263 + "source": (
  264 + f"((dotProduct(params.query_vector, '{plan.field}') + 1.0) / 2.0) * params.boost"
  265 + ),
  266 + "params": {
  267 + "query_vector": self._vector_to_list(image_query_vector),
  268 + "boost": float(plan.boost),
  269 + },
  270 + },
  271 + }
  272 + if plan.nested_path:
  273 + return {
  274 + "nested": {
  275 + "path": plan.nested_path,
  276 + "_name": query_name,
  277 + "score_mode": "max",
  278 + "query": {"script_score": script_score_query},
  279 + }
  280 + }
  281 + return {"script_score": {"_name": query_name, **script_score_query}}
  282 +
117 283 def _apply_source_filter(self, es_query: Dict[str, Any]) -> None:
118 284 """
119 285 Apply tri-state _source semantics:
... ... @@ -250,52 +416,21 @@ class ESQueryBuilder:
250 416 # 3. Add KNN search clauses alongside lexical clauses under the same bool.should
251 417 # Text KNN: k / num_candidates from config; long queries use *_long and higher boost
252 418 if has_embedding:
253   - text_knn_boost = self.knn_text_boost
254   - final_knn_k = self.knn_text_k
255   - final_knn_num_candidates = self.knn_text_num_candidates
256   - if parsed_query:
257   - query_tokens = getattr(parsed_query, 'query_tokens', None) or []
258   - token_count = len(query_tokens)
259   - if token_count >= 5:
260   - final_knn_k = self.knn_text_k_long
261   - final_knn_num_candidates = self.knn_text_num_candidates_long
262   - text_knn_boost = self.knn_text_boost * 1.4
263   - recall_clauses.append({
264   - "knn": {
265   - "field": self.text_embedding_field,
266   - "query_vector": query_vector.tolist(),
267   - "k": final_knn_k,
268   - "num_candidates": final_knn_num_candidates,
269   - "boost": text_knn_boost,
270   - "_name": "knn_query",
271   - }
272   - })
  419 + text_knn_clause = self.build_text_knn_clause(
  420 + query_vector,
  421 + parsed_query=parsed_query,
  422 + query_name="knn_query",
  423 + )
  424 + if text_knn_clause:
  425 + recall_clauses.append(text_knn_clause)
273 426  
274 427 if has_image_embedding:
275   - nested_path, _, _ = str(self.image_embedding_field).rpartition(".")
276   - image_knn_query = {
277   - "field": self.image_embedding_field,
278   - "query_vector": image_query_vector.tolist(),
279   - "k": self.knn_image_k,
280   - "num_candidates": self.knn_image_num_candidates,
281   - "boost": self.knn_image_boost,
282   - }
283   - if nested_path:
284   - recall_clauses.append({
285   - "nested": {
286   - "path": nested_path,
287   - "_name": "image_knn_query",
288   - "query": {"knn": image_knn_query},
289   - "score_mode": "max",
290   - }
291   - })
292   - else:
293   - recall_clauses.append({
294   - "knn": {
295   - **image_knn_query,
296   - "_name": "image_knn_query",
297   - }
298   - })
  428 + image_knn_clause = self.build_image_knn_clause(
  429 + image_query_vector,
  430 + query_name="image_knn_query",
  431 + )
  432 + if image_knn_clause:
  433 + recall_clauses.append(image_knn_clause)
299 434  
300 435 # 4. Build main query structure: filters and recall
301 436 if recall_clauses:
... ...
search/rerank_client.py
... ... @@ -396,12 +396,50 @@ def _build_ltr_feature_block(
396 396 }
397 397  
398 398  
  399 +def _maybe_append_weighted_knn_terms(
  400 + *,
  401 + term_rows: List[Dict[str, Any]],
  402 + fusion: CoarseRankFusionConfig | RerankFusionConfig,
  403 + knn_components: Optional[Dict[str, Any]],
  404 +) -> None:
  405 + if not knn_components:
  406 + return
  407 +
  408 + weighted_text_knn_score = _to_score(knn_components.get("weighted_text_knn_score"))
  409 + weighted_image_knn_score = _to_score(knn_components.get("weighted_image_knn_score"))
  410 +
  411 + if float(getattr(fusion, "knn_text_exponent", 0.0)) != 0.0:
  412 + text_bias = float(getattr(fusion, "knn_text_bias", fusion.knn_bias))
  413 + term_rows.append(
  414 + {
  415 + "name": "weighted_text_knn_score",
  416 + "raw_score": weighted_text_knn_score,
  417 + "bias": text_bias,
  418 + "exponent": float(fusion.knn_text_exponent),
  419 + "factor": (max(weighted_text_knn_score, 0.0) + text_bias) ** float(fusion.knn_text_exponent),
  420 + }
  421 + )
  422 + if float(getattr(fusion, "knn_image_exponent", 0.0)) != 0.0:
  423 + image_bias = float(getattr(fusion, "knn_image_bias", fusion.knn_bias))
  424 + term_rows.append(
  425 + {
  426 + "name": "weighted_image_knn_score",
  427 + "raw_score": weighted_image_knn_score,
  428 + "bias": image_bias,
  429 + "exponent": float(fusion.knn_image_exponent),
  430 + "factor": (max(weighted_image_knn_score, 0.0) + image_bias)
  431 + ** float(fusion.knn_image_exponent),
  432 + }
  433 + )
  434 +
  435 +
399 436 def _compute_multiplicative_fusion(
400 437 *,
401 438 es_score: float,
402 439 text_score: float,
403 440 knn_score: float,
404 441 fusion: RerankFusionConfig,
  442 + knn_components: Optional[Dict[str, Any]] = None,
405 443 rerank_score: Optional[float] = None,
406 444 fine_score: Optional[float] = None,
407 445 style_boost: float = 1.0,
... ... @@ -427,6 +465,7 @@ def _compute_multiplicative_fusion(
427 465 _add_term("fine_score", fine_score, fusion.fine_bias, fusion.fine_exponent)
428 466 _add_term("text_score", text_score, fusion.text_bias, fusion.text_exponent)
429 467 _add_term("knn_score", knn_score, fusion.knn_bias, fusion.knn_exponent)
  468 + _maybe_append_weighted_knn_terms(term_rows=term_rows, fusion=fusion, knn_components=knn_components)
430 469  
431 470 fused = 1.0
432 471 factors: Dict[str, float] = {}
... ... @@ -450,12 +489,30 @@ def _multiply_coarse_fusion_factors(
450 489 es_score: float,
451 490 text_score: float,
452 491 knn_score: float,
  492 + knn_components: Dict[str, Any],
453 493 fusion: CoarseRankFusionConfig,
454   -) -> Tuple[float, float, float, float]:
  494 +) -> Tuple[float, float, float, float, float, float]:
455 495 es_factor = (max(es_score, 0.0) + fusion.es_bias) ** fusion.es_exponent
456 496 text_factor = (max(text_score, 0.0) + fusion.text_bias) ** fusion.text_exponent
457 497 knn_factor = (max(knn_score, 0.0) + fusion.knn_bias) ** fusion.knn_exponent
458   - return es_factor, text_factor, knn_factor, es_factor * text_factor * knn_factor
  498 + text_knn_bias = float(getattr(fusion, "knn_text_bias", fusion.knn_bias))
  499 + image_knn_bias = float(getattr(fusion, "knn_image_bias", fusion.knn_bias))
  500 + text_knn_factor = (
  501 + (max(_to_score(knn_components.get("weighted_text_knn_score")), 0.0) + text_knn_bias)
  502 + ** float(getattr(fusion, "knn_text_exponent", 0.0))
  503 + )
  504 + image_knn_factor = (
  505 + (max(_to_score(knn_components.get("weighted_image_knn_score")), 0.0) + image_knn_bias)
  506 + ** float(getattr(fusion, "knn_image_exponent", 0.0))
  507 + )
  508 + return (
  509 + es_factor,
  510 + text_factor,
  511 + knn_factor,
  512 + text_knn_factor,
  513 + image_knn_factor,
  514 + es_factor * text_factor * knn_factor * text_knn_factor * image_knn_factor,
  515 + )
459 516  
460 517  
461 518 def _has_selected_sku(hit: Dict[str, Any]) -> bool:
... ... @@ -481,10 +538,18 @@ def coarse_resort_hits(
481 538 knn_components = signal_bundle["knn_components"]
482 539 text_score = signal_bundle["text_score"]
483 540 knn_score = signal_bundle["knn_score"]
484   - es_factor, text_factor, knn_factor, coarse_score = _multiply_coarse_fusion_factors(
  541 + (
  542 + es_factor,
  543 + text_factor,
  544 + knn_factor,
  545 + text_knn_factor,
  546 + image_knn_factor,
  547 + coarse_score,
  548 + ) = _multiply_coarse_fusion_factors(
485 549 es_score=es_score,
486 550 text_score=text_score,
487 551 knn_score=knn_score,
  552 + knn_components=knn_components,
488 553 fusion=f,
489 554 )
490 555  
... ... @@ -535,6 +600,8 @@ def coarse_resort_hits(
535 600 "coarse_es_factor": es_factor,
536 601 "coarse_text_factor": text_factor,
537 602 "coarse_knn_factor": knn_factor,
  603 + "coarse_text_knn_factor": text_knn_factor,
  604 + "coarse_image_knn_factor": image_knn_factor,
538 605 "coarse_score": coarse_score,
539 606 "matched_queries": matched_queries,
540 607 "ltr_features": ltr_features,
... ... @@ -576,7 +643,7 @@ def fuse_scores_and_resort(
576 643 - _rerank_score: 重排服务返回的分数
577 644 - _fused_score: 融合分数
578 645 - _text_score: 文本相关性分数(优先取 named queries 的 base_query 分数)
579   - - _knn_score: KNN 分数(优先取 named queries 的 knn_query 分数
  646 + - _knn_score: KNN 分数(优先取 exact named queries,缺失时回退 ANN named queries
580 647  
581 648 Args:
582 649 es_hits: ES hits 列表(会被原地修改)
... ... @@ -612,6 +679,7 @@ def fuse_scores_and_resort(
612 679 text_score=text_score,
613 680 knn_score=knn_score,
614 681 fusion=f,
  682 + knn_components=knn_components,
615 683 style_boost=style_boost,
616 684 )
617 685 fused = fusion_result["score"]
... ... @@ -678,6 +746,8 @@ def fuse_scores_and_resort(
678 746 "es_factor": fusion_result["factors"].get("es_score"),
679 747 "text_factor": fusion_result["factors"].get("text_score"),
680 748 "knn_factor": fusion_result["factors"].get("knn_score"),
  749 + "text_knn_factor": fusion_result["factors"].get("weighted_text_knn_score"),
  750 + "image_knn_factor": fusion_result["factors"].get("weighted_image_knn_score"),
681 751 "style_intent_selected_sku": sku_selected,
682 752 "style_intent_selected_sku_boost": style_boost,
683 753 "matched_queries": signal_bundle["matched_queries"],
... ... @@ -810,6 +880,7 @@ def run_lightweight_rerank(
810 880 text_score=text_score,
811 881 knn_score=knn_score,
812 882 fusion=f,
  883 + knn_components=signal_bundle["knn_components"],
813 884 style_boost=style_boost,
814 885 )
815 886  
... ... @@ -846,6 +917,8 @@ def run_lightweight_rerank(
846 917 "es_factor": fusion_result["factors"].get("es_score"),
847 918 "text_factor": fusion_result["factors"].get("text_score"),
848 919 "knn_factor": fusion_result["factors"].get("knn_score"),
  920 + "text_knn_factor": fusion_result["factors"].get("weighted_text_knn_score"),
  921 + "image_knn_factor": fusion_result["factors"].get("weighted_image_knn_score"),
849 922 "style_intent_selected_sku": sku_selected,
850 923 "style_intent_selected_sku_boost": style_boost,
851 924 "ltr_features": ltr_features,
... ...
search/searcher.py
... ... @@ -242,67 +242,29 @@ class Searcher:
242 242 return configured
243 243 return int(self.config.rerank.rerank_window)
244 244  
245   - @staticmethod
246   - def _vector_to_list(vector: Any) -> List[float]:
247   - if vector is None:
248   - return []
249   - if hasattr(vector, "tolist"):
250   - values = vector.tolist()
251   - else:
252   - values = list(vector)
253   - return [float(v) for v in values]
254   -
255 245 def _build_exact_knn_rescore(
256 246 self,
257 247 *,
258 248 query_vector: Any,
259 249 image_query_vector: Any,
  250 + parsed_query: Optional[ParsedQuery] = None,
260 251 ) -> Optional[Dict[str, Any]]:
261 252 clauses: List[Dict[str, Any]] = []
262 253  
263   - if query_vector is not None and self.text_embedding_field:
264   - clauses.append(
265   - {
266   - "script_score": {
267   - "_name": "exact_text_knn_query",
268   - "query": {"exists": {"field": self.text_embedding_field}},
269   - "script": {
270   - # Keep exact score on the same [0, 1]-ish scale as KNN dot_product recall.
271   - "source": (
272   - f"(dotProduct(params.query_vector, '{self.text_embedding_field}') + 1.0) / 2.0"
273   - ),
274   - "params": {"query_vector": self._vector_to_list(query_vector)},
275   - },
276   - }
277   - }
278   - )
  254 + text_clause = self.query_builder.build_exact_text_knn_rescore_clause(
  255 + query_vector,
  256 + parsed_query=parsed_query,
  257 + query_name="exact_text_knn_query",
  258 + )
  259 + if text_clause:
  260 + clauses.append(text_clause)
279 261  
280   - if image_query_vector is not None and self.image_embedding_field:
281   - nested_path, _, _ = str(self.image_embedding_field).rpartition(".")
282   - if nested_path:
283   - clauses.append(
284   - {
285   - "nested": {
286   - "path": nested_path,
287   - "_name": "exact_image_knn_query",
288   - "score_mode": "max",
289   - "query": {
290   - "script_score": {
291   - "query": {"exists": {"field": self.image_embedding_field}},
292   - "script": {
293   - # Keep exact score on the same [0, 1]-ish scale as KNN dot_product recall.
294   - "source": (
295   - f"(dotProduct(params.query_vector, '{self.image_embedding_field}') + 1.0) / 2.0"
296   - ),
297   - "params": {
298   - "query_vector": self._vector_to_list(image_query_vector),
299   - },
300   - },
301   - }
302   - },
303   - }
304   - }
305   - )
  262 + image_clause = self.query_builder.build_exact_image_knn_rescore_clause(
  263 + image_query_vector,
  264 + query_name="exact_image_knn_query",
  265 + )
  266 + if image_clause:
  267 + clauses.append(image_clause)
306 268  
307 269 if not clauses:
308 270 return None
... ... @@ -330,12 +292,14 @@ class Searcher:
330 292 in_rank_window: bool,
331 293 query_vector: Any,
332 294 image_query_vector: Any,
  295 + parsed_query: Optional[ParsedQuery] = None,
333 296 ) -> None:
334 297 if not in_rank_window or not self.config.rerank.exact_knn_rescore_enabled:
335 298 return
336 299 rescore = self._build_exact_knn_rescore(
337 300 query_vector=query_vector,
338 301 image_query_vector=image_query_vector,
  302 + parsed_query=parsed_query,
339 303 )
340 304 if not rescore:
341 305 return
... ... @@ -689,6 +653,7 @@ class Searcher:
689 653 in_rank_window=in_rank_window,
690 654 query_vector=parsed_query.query_vector if enable_embedding else None,
691 655 image_query_vector=image_query_vector,
  656 + parsed_query=parsed_query,
692 657 )
693 658  
694 659 # Add facets for faceted search
... ...
tests/test_es_query_builder.py
... ... @@ -208,3 +208,36 @@ def test_image_knn_clause_is_added_alongside_base_translation_and_text_knn():
208 208 assert image_knn["path"] == "image_embedding"
209 209 assert image_knn["score_mode"] == "max"
210 210 assert image_knn["query"]["knn"]["field"] == "image_embedding.vector"
  211 +
  212 +
  213 +def test_text_knn_plan_is_reused_for_ann_and_exact_rescore():
  214 + qb = _builder()
  215 + parsed_query = SimpleNamespace(query_tokens=["a", "b", "c", "d", "e"])
  216 +
  217 + ann_clause = qb.build_text_knn_clause(
  218 + np.array([0.1, 0.2, 0.3]),
  219 + parsed_query=parsed_query,
  220 + )
  221 + exact_clause = qb.build_exact_text_knn_rescore_clause(
  222 + np.array([0.1, 0.2, 0.3]),
  223 + parsed_query=parsed_query,
  224 + )
  225 +
  226 + assert ann_clause is not None
  227 + assert exact_clause is not None
  228 + assert ann_clause["knn"]["k"] == qb.knn_text_k_long
  229 + assert ann_clause["knn"]["num_candidates"] == qb.knn_text_num_candidates_long
  230 + assert ann_clause["knn"]["boost"] == qb.knn_text_boost * 1.4
  231 + assert exact_clause["script_score"]["script"]["params"]["boost"] == qb.knn_text_boost * 1.4
  232 +
  233 +
  234 +def test_image_knn_plan_is_reused_for_ann_and_exact_rescore():
  235 + qb = _builder()
  236 +
  237 + ann_clause = qb.build_image_knn_clause(np.array([0.4, 0.5, 0.6]))
  238 + exact_clause = qb.build_exact_image_knn_rescore_clause(np.array([0.4, 0.5, 0.6]))
  239 +
  240 + assert ann_clause is not None
  241 + assert exact_clause is not None
  242 + assert ann_clause["nested"]["query"]["knn"]["boost"] == qb.knn_image_boost
  243 + assert exact_clause["nested"]["query"]["script_score"]["script"]["params"]["boost"] == qb.knn_image_boost
... ...
tests/test_rerank_client.py
1 1 from math import isclose
2 2  
3   -from config.schema import RerankFusionConfig
4   -from search.rerank_client import fuse_scores_and_resort, run_lightweight_rerank
  3 +from config.schema import CoarseRankFusionConfig, RerankFusionConfig
  4 +from search.rerank_client import coarse_resort_hits, fuse_scores_and_resort, run_lightweight_rerank
5 5  
6 6  
7 7 def test_fuse_scores_and_resort_aggregates_text_components_and_keeps_rerank_primary():
... ... @@ -257,6 +257,96 @@ def test_fuse_scores_and_resort_applies_knn_dismax_weights_and_tie_breaker():
257 257 assert isclose(debug[0]["knn_support_score"], 0.5, rel_tol=1e-9)
258 258  
259 259  
  260 +def test_fuse_scores_and_resort_can_add_weighted_text_and_image_knn_factors():
  261 + hits = [
  262 + {
  263 + "_id": "a",
  264 + "_score": 1.0,
  265 + "matched_queries": {
  266 + "base_query": 2.0,
  267 + "knn_query": 0.4,
  268 + "image_knn_query": 0.5,
  269 + },
  270 + }
  271 + ]
  272 + fusion = RerankFusionConfig(
  273 + rerank_bias=0.0,
  274 + rerank_exponent=1.0,
  275 + text_bias=0.0,
  276 + text_exponent=1.0,
  277 + knn_text_weight=2.0,
  278 + knn_image_weight=1.0,
  279 + knn_tie_breaker=0.25,
  280 + knn_bias=0.1,
  281 + knn_exponent=1.0,
  282 + knn_text_exponent=2.0,
  283 + knn_image_exponent=3.0,
  284 + )
  285 +
  286 + debug = fuse_scores_and_resort(hits, [0.8], fusion=fusion, debug=True)
  287 +
  288 + weighted_text_knn = 0.8
  289 + weighted_image_knn = 0.5
  290 + expected_knn = weighted_text_knn + 0.25 * weighted_image_knn
  291 + expected_fused = (
  292 + 0.8
  293 + * 2.0
  294 + * (expected_knn + 0.1)
  295 + * ((weighted_text_knn + 0.1) ** 2.0)
  296 + * ((weighted_image_knn + 0.1) ** 3.0)
  297 + )
  298 +
  299 + assert isclose(hits[0]["_fused_score"], expected_fused, rel_tol=1e-9)
  300 + assert isclose(debug[0]["text_knn_factor"], (weighted_text_knn + 0.1) ** 2.0, rel_tol=1e-9)
  301 + assert isclose(debug[0]["image_knn_factor"], (weighted_image_knn + 0.1) ** 3.0, rel_tol=1e-9)
  302 + assert "weighted_text_knn_score=" in debug[0]["fusion_summary"]
  303 + assert "weighted_image_knn_score=" in debug[0]["fusion_summary"]
  304 +
  305 +
  306 +def test_coarse_resort_hits_can_add_weighted_text_and_image_knn_factors():
  307 + hits = [
  308 + {
  309 + "_id": "coarse-a",
  310 + "_score": 1.0,
  311 + "matched_queries": {
  312 + "base_query": 2.0,
  313 + "knn_query": 0.4,
  314 + "image_knn_query": 0.5,
  315 + },
  316 + }
  317 + ]
  318 + fusion = CoarseRankFusionConfig(
  319 + es_bias=0.0,
  320 + es_exponent=1.0,
  321 + text_bias=0.0,
  322 + text_exponent=1.0,
  323 + knn_text_weight=2.0,
  324 + knn_image_weight=1.0,
  325 + knn_tie_breaker=0.25,
  326 + knn_bias=0.1,
  327 + knn_exponent=1.0,
  328 + knn_text_exponent=2.0,
  329 + knn_image_exponent=3.0,
  330 + )
  331 +
  332 + debug = coarse_resort_hits(hits, fusion=fusion, debug=True)
  333 +
  334 + weighted_text_knn = 0.8
  335 + weighted_image_knn = 0.5
  336 + expected_knn = weighted_text_knn + 0.25 * weighted_image_knn
  337 + expected_coarse = (
  338 + 1.0
  339 + * 2.0
  340 + * (expected_knn + 0.1)
  341 + * ((weighted_text_knn + 0.1) ** 2.0)
  342 + * ((weighted_image_knn + 0.1) ** 3.0)
  343 + )
  344 +
  345 + assert isclose(hits[0]["_coarse_score"], expected_coarse, rel_tol=1e-9)
  346 + assert isclose(debug[0]["coarse_text_knn_factor"], (weighted_text_knn + 0.1) ** 2.0, rel_tol=1e-9)
  347 + assert isclose(debug[0]["coarse_image_knn_factor"], (weighted_image_knn + 0.1) ** 3.0, rel_tol=1e-9)
  348 +
  349 +
260 350 def test_run_lightweight_rerank_sorts_by_fused_stage_score(monkeypatch):
261 351 hits = [
262 352 {
... ...
tests/test_search_rerank_window.py
... ... @@ -1055,6 +1055,7 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch):
1055 1055 translations={},
1056 1056 query_vector=np.array([0.1, 0.2, 0.3], dtype=np.float32),
1057 1057 image_query_vector=np.array([0.4, 0.5, 0.6], dtype=np.float32),
  1058 + query_tokens=["dress", "formal", "spring", "summer", "floral"],
1058 1059 )
1059 1060  
1060 1061 es_client = _FakeESClient(total_hits=5)
... ... @@ -1081,8 +1082,12 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch):
1081 1082 es_index_name=base.es_index_name,
1082 1083 es_settings=base.es_settings,
1083 1084 )
1084   - searcher = _build_searcher(config, es_client)
1085   - searcher.query_parser = _VectorQueryParser()
  1085 + searcher = Searcher(
  1086 + es_client=es_client,
  1087 + config=config,
  1088 + query_parser=_VectorQueryParser(),
  1089 + image_encoder=SimpleNamespace(),
  1090 + )
1086 1091 context = create_request_context(reqid="exact-rescore", uid="u-exact")
1087 1092  
1088 1093 monkeypatch.setattr(
... ... @@ -1112,6 +1117,36 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch):
1112 1117 elif "nested" in clause:
1113 1118 names.append(clause["nested"]["_name"])
1114 1119 assert names == ["exact_text_knn_query", "exact_image_knn_query"]
  1120 + recall_query = body["query"]
  1121 + if "bool" in recall_query and recall_query["bool"].get("must"):
  1122 + recall_query = recall_query["bool"]["must"][0]
  1123 + if "function_score" in recall_query:
  1124 + recall_query = recall_query["function_score"]["query"]
  1125 + recall_should = recall_query["bool"]["should"]
  1126 + text_knn_clause = next(
  1127 + clause["knn"]
  1128 + for clause in recall_should
  1129 + if clause.get("knn", {}).get("_name") == "knn_query"
  1130 + )
  1131 + image_knn_clause = next(
  1132 + clause["nested"]["query"]["knn"]
  1133 + for clause in recall_should
  1134 + if clause.get("nested", {}).get("_name") == "image_knn_query"
  1135 + )
  1136 + exact_text_clause = next(
  1137 + clause["script_score"]
  1138 + for clause in should
  1139 + if clause.get("script_score", {}).get("_name") == "exact_text_knn_query"
  1140 + )
  1141 + exact_image_clause = next(
  1142 + clause["nested"]["query"]["script_score"]
  1143 + for clause in should
  1144 + if clause.get("nested", {}).get("_name") == "exact_image_knn_query"
  1145 + )
  1146 + assert text_knn_clause["boost"] == 28.0
  1147 + assert exact_text_clause["script"]["params"]["boost"] == text_knn_clause["boost"]
  1148 + assert image_knn_clause["boost"] == 20.0
  1149 + assert exact_image_clause["script"]["params"]["boost"] == image_knn_clause["boost"]
1115 1150  
1116 1151  
1117 1152 def test_searcher_skips_exact_knn_rescore_outside_rank_window(monkeypatch):
... ...