Commit 47452e1dd6cd19314e6f867e4b5d346ddbc99651

Authored by tangwang
1 parent 317c5d2c

feat(search): 支持可配置的精确向量重打分 (exact rescore),解决 topN 内 ANN 得分缺失问题

 修改内容

1. **新增配置项** (`config/config.yaml`)
   - `exact_knn_rescore_enabled`: 是否开启精确向量重打分,默认 true
   - `exact_knn_rescore_window`: 重打分窗口大小,默认 160(与 rerank_window 解耦,可独立配置)

2. **ES 查询层改造** (`search/searcher.py`)
   - 在第一次 ES 搜索中,根据配置为 window_size 内的文档注入 rescore 阶段
   - rescore_query 中包含两个 named script_score 子句:
     - `exact_text_knn_query`: 对文本向量执行精确点积
     - `exact_image_knn_query`: 对图片向量执行精确点积
   - 当前采用 `score_mode=total` 且 `rescore_query_weight=0.0`,**只补分不改排序**,exact 分仅出现在 `matched_queries` 中

3. **统一向量得分 Boost 逻辑** (`search/es_query_builder.py`)
   - 新增 `_get_knn_plan()` 方法,集中管理文本/图片 KNN 的 boost 计算规则
   - 支持长查询(token 数超过阈值)时文本 boost 额外乘 1.4 倍
   - 精确 rescore 与 ANN 召回**共用同一套 boost 规则**,确保分数量纲一致
   - 原有 ANN 查询构建逻辑同步迁移至该统一入口

4. **融合阶段得分优先级调整** (`search/rerank_client.py`)
   - `_build_hit_signal_bundle()` 中统一处理向量得分读取
   - 优先从 `matched_queries` 读取 `exact_text_knn_query` / `exact_image_knn_query`
   - 若不存在则回退到原 `knn_query` / `image_knn_query`(ANN 得分)
   - 覆盖 coarse_rank、fine_rank、rerank 三个阶段,避免重复补丁

5. **测试覆盖**
   - `tests/test_es_query_builder.py`: 验证 ANN 与 exact 共用 boost 规则
   - `tests/test_search_rerank_window.py`: 验证 rescore 窗口及 named query 正确注入
   - `tests/test_rerank_client.py`: 验证 exact 优先、回退 ANN 的逻辑

 技术细节

- **精确向量计算脚本** (Painless)
  ```painless
  // 文本 (dotProduct + 1.0) / 2.0
  (dotProduct(params.query_vector, 'title_embedding') + 1.0) / 2.0
  // 图片同理,字段为 'image_embedding.vector'
  ```
  乘以统一的 boost(来自配置 `knn_text_boost` / `knn_image_boost` 及长查询放大因子)。

- **named query 保留机制**
  - 主查询中已开启 `include_named_queries_score: true`
  - rescore 阶段命名的脚本得分会合并到每个 hit 的 `matched_queries` 中
  - 通过 `_extract_named_score()` 按名称提取,与原始 ANN 得分访问方式完全一致

- **性能影响** (基于 top160、6 条真实查询、warm-up 后 3 轮平均)
  - `elasticsearch_search_primary` 耗时: 124.71ms → 136.60ms (+11.89ms, +9.53%)
  - `total_search` 受其他组件抖动影响较大,不作为主要参考
  - 该开销在可接受范围内,未出现超时或资源瓶颈

 配置示例

```yaml
search:
  exact_knn_rescore_enabled: true
  exact_knn_rescore_window: 160
  knn_text_boost: 4.0
  knn_image_boost: 4.0
  long_query_token_threshold: 8
  long_query_text_boost_factor: 1.4
```

 已知问题与后续计划

- 当前版本经过调参实验发现,开启 exact rescore 后部分 query(强类型约束 + 多风格/颜色相似)的主指标相比 baseline(exact=false)下降约 0.031(0.6009 → 0.5697)
- 根因:exact 将 KNN 从稀疏辅助信号变为 dense 排序因子,coarse 阶段排序语义变化,单纯调整现有 `knn_bias/exponent` 无法完全恢复
- 后续迭代方向:**coarse 阶段暂不强制使用 exact**,仅 fine/rerank 优先 exact;或 coarse 采用“ANN 优先,exact 只补缺失”策略,再重新评估

 相关文件

- `config/config.yaml`
- `search/searcher.py`
- `search/es_query_builder.py`
- `search/rerank_client.py`
- `tests/test_es_query_builder.py`
- `tests/test_search_rerank_window.py`
- `tests/test_rerank_client.py`
- `scripts/evaluation/exact_rescore_coarse_tuning_round2.json` (调参实验记录)
config/config.yaml
1 -# Unified Configuration for Multi-Tenant Search Engine  
2 -# 统一配置文件,所有租户共用一套配置  
3 -# 注意:索引结构由 mappings/search_products.json 定义,此文件只配置搜索行为  
4 -#  
5 -# 约定:下列键为必填;进程环境变量可覆盖 infrastructure / runtime 中同名语义项  
6 -#(如 ES_HOST、API_PORT 等),未设置环境变量时使用本文件中的值。  
7 -  
8 -# Process / bind addresses (环境变量 APP_ENV、RUNTIME_ENV、ES_INDEX_NAMESPACE 可覆盖前两者的语义)  
9 runtime: 1 runtime:
10 environment: prod 2 environment: prod
11 index_namespace: '' 3 index_namespace: ''
@@ -21,8 +13,6 @@ runtime: @@ -21,8 +13,6 @@ runtime:
21 translator_port: 6006 13 translator_port: 6006
22 reranker_host: 0.0.0.0 14 reranker_host: 0.0.0.0
23 reranker_port: 6007 15 reranker_port: 6007
24 -  
25 -# 基础设施连接(敏感项优先读环境变量:ES_*、REDIS_*、DB_*、DASHSCOPE_API_KEY、DEEPL_AUTH_KEY)  
26 infrastructure: 16 infrastructure:
27 elasticsearch: 17 elasticsearch:
28 host: http://localhost:9200 18 host: http://localhost:9200
@@ -49,23 +39,12 @@ infrastructure: @@ -49,23 +39,12 @@ infrastructure:
49 secrets: 39 secrets:
50 dashscope_api_key: null 40 dashscope_api_key: null
51 deepl_auth_key: null 41 deepl_auth_key: null
52 -  
53 -# Elasticsearch Index  
54 es_index_name: search_products 42 es_index_name: search_products
55 -  
56 -# 检索域 / 索引列表(可为空列表;每项字段均需显式给出)  
57 indexes: [] 43 indexes: []
58 -  
59 -# Config assets  
60 assets: 44 assets:
61 query_rewrite_dictionary_path: config/dictionaries/query_rewrite.dict 45 query_rewrite_dictionary_path: config/dictionaries/query_rewrite.dict
62 -  
63 -# Product content understanding (LLM enrich-content) configuration  
64 product_enrich: 46 product_enrich:
65 max_workers: 40 47 max_workers: 40
66 -  
67 -# 离线 / Web 相关性评估(scripts/evaluation、eval-web)  
68 -# CLI 未显式传参时使用此处默认值;search_base_url 未配置时自动为 http://127.0.0.1:{runtime.api_port}  
69 search_evaluation: 48 search_evaluation:
70 artifact_root: artifacts/search_evaluation 49 artifact_root: artifacts/search_evaluation
71 queries_file: scripts/evaluation/queries/queries.txt 50 queries_file: scripts/evaluation/queries/queries.txt
@@ -98,23 +77,18 @@ search_evaluation: @@ -98,23 +77,18 @@ search_evaluation:
98 rebuild_irrelevant_stop_ratio: 0.799 77 rebuild_irrelevant_stop_ratio: 0.799
99 rebuild_irrel_low_combined_stop_ratio: 0.959 78 rebuild_irrel_low_combined_stop_ratio: 0.959
100 rebuild_irrelevant_stop_streak: 3 79 rebuild_irrelevant_stop_streak: 3
101 -  
102 -# ES Index Settings (基础设置)  
103 es_settings: 80 es_settings:
104 number_of_shards: 1 81 number_of_shards: 1
105 number_of_replicas: 0 82 number_of_replicas: 0
106 refresh_interval: 30s 83 refresh_interval: 30s
107 84
108 -# 字段权重配置(用于搜索时的字段boost)  
109 -# 统一按“字段基名”配置;查询时按实际检索语言动态拼接 .{lang}。  
110 -# 若需要按某个语言单独调权,也可以加显式 key(例如 title.de: 3.2)。 85 +# 统一按“字段基名”配置;查询时按实际检索语言动态拼接 .{lang}
111 field_boosts: 86 field_boosts:
112 title: 3.0 87 title: 3.0
113 # qanchors enriched_tags 在 enriched_attributes.value中也存在,所以其实他的权重为自身权重+enriched_attributes.value的权重 88 # qanchors enriched_tags 在 enriched_attributes.value中也存在,所以其实他的权重为自身权重+enriched_attributes.value的权重
114 qanchors: 1.0 89 qanchors: 1.0
115 enriched_tags: 1.0 90 enriched_tags: 1.0
116 enriched_attributes.value: 1.5 91 enriched_attributes.value: 1.5
117 - # enriched_taxonomy_attributes.value: 0.3  
118 category_name_text: 2.0 92 category_name_text: 2.0
119 category_path: 2.0 93 category_path: 2.0
120 keywords: 2.0 94 keywords: 2.0
@@ -126,38 +100,25 @@ field_boosts: @@ -126,38 +100,25 @@ field_boosts:
126 description: 1.0 100 description: 1.0
127 vendor: 1.0 101 vendor: 1.0
128 102
129 -# Query Configuration(查询配置)  
130 query_config: 103 query_config:
131 - # 支持的语言  
132 supported_languages: 104 supported_languages:
133 - zh 105 - zh
134 - en 106 - en
135 default_language: en 107 default_language: en
136 -  
137 - # 功能开关(翻译开关由tenant_config控制)  
138 enable_text_embedding: true 108 enable_text_embedding: true
139 enable_query_rewrite: true 109 enable_query_rewrite: true
140 110
141 - # 查询翻译模型(须与 services.translation.capabilities 中某项一致)  
142 - # 源语种在租户 index_languages 内:主召回可打在源语种字段,用下面三项。  
143 - zh_to_en_model: nllb-200-distilled-600m # "opus-mt-zh-en"  
144 - en_to_zh_model: nllb-200-distilled-600m # "opus-mt-en-zh"  
145 - default_translation_model: nllb-200-distilled-600m  
146 - # zh_to_en_model: deepl  
147 - # en_to_zh_model: deepl  
148 - # default_translation_model: deepl  
149 - # 源语种不在 index_languages:翻译对可检索文本更关键,可单独指定(缺省则与上一组相同)  
150 - zh_to_en_model__source_not_in_index: nllb-200-distilled-600m  
151 - en_to_zh_model__source_not_in_index: nllb-200-distilled-600m  
152 - default_translation_model__source_not_in_index: nllb-200-distilled-600m  
153 - # zh_to_en_model__source_not_in_index: deepl  
154 - # en_to_zh_model__source_not_in_index: deepl  
155 - # default_translation_model__source_not_in_index: deepl 111 + zh_to_en_model: deepl # nllb-200-distilled-600m
  112 + en_to_zh_model: deepl
  113 + default_translation_model: deepl
  114 + # 源语种不在 index_languages时翻译质量比较重要,因此单独配置
  115 + zh_to_en_model__source_not_in_index: deepl
  116 + en_to_zh_model__source_not_in_index: deepl
  117 + default_translation_model__source_not_in_index: deepl
156 118
157 - # 查询解析阶段:翻译与 query 向量并发执行,共用同一等待预算(毫秒)。  
158 - # 检测语言已在租户 index_languages 内:较短;不在索引语言内:较长(翻译对召回更关键)。  
159 - translation_embedding_wait_budget_ms_source_in_index: 300 # 80  
160 - translation_embedding_wait_budget_ms_source_not_in_index: 400 # 200 119 + # 查询解析阶段:翻译与 query 向量并发执行,共用同一等待预算(毫秒)
  120 + translation_embedding_wait_budget_ms_source_in_index: 300
  121 + translation_embedding_wait_budget_ms_source_not_in_index: 400
161 style_intent: 122 style_intent:
162 enabled: true 123 enabled: true
163 selected_sku_boost: 1.2 124 selected_sku_boost: 1.2
@@ -184,11 +145,8 @@ query_config: @@ -184,11 +145,8 @@ query_config:
184 product_title_exclusion: 145 product_title_exclusion:
185 enabled: true 146 enabled: true
186 dictionary_path: config/dictionaries/product_title_exclusion.tsv 147 dictionary_path: config/dictionaries/product_title_exclusion.tsv
187 -  
188 - # 动态多语言检索字段配置  
189 - # multilingual_fields 会被拼成 title.{lang}/brief.{lang}/... 形式;  
190 - # shared_fields 为无语言后缀字段。  
191 search_fields: 148 search_fields:
  149 + # 统一按“字段基名”配置;查询时按实际检索语言动态拼接 .{lang}
192 multilingual_fields: 150 multilingual_fields:
193 - title 151 - title
194 - keywords 152 - keywords
@@ -205,13 +163,14 @@ query_config: @@ -205,13 +163,14 @@ query_config:
205 # - description 163 # - description
206 # - vendor 164 # - vendor
207 # shared_fields: 无语言后缀字段;示例: tags, option1_values, option2_values, option3_values 165 # shared_fields: 无语言后缀字段;示例: tags, option1_values, option2_values, option3_values
  166 +
208 shared_fields: null 167 shared_fields: null
209 core_multilingual_fields: 168 core_multilingual_fields:
210 - title 169 - title
211 - qanchors 170 - qanchors
212 - category_name_text 171 - category_name_text
213 172
214 - # 统一文本召回策略(主查询 + 翻译查询) 173 + # 文本召回(主查询 + 翻译查询)
215 text_query_strategy: 174 text_query_strategy:
216 base_minimum_should_match: 60% 175 base_minimum_should_match: 60%
217 translation_minimum_should_match: 60% 176 translation_minimum_should_match: 60%
@@ -226,14 +185,10 @@ query_config: @@ -226,14 +185,10 @@ query_config:
226 title: 5.0 185 title: 5.0
227 qanchors: 4.0 186 qanchors: 4.0
228 phrase_match_boost: 3.0 187 phrase_match_boost: 3.0
229 -  
230 - # Embedding字段名称  
231 text_embedding_field: title_embedding 188 text_embedding_field: title_embedding
232 image_embedding_field: image_embedding.vector 189 image_embedding_field: image_embedding.vector
233 190
234 - # 返回字段配置(_source includes)  
235 - # null表示返回所有字段,[]表示不返回任何字段,列表表示只返回指定字段  
236 - # 下列字段与 api/result_formatter.py(SpuResult 填充)及 search/searcher.py(SKU 排序/主图替换)一致 191 + # null表示返回所有字段,[]表示不返回任何字段
237 source_fields: 192 source_fields:
238 - spu_id 193 - spu_id
239 - handle 194 - handle
@@ -255,6 +210,7 @@ query_config: @@ -255,6 +210,7 @@ query_config:
255 # - enriched_tags 210 # - enriched_tags
256 # - enriched_attributes 211 # - enriched_attributes
257 # - # enriched_taxonomy_attributes.value 212 # - # enriched_taxonomy_attributes.value
  213 +
258 - min_price 214 - min_price
259 - compare_at_price 215 - compare_at_price
260 - image_url 216 - image_url
@@ -274,22 +230,17 @@ query_config: @@ -274,22 +230,17 @@ query_config:
274 # KNN:文本向量与多模态(图片)向量各自 boost 与召回(k / num_candidates) 230 # KNN:文本向量与多模态(图片)向量各自 boost 与召回(k / num_candidates)
275 knn_text_boost: 4 231 knn_text_boost: 4
276 knn_image_boost: 4 232 knn_image_boost: 4
277 -  
278 - # knn_text_num_candidates = k * 3.4  
279 knn_text_k: 160 233 knn_text_k: 160
280 - knn_text_num_candidates: 560 234 + knn_text_num_candidates: 560 # k * 3.4
281 knn_text_k_long: 400 235 knn_text_k_long: 400
282 knn_text_num_candidates_long: 1200 236 knn_text_num_candidates_long: 1200
283 knn_image_k: 400 237 knn_image_k: 400
284 knn_image_num_candidates: 1200 238 knn_image_num_candidates: 1200
285 239
286 -# Function Score配置(ES层打分规则)  
287 function_score: 240 function_score:
288 score_mode: sum 241 score_mode: sum
289 boost_mode: multiply 242 boost_mode: multiply
290 functions: [] 243 functions: []
291 -  
292 -# 粗排配置(仅融合 ES 文本/向量信号,不调用模型)  
293 coarse_rank: 244 coarse_rank:
294 enabled: true 245 enabled: true
295 input_window: 480 246 input_window: 480
@@ -305,24 +256,20 @@ coarse_rank: @@ -305,24 +256,20 @@ coarse_rank:
305 knn_text_weight: 1.0 256 knn_text_weight: 1.0
306 knn_image_weight: 2.0 257 knn_image_weight: 2.0
307 knn_tie_breaker: 0.3 258 knn_tie_breaker: 0.3
308 - knn_bias: 0.6  
309 - knn_exponent: 0.4  
310 -  
311 -# 精排配置(轻量 reranker)  
312 -# enabled=false 时仍进入 fine 阶段,但保序透传,不调用 fine 模型服务 259 + knn_bias: 0.0
  260 + knn_exponent: 5.6
  261 + knn_text_exponent: 0.0
  262 + knn_image_exponent: 0.0
313 fine_rank: 263 fine_rank:
314 - enabled: false 264 + enabled: false # false 时保序透传
315 input_window: 160 265 input_window: 160
316 output_window: 80 266 output_window: 80
317 timeout_sec: 10.0 267 timeout_sec: 10.0
318 rerank_query_template: '{query}' 268 rerank_query_template: '{query}'
319 rerank_doc_template: '{title}' 269 rerank_doc_template: '{title}'
320 service_profile: fine 270 service_profile: fine
321 -  
322 -# 重排配置(provider/URL 在 services.rerank)  
323 -# enabled=false 时仍进入 rerank 阶段,但保序透传,不调用最终 rerank 服务  
324 rerank: 271 rerank:
325 - enabled: true 272 + enabled: false # false 时保序透传
326 rerank_window: 160 273 rerank_window: 160
327 exact_knn_rescore_enabled: true 274 exact_knn_rescore_enabled: true
328 exact_knn_rescore_window: 160 275 exact_knn_rescore_window: 160
@@ -332,7 +279,6 @@ rerank: @@ -332,7 +279,6 @@ rerank:
332 rerank_query_template: '{query}' 279 rerank_query_template: '{query}'
333 rerank_doc_template: '{title}' 280 rerank_doc_template: '{title}'
334 service_profile: default 281 service_profile: default
335 -  
336 # 乘法融合:fused = Π (max(score,0) + bias) ** exponent(es / rerank / fine / text / knn) 282 # 乘法融合:fused = Π (max(score,0) + bias) ** exponent(es / rerank / fine / text / knn)
337 # 其中 knn_score 先做一层 dis_max: 283 # 其中 knn_score 先做一层 dis_max:
338 # max(knn_text_weight * text_knn, knn_image_weight * image_knn) 284 # max(knn_text_weight * text_knn, knn_image_weight * image_knn)
@@ -345,30 +291,28 @@ rerank: @@ -345,30 +291,28 @@ rerank:
345 fine_bias: 0.1 291 fine_bias: 0.1
346 fine_exponent: 1.0 292 fine_exponent: 1.0
347 text_bias: 0.1 293 text_bias: 0.1
348 - text_exponent: 0.25  
349 # base_query_trans_* 相对 base_query 的权重(见 search/rerank_client 中文本 dismax 融合) 294 # base_query_trans_* 相对 base_query 的权重(见 search/rerank_client 中文本 dismax 融合)
  295 + text_exponent: 0.25
350 text_translation_weight: 0.8 296 text_translation_weight: 0.8
351 knn_text_weight: 1.0 297 knn_text_weight: 1.0
352 knn_image_weight: 2.0 298 knn_image_weight: 2.0
353 knn_tie_breaker: 0.3 299 knn_tie_breaker: 0.3
354 - knn_bias: 0.6  
355 - knn_exponent: 0.4 300 + knn_bias: 0.0
  301 + knn_exponent: 5.6
356 302
357 -# 可扩展服务/provider 注册表(单一配置源)  
358 services: 303 services:
359 translation: 304 translation:
360 service_url: http://127.0.0.1:6006 305 service_url: http://127.0.0.1:6006
361 - # default_model: nllb-200-distilled-600m  
362 default_model: nllb-200-distilled-600m 306 default_model: nllb-200-distilled-600m
363 default_scene: general 307 default_scene: general
364 timeout_sec: 10.0 308 timeout_sec: 10.0
365 cache: 309 cache:
366 ttl_seconds: 62208000 310 ttl_seconds: 62208000
367 sliding_expiration: true 311 sliding_expiration: true
368 - # When false, cache keys are exact-match per request model only (ignores model_quality_tiers for lookups).  
369 - enable_model_quality_tier_cache: true 312 + # When false, cache keys are exact-match per request model only (ignores model_quality_tiers for lookups)
370 # Higher tier = better quality. Multiple models may share one tier (同级). 313 # Higher tier = better quality. Multiple models may share one tier (同级).
371 # A request may reuse Redis keys from models with tier > A or tier == A (not from lower tiers). 314 # A request may reuse Redis keys from models with tier > A or tier == A (not from lower tiers).
  315 + enable_model_quality_tier_cache: true
372 model_quality_tiers: 316 model_quality_tiers:
373 deepl: 30 317 deepl: 30
374 qwen-mt: 30 318 qwen-mt: 30
@@ -462,13 +406,12 @@ services: @@ -462,13 +406,12 @@ services:
462 num_beams: 1 406 num_beams: 1
463 use_cache: true 407 use_cache: true
464 embedding: 408 embedding:
465 - provider: http # http 409 + provider: http
466 providers: 410 providers:
467 http: 411 http:
468 text_base_url: http://127.0.0.1:6005 412 text_base_url: http://127.0.0.1:6005
469 image_base_url: http://127.0.0.1:6008 413 image_base_url: http://127.0.0.1:6008
470 - # 服务内文本后端(embedding 进程启动时读取)  
471 - backend: tei # tei | local_st 414 + backend: tei
472 backends: 415 backends:
473 tei: 416 tei:
474 base_url: http://127.0.0.1:8080 417 base_url: http://127.0.0.1:8080
@@ -508,8 +451,8 @@ services: @@ -508,8 +451,8 @@ services:
508 request: 451 request:
509 max_docs: 1000 452 max_docs: 1000
510 normalize: true 453 normalize: true
511 - default_instance: default  
512 # 命名实例:同一套 reranker 代码按实例名读取不同端口 / 后端 / runtime 目录。 454 # 命名实例:同一套 reranker 代码按实例名读取不同端口 / 后端 / runtime 目录。
  455 + default_instance: default
513 instances: 456 instances:
514 default: 457 default:
515 host: 0.0.0.0 458 host: 0.0.0.0
@@ -551,6 +494,7 @@ services: @@ -551,6 +494,7 @@ services:
551 enforce_eager: false 494 enforce_eager: false
552 infer_batch_size: 100 495 infer_batch_size: 100
553 sort_by_doc_length: true 496 sort_by_doc_length: true
  497 +
554 # standard=_format_instruction__standard(固定 yes/no system);compact=_format_instruction(instruction 作 system 且 user 内重复 Instruct) 498 # standard=_format_instruction__standard(固定 yes/no system);compact=_format_instruction(instruction 作 system 且 user 内重复 Instruct)
555 instruction_format: standard # compact standard 499 instruction_format: standard # compact standard
556 # instruction: "Given a query, score the product for relevance" 500 # instruction: "Given a query, score the product for relevance"
@@ -564,6 +508,7 @@ services: @@ -564,6 +508,7 @@ services:
564 # instruction: "Rank products by query with category & style match prioritized" 508 # instruction: "Rank products by query with category & style match prioritized"
565 # instruction: "Given a fashion shopping query, retrieve relevant products that answer the query" 509 # instruction: "Given a fashion shopping query, retrieve relevant products that answer the query"
566 instruction: rank products by given query 510 instruction: rank products by given query
  511 +
567 # vLLM LLM.score()(跨编码打分)。独立高性能环境 .venv-reranker-score(vllm 0.18 固定版):./scripts/setup_reranker_venv.sh qwen3_vllm_score 512 # vLLM LLM.score()(跨编码打分)。独立高性能环境 .venv-reranker-score(vllm 0.18 固定版):./scripts/setup_reranker_venv.sh qwen3_vllm_score
568 # 与 qwen3_vllm 可共用同一 model_name / HF 缓存;venv 分离以便升级 vLLM 而不影响 generate 后端。 513 # 与 qwen3_vllm 可共用同一 model_name / HF 缓存;venv 分离以便升级 vLLM 而不影响 generate 后端。
569 qwen3_vllm_score: 514 qwen3_vllm_score:
@@ -591,15 +536,10 @@ services: @@ -591,15 +536,10 @@ services:
591 qwen3_transformers: 536 qwen3_transformers:
592 model_name: Qwen/Qwen3-Reranker-0.6B 537 model_name: Qwen/Qwen3-Reranker-0.6B
593 instruction: rank products by given query 538 instruction: rank products by given query
594 - # instruction: "Score the product’s relevance to the given query"  
595 max_length: 8192 539 max_length: 8192
596 batch_size: 64 540 batch_size: 64
597 use_fp16: true 541 use_fp16: true
598 - # sdpa:默认无需 flash-attn;若已安装 flash_attn 可改为 flash_attention_2  
599 attn_implementation: sdpa 542 attn_implementation: sdpa
600 - # Packed Transformers backend: shared query prefix + custom position_ids/attention_mask.  
601 - # For 1 query + many short docs (for example 400 product titles), this usually reduces  
602 - # repeated prefix work and padding waste compared with pairwise batching.  
603 qwen3_transformers_packed: 543 qwen3_transformers_packed:
604 model_name: Qwen/Qwen3-Reranker-0.6B 544 model_name: Qwen/Qwen3-Reranker-0.6B
605 instruction: Rank products by query with category & style match prioritized 545 instruction: Rank products by query with category & style match prioritized
@@ -608,8 +548,6 @@ services: @@ -608,8 +548,6 @@ services:
608 max_docs_per_pack: 0 548 max_docs_per_pack: 0
609 use_fp16: true 549 use_fp16: true
610 sort_by_doc_length: true 550 sort_by_doc_length: true
611 - # Packed mode relies on a custom 4D attention mask. "eager" is the safest default.  
612 - # If your torch/transformers stack validates it, you can benchmark "sdpa".  
613 attn_implementation: eager 551 attn_implementation: eager
614 qwen3_gguf: 552 qwen3_gguf:
615 repo_id: DevQuasar/Qwen.Qwen3-Reranker-4B-GGUF 553 repo_id: DevQuasar/Qwen.Qwen3-Reranker-4B-GGUF
@@ -617,7 +555,6 @@ services: @@ -617,7 +555,6 @@ services:
617 cache_dir: ./model_cache 555 cache_dir: ./model_cache
618 local_dir: ./models/reranker/qwen3-reranker-4b-gguf 556 local_dir: ./models/reranker/qwen3-reranker-4b-gguf
619 instruction: Rank products by query with category & style match prioritized 557 instruction: Rank products by query with category & style match prioritized
620 - # T4 16GB / 性能优先配置:全量层 offload,实测比保守配置明显更快  
621 n_ctx: 512 558 n_ctx: 512
622 n_batch: 512 559 n_batch: 512
623 n_ubatch: 512 560 n_ubatch: 512
@@ -640,8 +577,6 @@ services: @@ -640,8 +577,6 @@ services:
640 cache_dir: ./model_cache 577 cache_dir: ./model_cache
641 local_dir: ./models/reranker/qwen3-reranker-0.6b-q8_0-gguf 578 local_dir: ./models/reranker/qwen3-reranker-0.6b-q8_0-gguf
642 instruction: Rank products by query with category & style match prioritized 579 instruction: Rank products by query with category & style match prioritized
643 - # 0.6B GGUF / online rerank baseline:  
644 - # 实测 400 titles 单请求约 265s,因此它更适合作为低显存功能后备,不适合在线低延迟主路由。  
645 n_ctx: 256 580 n_ctx: 256
646 n_batch: 256 581 n_batch: 256
647 n_ubatch: 256 582 n_ubatch: 256
@@ -661,20 +596,15 @@ services: @@ -661,20 +596,15 @@ services:
661 verbose: false 596 verbose: false
662 dashscope_rerank: 597 dashscope_rerank:
663 model_name: qwen3-rerank 598 model_name: qwen3-rerank
664 - # 按地域选择 endpoint:  
665 - # 中国: https://dashscope.aliyuncs.com/compatible-api/v1/reranks  
666 - # 新加坡: https://dashscope-intl.aliyuncs.com/compatible-api/v1/reranks  
667 - # 美国: https://dashscope-us.aliyuncs.com/compatible-api/v1/reranks  
668 endpoint: https://dashscope.aliyuncs.com/compatible-api/v1/reranks 599 endpoint: https://dashscope.aliyuncs.com/compatible-api/v1/reranks
669 api_key_env: RERANK_DASHSCOPE_API_KEY_CN 600 api_key_env: RERANK_DASHSCOPE_API_KEY_CN
670 timeout_sec: 10.0 601 timeout_sec: 10.0
671 - top_n_cap: 0 # 0 表示 top_n=当前请求文档数;>0 则限制 top_n 上限  
672 - batchsize: 64 # 0 关闭;>0 启用并发小包调度(top_n/top_n_cap 仍生效,分包后全局截断) 602 + top_n_cap: 0 # 0 表示 top_n=当前请求文档数
  603 + batchsize: 64 # 0 关闭;>0 启用并发小包调度(top_n/top_n_cap 仍生效,分包后全局截断)
673 instruct: Given a shopping query, rank product titles by relevance 604 instruct: Given a shopping query, rank product titles by relevance
674 max_retries: 2 605 max_retries: 2
675 retry_backoff_sec: 0.2 606 retry_backoff_sec: 0.2
676 607
677 -# SPU配置(已启用,使用嵌套skus)  
678 spu_config: 608 spu_config:
679 enabled: true 609 enabled: true
680 spu_field: spu_id 610 spu_field: spu_id
@@ -686,7 +616,6 @@ spu_config: @@ -686,7 +616,6 @@ spu_config:
686 - option2 616 - option2
687 - option3 617 - option3
688 618
689 -# 租户配置(Tenant Configuration)  
690 # 每个租户可配置主语言 primary_language 与索引语言 index_languages(主市场语言,商家可勾选) 619 # 每个租户可配置主语言 primary_language 与索引语言 index_languages(主市场语言,商家可勾选)
691 # 默认 index_languages: [en, zh],可配置为任意 SOURCE_LANG_CODE_MAP.keys() 的子集 620 # 默认 index_languages: [en, zh],可配置为任意 SOURCE_LANG_CODE_MAP.keys() 的子集
692 tenant_config: 621 tenant_config:
@@ -587,6 +587,14 @@ class AppConfigLoader: @@ -587,6 +587,14 @@ class AppConfigLoader:
587 knn_tie_breaker=float(coarse_fusion_raw.get("knn_tie_breaker", 0.0)), 587 knn_tie_breaker=float(coarse_fusion_raw.get("knn_tie_breaker", 0.0)),
588 knn_bias=float(coarse_fusion_raw.get("knn_bias", 0.6)), 588 knn_bias=float(coarse_fusion_raw.get("knn_bias", 0.6)),
589 knn_exponent=float(coarse_fusion_raw.get("knn_exponent", 0.2)), 589 knn_exponent=float(coarse_fusion_raw.get("knn_exponent", 0.2)),
  590 + knn_text_bias=float(
  591 + coarse_fusion_raw.get("knn_text_bias", coarse_fusion_raw.get("knn_bias", 0.6))
  592 + ),
  593 + knn_text_exponent=float(coarse_fusion_raw.get("knn_text_exponent", 0.0)),
  594 + knn_image_bias=float(
  595 + coarse_fusion_raw.get("knn_image_bias", coarse_fusion_raw.get("knn_bias", 0.6))
  596 + ),
  597 + knn_image_exponent=float(coarse_fusion_raw.get("knn_image_exponent", 0.0)),
590 text_translation_weight=float( 598 text_translation_weight=float(
591 coarse_fusion_raw.get("text_translation_weight", 0.8) 599 coarse_fusion_raw.get("text_translation_weight", 0.8)
592 ), 600 ),
@@ -636,6 +644,14 @@ class AppConfigLoader: @@ -636,6 +644,14 @@ class AppConfigLoader:
636 knn_tie_breaker=float(fusion_raw.get("knn_tie_breaker", 0.0)), 644 knn_tie_breaker=float(fusion_raw.get("knn_tie_breaker", 0.0)),
637 knn_bias=float(fusion_raw.get("knn_bias", 0.6)), 645 knn_bias=float(fusion_raw.get("knn_bias", 0.6)),
638 knn_exponent=float(fusion_raw.get("knn_exponent", 0.2)), 646 knn_exponent=float(fusion_raw.get("knn_exponent", 0.2)),
  647 + knn_text_bias=float(
  648 + fusion_raw.get("knn_text_bias", fusion_raw.get("knn_bias", 0.6))
  649 + ),
  650 + knn_text_exponent=float(fusion_raw.get("knn_text_exponent", 0.0)),
  651 + knn_image_bias=float(
  652 + fusion_raw.get("knn_image_bias", fusion_raw.get("knn_bias", 0.6))
  653 + ),
  654 + knn_image_exponent=float(fusion_raw.get("knn_image_exponent", 0.0)),
639 fine_bias=float(fusion_raw.get("fine_bias", 0.00001)), 655 fine_bias=float(fusion_raw.get("fine_bias", 0.00001)),
640 fine_exponent=float(fusion_raw.get("fine_exponent", 1.0)), 656 fine_exponent=float(fusion_raw.get("fine_exponent", 1.0)),
641 text_translation_weight=float( 657 text_translation_weight=float(
@@ -119,6 +119,18 @@ class RerankFusionConfig: @@ -119,6 +119,18 @@ class RerankFusionConfig:
119 knn_tie_breaker: float = 0.0 119 knn_tie_breaker: float = 0.0
120 knn_bias: float = 0.6 120 knn_bias: float = 0.6
121 knn_exponent: float = 0.2 121 knn_exponent: float = 0.2
  122 + #: Optional additive floor for the weighted text KNN term.
  123 + #: Falls back to knn_bias when omitted in config loading.
  124 + knn_text_bias: float = 0.6
  125 + #: Optional extra multiplicative term on weighted text KNN.
  126 + #: Uses knn_text_bias as the additive floor.
  127 + knn_text_exponent: float = 0.0
  128 + #: Optional additive floor for the weighted image KNN term.
  129 + #: Falls back to knn_bias when omitted in config loading.
  130 + knn_image_bias: float = 0.6
  131 + #: Optional extra multiplicative term on weighted image KNN.
  132 + #: Uses knn_image_bias as the additive floor.
  133 + knn_image_exponent: float = 0.0
122 fine_bias: float = 0.00001 134 fine_bias: float = 0.00001
123 fine_exponent: float = 1.0 135 fine_exponent: float = 1.0
124 #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合) 136 #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合)
@@ -143,6 +155,18 @@ class CoarseRankFusionConfig: @@ -143,6 +155,18 @@ class CoarseRankFusionConfig:
143 knn_tie_breaker: float = 0.0 155 knn_tie_breaker: float = 0.0
144 knn_bias: float = 0.6 156 knn_bias: float = 0.6
145 knn_exponent: float = 0.2 157 knn_exponent: float = 0.2
  158 + #: Optional additive floor for the weighted text KNN term.
  159 + #: Falls back to knn_bias when omitted in config loading.
  160 + knn_text_bias: float = 0.6
  161 + #: Optional extra multiplicative term on weighted text KNN.
  162 + #: Uses knn_text_bias as the additive floor.
  163 + knn_text_exponent: float = 0.0
  164 + #: Optional additive floor for the weighted image KNN term.
  165 + #: Falls back to knn_bias when omitted in config loading.
  166 + knn_image_bias: float = 0.6
  167 + #: Optional extra multiplicative term on weighted image KNN.
  168 + #: Uses knn_image_bias as the additive floor.
  169 + knn_image_exponent: float = 0.0
146 #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合) 170 #: 翻译子句 named query 分数相对原文 base_query 的权重(加权后再与原文做 dismax 融合)
147 text_translation_weight: float = 0.8 171 text_translation_weight: float = 0.8
148 172
docs/caches-inventory.md 0 → 100644
@@ -0,0 +1,133 @@ @@ -0,0 +1,133 @@
  1 +# 本项目缓存一览
  2 +
  3 +本文档梳理仓库内**与业务相关的各类缓存**:说明用途、键与过期策略,并汇总运维脚本。按「分布式(Redis)→ 进程内 → 磁盘/模型 → 第三方」组织。
  4 +
  5 +---
  6 +
  7 +## 一、Redis 集中式缓存(生产主路径)
  8 +
  9 +所有下列缓存默认连接 **`infrastructure.redis`**(`config/config.yaml` 与 `REDIS_*` 环境变量),**数据库编号一般为 `db=0`**(脚本可通过参数覆盖)。`snapshot_db` 仅在配置中存在,供快照/运维场景选用,应用代码未按该字段切换业务缓存的 DB。
  10 +
  11 +### 1. 文本 / 图像向量缓存(Embedding)
  12 +
  13 +- **作用**:缓存 BGE/TEI 文本向量与 CN-CLIP 图像向量、CLIP 文本塔向量,避免重复推理。
  14 +- **实现**:`embeddings/redis_embedding_cache.py` 的 `RedisEmbeddingCache`;键构造见 `embeddings/cache_keys.py`。
  15 +- **Key 形态**(最终 Redis 键 = `前缀` + `可选 namespace` + `逻辑键`):
  16 + - **前缀**:`infrastructure.redis.embedding_cache_prefix`(默认 `embedding`,可用 `REDIS_EMBEDDING_CACHE_PREFIX` 覆盖)。
  17 + - **命名空间**:`embeddings/server.py` 与客户端中分为:
  18 + - 文本:`namespace=""` → `{prefix}:{embed:norm0|1:...}`
  19 + - 图像:`namespace="image"` → `{prefix}:image:{embed:模型名:txt:norm0|1:...}`
  20 + - CLIP 文本:`namespace="clip_text"` → `{prefix}:clip_text:{embed:模型名:img:norm0|1:...}`
  21 + - 逻辑键段含 `embed:`、`norm0/1`、模型名(多模态)、过长文本/URL 时用 `h:sha256:...` 摘要(见 `cache_keys.py` 注释)。
  22 +- **值格式**:BF16 压缩后的字节(`embeddings/bf16.py`),非 JSON。
  23 +- **TTL**:`infrastructure.redis.cache_expire_days`(默认 **720 天**,`REDIS_CACHE_EXPIRE_DAYS`)。写入用 `SETEX`;**命中时滑动续期**(`EXPIRE` 刷新为同一时长)。
  24 +- **Redis 客户端**:`decode_responses=False`(二进制)。
  25 +
  26 +**主要代码**:`embeddings/server.py`、`embeddings/text_encoder.py`、`embeddings/image_encoder.py`。
  27 +
  28 +---
  29 +
  30 +### 2. 翻译结果缓存(Translation)
  31 +
  32 +- **作用**:按「翻译模型 + 目标语言 + 原文」缓存译文;支持**模型质量分层探测**(高 tier 模型写入的缓存可被同 tier 或更高 tier 的请求命中,见 `translation/settings.py` 中 `translation_cache_probe_models`)。
  33 +- **Key 形态**:`trans:{model}:{target_lang}:{text前4字符}{sha256全文}`(`translation/cache.py` 的 `build_key`)。
  34 +- **值格式**:UTF-8 译文字符串。
  35 +- **TTL**:`services.translation.cache.ttl_seconds`(默认 **62208000 秒 = 720 天**)。若 `sliding_expiration: true`,命中时刷新 TTL。
  36 +- **能力级开关**:各 `capabilities.*.use_cache` 为 `false` 时该后端不落 Redis。
  37 +- **Redis 客户端**:`decode_responses=True`。
  38 +
  39 +**主要代码**:`translation/cache.py`、`translation/service.py`;翻译 HTTP 服务:`api/translator_app.py`(`get_translation_service()` 使用 `lru_cache` 单例,见下文进程内缓存)。
  40 +
  41 +---
  42 +
  43 +### 3. 商品内容理解 / Anchors 与语义分析缓存(Indexer)
  44 +
  45 +- **作用**:缓存 LLM 对商品标题等拼出的 **prompt 输入** 所做的分析结果(anchors、语义属性等),避免重复调用大模型。键与 `analysis_kind`、`prompt` 契约版本、`target_lang` 及输入摘要相关。
  46 +- **Key 形态**:`{anchor_cache_prefix}:{analysis_kind}:{prompt_contract_hash[:12]}:{target_lang}:{prompt_input[:4]}{md5}`(`indexer/product_enrich.py` 中 `_make_analysis_cache_key`)。
  47 +- **前缀**:`infrastructure.redis.anchor_cache_prefix`(默认 `product_anchors`,`REDIS_ANCHOR_CACHE_PREFIX`)。
  48 +- **值格式**:JSON 字符串(规范化后的分析结果)。
  49 +- **TTL**:`anchor_cache_expire_days`(默认 **30 天**),以秒写入 `SETEX`(**非滑动**,与向量/翻译不同)。
  50 +- **读逻辑**:无 TTL 刷新;仅校验内容是否「有意义」再返回。
  51 +
  52 +**主要代码**:`indexer/product_enrich.py`;与 HTTP 侧对齐说明见 `api/routes/indexer.py` 注释。
  53 +
  54 +---
  55 +
  56 +## 二、进程内缓存(非共享、随进程重启失效)
  57 +
  58 +| 名称 | 用途 | 范围/生命周期 |
  59 +|------|------|----------------|
  60 +| **`get_app_config()`** | 解析并缓存全局 `AppConfig` | `config/loader.py`:`@lru_cache(maxsize=1)`;`reload_app_config()` 可 `cache_clear()` |
  61 +| **`TranslationService` 单例** | 翻译服务进程内复用后端与 Redis 客户端 | `api/translator_app.py`:`get_translation_service()` |
  62 +| **`_nllb_tokenizer_code_by_normalized_key`** | NLLB tokenizer 语言码映射 | `translation/languages.py`:`@lru_cache(maxsize=1)` |
  63 +| **`QueryTextAnalysisCache`** | 单次查询解析内复用分词、tokenizer 结果 | `query/tokenization.py`,随 `QueryParser` 一次 parse |
  64 +| **`_SelectionContext`(SKU 意图)** | 归一化文本、分词、匹配布尔等小字典 | `search/sku_intent_selector.py`,单次选择流程 |
  65 +| **`incremental_service` transformer 缓存** | 按 `tenant_id` 缓存文档转换器 | `indexer/incremental_service.py`,**无界**、多租户进程长期存活时需注意内存 |
  66 +| **NLLB batch 内 `token_count_cache`** | 同一 batch 内避免重复计 token | `translation/backends/local_ctranslate2.py` |
  67 +| **CLIP 分词器 `@lru_cache`**(第三方) | 简单 tokenizer 缓存 | `third-party/clip-as-service/.../simple_tokenizer.py` |
  68 +
  69 +**说明**:`utils/cache.py` 中的 **`DictCache`**(文件 JSON:默认 `.cache/dict_cache.json`)已导出,但仓库内**无直接 `DictCache(` 调用**,视为可复用工具/预留,非当前主路径。
  70 +
  71 +---
  72 +
  73 +## 三、磁盘与模型相关「缓存」(非 Redis)
  74 +
  75 +| 名称 | 用途 | 配置/位置 |
  76 +|------|------|-----------|
  77 +| **Hugging Face / 本地模型目录** | 重排器、翻译本地模型等权重下载与缓存 | `services.rerank.backends.*.cache_dir` 等,常见默认 **`./model_cache`**(`config/config.yaml`) |
  78 +| **vLLM `enable_prefix_caching`** | 重排服务内 **Prefix KV 缓存**(加速同前缀批推理) | `services.rerank.backends.qwen3_vllm*`、`reranker/backends/qwen3_vllm*.py` |
  79 +| **运行时目录** | 重排服务状态/引擎文件 | `services.rerank.instances.*.runtime_dir`(如 `./.runtime/reranker/...`) |
  80 +
  81 +翻译能力里的 **`use_cache: true`**(如 NLLB、Marian)在多数后端指 **推理时的 KV cache(Transformer)**,与 Redis 译文缓存是不同层次;Redis 译文缓存仍由 `TranslationCache` 控制。
  82 +
  83 +---
  84 +
  85 +## 四、Elasticsearch 内部缓存
  86 +
  87 +索引设置中的 `refresh_interval` 等影响近实时可见性,但**不属于应用层键值缓存**。若需调优 ES 查询缓存、节点堆等,见运维文档与集群配置,此处不展开。
  88 +
  89 +---
  90 +
  91 +## 五、运维与巡检脚本(Redis)
  92 +
  93 +| 脚本 | 作用 |
  94 +|------|------|
  95 +| `scripts/redis/redis_cache_health_check.py` | 按 **embedding / translation / anchors** 三类前缀巡检:key 数量估算、TTL 采样、`IDLETIME` 等 |
  96 +| `scripts/redis/redis_cache_prefix_stats.py` | 按前缀统计 key 数量与 **MEMORY USAGE**(可多 DB) |
  97 +| `scripts/redis/redis_memory_heavy_keys.py` | 扫描占用内存最大的 key,辅助排查「统计与总内存不一致」 |
  98 +| `scripts/redis/monitor_eviction.py` | 实时监控 **eviction** 相关事件,用于容量与驱逐策略排查 |
  99 +
  100 +使用前需加载项目配置(如 `source activate.sh`)以保证 `REDIS_CONFIG` 与生产一致。脚本注释中给出了 **`redis-cli` 手工统计**示例(按前缀 `wc -l`、`MEMORY STATS` 等)。
  101 +
  102 +---
  103 +
  104 +## 六、总表(Redis 与各层缓存)
  105 +
  106 +| 缓存名称 | 业务模块 | 存储 | Key 前缀 / 命名模式 | 过期时间 | 过期策略 | 值摘要 | 配置键 / 环境变量 |
  107 +|----------|----------|------|---------------------|----------|----------|--------|-------------------|
  108 +| 文本向量 | 检索 / 索引 / Embedding 服务 | Redis db≈0 | `{embedding_cache_prefix}:*`(逻辑键以 `embed:norm…` 开头) | `cache_expire_days`(默认 720 天) | 写入 TTL + 命中滑动续期 | BF16 字节向量 | `infrastructure.redis.*`;`REDIS_EMBEDDING_CACHE_PREFIX`、`REDIS_CACHE_EXPIRE_DAYS` |
  109 +| 图像向量(CLIP 图) | 图搜 / 多模态 | 同上 | `{prefix}:image:*` | 同上 | 同上 | BF16 字节 | 同上 |
  110 +| CLIP 文本塔向量 | 图搜文本侧 | 同上 | `{prefix}:clip_text:*` | 同上 | 同上 | BF16 字节 | 同上 |
  111 +| 翻译译文 | 查询翻译、翻译服务 | 同上 | `trans:{model}:{lang}:*` | `services.translation.cache.ttl_seconds`(默认 720 天) | 可配置滑动(`sliding_expiration`) | UTF-8 字符串 | `services.translation.cache.*`;各能力 `use_cache` |
  112 +| 商品分析 / Anchors | 索引富化、LLM 内容理解 | 同上 | `{anchor_cache_prefix}:{kind}:{hash}:{lang}:*` | `anchor_cache_expire_days`(默认 30 天) | 固定 TTL,不滑动 | JSON 字符串 | `anchor_cache_prefix`、`anchor_cache_expire_days`;`REDIS_ANCHOR_*` |
  113 +| 应用配置 | 全栈 | 进程内存 | N/A(单例) | 进程生命周期 | `reload_app_config` 清除 | `AppConfig` 对象 | `config/loader.py` |
  114 +| 翻译服务实例 | 翻译 API | 进程内存 | N/A | 进程生命周期 | 单例 | `TranslationService` | `api/translator_app.py` |
  115 +| 查询分词缓存 | 查询解析 | 单次请求内 | N/A | 单次 parse | — | 分词与中间结果 | `query/tokenization.py` |
  116 +| SKU 意图辅助字典 | 搜索排序辅助 | 单次请求内 | N/A | 单次选择 | — | 小 dict | `search/sku_intent_selector.py` |
  117 +| 增量索引 Transformer | 索引管道 | 进程内存 | `tenant_id` 字符串键 | 长期(无界) | 无自动淘汰 | Transformer 元组 | `indexer/incremental_service.py` |
  118 +| 重排 / 翻译模型权重 | 推理服务 | 本地磁盘 | 目录路径 | 无自动删除(人工清理) | — | 模型文件 | `cache_dir: ./model_cache` 等 |
  119 +| vLLM Prefix 缓存 | 重排(Qwen3 等) | GPU/引擎内 | 引擎内部 | 引擎管理 | — | KV Cache | `enable_prefix_caching` |
  120 +| 文件 Dict 缓存(可选) | 通用 | `.cache/dict_cache.json` | 分类 + 自定义 key | 持久直至删除 | — | JSON 可序列化值 | `utils/cache.py`(当前无调用方) |
  121 +
  122 +---
  123 +
  124 +## 七、维护建议(简要)
  125 +
  126 +1. **容量**:三类 Redis 缓存(embedding / trans / anchors)可共用同一实例;大租户或图搜多时 **embedding** 与 **trans** 往往占主要内存,可用 `redis_cache_prefix_stats.py` 分前缀观察。
  127 +2. **键迁移**:变更 `embedding_cache_prefix`、CLIP `model_name` 或 prompt 契约会自然**隔离新键空间**;旧键依赖 TTL 或人工批量删除。
  128 +3. **一致性**:向量缓存对异常向量会 **delete key**(`RedisEmbeddingCache.get`);anchors 依赖 `cache_version` 与契约 hash 防止错误复用。
  129 +4. **监控**:除脚本外,Embedding HTTP 服务健康检查会报告各 lane 的 **`cache_enabled`**(`embeddings/server.py`)。
  130 +
  131 +---
  132 +
  133 +*文档随代码扫描生成;若新增 Redis 用途,请同步更新本文件与 `scripts/redis/redis_cache_health_check.py` 中的 `_load_known_cache_types()`。*
search/es_query_builder.py
@@ -8,6 +8,7 @@ Simplified architecture: @@ -8,6 +8,7 @@ Simplified architecture:
8 - function_score wrapper for boosting fields 8 - function_score wrapper for boosting fields
9 """ 9 """
10 10
  11 +from dataclasses import dataclass
11 from typing import Dict, Any, List, Optional, Tuple 12 from typing import Dict, Any, List, Optional, Tuple
12 13
13 import numpy as np 14 import numpy as np
@@ -114,6 +115,171 @@ class ESQueryBuilder: @@ -114,6 +115,171 @@ class ESQueryBuilder:
114 self.phrase_match_tie_breaker = float(phrase_match_tie_breaker) 115 self.phrase_match_tie_breaker = float(phrase_match_tie_breaker)
115 self.phrase_match_boost = float(phrase_match_boost) 116 self.phrase_match_boost = float(phrase_match_boost)
116 117
  118 + @dataclass(frozen=True)
  119 + class KNNClausePlan:
  120 + field: str
  121 + boost: float
  122 + k: Optional[int] = None
  123 + num_candidates: Optional[int] = None
  124 + nested_path: Optional[str] = None
  125 +
  126 + @staticmethod
  127 + def _vector_to_list(vector: Any) -> List[float]:
  128 + if vector is None:
  129 + return []
  130 + if hasattr(vector, "tolist"):
  131 + values = vector.tolist()
  132 + else:
  133 + values = list(vector)
  134 + return [float(v) for v in values]
  135 +
  136 + @staticmethod
  137 + def _query_token_count(parsed_query: Optional[Any]) -> int:
  138 + if parsed_query is None:
  139 + return 0
  140 + query_tokens = getattr(parsed_query, "query_tokens", None) or []
  141 + return len(query_tokens)
  142 +
  143 + def get_text_knn_plan(self, parsed_query: Optional[Any] = None) -> Optional[KNNClausePlan]:
  144 + if not self.text_embedding_field:
  145 + return None
  146 + boost = self.knn_text_boost
  147 + final_knn_k = self.knn_text_k
  148 + final_knn_num_candidates = self.knn_text_num_candidates
  149 + if self._query_token_count(parsed_query) >= 5:
  150 + final_knn_k = self.knn_text_k_long
  151 + final_knn_num_candidates = self.knn_text_num_candidates_long
  152 + boost = self.knn_text_boost * 1.4
  153 + return self.KNNClausePlan(
  154 + field=str(self.text_embedding_field),
  155 + boost=float(boost),
  156 + k=int(final_knn_k),
  157 + num_candidates=int(final_knn_num_candidates),
  158 + )
  159 +
  160 + def get_image_knn_plan(self) -> Optional[KNNClausePlan]:
  161 + if not self.image_embedding_field:
  162 + return None
  163 + nested_path, _, _ = str(self.image_embedding_field).rpartition(".")
  164 + return self.KNNClausePlan(
  165 + field=str(self.image_embedding_field),
  166 + boost=float(self.knn_image_boost),
  167 + k=int(self.knn_image_k),
  168 + num_candidates=int(self.knn_image_num_candidates),
  169 + nested_path=nested_path or None,
  170 + )
  171 +
  172 + def build_text_knn_clause(
  173 + self,
  174 + query_vector: Any,
  175 + *,
  176 + parsed_query: Optional[Any] = None,
  177 + query_name: str = "knn_query",
  178 + ) -> Optional[Dict[str, Any]]:
  179 + plan = self.get_text_knn_plan(parsed_query)
  180 + if plan is None or query_vector is None:
  181 + return None
  182 + return {
  183 + "knn": {
  184 + "field": plan.field,
  185 + "query_vector": self._vector_to_list(query_vector),
  186 + "k": plan.k,
  187 + "num_candidates": plan.num_candidates,
  188 + "boost": plan.boost,
  189 + "_name": query_name,
  190 + }
  191 + }
  192 +
  193 + def build_image_knn_clause(
  194 + self,
  195 + image_query_vector: Any,
  196 + *,
  197 + query_name: str = "image_knn_query",
  198 + ) -> Optional[Dict[str, Any]]:
  199 + plan = self.get_image_knn_plan()
  200 + if plan is None or image_query_vector is None:
  201 + return None
  202 + image_knn_query = {
  203 + "field": plan.field,
  204 + "query_vector": self._vector_to_list(image_query_vector),
  205 + "k": plan.k,
  206 + "num_candidates": plan.num_candidates,
  207 + "boost": plan.boost,
  208 + }
  209 + if plan.nested_path:
  210 + return {
  211 + "nested": {
  212 + "path": plan.nested_path,
  213 + "_name": query_name,
  214 + "query": {"knn": image_knn_query},
  215 + "score_mode": "max",
  216 + }
  217 + }
  218 + return {
  219 + "knn": {
  220 + **image_knn_query,
  221 + "_name": query_name,
  222 + }
  223 + }
  224 +
  225 + def build_exact_text_knn_rescore_clause(
  226 + self,
  227 + query_vector: Any,
  228 + *,
  229 + parsed_query: Optional[Any] = None,
  230 + query_name: str = "exact_text_knn_query",
  231 + ) -> Optional[Dict[str, Any]]:
  232 + plan = self.get_text_knn_plan(parsed_query)
  233 + if plan is None or query_vector is None:
  234 + return None
  235 + return {
  236 + "script_score": {
  237 + "_name": query_name,
  238 + "query": {"exists": {"field": plan.field}},
  239 + "script": {
  240 + "source": (
  241 + f"((dotProduct(params.query_vector, '{plan.field}') + 1.0) / 2.0) * params.boost"
  242 + ),
  243 + "params": {
  244 + "query_vector": self._vector_to_list(query_vector),
  245 + "boost": float(plan.boost),
  246 + },
  247 + },
  248 + }
  249 + }
  250 +
  251 + def build_exact_image_knn_rescore_clause(
  252 + self,
  253 + image_query_vector: Any,
  254 + *,
  255 + query_name: str = "exact_image_knn_query",
  256 + ) -> Optional[Dict[str, Any]]:
  257 + plan = self.get_image_knn_plan()
  258 + if plan is None or image_query_vector is None:
  259 + return None
  260 + script_score_query = {
  261 + "query": {"exists": {"field": plan.field}},
  262 + "script": {
  263 + "source": (
  264 + f"((dotProduct(params.query_vector, '{plan.field}') + 1.0) / 2.0) * params.boost"
  265 + ),
  266 + "params": {
  267 + "query_vector": self._vector_to_list(image_query_vector),
  268 + "boost": float(plan.boost),
  269 + },
  270 + },
  271 + }
  272 + if plan.nested_path:
  273 + return {
  274 + "nested": {
  275 + "path": plan.nested_path,
  276 + "_name": query_name,
  277 + "score_mode": "max",
  278 + "query": {"script_score": script_score_query},
  279 + }
  280 + }
  281 + return {"script_score": {"_name": query_name, **script_score_query}}
  282 +
117 def _apply_source_filter(self, es_query: Dict[str, Any]) -> None: 283 def _apply_source_filter(self, es_query: Dict[str, Any]) -> None:
118 """ 284 """
119 Apply tri-state _source semantics: 285 Apply tri-state _source semantics:
@@ -250,52 +416,21 @@ class ESQueryBuilder: @@ -250,52 +416,21 @@ class ESQueryBuilder:
250 # 3. Add KNN search clauses alongside lexical clauses under the same bool.should 416 # 3. Add KNN search clauses alongside lexical clauses under the same bool.should
251 # Text KNN: k / num_candidates from config; long queries use *_long and higher boost 417 # Text KNN: k / num_candidates from config; long queries use *_long and higher boost
252 if has_embedding: 418 if has_embedding:
253 - text_knn_boost = self.knn_text_boost  
254 - final_knn_k = self.knn_text_k  
255 - final_knn_num_candidates = self.knn_text_num_candidates  
256 - if parsed_query:  
257 - query_tokens = getattr(parsed_query, 'query_tokens', None) or []  
258 - token_count = len(query_tokens)  
259 - if token_count >= 5:  
260 - final_knn_k = self.knn_text_k_long  
261 - final_knn_num_candidates = self.knn_text_num_candidates_long  
262 - text_knn_boost = self.knn_text_boost * 1.4  
263 - recall_clauses.append({  
264 - "knn": {  
265 - "field": self.text_embedding_field,  
266 - "query_vector": query_vector.tolist(),  
267 - "k": final_knn_k,  
268 - "num_candidates": final_knn_num_candidates,  
269 - "boost": text_knn_boost,  
270 - "_name": "knn_query",  
271 - }  
272 - }) 419 + text_knn_clause = self.build_text_knn_clause(
  420 + query_vector,
  421 + parsed_query=parsed_query,
  422 + query_name="knn_query",
  423 + )
  424 + if text_knn_clause:
  425 + recall_clauses.append(text_knn_clause)
273 426
274 if has_image_embedding: 427 if has_image_embedding:
275 - nested_path, _, _ = str(self.image_embedding_field).rpartition(".")  
276 - image_knn_query = {  
277 - "field": self.image_embedding_field,  
278 - "query_vector": image_query_vector.tolist(),  
279 - "k": self.knn_image_k,  
280 - "num_candidates": self.knn_image_num_candidates,  
281 - "boost": self.knn_image_boost,  
282 - }  
283 - if nested_path:  
284 - recall_clauses.append({  
285 - "nested": {  
286 - "path": nested_path,  
287 - "_name": "image_knn_query",  
288 - "query": {"knn": image_knn_query},  
289 - "score_mode": "max",  
290 - }  
291 - })  
292 - else:  
293 - recall_clauses.append({  
294 - "knn": {  
295 - **image_knn_query,  
296 - "_name": "image_knn_query",  
297 - }  
298 - }) 428 + image_knn_clause = self.build_image_knn_clause(
  429 + image_query_vector,
  430 + query_name="image_knn_query",
  431 + )
  432 + if image_knn_clause:
  433 + recall_clauses.append(image_knn_clause)
299 434
300 # 4. Build main query structure: filters and recall 435 # 4. Build main query structure: filters and recall
301 if recall_clauses: 436 if recall_clauses:
search/rerank_client.py
@@ -396,12 +396,50 @@ def _build_ltr_feature_block( @@ -396,12 +396,50 @@ def _build_ltr_feature_block(
396 } 396 }
397 397
398 398
  399 +def _maybe_append_weighted_knn_terms(
  400 + *,
  401 + term_rows: List[Dict[str, Any]],
  402 + fusion: CoarseRankFusionConfig | RerankFusionConfig,
  403 + knn_components: Optional[Dict[str, Any]],
  404 +) -> None:
  405 + if not knn_components:
  406 + return
  407 +
  408 + weighted_text_knn_score = _to_score(knn_components.get("weighted_text_knn_score"))
  409 + weighted_image_knn_score = _to_score(knn_components.get("weighted_image_knn_score"))
  410 +
  411 + if float(getattr(fusion, "knn_text_exponent", 0.0)) != 0.0:
  412 + text_bias = float(getattr(fusion, "knn_text_bias", fusion.knn_bias))
  413 + term_rows.append(
  414 + {
  415 + "name": "weighted_text_knn_score",
  416 + "raw_score": weighted_text_knn_score,
  417 + "bias": text_bias,
  418 + "exponent": float(fusion.knn_text_exponent),
  419 + "factor": (max(weighted_text_knn_score, 0.0) + text_bias) ** float(fusion.knn_text_exponent),
  420 + }
  421 + )
  422 + if float(getattr(fusion, "knn_image_exponent", 0.0)) != 0.0:
  423 + image_bias = float(getattr(fusion, "knn_image_bias", fusion.knn_bias))
  424 + term_rows.append(
  425 + {
  426 + "name": "weighted_image_knn_score",
  427 + "raw_score": weighted_image_knn_score,
  428 + "bias": image_bias,
  429 + "exponent": float(fusion.knn_image_exponent),
  430 + "factor": (max(weighted_image_knn_score, 0.0) + image_bias)
  431 + ** float(fusion.knn_image_exponent),
  432 + }
  433 + )
  434 +
  435 +
399 def _compute_multiplicative_fusion( 436 def _compute_multiplicative_fusion(
400 *, 437 *,
401 es_score: float, 438 es_score: float,
402 text_score: float, 439 text_score: float,
403 knn_score: float, 440 knn_score: float,
404 fusion: RerankFusionConfig, 441 fusion: RerankFusionConfig,
  442 + knn_components: Optional[Dict[str, Any]] = None,
405 rerank_score: Optional[float] = None, 443 rerank_score: Optional[float] = None,
406 fine_score: Optional[float] = None, 444 fine_score: Optional[float] = None,
407 style_boost: float = 1.0, 445 style_boost: float = 1.0,
@@ -427,6 +465,7 @@ def _compute_multiplicative_fusion( @@ -427,6 +465,7 @@ def _compute_multiplicative_fusion(
427 _add_term("fine_score", fine_score, fusion.fine_bias, fusion.fine_exponent) 465 _add_term("fine_score", fine_score, fusion.fine_bias, fusion.fine_exponent)
428 _add_term("text_score", text_score, fusion.text_bias, fusion.text_exponent) 466 _add_term("text_score", text_score, fusion.text_bias, fusion.text_exponent)
429 _add_term("knn_score", knn_score, fusion.knn_bias, fusion.knn_exponent) 467 _add_term("knn_score", knn_score, fusion.knn_bias, fusion.knn_exponent)
  468 + _maybe_append_weighted_knn_terms(term_rows=term_rows, fusion=fusion, knn_components=knn_components)
430 469
431 fused = 1.0 470 fused = 1.0
432 factors: Dict[str, float] = {} 471 factors: Dict[str, float] = {}
@@ -450,12 +489,30 @@ def _multiply_coarse_fusion_factors( @@ -450,12 +489,30 @@ def _multiply_coarse_fusion_factors(
450 es_score: float, 489 es_score: float,
451 text_score: float, 490 text_score: float,
452 knn_score: float, 491 knn_score: float,
  492 + knn_components: Dict[str, Any],
453 fusion: CoarseRankFusionConfig, 493 fusion: CoarseRankFusionConfig,
454 -) -> Tuple[float, float, float, float]: 494 +) -> Tuple[float, float, float, float, float, float]:
455 es_factor = (max(es_score, 0.0) + fusion.es_bias) ** fusion.es_exponent 495 es_factor = (max(es_score, 0.0) + fusion.es_bias) ** fusion.es_exponent
456 text_factor = (max(text_score, 0.0) + fusion.text_bias) ** fusion.text_exponent 496 text_factor = (max(text_score, 0.0) + fusion.text_bias) ** fusion.text_exponent
457 knn_factor = (max(knn_score, 0.0) + fusion.knn_bias) ** fusion.knn_exponent 497 knn_factor = (max(knn_score, 0.0) + fusion.knn_bias) ** fusion.knn_exponent
458 - return es_factor, text_factor, knn_factor, es_factor * text_factor * knn_factor 498 + text_knn_bias = float(getattr(fusion, "knn_text_bias", fusion.knn_bias))
  499 + image_knn_bias = float(getattr(fusion, "knn_image_bias", fusion.knn_bias))
  500 + text_knn_factor = (
  501 + (max(_to_score(knn_components.get("weighted_text_knn_score")), 0.0) + text_knn_bias)
  502 + ** float(getattr(fusion, "knn_text_exponent", 0.0))
  503 + )
  504 + image_knn_factor = (
  505 + (max(_to_score(knn_components.get("weighted_image_knn_score")), 0.0) + image_knn_bias)
  506 + ** float(getattr(fusion, "knn_image_exponent", 0.0))
  507 + )
  508 + return (
  509 + es_factor,
  510 + text_factor,
  511 + knn_factor,
  512 + text_knn_factor,
  513 + image_knn_factor,
  514 + es_factor * text_factor * knn_factor * text_knn_factor * image_knn_factor,
  515 + )
459 516
460 517
461 def _has_selected_sku(hit: Dict[str, Any]) -> bool: 518 def _has_selected_sku(hit: Dict[str, Any]) -> bool:
@@ -481,10 +538,18 @@ def coarse_resort_hits( @@ -481,10 +538,18 @@ def coarse_resort_hits(
481 knn_components = signal_bundle["knn_components"] 538 knn_components = signal_bundle["knn_components"]
482 text_score = signal_bundle["text_score"] 539 text_score = signal_bundle["text_score"]
483 knn_score = signal_bundle["knn_score"] 540 knn_score = signal_bundle["knn_score"]
484 - es_factor, text_factor, knn_factor, coarse_score = _multiply_coarse_fusion_factors( 541 + (
  542 + es_factor,
  543 + text_factor,
  544 + knn_factor,
  545 + text_knn_factor,
  546 + image_knn_factor,
  547 + coarse_score,
  548 + ) = _multiply_coarse_fusion_factors(
485 es_score=es_score, 549 es_score=es_score,
486 text_score=text_score, 550 text_score=text_score,
487 knn_score=knn_score, 551 knn_score=knn_score,
  552 + knn_components=knn_components,
488 fusion=f, 553 fusion=f,
489 ) 554 )
490 555
@@ -535,6 +600,8 @@ def coarse_resort_hits( @@ -535,6 +600,8 @@ def coarse_resort_hits(
535 "coarse_es_factor": es_factor, 600 "coarse_es_factor": es_factor,
536 "coarse_text_factor": text_factor, 601 "coarse_text_factor": text_factor,
537 "coarse_knn_factor": knn_factor, 602 "coarse_knn_factor": knn_factor,
  603 + "coarse_text_knn_factor": text_knn_factor,
  604 + "coarse_image_knn_factor": image_knn_factor,
538 "coarse_score": coarse_score, 605 "coarse_score": coarse_score,
539 "matched_queries": matched_queries, 606 "matched_queries": matched_queries,
540 "ltr_features": ltr_features, 607 "ltr_features": ltr_features,
@@ -576,7 +643,7 @@ def fuse_scores_and_resort( @@ -576,7 +643,7 @@ def fuse_scores_and_resort(
576 - _rerank_score: 重排服务返回的分数 643 - _rerank_score: 重排服务返回的分数
577 - _fused_score: 融合分数 644 - _fused_score: 融合分数
578 - _text_score: 文本相关性分数(优先取 named queries 的 base_query 分数) 645 - _text_score: 文本相关性分数(优先取 named queries 的 base_query 分数)
579 - - _knn_score: KNN 分数(优先取 named queries 的 knn_query 分数 646 + - _knn_score: KNN 分数(优先取 exact named queries,缺失时回退 ANN named queries
580 647
581 Args: 648 Args:
582 es_hits: ES hits 列表(会被原地修改) 649 es_hits: ES hits 列表(会被原地修改)
@@ -612,6 +679,7 @@ def fuse_scores_and_resort( @@ -612,6 +679,7 @@ def fuse_scores_and_resort(
612 text_score=text_score, 679 text_score=text_score,
613 knn_score=knn_score, 680 knn_score=knn_score,
614 fusion=f, 681 fusion=f,
  682 + knn_components=knn_components,
615 style_boost=style_boost, 683 style_boost=style_boost,
616 ) 684 )
617 fused = fusion_result["score"] 685 fused = fusion_result["score"]
@@ -678,6 +746,8 @@ def fuse_scores_and_resort( @@ -678,6 +746,8 @@ def fuse_scores_and_resort(
678 "es_factor": fusion_result["factors"].get("es_score"), 746 "es_factor": fusion_result["factors"].get("es_score"),
679 "text_factor": fusion_result["factors"].get("text_score"), 747 "text_factor": fusion_result["factors"].get("text_score"),
680 "knn_factor": fusion_result["factors"].get("knn_score"), 748 "knn_factor": fusion_result["factors"].get("knn_score"),
  749 + "text_knn_factor": fusion_result["factors"].get("weighted_text_knn_score"),
  750 + "image_knn_factor": fusion_result["factors"].get("weighted_image_knn_score"),
681 "style_intent_selected_sku": sku_selected, 751 "style_intent_selected_sku": sku_selected,
682 "style_intent_selected_sku_boost": style_boost, 752 "style_intent_selected_sku_boost": style_boost,
683 "matched_queries": signal_bundle["matched_queries"], 753 "matched_queries": signal_bundle["matched_queries"],
@@ -810,6 +880,7 @@ def run_lightweight_rerank( @@ -810,6 +880,7 @@ def run_lightweight_rerank(
810 text_score=text_score, 880 text_score=text_score,
811 knn_score=knn_score, 881 knn_score=knn_score,
812 fusion=f, 882 fusion=f,
  883 + knn_components=signal_bundle["knn_components"],
813 style_boost=style_boost, 884 style_boost=style_boost,
814 ) 885 )
815 886
@@ -846,6 +917,8 @@ def run_lightweight_rerank( @@ -846,6 +917,8 @@ def run_lightweight_rerank(
846 "es_factor": fusion_result["factors"].get("es_score"), 917 "es_factor": fusion_result["factors"].get("es_score"),
847 "text_factor": fusion_result["factors"].get("text_score"), 918 "text_factor": fusion_result["factors"].get("text_score"),
848 "knn_factor": fusion_result["factors"].get("knn_score"), 919 "knn_factor": fusion_result["factors"].get("knn_score"),
  920 + "text_knn_factor": fusion_result["factors"].get("weighted_text_knn_score"),
  921 + "image_knn_factor": fusion_result["factors"].get("weighted_image_knn_score"),
849 "style_intent_selected_sku": sku_selected, 922 "style_intent_selected_sku": sku_selected,
850 "style_intent_selected_sku_boost": style_boost, 923 "style_intent_selected_sku_boost": style_boost,
851 "ltr_features": ltr_features, 924 "ltr_features": ltr_features,
search/searcher.py
@@ -242,67 +242,29 @@ class Searcher: @@ -242,67 +242,29 @@ class Searcher:
242 return configured 242 return configured
243 return int(self.config.rerank.rerank_window) 243 return int(self.config.rerank.rerank_window)
244 244
245 - @staticmethod  
246 - def _vector_to_list(vector: Any) -> List[float]:  
247 - if vector is None:  
248 - return []  
249 - if hasattr(vector, "tolist"):  
250 - values = vector.tolist()  
251 - else:  
252 - values = list(vector)  
253 - return [float(v) for v in values]  
254 -  
255 def _build_exact_knn_rescore( 245 def _build_exact_knn_rescore(
256 self, 246 self,
257 *, 247 *,
258 query_vector: Any, 248 query_vector: Any,
259 image_query_vector: Any, 249 image_query_vector: Any,
  250 + parsed_query: Optional[ParsedQuery] = None,
260 ) -> Optional[Dict[str, Any]]: 251 ) -> Optional[Dict[str, Any]]:
261 clauses: List[Dict[str, Any]] = [] 252 clauses: List[Dict[str, Any]] = []
262 253
263 - if query_vector is not None and self.text_embedding_field:  
264 - clauses.append(  
265 - {  
266 - "script_score": {  
267 - "_name": "exact_text_knn_query",  
268 - "query": {"exists": {"field": self.text_embedding_field}},  
269 - "script": {  
270 - # Keep exact score on the same [0, 1]-ish scale as KNN dot_product recall.  
271 - "source": (  
272 - f"(dotProduct(params.query_vector, '{self.text_embedding_field}') + 1.0) / 2.0"  
273 - ),  
274 - "params": {"query_vector": self._vector_to_list(query_vector)},  
275 - },  
276 - }  
277 - }  
278 - ) 254 + text_clause = self.query_builder.build_exact_text_knn_rescore_clause(
  255 + query_vector,
  256 + parsed_query=parsed_query,
  257 + query_name="exact_text_knn_query",
  258 + )
  259 + if text_clause:
  260 + clauses.append(text_clause)
279 261
280 - if image_query_vector is not None and self.image_embedding_field:  
281 - nested_path, _, _ = str(self.image_embedding_field).rpartition(".")  
282 - if nested_path:  
283 - clauses.append(  
284 - {  
285 - "nested": {  
286 - "path": nested_path,  
287 - "_name": "exact_image_knn_query",  
288 - "score_mode": "max",  
289 - "query": {  
290 - "script_score": {  
291 - "query": {"exists": {"field": self.image_embedding_field}},  
292 - "script": {  
293 - # Keep exact score on the same [0, 1]-ish scale as KNN dot_product recall.  
294 - "source": (  
295 - f"(dotProduct(params.query_vector, '{self.image_embedding_field}') + 1.0) / 2.0"  
296 - ),  
297 - "params": {  
298 - "query_vector": self._vector_to_list(image_query_vector),  
299 - },  
300 - },  
301 - }  
302 - },  
303 - }  
304 - }  
305 - ) 262 + image_clause = self.query_builder.build_exact_image_knn_rescore_clause(
  263 + image_query_vector,
  264 + query_name="exact_image_knn_query",
  265 + )
  266 + if image_clause:
  267 + clauses.append(image_clause)
306 268
307 if not clauses: 269 if not clauses:
308 return None 270 return None
@@ -330,12 +292,14 @@ class Searcher: @@ -330,12 +292,14 @@ class Searcher:
330 in_rank_window: bool, 292 in_rank_window: bool,
331 query_vector: Any, 293 query_vector: Any,
332 image_query_vector: Any, 294 image_query_vector: Any,
  295 + parsed_query: Optional[ParsedQuery] = None,
333 ) -> None: 296 ) -> None:
334 if not in_rank_window or not self.config.rerank.exact_knn_rescore_enabled: 297 if not in_rank_window or not self.config.rerank.exact_knn_rescore_enabled:
335 return 298 return
336 rescore = self._build_exact_knn_rescore( 299 rescore = self._build_exact_knn_rescore(
337 query_vector=query_vector, 300 query_vector=query_vector,
338 image_query_vector=image_query_vector, 301 image_query_vector=image_query_vector,
  302 + parsed_query=parsed_query,
339 ) 303 )
340 if not rescore: 304 if not rescore:
341 return 305 return
@@ -689,6 +653,7 @@ class Searcher: @@ -689,6 +653,7 @@ class Searcher:
689 in_rank_window=in_rank_window, 653 in_rank_window=in_rank_window,
690 query_vector=parsed_query.query_vector if enable_embedding else None, 654 query_vector=parsed_query.query_vector if enable_embedding else None,
691 image_query_vector=image_query_vector, 655 image_query_vector=image_query_vector,
  656 + parsed_query=parsed_query,
692 ) 657 )
693 658
694 # Add facets for faceted search 659 # Add facets for faceted search
tests/test_es_query_builder.py
@@ -208,3 +208,36 @@ def test_image_knn_clause_is_added_alongside_base_translation_and_text_knn(): @@ -208,3 +208,36 @@ def test_image_knn_clause_is_added_alongside_base_translation_and_text_knn():
208 assert image_knn["path"] == "image_embedding" 208 assert image_knn["path"] == "image_embedding"
209 assert image_knn["score_mode"] == "max" 209 assert image_knn["score_mode"] == "max"
210 assert image_knn["query"]["knn"]["field"] == "image_embedding.vector" 210 assert image_knn["query"]["knn"]["field"] == "image_embedding.vector"
  211 +
  212 +
  213 +def test_text_knn_plan_is_reused_for_ann_and_exact_rescore():
  214 + qb = _builder()
  215 + parsed_query = SimpleNamespace(query_tokens=["a", "b", "c", "d", "e"])
  216 +
  217 + ann_clause = qb.build_text_knn_clause(
  218 + np.array([0.1, 0.2, 0.3]),
  219 + parsed_query=parsed_query,
  220 + )
  221 + exact_clause = qb.build_exact_text_knn_rescore_clause(
  222 + np.array([0.1, 0.2, 0.3]),
  223 + parsed_query=parsed_query,
  224 + )
  225 +
  226 + assert ann_clause is not None
  227 + assert exact_clause is not None
  228 + assert ann_clause["knn"]["k"] == qb.knn_text_k_long
  229 + assert ann_clause["knn"]["num_candidates"] == qb.knn_text_num_candidates_long
  230 + assert ann_clause["knn"]["boost"] == qb.knn_text_boost * 1.4
  231 + assert exact_clause["script_score"]["script"]["params"]["boost"] == qb.knn_text_boost * 1.4
  232 +
  233 +
  234 +def test_image_knn_plan_is_reused_for_ann_and_exact_rescore():
  235 + qb = _builder()
  236 +
  237 + ann_clause = qb.build_image_knn_clause(np.array([0.4, 0.5, 0.6]))
  238 + exact_clause = qb.build_exact_image_knn_rescore_clause(np.array([0.4, 0.5, 0.6]))
  239 +
  240 + assert ann_clause is not None
  241 + assert exact_clause is not None
  242 + assert ann_clause["nested"]["query"]["knn"]["boost"] == qb.knn_image_boost
  243 + assert exact_clause["nested"]["query"]["script_score"]["script"]["params"]["boost"] == qb.knn_image_boost
tests/test_rerank_client.py
1 from math import isclose 1 from math import isclose
2 2
3 -from config.schema import RerankFusionConfig  
4 -from search.rerank_client import fuse_scores_and_resort, run_lightweight_rerank 3 +from config.schema import CoarseRankFusionConfig, RerankFusionConfig
  4 +from search.rerank_client import coarse_resort_hits, fuse_scores_and_resort, run_lightweight_rerank
5 5
6 6
7 def test_fuse_scores_and_resort_aggregates_text_components_and_keeps_rerank_primary(): 7 def test_fuse_scores_and_resort_aggregates_text_components_and_keeps_rerank_primary():
@@ -257,6 +257,96 @@ def test_fuse_scores_and_resort_applies_knn_dismax_weights_and_tie_breaker(): @@ -257,6 +257,96 @@ def test_fuse_scores_and_resort_applies_knn_dismax_weights_and_tie_breaker():
257 assert isclose(debug[0]["knn_support_score"], 0.5, rel_tol=1e-9) 257 assert isclose(debug[0]["knn_support_score"], 0.5, rel_tol=1e-9)
258 258
259 259
  260 +def test_fuse_scores_and_resort_can_add_weighted_text_and_image_knn_factors():
  261 + hits = [
  262 + {
  263 + "_id": "a",
  264 + "_score": 1.0,
  265 + "matched_queries": {
  266 + "base_query": 2.0,
  267 + "knn_query": 0.4,
  268 + "image_knn_query": 0.5,
  269 + },
  270 + }
  271 + ]
  272 + fusion = RerankFusionConfig(
  273 + rerank_bias=0.0,
  274 + rerank_exponent=1.0,
  275 + text_bias=0.0,
  276 + text_exponent=1.0,
  277 + knn_text_weight=2.0,
  278 + knn_image_weight=1.0,
  279 + knn_tie_breaker=0.25,
  280 + knn_bias=0.1,
  281 + knn_exponent=1.0,
  282 + knn_text_exponent=2.0,
  283 + knn_image_exponent=3.0,
  284 + )
  285 +
  286 + debug = fuse_scores_and_resort(hits, [0.8], fusion=fusion, debug=True)
  287 +
  288 + weighted_text_knn = 0.8
  289 + weighted_image_knn = 0.5
  290 + expected_knn = weighted_text_knn + 0.25 * weighted_image_knn
  291 + expected_fused = (
  292 + 0.8
  293 + * 2.0
  294 + * (expected_knn + 0.1)
  295 + * ((weighted_text_knn + 0.1) ** 2.0)
  296 + * ((weighted_image_knn + 0.1) ** 3.0)
  297 + )
  298 +
  299 + assert isclose(hits[0]["_fused_score"], expected_fused, rel_tol=1e-9)
  300 + assert isclose(debug[0]["text_knn_factor"], (weighted_text_knn + 0.1) ** 2.0, rel_tol=1e-9)
  301 + assert isclose(debug[0]["image_knn_factor"], (weighted_image_knn + 0.1) ** 3.0, rel_tol=1e-9)
  302 + assert "weighted_text_knn_score=" in debug[0]["fusion_summary"]
  303 + assert "weighted_image_knn_score=" in debug[0]["fusion_summary"]
  304 +
  305 +
  306 +def test_coarse_resort_hits_can_add_weighted_text_and_image_knn_factors():
  307 + hits = [
  308 + {
  309 + "_id": "coarse-a",
  310 + "_score": 1.0,
  311 + "matched_queries": {
  312 + "base_query": 2.0,
  313 + "knn_query": 0.4,
  314 + "image_knn_query": 0.5,
  315 + },
  316 + }
  317 + ]
  318 + fusion = CoarseRankFusionConfig(
  319 + es_bias=0.0,
  320 + es_exponent=1.0,
  321 + text_bias=0.0,
  322 + text_exponent=1.0,
  323 + knn_text_weight=2.0,
  324 + knn_image_weight=1.0,
  325 + knn_tie_breaker=0.25,
  326 + knn_bias=0.1,
  327 + knn_exponent=1.0,
  328 + knn_text_exponent=2.0,
  329 + knn_image_exponent=3.0,
  330 + )
  331 +
  332 + debug = coarse_resort_hits(hits, fusion=fusion, debug=True)
  333 +
  334 + weighted_text_knn = 0.8
  335 + weighted_image_knn = 0.5
  336 + expected_knn = weighted_text_knn + 0.25 * weighted_image_knn
  337 + expected_coarse = (
  338 + 1.0
  339 + * 2.0
  340 + * (expected_knn + 0.1)
  341 + * ((weighted_text_knn + 0.1) ** 2.0)
  342 + * ((weighted_image_knn + 0.1) ** 3.0)
  343 + )
  344 +
  345 + assert isclose(hits[0]["_coarse_score"], expected_coarse, rel_tol=1e-9)
  346 + assert isclose(debug[0]["coarse_text_knn_factor"], (weighted_text_knn + 0.1) ** 2.0, rel_tol=1e-9)
  347 + assert isclose(debug[0]["coarse_image_knn_factor"], (weighted_image_knn + 0.1) ** 3.0, rel_tol=1e-9)
  348 +
  349 +
260 def test_run_lightweight_rerank_sorts_by_fused_stage_score(monkeypatch): 350 def test_run_lightweight_rerank_sorts_by_fused_stage_score(monkeypatch):
261 hits = [ 351 hits = [
262 { 352 {
tests/test_search_rerank_window.py
@@ -1055,6 +1055,7 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch): @@ -1055,6 +1055,7 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch):
1055 translations={}, 1055 translations={},
1056 query_vector=np.array([0.1, 0.2, 0.3], dtype=np.float32), 1056 query_vector=np.array([0.1, 0.2, 0.3], dtype=np.float32),
1057 image_query_vector=np.array([0.4, 0.5, 0.6], dtype=np.float32), 1057 image_query_vector=np.array([0.4, 0.5, 0.6], dtype=np.float32),
  1058 + query_tokens=["dress", "formal", "spring", "summer", "floral"],
1058 ) 1059 )
1059 1060
1060 es_client = _FakeESClient(total_hits=5) 1061 es_client = _FakeESClient(total_hits=5)
@@ -1081,8 +1082,12 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch): @@ -1081,8 +1082,12 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch):
1081 es_index_name=base.es_index_name, 1082 es_index_name=base.es_index_name,
1082 es_settings=base.es_settings, 1083 es_settings=base.es_settings,
1083 ) 1084 )
1084 - searcher = _build_searcher(config, es_client)  
1085 - searcher.query_parser = _VectorQueryParser() 1085 + searcher = Searcher(
  1086 + es_client=es_client,
  1087 + config=config,
  1088 + query_parser=_VectorQueryParser(),
  1089 + image_encoder=SimpleNamespace(),
  1090 + )
1086 context = create_request_context(reqid="exact-rescore", uid="u-exact") 1091 context = create_request_context(reqid="exact-rescore", uid="u-exact")
1087 1092
1088 monkeypatch.setattr( 1093 monkeypatch.setattr(
@@ -1112,6 +1117,36 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch): @@ -1112,6 +1117,36 @@ def test_searcher_attaches_exact_knn_rescore_for_rank_window(monkeypatch):
1112 elif "nested" in clause: 1117 elif "nested" in clause:
1113 names.append(clause["nested"]["_name"]) 1118 names.append(clause["nested"]["_name"])
1114 assert names == ["exact_text_knn_query", "exact_image_knn_query"] 1119 assert names == ["exact_text_knn_query", "exact_image_knn_query"]
  1120 + recall_query = body["query"]
  1121 + if "bool" in recall_query and recall_query["bool"].get("must"):
  1122 + recall_query = recall_query["bool"]["must"][0]
  1123 + if "function_score" in recall_query:
  1124 + recall_query = recall_query["function_score"]["query"]
  1125 + recall_should = recall_query["bool"]["should"]
  1126 + text_knn_clause = next(
  1127 + clause["knn"]
  1128 + for clause in recall_should
  1129 + if clause.get("knn", {}).get("_name") == "knn_query"
  1130 + )
  1131 + image_knn_clause = next(
  1132 + clause["nested"]["query"]["knn"]
  1133 + for clause in recall_should
  1134 + if clause.get("nested", {}).get("_name") == "image_knn_query"
  1135 + )
  1136 + exact_text_clause = next(
  1137 + clause["script_score"]
  1138 + for clause in should
  1139 + if clause.get("script_score", {}).get("_name") == "exact_text_knn_query"
  1140 + )
  1141 + exact_image_clause = next(
  1142 + clause["nested"]["query"]["script_score"]
  1143 + for clause in should
  1144 + if clause.get("nested", {}).get("_name") == "exact_image_knn_query"
  1145 + )
  1146 + assert text_knn_clause["boost"] == 28.0
  1147 + assert exact_text_clause["script"]["params"]["boost"] == text_knn_clause["boost"]
  1148 + assert image_knn_clause["boost"] == 20.0
  1149 + assert exact_image_clause["script"]["params"]["boost"] == image_knn_clause["boost"]
1115 1150
1116 1151
1117 def test_searcher_skips_exact_knn_rescore_outside_rank_window(monkeypatch): 1152 def test_searcher_skips_exact_knn_rescore_outside_rank_window(monkeypatch):