fix(suggestion): decouple SAT ES recall size from /suggest HTTP size param

将 SAT 的 ES 召回与对外 size 解耦，并支持配置化（解决suggest接口size参数取值不同时返回结果不一致的问题） **Problem** When `size=10` vs `size=40`, the SAT (search‑as‑you‑type) ES `_search` used the same `size` value, causing different candidate pool sizes and inconsistent top‑N results after merging with completion suggestions. The `size` parameter incorrectly controlled three things: completion count, SAT ES `size`, and final truncation. **Solution** Introduce dedicated configurable bounds for SAT recall, completely decoupled from the client‑facing `size` (final result count). - Compute SAT ES request size as `min(max(client_size, sat_recall_min), sat_recall_cap)`. - Completion still uses the raw client `size`. - Final merge, sort, and truncation logic (`_finalize_suggestion_list(..., size)`) unchanged. **Configuration** - New dataclass `SuggestionConfig` in `config/schema.py` with fields: - `sat_recall_min: int = 40` - `sat_recall_cap: int = 100` - Root `config.yaml` now supports `suggestion.sat_recall_min` / `suggestion.sat_recall_cap`. - Tenant overrides: `tenant_config.default.suggestion` or `tenant_config.tenants.<id>.suggestion` – only keys to override need to be specified. Merge order: root `SuggestionConfig` → default fragment → tenant fragment. - Sanity check: if `sat_recall_cap < sat_recall_min`, both loader and runtime resolver raise `cap` to at least `min` (warning logged). **Impact** - Small `size` (e.g., 10) still gets a stable SAT candidate pool (minimum `sat_recall_min`). - Large `size` is capped to `sat_recall_cap`, bounding ES query cost. - Final result count remains exactly the HTTP `size` parameter. - Backward compatible: defaults preserve previous behaviour when `sat_recall_min=40, sat_recall_cap=100` and client `size` <=100. **Code changes** - `config/schema.py`: add `SuggestionConfig` and integrate into `AppConfig`. - `suggestion/service.py`: - `_resolve_suggestion_config_for_tenant()`: tenant‑aware config merging. - `SuggestionService._suggest()`: compute `sat_es_size` using the new bounds. - `suggestion/loader.py`: apply same sanity checks and defaults. - Tenant config example: ```yaml tenant_config: tenants: '163': suggestion: sat_recall_cap: 80 ``` **Tests** - `pytest tests/test_suggestions.py` - `pytest tests/ci/test_service_api_contracts.py` All related tests pass.

fix(suggestion): decouple SAT ES recall size from /suggest HTTP size param
将 SAT 的 ES 召回与对外 size 解耦，并支持配置化（解决suggest接口size参数取值不同时返回结果不一致的问题） **Problem** When `size=10` vs `size=40`, the SAT (search‑as‑you‑type) ES `_search` used the same `size` value, causing different candidate pool sizes and inconsistent top‑N results after merging with completion suggestions. The `size` parameter incorrectly controlled three things: completion count, SAT ES `size`, and final truncation. **Solution** Introduce dedicated configurable bounds for SAT recall, completely decoupled from the client‑facing `size` (final result count). - Compute SAT ES request size as `min(max(client_size, sat_recall_min), sat_recall_cap)`. - Completion still uses the raw client `size`. - Final merge, sort, and truncation logic (`_finalize_suggestion_list(..., size)`) unchanged. **Configuration** - New dataclass `SuggestionConfig` in `config/schema.py` with fields: - `sat_recall_min: int = 40` - `sat_recall_cap: int = 100` - Root `config.yaml` now supports `suggestion.sat_recall_min` / `suggestion.sat_recall_cap`. - Tenant overrides: `tenant_config.default.suggestion` or `tenant_config.tenants.<id>.suggestion` – only keys to override need to be specified. Merge order: root `SuggestionConfig` → default fragment → tenant fragment. - Sanity check: if `sat_recall_cap < sat_recall_min`, both loader and runtime resolver raise `cap` to at least `min` (warning logged). **Impact** - Small `size` (e.g., 10) still gets a stable SAT candidate pool (minimum `sat_recall_min`). - Large `size` is capped to `sat_recall_cap`, bounding ES query cost. - Final result count remains exactly the HTTP `size` parameter. - Backward compatible: defaults preserve previous behaviour when `sat_recall_min=40, sat_recall_cap=100` and client `size` <=100. **Code changes** - `config/schema.py`: add `SuggestionConfig` and integrate into `AppConfig`. - `suggestion/service.py`: - `_resolve_suggestion_config_for_tenant()`: tenant‑aware config merging. - `SuggestionService._suggest()`: compute `sat_es_size` using the new bounds. - `suggestion/loader.py`: apply same sanity checks and defaults. - Tenant config example: ```yaml tenant_config: tenants: '163': suggestion: sat_recall_cap: 80 ``` **Tests** - `pytest tests/test_suggestions.py` - `pytest tests/ci/test_service_api_contracts.py` All related tests pass.
tangwang
1 parent 12a75c46
Showing 6 changed files with 100 additions and 3 deletions Show diff stats
config/config.yaml
config/loader.py
config/schema.py
docs/issues/issue-2026-04-16-bayes寻参-数据集扩增.md
suggestion/service.py
tests/test_suggestions.py
@@ -45,6 +45,9 @@ assets:
   query_rewrite_dictionary_path: config/dictionaries/query_rewrite.dict
 product_enrich:
   max_workers: 40
+suggestion:
+  sat_recall_min: 40
+  sat_recall_cap: 100
 search_evaluation:
   artifact_root: artifacts/search_evaluation
   queries_file: scripts/evaluation/queries/queries.txt
@@ -50,6 +50,7 @@ from config.schema import (
     SearchEvaluationDatasetConfig,
     SecretsConfig,
     ServicesConfig,
+    SuggestionConfig,
     SPUConfig,
     TenantCatalogConfig,
     TranslationServiceConfig,
@@ -256,6 +257,9 @@ class AppConfigLoader:
         rewrite_dictionary = _read_rewrite_dictionary(rewrite_path)
         search_config = self._build_search_config(raw, rewrite_dictionary)
+        suggestion_config = self._build_suggestion_config(
+            raw.get("suggestion") if isinstance(raw.get("suggestion"), dict) else {}
+        )
         services_config = self._build_services_config(raw.get("services") or {})
         tenants_config = self._build_tenants_config(raw.get("tenant_config") or {})
         runtime_config = self._build_runtime_config()
@@ -278,6 +282,7 @@ class AppConfigLoader:
             infrastructure=infrastructure_config,
             product_enrich=product_enrich_config,
             search=search_config,
+            suggestion=suggestion_config,
             services=services_config,
             tenants=tenants_config,
             assets=AssetsConfig(query_rewrite_dictionary_path=rewrite_path),
@@ -291,6 +296,7 @@ class AppConfigLoader:
             infrastructure=app_config.infrastructure,
             product_enrich=app_config.product_enrich,
             search=app_config.search,
+            suggestion=app_config.suggestion,
             services=app_config.services,
             tenants=app_config.tenants,
             assets=app_config.assets,
@@ -727,6 +733,17 @@ class AppConfigLoader:
             es_settings=dict(raw.get("es_settings") or {}),
         )
+    def _build_suggestion_config(self, raw: Dict[str, Any]) -> SuggestionConfig:
+        if not isinstance(raw, dict):
+            raw = {}
+        mn = int(raw.get("sat_recall_min", 40))
+        cap = int(raw.get("sat_recall_cap", 100))
+        if mn < 1:
+            mn = 1
+        if cap < mn:
+            cap = mn
+        return SuggestionConfig(sat_recall_min=mn, sat_recall_cap=cap)
+
     def _build_services_config(self, raw: Dict[str, Any]) -> ServicesConfig:
         if not isinstance(raw, dict):
             raise ConfigurationError("services must be a mapping")
@@ -213,6 +213,16 @@ class RerankConfig:
 @dataclass(frozen=True)
+class SuggestionConfig:
+    """Online suggestion API tuning (completion + SAT recall)."""
+
+    #: ES bool_prefix (search-as-you-type) request size lower bound after client ``size``.
+    sat_recall_min: int = 40
+    #: ES bool_prefix request size upper bound (clips ``max(client_size, sat_recall_min)``).
+    sat_recall_cap: int = 100
+
+
+@dataclass(frozen=True)
 class SearchConfig:
     """Search behavior configuration shared by backend and indexer."""
@@ -474,6 +484,7 @@ class AppConfig:
     infrastructure: InfrastructureConfig
     product_enrich: ProductEnrichConfig
     search: SearchConfig
+    suggestion: SuggestionConfig
     services: ServicesConfig
     tenants: TenantCatalogConfig
     assets: AssetsConfig
@@ -6,6 +6,8 @@ import logging
 import time
 from typing import Any, Dict, List, Optional
+from config.loader import get_app_config
+from config.schema import SuggestionConfig
 from config.tenant_config_loader import get_tenant_config_loader
 from query.tokenization import simple_tokenize_query
 from suggestion.builder import get_suggestion_alias_name
@@ -25,6 +27,37 @@ def _score_with_token_length_penalty(item: Dict[str, Any]) -&gt; float:
     return base * _suggestion_length_factor(str(item.get("text") or ""))
+def _resolve_suggestion_config_for_tenant(tenant_id: str) -> SuggestionConfig:
+    """Merge root ``suggestion`` with ``tenant_config.default`` / per-tenant ``suggestion`` overrides."""
+    app = get_app_config()
+    base = app.suggestion
+    d = app.tenants.default if isinstance(app.tenants.default, dict) else {}
+    d_s = d.get("suggestion") if isinstance(d.get("suggestion"), dict) else {}
+    t = app.tenants.tenants.get(str(tenant_id), {}) or {}
+    t_s = t.get("suggestion") if isinstance(t.get("suggestion"), dict) else {}
+
+    def _i(key: str, fallback: int) -> int:
+        if key in t_s and t_s[key] is not None:
+            return int(t_s[key])
+        if key in d_s and d_s[key] is not None:
+            return int(d_s[key])
+        return fallback
+
+    mn = _i("sat_recall_min", base.sat_recall_min)
+    cap = _i("sat_recall_cap", base.sat_recall_cap)
+    if mn < 1:
+        mn = 1
+    if cap < mn:
+        cap = mn
+    return SuggestionConfig(sat_recall_min=mn, sat_recall_cap=cap)
+
+
+def _sat_es_size(client_size: int, cfg: SuggestionConfig) -> int:
+    """ES size for SAT (bool_prefix) recall: ``min(max(client_size, sat_recall_min), sat_recall_cap)``."""
+    c = max(int(client_size), 0)
+    return min(max(c, cfg.sat_recall_min), cfg.sat_recall_cap)
+
+
 class SuggestionService:
     def __init__(self, es_client: ESClient):
         self.es_client = es_client
@@ -123,6 +156,8 @@ class SuggestionService:
         start = time.time()
         query_text = str(query or "").strip()
         resolved_lang = self._resolve_language(tenant_id, language)
+        sugg_cfg = _resolve_suggestion_config_for_tenant(tenant_id)
+        sat_es_size = _sat_es_size(size, sugg_cfg)
         index_name = self._resolve_search_target(tenant_id)
         if not index_name:
             # On a fresh ES cluster the suggestion index might not be built yet.
@@ -243,7 +278,7 @@ class SuggestionService:
         es_resp = self.es_client.search(
             index_name=index_name,
             body=dsl,
-            size=size,
+            size=sat_es_size,
             from_=0,
             routing=str(tenant_id),
         )
@@ -269,12 +304,13 @@ class SuggestionService:
         took_ms = int((time.time() - start) * 1000)
         logger.info(
-            "suggest completion+sat-return | tenant=%s lang=%s q=%s completion=%d sat_hits=%d took_ms=%d completion_ms=%d sat_ms=%d",
+            "suggest completion+sat-return | tenant=%s lang=%s q=%s completion=%d sat_hits=%d sat_es_size=%d took_ms=%d completion_ms=%d sat_ms=%d",
             tenant_id,
             resolved_lang,
             query_text,
             len(completion_items),
             len(hits),
+            sat_es_size,
             took_ms,
             completion_ms,
             sat_ms,
@@ -10,7 +10,12 @@ from suggestion.builder import (
     get_suggestion_alias_name,
     get_suggestion_versioned_index_name,
 )
-from suggestion.service import SuggestionService
+from config.schema import SuggestionConfig
+from suggestion.service import (
+    SuggestionService,
+    _resolve_suggestion_config_for_tenant,
+    _sat_es_size,
+)
 pytestmark = [pytest.mark.suggestion, pytest.mark.regression]
@@ -303,6 +308,31 @@ def test_suggestion_service_basic_flow_uses_alias_and_routing():
     assert len(search_calls) >= 2
     assert any(x.get("routing") == "1" for x in search_calls)
     assert any(x.get("index") == alias_name for x in search_calls)
+    sat_calls = [x for x in search_calls if "suggest" not in (x.get("body") or {})]
+    assert sat_calls[-1]["size"] == 40
+
+
+def test_sat_es_size_clamped_by_suggestion_config():
+    cfg = SuggestionConfig(sat_recall_min=40, sat_recall_cap=100)
+    assert _sat_es_size(10, cfg) == 40
+    assert _sat_es_size(50, cfg) == 50
+    assert _sat_es_size(200, cfg) == 100
+
+
+def test_resolve_suggestion_config_merges_tenant_yaml(monkeypatch):
+    from types import SimpleNamespace
+
+    fake = SimpleNamespace(
+        suggestion=SuggestionConfig(sat_recall_min=40, sat_recall_cap=100),
+        tenants=SimpleNamespace(
+            default={"suggestion": {"sat_recall_min": 30}},
+            tenants={"99": {"suggestion": {"sat_recall_cap": 80}}},
+        ),
+    )
+    monkeypatch.setattr("suggestion.service.get_app_config", lambda: fake)
+    cfg = _resolve_suggestion_config_for_tenant("99")
+    assert cfg.sat_recall_min == 30
+    assert cfg.sat_recall_cap == 80
 def test_publish_alias_and_cleanup_old_versions(monkeypatch):