19 Mar, 2026

2 commits

  • The instability is very likely real overload, but `lsof -i :6005 | wc -l
    = 75` alone does not prove it. What does matter is the live shape of the
    service: it is a single `uvicorn` worker on port `6005`, and the code
    had one shared process handling both text and image requests, with image
    work serialized behind a single lock. Under bursty image traffic,
    requests could pile up and sit blocked with almost no useful tracing,
    which matches the “only blocking observed” symptom.
    
    now adds persistent log files, request IDs, per-request
    request/response/failure logs, text microbatch dispatch logs, health
    stats with active/rejected counts, and explicit overload admission
    control. New knobs are `TEXT_MAX_INFLIGHT`, `IMAGE_MAX_INFLIGHT`, and
    `EMBEDDING_OVERLOAD_STATUS_CODE`. Startup output now shows those limits
    and log paths in
    [scripts/start_embedding_service.sh](/data/saas-search/scripts/start_embedding_service.sh#L80).
    I also added focused tests in
    [tests/test_embedding_service_limits.py](/data/saas-search/tests/test_embedding_service_limits.py#L1).
    
    What this means operationally:
    - Text and image are still in one process, so this is not the final
      architecture.
    - But image spikes will now be rejected quickly once the image lane is
      full instead of sitting around and consuming the worker pool.
    - Logs will now show each request, each rejection, each microbatch
      dispatch, backend time, response time, and request ID.
    
    Verification:
    - Passed: `.venv/bin/python -m pytest -q
      tests/test_embedding_service_limits.py`
    - I also ran a wider test command, but 3 failures came from pre-existing
      drift in
    [tests/test_embedding_pipeline.py](/data/saas-search/tests/test_embedding_pipeline.py#L95),
    where the tests still monkeypatch `embeddings.text_encoder.redis.Redis`
    even though
    [embeddings/text_encoder.py](/data/saas-search/embeddings/text_encoder.py#L1)
    no longer imports `redis` that way.
    
    已把 CLIP_AS_SERVICE 的默认模型切到
    ViT-L-14,并把这套配置收口成可变更的统一入口了。现在默认值在
    embeddings/config.py (line 29) 的 CLIP_AS_SERVICE_MODEL_NAME,当前为
    CN-CLIP/ViT-L-14;scripts/start_cnclip_service.sh (line 37)
    会自动读取这个配置,不再把默认模型写死在脚本里,同时支持
    CNCLIP_MODEL_NAME 和 --model-name
    临时覆盖。scripts/start_embedding_service.sh (line 29) 和
    embeddings/server.py (line 425)
    也补了模型信息输出,方便排查实际连接的配置。
    
    文档也一起更新了,重点在 docs/CNCLIP_SERVICE说明文档.md (line 62) 和
    embeddings/README.md (line
    58):现在说明的是“以配置为准、可覆盖”的机制,而不是写死某个模型名;相关总结文档和内部说明也同步改成了配置驱动表述。
    tangwang
     
  • 中采用了最优T4配置:ct2_inter_threads=2、ct2_max_queued_batches=16、ct2_batch_type=examples。该设置使NLLB获得了显著更优的在线式性能,同时大致保持了大批次吞吐量不变。我没有将相同配置应用于两个Marian模型,因为聚焦式报告显示了复杂的权衡:opus-mt-zh-en
    在保守默认配置下更为均衡,而 opus-mt-en-zh 虽然获得了吞吐量提升,但在
    c=8 时尾延迟波动较大。
    我还将部署/配置经验记录在 /data/saas-search/translation/README.md
    中,并在 /data/saas-search/docs/TODO.txt
    中标记了优化结果。关键实践要点现已记录如下:使用CT2 +
    float16,保持单worker,将NLLB的 inter_threads 设为2、max_queued_batches
    设为16,在T4上避免使用
    inter_threads=4(因为这会损害高批次吞吐量),除非区分在线/离线配置,否则保持Marian模型的默认配置保守。
    tangwang
     

18 Mar, 2026

4 commits

  • Implemented CTranslate2 for the three local translation models and
    switched the existing local_nllb / local_marian factories over to it.
    The new runtime lives in local_ctranslate2.py, including HF->CT2
    auto-conversion, float16 compute type mapping, Marian direction
    handling, and NLLB target-prefix decoding. The service wiring is in
    service.py (line 113), and the three model configs now point at explicit
    ctranslate2-float16 dirs in config.yaml (line 133).
    
    I also updated the setup path so this is usable end-to-end:
    ctranslate2>=4.7.0 was added to requirements_translator_service.txt and
    requirements.txt, the download script now supports pre-conversion in
    download_translation_models.py (line 27), and the docs/config examples
    were refreshed in translation/README.md. I installed ctranslate2 into
    .venv-translator, pre-converted all three models, and the CT2 artifacts
    are now already on disk:
    
    models/translation/facebook/nllb-200-distilled-600M/ctranslate2-float16
    models/translation/Helsinki-NLP/opus-mt-zh-en/ctranslate2-float16
    models/translation/Helsinki-NLP/opus-mt-en-zh/ctranslate2-float16
    Verification was solid. python3 -m compileall passed, direct
    TranslationService smoke tests ran successfully in .venv-translator, and
    the focused NLLB benchmark on the local GPU showed a clear win:
    
    batch_size=16: HF 0.347s/batch, 46.1 items/s vs CT2 0.130s/batch, 123.0
    items/s
    batch_size=1: HF 0.396s/request vs CT2 0.126s/request
    One caveat: translation quality on some very short phrases, especially
    opus-mt-en-zh, still looks a bit rough in smoke tests, so I’d run your
    real quality set before fully cutting over. If you want, I can take the
    next step and update the benchmark script/report so you have a fresh
    full CT2 performance report for all three models.
    tangwang
     
  • tangwang
     
  • tangwang
     
  • tangwang
     

17 Mar, 2026

4 commits

  • tangwang
     
  • tangwang
     
  • 多个独立翻译能力”重构。现在业务侧不再把翻译当 provider
    选型,QueryParser 和 indexer 统一通过 6006 的 translator service client
    调用;真正的能力选择、启用开关、model + scene 路由,都收口到服务端和新的
    translation/ 目录里了。
    
    这次的核心改动在
    config/services_config.py、providers/translation.py、api/translator_app.py、config/config.yaml
    和新的 translation/service.py。配置从旧的
    services.translation.provider/providers 改成了 service_url +
    default_model + default_scene + capabilities,每个能力可独立
    enabled;服务端新增了统一的 backend 管理与懒加载,真实实现集中到
    translation/backends/qwen_mt.py、translation/backends/llm.py、translation/backends/deepl.py,旧的
    query/qwen_mt_translate.py、query/llm_translate.py、query/deepl_provider.py
    只保留兼容导出。接口上,/translate 现在标准支持 scene,context
    作为兼容别名继续可用,健康检查会返回默认模型、默认场景和已启用能力。
    tangwang
     
  • - Rename indexer/product_annotator.py to indexer/product_enrich.py and remove CSV-based CLI entrypoint, keeping only in-memory analyze_products API
    - Introduce dedicated product_enrich logging with separate verbose log file for full LLM requests/responses
    - Change indexer and /indexer/enrich-content API wiring to use indexer.product_enrich instead of indexer.product_annotator, updating tests and docs accordingly
    - Switch translate_prompts to share SUPPORTED_INDEX_LANGUAGES from tenant_config_loader and reuse that mapping for language code → display name
    - Remove hard SUPPORTED_LANGS constraint from LLM content-enrichment flow, driving languages directly from tenant/indexer configuration
    - Redesign LLM prompt generation to support multi-round, multi-language tables: first round in English, subsequent rounds translate the entire table (headers + cells) into target languages using English instructions
    tangwang
     

13 Mar, 2026

4 commits


12 Mar, 2026

5 commits


11 Mar, 2026

2 commits

  • tangwang
     
  • ./scripts/start_tei_service.sh
    START_TEI=0 ./scripts/service_ctl.sh restart embedding
    
    curl -sS -X POST "http://127.0.0.1:6005/embed/text" \
      -H "Content-Type: application/json" \
      -d '["芭比娃娃 儿童玩具", "纯棉T恤 短袖"]'
    tangwang
     

10 Mar, 2026

4 commits

  • tangwang
     
  • - 配置改为“字段基名 + 动态语言后缀”方案,已不再依赖旧 `indexes`。
    [config.yaml](/data/saas-search/config/config.yaml#L17)
    - `search_fields` / `text_query_strategy` 已进入强校验与解析流程。
    [config_loader.py](/data/saas-search/config/config_loader.py#L254)
    
    2. 查询语言计划与翻译等待策略
    - `QueryParser` 现在产出
      `query_text_by_lang`、`search_langs`、`source_in_index_languages`。
    [query_parser.py](/data/saas-search/query/query_parser.py#L41)
    - 你要求的两种翻译路径都在:
      - 源语言不在店铺 `index_languages`:`translate_multi_async` + 等待
        future
      - 源语言在 `index_languages`:`translate_multi(...,
        async_mode=True)`,尽量走缓存
    [query_parser.py](/data/saas-search/query/query_parser.py#L284)
    
    3. ES 查询统一文本策略(无 AST 分支)
    - 主召回按 `search_langs` 动态拼 `field.{lang}`,翻译语种做次权重
      `should`。
    [es_query_builder.py](/data/saas-search/search/es_query_builder.py#L454)
    - 布尔 AST 路径已删除,仅保留统一文本策略。
    [es_query_builder.py](/data/saas-search/search/es_query_builder.py#L185)
    
    4. LanguageDetector 优化
    - 从“拉丁字母默认英文”升级为:脚本优先 +
      拉丁语系打分(词典/变音/后缀)。
    [language_detector.py](/data/saas-search/query/language_detector.py#L68)
    
    5. 布尔能力清理(补充)
    - 已删除废弃模块:
    [boolean_parser.py](/data/saas-search/search/boolean_parser.py)
    - `search/__init__` 已无相关导出。
    [search/__init__.py](/data/saas-search/search/__init__.py)
    
    6. `indexes` 过时收口(补充)
    - 兼容函数改为基于动态字段生成,不再依赖 `config.indexes`。
    [utils.py](/data/saas-search/config/utils.py#L24)
    - Admin 配置接口改为返回动态字段配置,不再暴露 `num_indexes`。
    [admin.py](/data/saas-search/api/routes/admin.py#L52)
    
    7. suggest
    tangwang
     
  • tangwang
     
  • tangwang
     

09 Mar, 2026

3 commits


08 Mar, 2026

1 commit


07 Mar, 2026

1 commit


05 Feb, 2026

2 commits

  • tangwang
     
  • - API:新增请求参数 ai_search,开启时在窗口内走重排流程
    - 配置:RerankConfig 移除 enabled/expression/description,仅保留 rerank_window 及
      service_url/timeout_sec/weight_es/weight_ai;默认超时 15s
    - 重排流程:ai_search 且 from+size<=rerank_window 时,ES 取前 rerank_window 条,
      调用外部 /rerank 服务,融合 ES 与重排分数后按 from/size 分页;否则不重排
    - search/rerank_client:新增模块,封装 build_docs、call_rerank_service、
      fuse_scores_and_resort、run_rerank;超时单独捕获并简短日志
    - search/searcher:移除 RerankEngine,enable_rerank=ai_search,使用 config.rerank 参数
    - 删除 search/rerank_engine.py(本地表达式重排),统一为外部服务一种实现
    - 文档:搜索 API 对接指南补充 ai_search 与 relevance_score 说明
    - 测试:conftest 中 rerank 配置改为新结构
    
    Co-authored-by: Cursor <cursoragent@cursor.com>
    tangwang
     

27 Jan, 2026

1 commit

  • - config: 新增 SUPPORTED_INDEX_LANGUAGES(38 种语言)、DEFAULT_INDEX_LANGUAGES、
      normalize_index_languages、resolve_index_languages;get_tenant_config 统一注入 index_languages
    - config.yaml: 租户配置改用 index_languages,默认 [en,zh],保留 translate_to_* 兼容解析
    - query/translator: translate_for_indexing 改为接收 index_languages,返回多语言 Dict
    - query/query_parser: 翻译目标从 index_languages 解析,need_wait_translation 按 index_langs 判断
    - search/searcher: enable_translation 改为基于 index_languages 是否非空
    - indexer: document_transformer 按 index_languages 填多语言字段;indexing_utils 仅多语言时初始化翻译器
    - tests: 租户配置与索引测试改为断言 index_languages
    - README: 更新 TODO 说明已支持 index_languages
    tangwang
     

06 Jan, 2026

2 commits

  • tangwang
     
  • mappings/search_products.json:把原来的 title_zh/title_en/brief_zh/... 改成 按语言 key 的对象结构( /products/_doc/1 { "title": {"en":...} } )
    同时在这些字段下 预置了全部 analyzer 语言:
    arabic, armenian, basque, brazilian, bulgarian, catalan, chinese, cjk, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, italian, norwegian, persian, portuguese, romanian, russian, spanish, swedish, turkish, thai
    
    实现为 type: object + properties,同时满足“按语言灌入”和“按语言 analyzer”。
    索引灌入(全量/增量/transformer)已同步改完
    indexer/document_transformer.py:输出从 title_zh/title_en/... 改为:
    title: {<primary_lang>: 原文, en?: 翻译, zh?: 翻译}
    brief/description/vendor 同理
    category_path/category_name_text 也改为语言对象(避免查询侧继续依赖旧字段)
    indexer/incremental_service.py:embedding 取值从 title_en/title_zh 改为从 title 对象里优先取 en,否则取 zh,否则取任一可用语言。
    查询侧与配置、API/文档已同步
    search/es_query_builder.py:查询字段统一改成点路径:title.zh / title.en / vendor.zh / vendor.zh.keyword / category_name_text.zh 等。
    config/config.yaml:field boosts / indexes 里的字段名同步为新点路径。
    API & formatter:
    api/result_formatter.py 已支持新结构(并保留对旧 *_zh/_en 的兼容兜底)。
    api/models.py、相关 docs/examples 里的 vendor_zh.keyword 等已更新为 vendor.zh.keyword。
    文档/脚本:docs/、README.md、scripts/ 里所有旧字段名引用已批量替换为新结构。
    tangwang
     

20 Dec, 2025

1 commit


18 Dec, 2025

3 commits

  • 新增:scripts/recreate_index.py
    功能:初始化 indexer 的 ES/DB 服务,然后调用 BulkIndexingService.bulk_index(…, recreate_index=True) 为指定 tenant_id 做「删除并重建索引 + 全量导入」。
    用法示例:
    cd /home/tw/SearchEngine# 使用默认 batch_size=500python scripts/recreate_index.py 162# 指定 batch_sizepython scripts/recreate_index.py 162 --batch-size 1000
    脚本依赖和 Indexer API 一样的环境变量:DB_HOST/DB_PORT/DB_DATABASE/DB_USERNAME/DB_PASSWORD、ES_HOST/ES_USERNAME/ES_PASSWORD。
    2. 清理与引用更新
    原来的 scripts/recreate_index.sh 已经删除。
    api/routes/indexer.py 里的说明改成引用 scripts/recreate_index.py。
    docs/搜索API对接指南.md 中的提示也从 .sh 改为:
    > python scripts/recreate_index.py <tenant_id> [--batch-size 500]
    tangwang
     
  • config/config_loader.py: 从 QueryConfig 类中删除 enable_translation 字段
    config/config.yaml: 删除 enable_translation: true 配置项
    config/config_loader.py: 从 to_dict() 方法中删除相关输出
    2. 索引阶段(离线)- 使用租户配置
    indexer/indexing_utils.py:
    根据 tenant_config.translate_to_en 和 translate_to_zh 决定是否初始化 translator
    只有任一方向开启时才创建 translator
    indexer/document_transformer.py:
    _fill_text_fields 从 tenant_config 读取 translate_to_en 和 translate_to_zh
    调用 translate_for_indexing 时传递这两个参数
    更新了文档注释
    3. 查询阶段(在线)- 使用租户配置
    query/query_parser.py:
    parse() 方法新增 tenant_id 参数
    根据租户配置决定翻译目标语言(translate_to_zh / translate_to_en)
    如果两个都是 false,跳过翻译阶段
    translator 属性不再依赖 enable_translation,总是可以初始化
    search/searcher.py:
    search() 方法中根据租户配置计算 enable_translation(用于日志和 metadata)
    调用 query_parser.parse() 时传递 tenant_id
    4. 翻译器方法更新
    query/translator.py:
    translate_for_indexing() 新增 translate_to_en 和 translate_to_zh 参数(默认 True 保持向后兼容)
    根据这两个参数决定翻译目标
    更新了文档注释
    tangwang
     
  • tangwang
     

07 Dec, 2025

1 commit

  • 主要功能:
    1. 增量数据获取服务
       - 新增 IncrementalIndexerService 提供单个SPU数据获取
       - 新增 /indexer/spu/{spu_id} API接口
       - 服务启动时预加载分类映射等公共数据
       - 提取 SPUDocumentTransformer 统一全量和增量转换逻辑
       - 支持根据租户配置进行语言处理和翻译
    
    3. 租户配置系统
       - 租户配置合并到统一配置文件 config/config.yaml
       - 支持每个租户独立配置主语言和翻译选项
       - 租户162配置为翻译关闭(用于测试)
    
    4. 翻译功能集成
       - 翻译提示词作为DeepL API的context参数传递
       - 支持中英文提示词配置
       - 索引场景:同步翻译,使用缓存
       - 查询场景:异步翻译,立即返回
    
    测试:
    - 新增 indexer/test_indexing.py 和 query/test_translation.py
    - 验证租户162翻译关闭功能
    - 验证全量和增量索引功能
    tangwang