ai-saas / saas-search

09 Apr, 2026

2 commits

32e9b30c scripts/ 根目录主要保留启动/编排入口，其他脚本归到了几个固定子目录： ... Browse File »

  - 数据转换放到 scripts/data_import/README.md
  - 诊断巡检放到 scripts/inspect/README.md
  - 运维辅助放到 scripts/ops/README.md
  - 前端辅助服务放到 scripts/frontend/frontend_server.py
  - 翻译模型下载放到 scripts/translation/download_translation_models.py
  - 临时图片补 embedding 脚本收敛成
    scripts/maintenance/embed_tenant_image_urls.py
  - Redis 监控脚本并入 redis/，现在是 scripts/redis/monitor_eviction.py

  同时我把真实调用链都改到了新位置：

  - scripts/start_frontend.sh
  - scripts/start_cnclip_service.sh
  - scripts/service_ctl.sh
  - scripts/setup_translator_venv.sh
  - scripts/README.md

  文档里涉及这些脚本的路径也同步修了，主要是 docs/QUICKSTART.md 和
translation/README.md。

2026-04-09 23:48:39 +0800

3abbc95a 重构(scripts): 整理scripts目录，按现架构分类并迁移性能/手动测试脚本 ... Browse File »

问题背景：
- scripts/
  目录下混有服务启动、数据转换、性能压测、临时脚本及历史备份目录
- 存在大量中间迭代遗留信息，不利于维护和新人理解
- 现行服务编排已稳定为 service_ctl up all 的集合：tei / cnclip /
  embedding / embedding-image / translator / reranker / backend /
indexer / frontend / eval-web，不再保留 reranker-fine 默认位

调整内容：
1. 根 scripts/ 收敛为运行、运维、环境、数据处理脚本，并新增
   scripts/README.md 说明文档
2. 性能/压测/调参脚本整体迁至 benchmarks/ 目录，同步更新
   benchmarks/README.md
3. 人工试跑脚本迁至 tests/manual/ 目录，同步更新 tests/manual/README.md
4. 删除明确过时内容：
   - scripts/indexer__old_2025_11/
   - scripts/start.sh
   - scripts/install_server_deps.sh
5. 同步修正以下文档中的路径及过时描述：
   - 根目录 README.md
   - 性能报告相关文档
   - reranker/translation 模块文档

技术细节：
- 性能测试不放常规 tests/
  的原因：这类脚本依赖真实服务、GPU、模型和环境噪声，不适合作为稳定回归门禁；benchmarks/
更贴合其定位
- tests/manual/ 仅存放需要人工启动依赖、手工观察结果的接口试跑脚本
- 所有迁移后的 Python 脚本已通过 py_compile 语法校验
- 所有迁移后的 Shell 脚本已通过 bash -n 语法校验

校验结果：
- py_compile: 通过
- bash -n: 通过

2026-04-09 23:36:06 +0800

02 Apr, 2026

1 commit

41345271 文档更新 Browse File »

tangwang
2026-04-02 19:46:27 +0800

31 Mar, 2026

1 commit

7b8d9e1a 评估框架的启动脚本 Browse File »

tangwang
2026-03-31 19:36:47 +0800

30 Mar, 2026

1 commit

36cf0ef9 es索引结果修改 Browse File »

tangwang
2026-03-30 16:20:24 +0800

21 Mar, 2026

1 commit

fb973d19 configs Browse File »

tangwang
2026-03-21 22:11:41 +0800

19 Mar, 2026

2 commits

5bac9649 文本 embedding 与图片 embedding 已拆分为两个独立进程 / 端口 Browse File »

tangwang
2026-03-19 13:54:05 +0800

4747e2f4 embedding performance ... Browse File »

The instability is very likely real overload, but `lsof -i :6005 | wc -l
= 75` alone does not prove it. What does matter is the live shape of the
service: it is a single `uvicorn` worker on port `6005`, and the code
had one shared process handling both text and image requests, with image
work serialized behind a single lock. Under bursty image traffic,
requests could pile up and sit blocked with almost no useful tracing,
which matches the “only blocking observed” symptom.

now adds persistent log files, request IDs, per-request
request/response/failure logs, text microbatch dispatch logs, health
stats with active/rejected counts, and explicit overload admission
control. New knobs are `TEXT_MAX_INFLIGHT`, `IMAGE_MAX_INFLIGHT`, and
`EMBEDDING_OVERLOAD_STATUS_CODE`. Startup output now shows those limits
and log paths in
[scripts/start_embedding_service.sh](/data/saas-search/scripts/start_embedding_service.sh#L80).
I also added focused tests in
[tests/test_embedding_service_limits.py](/data/saas-search/tests/test_embedding_service_limits.py#L1).

What this means operationally:
- Text and image are still in one process, so this is not the final
  architecture.
- But image spikes will now be rejected quickly once the image lane is
  full instead of sitting around and consuming the worker pool.
- Logs will now show each request, each rejection, each microbatch
  dispatch, backend time, response time, and request ID.

Verification:
- Passed: `.venv/bin/python -m pytest -q
  tests/test_embedding_service_limits.py`
- I also ran a wider test command, but 3 failures came from pre-existing
  drift in
[tests/test_embedding_pipeline.py](/data/saas-search/tests/test_embedding_pipeline.py#L95),
where the tests still monkeypatch `embeddings.text_encoder.redis.Redis`
even though
[embeddings/text_encoder.py](/data/saas-search/embeddings/text_encoder.py#L1)
no longer imports `redis` that way.

已把 CLIP_AS_SERVICE 的默认模型切到
ViT-L-14，并把这套配置收口成可变更的统一入口了。现在默认值在
embeddings/config.py (line 29) 的 CLIP_AS_SERVICE_MODEL_NAME，当前为
CN-CLIP/ViT-L-14；scripts/start_cnclip_service.sh (line 37)
会自动读取这个配置，不再把默认模型写死在脚本里，同时支持
CNCLIP_MODEL_NAME 和 --model-name
临时覆盖。scripts/start_embedding_service.sh (line 29) 和
embeddings/server.py (line 425)
也补了模型信息输出，方便排查实际连接的配置。

文档也一起更新了，重点在 docs/CNCLIP_SERVICE说明文档.md (line 62) 和
embeddings/README.md (line
58)：现在说明的是“以配置为准、可覆盖”的机制，而不是写死某个模型名；相关总结文档和内部说明也同步改成了配置驱动表述。

2026-03-19 12:27:05 +0800

18 Mar, 2026

1 commit

cd4ce66d trans logs Browse File »

tangwang
2026-03-18 20:32:37 +0800

17 Mar, 2026

2 commits

0fd2f875 translate Browse File »

tangwang
2026-03-17 19:21:34 +0800

6f7840cf refactor: rename product annotator to enrich and expand multilingual prompts ... Browse File »

- Rename indexer/product_annotator.py to indexer/product_enrich.py and remove CSV-based CLI entrypoint, keeping only in-memory analyze_products API
- Introduce dedicated product_enrich logging with separate verbose log file for full LLM requests/responses
- Change indexer and /indexer/enrich-content API wiring to use indexer.product_enrich instead of indexer.product_annotator, updating tests and docs accordingly
- Switch translate_prompts to share SUPPORTED_INDEX_LANGUAGES from tenant_config_loader and reuse that mapping for language code → display name
- Remove hard SUPPORTED_LANGS constraint from LLM content-enrichment flow, driving languages directly from tenant/indexer configuration
- Redesign LLM prompt generation to support multi-round, multi-language tables: first round in English, subsequent rounds translate the entire table (headers + cells) into target languages using English instructions

2026-03-17 11:26:03 +0800

16 Mar, 2026

1 commit

137455af doc Browse File »

tangwang
2026-03-16 22:47:34 +0800

14 Mar, 2026

1 commit

208e079a TODO Browse File »

tangwang
2026-03-14 21:37:51 +0800