ai-saas / saas-search

19 Mar, 2026

2 commits

af03fdef embedding模块代码整理 Browse File »

tangwang
2026-03-19 14:24:35 +0800

- Text and image embedding are now split into separate
  services/processes, while still keeping a single replica as requested.
The split lives in
[embeddings/server.py](/data/saas-search/embeddings/server.py#L112),
[config/services_config.py](/data/saas-search/config/services_config.py#L68),
[providers/embedding.py](/data/saas-search/providers/embedding.py#L27),
and the start scripts
[scripts/start_embedding_service.sh](/data/saas-search/scripts/start_embedding_service.sh#L36),
[scripts/start_embedding_text_service.sh](/data/saas-search/scripts/start_embedding_text_service.sh),
[scripts/start_embedding_image_service.sh](/data/saas-search/scripts/start_embedding_image_service.sh).
- Independent admission control is in place now: text and image have
  separate inflight limits, and image can be kept much stricter than
text. The request handling, reject path, `/health`, and `/ready` are in
[embeddings/server.py](/data/saas-search/embeddings/server.py#L613),
[embeddings/server.py](/data/saas-search/embeddings/server.py#L786), and
[embeddings/server.py](/data/saas-search/embeddings/server.py#L1028).
- I checked the Redis embedding cache. It did exist, but there was a
  real flaw: cache keys did not distinguish `normalize=true` from
`normalize=false`. I fixed that in
[embeddings/cache_keys.py](/data/saas-search/embeddings/cache_keys.py#L6),
and both text and image now use the same normalize-aware keying. I also
added service-side BF16 cache hits that short-circuit before the model
lane, so repeated requests no longer get throttled behind image
inference.

**What This Means**
- Image pressure no longer blocks text, because they are on different
  ports/processes.
- Repeated text/image requests now return from Redis without consuming
  model capacity.
- Over-capacity requests are rejected quickly instead of sitting
  blocked.
- I did not add a load balancer or multi-replica HA, per your GPU
  constraint. I also did not build Grafana/Prometheus dashboards in this
pass, but `/health` now exposes the metrics needed to wire them.

**Validation**
- Tests passed: `.venv/bin/python -m pytest -q
  tests/test_embedding_pipeline.py
tests/test_embedding_service_limits.py` -> `10 passed`
- Stress test tool updates are in
  [scripts/perf_api_benchmark.py](/data/saas-search/scripts/perf_api_benchmark.py#L155)
- Fresh benchmark on split text service `6105`: 535 requests / 3s, 100%
  success, `174.56 rps`, avg `88.48 ms`
- Fresh benchmark on split image service `6108`: 1213 requests / 3s,
  100% success, `403.32 rps`, avg `9.64 ms`
- Live health after the run showed cache hits and non-zero cache-hit
  latency accounting:
  - text `avg_latency_ms=4.251`
  - image `avg_latency_ms=1.462`

2026-03-19 13:21:01 +0800

13 Mar, 2026

2 commits

d4cadc13 翻译重构 Browse File »

tangwang
2026-03-13 20:28:08 +0800
77ab67ad 更新测试用例 Browse File »

tangwang
2026-03-13 12:39:40 +0800

09 Mar, 2026

2 commits

07cf5a93 START_EMBEDDING=1 START_TRANSLATOR=1 START_RERANKER=1 START_TEI=1 ... Browse File »
```
CNCLIP_DEVICE=cuda TEI_USE_GPU=1 ./scripts/service_ctl.sh start
搜索后端+indexer+测试前段+4个微服务 跑通
```
tangwang
2026-03-09 23:29:07 +0800
ed948666 tidy Browse File »

tangwang
2026-03-09 17:04:00 +0800

08 Mar, 2026

1 commit

701ae503 docs Browse File »

tangwang
2026-03-08 14:30:07 +0800

07 Mar, 2026

1 commit

42e3aea6 tidy Browse File »

tangwang
2026-03-07 19:44:25 +0800

02 Dec, 2025

2 commits

33839b37 属性值参与搜索： ... Browse File »

1. 加了一个配置searchable_option_dimensions，功能是配置子sku的option1_value option2_value option3_value 哪些参与检索（进索引、以及在线搜索的时候将对应字段纳入搜索field）。格式为list，选择三者中的一个或多个。

2. 索引 @mappings/search_products.json 要加3个字段 option1_values option2_values option3_values，各自的 数据灌入（mysql->ES）的模块也要修改，这个字段是对子sku的option1_value option2_value option3_value分别提取去抽后得到的list。
searchable_option_dimensions 中配置的，才进索引，比如 searchable_option_dimensions = ['option1'] 则 只对option1提取属性值去重组织list进入索引，其余两个字段为空

3. 在线 对应的将 searchable_option_dimensions 中 对应的索引字段纳入 multi_match 的 fields，权重设为0.5 （各个字段的权重配置放到一起集中管理）

1. 配置文件改动 (config/config.yaml)
✅ 在 spu_config 中添加了 searchable_option_dimensions 配置项，默认值为 ['option1', 'option2', 'option3']
✅ 添加了3个新字段定义：option1_values, option2_values, option3_values，类型为 KEYWORD，权重为 0.5
✅ 在 default 索引域的 fields 列表中添加了这3个字段，使其参与搜索
2. ES索引Mapping改动 (mappings/search_products.json)
✅ 添加了3个新字段：option1_values, option2_values, option3_values，类型为 keyword
3. 配置加载器改动 (config/config_loader.py)
✅ 在 SPUConfig 类中添加了 searchable_option_dimensions 字段
✅ 更新了配置解析逻辑，支持读取 searchable_option_dimensions
✅ 更新了配置转换为字典的逻辑
4. 数据灌入改动 (indexer/spu_transformer.py)
✅ 在初始化时加载配置，获取 searchable_option_dimensions
✅ 在 _transform_spu_to_doc 方法中添加逻辑：
从所有子SKU中提取 option1, option2, option3 值
去重后存入 option1_values, option2_values, option3_values
根据配置决定哪些字段实际写入数据（未配置的字段写空数组）

=

2025-12-02 18:35:50 +0800

9f96d6f3 短query不用语义搜索 ... Browse File »
```
query config/ranking config优化
```
tangwang
2025-12-02 13:38:31 +0800

13 Nov, 2025

1 commit

9cb7528e 店匠体系数据的搜索:mock data -> mysql, mysql->ES Browse File »

tangwang
2025-11-13 15:13:26 +0800

12 Nov, 2025

1 commit

a00c3672 feat: Function Score配置化 - 基于ES原生能力 ... Browse File »

核心改动：
1. 配置化打分规则
   - 新增FunctionScoreConfig和RerankConfig配置类
   - 支持filter_weight、field_value_factor、decay三种ES原生function
   - 从代码中移除硬编码的打分逻辑

2. 配置模型定义
   - FunctionScoreConfig: score_mode, boost_mode, functions
   - RerankConfig: enabled, expression（当前禁用）
   - 添加到CustomerConfig中

3. 查询构建器改造
   - MultiLanguageQueryBuilder.init添加function_score_config引用
   - _build_score_functions从配置动态构建ES functions
   - 支持配置的score_mode和boost_mode

4. 配置文件示例
   - 添加完整的function_score配置示例
   - 包含3种function类型的详细注释
   - 提供常见场景的配置模板

5. ES原生能力支持
   - Filter+Weight: 条件匹配提权
   - Field Value Factor: 字段值映射打分
     * modifier支持: none, log, log1p, log2p, ln, ln1p, ln2p, square, sqrt, reciprocal
   - Decay Functions: 衰减函数
     * 支持: gauss, exp, linear

配置示例：
- 7天新品提权（weight: 1.3）
- 30天新品提权（weight: 1.15）
- 有视频提权（weight: 1.05）
- 销量因子（field_value_factor + log1p）
- 时间衰减（gauss decay）

优势：
✓ 配置化 - 客户自己调整，无需改代码
✓ 基于ES原生 - 性能最优，功能完整
✓ 灵活易用 - YAML格式，有示例和注释
✓ 统一约定 - function_score必需，简化设计

参考：https://www.elastic.co/docs/reference/query-languages/query-dsl/query-dsl-function-score-query

2025-11-12 13:39:14 +0800

08 Nov, 2025

1 commit

be52af70 first commit Browse File »

tangwang
2025-11-08 00:07:09 +0800