09 Apr, 2026
1 commit
-
问题背景: - scripts/ 目录下混有服务启动、数据转换、性能压测、临时脚本及历史备份目录 - 存在大量中间迭代遗留信息,不利于维护和新人理解 - 现行服务编排已稳定为 service_ctl up all 的集合:tei / cnclip / embedding / embedding-image / translator / reranker / backend / indexer / frontend / eval-web,不再保留 reranker-fine 默认位 调整内容: 1. 根 scripts/ 收敛为运行、运维、环境、数据处理脚本,并新增 scripts/README.md 说明文档 2. 性能/压测/调参脚本整体迁至 benchmarks/ 目录,同步更新 benchmarks/README.md 3. 人工试跑脚本迁至 tests/manual/ 目录,同步更新 tests/manual/README.md 4. 删除明确过时内容: - scripts/indexer__old_2025_11/ - scripts/start.sh - scripts/install_server_deps.sh 5. 同步修正以下文档中的路径及过时描述: - 根目录 README.md - 性能报告相关文档 - reranker/translation 模块文档 技术细节: - 性能测试不放常规 tests/ 的原因:这类脚本依赖真实服务、GPU、模型和环境噪声,不适合作为稳定回归门禁;benchmarks/ 更贴合其定位 - tests/manual/ 仅存放需要人工启动依赖、手工观察结果的接口试跑脚本 - 所有迁移后的 Python 脚本已通过 py_compile 语法校验 - 所有迁移后的 Shell 脚本已通过 bash -n 语法校验 校验结果: - py_compile: 通过 - bash -n: 通过
02 Apr, 2026
1 commit
31 Mar, 2026
1 commit
19 Mar, 2026
2 commits
-
The instability is very likely real overload, but `lsof -i :6005 | wc -l = 75` alone does not prove it. What does matter is the live shape of the service: it is a single `uvicorn` worker on port `6005`, and the code had one shared process handling both text and image requests, with image work serialized behind a single lock. Under bursty image traffic, requests could pile up and sit blocked with almost no useful tracing, which matches the “only blocking observed” symptom. now adds persistent log files, request IDs, per-request request/response/failure logs, text microbatch dispatch logs, health stats with active/rejected counts, and explicit overload admission control. New knobs are `TEXT_MAX_INFLIGHT`, `IMAGE_MAX_INFLIGHT`, and `EMBEDDING_OVERLOAD_STATUS_CODE`. Startup output now shows those limits and log paths in [scripts/start_embedding_service.sh](/data/saas-search/scripts/start_embedding_service.sh#L80). I also added focused tests in [tests/test_embedding_service_limits.py](/data/saas-search/tests/test_embedding_service_limits.py#L1). What this means operationally: - Text and image are still in one process, so this is not the final architecture. - But image spikes will now be rejected quickly once the image lane is full instead of sitting around and consuming the worker pool. - Logs will now show each request, each rejection, each microbatch dispatch, backend time, response time, and request ID. Verification: - Passed: `.venv/bin/python -m pytest -q tests/test_embedding_service_limits.py` - I also ran a wider test command, but 3 failures came from pre-existing drift in [tests/test_embedding_pipeline.py](/data/saas-search/tests/test_embedding_pipeline.py#L95), where the tests still monkeypatch `embeddings.text_encoder.redis.Redis` even though [embeddings/text_encoder.py](/data/saas-search/embeddings/text_encoder.py#L1) no longer imports `redis` that way. 已把 CLIP_AS_SERVICE 的默认模型切到 ViT-L-14,并把这套配置收口成可变更的统一入口了。现在默认值在 embeddings/config.py (line 29) 的 CLIP_AS_SERVICE_MODEL_NAME,当前为 CN-CLIP/ViT-L-14;scripts/start_cnclip_service.sh (line 37) 会自动读取这个配置,不再把默认模型写死在脚本里,同时支持 CNCLIP_MODEL_NAME 和 --model-name 临时覆盖。scripts/start_embedding_service.sh (line 29) 和 embeddings/server.py (line 425) 也补了模型信息输出,方便排查实际连接的配置。 文档也一起更新了,重点在 docs/CNCLIP_SERVICE说明文档.md (line 62) 和 embeddings/README.md (line 58):现在说明的是“以配置为准、可覆盖”的机制,而不是写死某个模型名;相关总结文档和内部说明也同步改成了配置驱动表述。
-
推理”,不再是先按原始输入条数切块。也就是说,如果 100 条请求分句后变成 150 个 segments,batch_size=64 时会按 64 + 64 + 22 三批推理,推理完再按原始分句计划合并并还原成 100 条返回。这个改动在 local_seq2seq.py (line 241) 和 local_ctranslate2.py (line 391)。 日志这边也补上了两层你要的关键信息: 分句摘要日志:Translation segmentation summary,会打印输入条数、非空条数、发生分句的输入数、总 segments 数、当前 batch_size、每条输入分成多少段的统计,见 local_seq2seq.py (line 216) 和 local_ctranslate2.py (line 366)。 每个预测批次日志:Translation inference batch,会打印第几批、总批数、该批 segment 数、长度统计、首条预览。CTranslate2 另外还会打印 Translation model batch detail,补充 token 长度和 max_decoding_length,见 local_ctranslate2.py (line 294)。 我也补了测试,覆盖了“分句后再 batching”和“日志中有分句摘要与每批推理日志”,在 test_translation_local_backends.py (line 358)。
13 Mar, 2026
1 commit
10 Mar, 2026
1 commit
08 Mar, 2026
1 commit
06 Mar, 2026
1 commit
06 Jan, 2026
1 commit
-
mappings/search_products.json:把原来的 title_zh/title_en/brief_zh/... 改成 按语言 key 的对象结构( /products/_doc/1 { "title": {"en":...} } ) 同时在这些字段下 预置了全部 analyzer 语言: arabic, armenian, basque, brazilian, bulgarian, catalan, chinese, cjk, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, italian, norwegian, persian, portuguese, romanian, russian, spanish, swedish, turkish, thai 实现为 type: object + properties,同时满足“按语言灌入”和“按语言 analyzer”。 索引灌入(全量/增量/transformer)已同步改完 indexer/document_transformer.py:输出从 title_zh/title_en/... 改为: title: {<primary_lang>: 原文, en?: 翻译, zh?: 翻译} brief/description/vendor 同理 category_path/category_name_text 也改为语言对象(避免查询侧继续依赖旧字段) indexer/incremental_service.py:embedding 取值从 title_en/title_zh 改为从 title 对象里优先取 en,否则取 zh,否则取任一可用语言。 查询侧与配置、API/文档已同步 search/es_query_builder.py:查询字段统一改成点路径:title.zh / title.en / vendor.zh / vendor.zh.keyword / category_name_text.zh 等。 config/config.yaml:field boosts / indexes 里的字段名同步为新点路径。 API & formatter: api/result_formatter.py 已支持新结构(并保留对旧 *_zh/_en 的兼容兜底)。 api/models.py、相关 docs/examples 里的 vendor_zh.keyword 等已更新为 vendor.zh.keyword。 文档/脚本:docs/、README.md、scripts/ 里所有旧字段名引用已批量替换为新结构。
19 Dec, 2025
1 commit
-
1. 删除 IndexingPipeline 类 文件:indexer/bulk_indexer.py 删除:IndexingPipeline 类(第201-259行) 删除:不再需要的 load_mapping 导入 2. 删除 main.py 中的旧代码 删除:cmd_ingest() 函数(整个函数) 删除:ingest 子命令定义 删除:main() 中对 ingest 命令的处理 删除:不再需要的 pandas 导入 更新:文档字符串,移除 ingest 命令说明 3. 删除旧的数据导入脚本 删除:data/customer1/ingest_customer1.py(依赖已废弃的 DataTransformer 和 IndexingPipeline)
05 Dec, 2025
2 commits
-
2. queries
-
quriers products
28 Nov, 2025
1 commit
-
脚本:scripts/csv_to_excel_multi_variant.py 主要功能: 单一款式商品(S 类型)- 30% 商品属性为 S 不填写 option1/option2/option3 包含所有商品信息(标题、描述、价格、库存等) 多款式商品(M+P 类型)- 70% M 行(商品主体): 商品属性为 M 填写商品主体信息(标题、描述、SEO、分类等) option1="color", option2="size", option3="material" 不填写价格、库存、SKU 等子款式信息 P 行(子款式): 商品属性为 P 商品标题与 M 行一致 option1/2/3 填写具体值(color、size、material 的笛卡尔积) 每个 SKU 有独立的价格、库存、SKU 编码等 多款式商品生成规则: Color(颜色):从 color1 到 color30 中随机选择 2-10 个 Size(尺寸):从 1-30 中随机选择 4-8 个 Material(材质):从商品标题按空格分割后的最后一个字符串提取(去掉特殊字符) 笛卡尔积:生成所有组合的 P 行(例如:3 个颜色 × 5 个尺寸 × 1 个材质 = 15 个 SKU)
17 Nov, 2025
1 commit
13 Nov, 2025
1 commit
12 Nov, 2025
1 commit
11 Nov, 2025
1 commit
-
## 🎯 Major Features - Request context management system for complete request visibility - Structured JSON logging with automatic daily rotation - Performance monitoring with detailed stage timing breakdowns - Query analysis result storage and intermediate result tracking - Error and warning collection with context correlation ## 🔧 Technical Improvements - **Context Management**: Request-level context with reqid/uid correlation - **Performance Monitoring**: Automatic timing for all search pipeline stages - **Structured Logging**: JSON format logs with request context injection - **Query Enhancement**: Complete query analysis tracking and storage - **Error Handling**: Enhanced error tracking with context information ## 🐛 Bug Fixes - Fixed DeepL API endpoint (paid vs free API confusion) - Fixed vector generation (GPU memory cleanup) - Fixed logger parameter passing format (reqid/uid handling) - Fixed translation and embedding functionality ## 🌟 API Improvements - Simplified API interface (8→5 parameters, 37.5% reduction) - Made internal functionality transparent to users - Added performance info to API responses - Enhanced request correlation and tracking ## 📁 New Infrastructure - Comprehensive test suite (unit, integration, API tests) - CI/CD pipeline with automated quality checks - Performance monitoring and testing tools - Documentation and example usage guides ## 🔒 Security & Reliability - Thread-safe context management for concurrent requests - Automatic log rotation and structured output - Error isolation with detailed context information - Complete request lifecycle tracking 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
10 Nov, 2025
2 commits
08 Nov, 2025
1 commit