10 Mar, 2026

1 commit


08 Mar, 2026

1 commit


06 Mar, 2026

1 commit


06 Jan, 2026

1 commit

  • mappings/search_products.json:把原来的 title_zh/title_en/brief_zh/... 改成 按语言 key 的对象结构( /products/_doc/1 { "title": {"en":...} } )
    同时在这些字段下 预置了全部 analyzer 语言:
    arabic, armenian, basque, brazilian, bulgarian, catalan, chinese, cjk, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, italian, norwegian, persian, portuguese, romanian, russian, spanish, swedish, turkish, thai
    
    实现为 type: object + properties,同时满足“按语言灌入”和“按语言 analyzer”。
    索引灌入(全量/增量/transformer)已同步改完
    indexer/document_transformer.py:输出从 title_zh/title_en/... 改为:
    title: {<primary_lang>: 原文, en?: 翻译, zh?: 翻译}
    brief/description/vendor 同理
    category_path/category_name_text 也改为语言对象(避免查询侧继续依赖旧字段)
    indexer/incremental_service.py:embedding 取值从 title_en/title_zh 改为从 title 对象里优先取 en,否则取 zh,否则取任一可用语言。
    查询侧与配置、API/文档已同步
    search/es_query_builder.py:查询字段统一改成点路径:title.zh / title.en / vendor.zh / vendor.zh.keyword / category_name_text.zh 等。
    config/config.yaml:field boosts / indexes 里的字段名同步为新点路径。
    API & formatter:
    api/result_formatter.py 已支持新结构(并保留对旧 *_zh/_en 的兼容兜底)。
    api/models.py、相关 docs/examples 里的 vendor_zh.keyword 等已更新为 vendor.zh.keyword。
    文档/脚本:docs/、README.md、scripts/ 里所有旧字段名引用已批量替换为新结构。
    tangwang
     

19 Dec, 2025

1 commit

  • 1. 删除 IndexingPipeline 类
    文件:indexer/bulk_indexer.py
    删除:IndexingPipeline 类(第201-259行)
    删除:不再需要的 load_mapping 导入
    2. 删除 main.py 中的旧代码
    删除:cmd_ingest() 函数(整个函数)
    删除:ingest 子命令定义
    删除:main() 中对 ingest 命令的处理
    删除:不再需要的 pandas 导入
    更新:文档字符串,移除 ingest 命令说明
    3. 删除旧的数据导入脚本
    删除:data/customer1/ingest_customer1.py(依赖已废弃的 DataTransformer 和 IndexingPipeline)
    tangwang
     

05 Dec, 2025

2 commits


28 Nov, 2025

1 commit

  • 脚本:scripts/csv_to_excel_multi_variant.py
    
    主要功能:
    单一款式商品(S 类型)- 30%
    商品属性为 S
    不填写 option1/option2/option3
    包含所有商品信息(标题、描述、价格、库存等)
    多款式商品(M+P 类型)- 70%
    M 行(商品主体):
    商品属性为 M
    填写商品主体信息(标题、描述、SEO、分类等)
    option1="color", option2="size", option3="material"
    不填写价格、库存、SKU 等子款式信息
    P 行(子款式):
    商品属性为 P
    商品标题与 M 行一致
    option1/2/3 填写具体值(color、size、material 的笛卡尔积)
    每个 SKU 有独立的价格、库存、SKU 编码等
    多款式商品生成规则:
    Color(颜色):从 color1 到 color30 中随机选择 2-10 个
    Size(尺寸):从 1-30 中随机选择 4-8 个
    Material(材质):从商品标题按空格分割后的最后一个字符串提取(去掉特殊字符)
    笛卡尔积:生成所有组合的 P 行(例如:3 个颜色 × 5 个尺寸 × 1 个材质 = 15 个 SKU)
    tangwang
     

17 Nov, 2025

1 commit


13 Nov, 2025

1 commit


12 Nov, 2025

1 commit


11 Nov, 2025

1 commit

  • ## 🎯 Major Features
    - Request context management system for complete request visibility
    - Structured JSON logging with automatic daily rotation
    - Performance monitoring with detailed stage timing breakdowns
    - Query analysis result storage and intermediate result tracking
    - Error and warning collection with context correlation
    
    ## 🔧 Technical Improvements
    - **Context Management**: Request-level context with reqid/uid correlation
    - **Performance Monitoring**: Automatic timing for all search pipeline stages
    - **Structured Logging**: JSON format logs with request context injection
    - **Query Enhancement**: Complete query analysis tracking and storage
    - **Error Handling**: Enhanced error tracking with context information
    
    ## 🐛 Bug Fixes
    - Fixed DeepL API endpoint (paid vs free API confusion)
    - Fixed vector generation (GPU memory cleanup)
    - Fixed logger parameter passing format (reqid/uid handling)
    - Fixed translation and embedding functionality
    
    ## 🌟 API Improvements
    - Simplified API interface (8→5 parameters, 37.5% reduction)
    - Made internal functionality transparent to users
    - Added performance info to API responses
    - Enhanced request correlation and tracking
    
    ## 📁 New Infrastructure
    - Comprehensive test suite (unit, integration, API tests)
    - CI/CD pipeline with automated quality checks
    - Performance monitoring and testing tools
    - Documentation and example usage guides
    
    ## 🔒 Security & Reliability
    - Thread-safe context management for concurrent requests
    - Automatic log rotation and structured output
    - Error isolation with detailed context information
    - Complete request lifecycle tracking
    
    🤖 Generated with Claude Code
    
    Co-Authored-By: Claude <noreply@anthropic.com>
    tangwang
     

10 Nov, 2025

2 commits


08 Nov, 2025

1 commit