fix facet for 172

tangwang
1 parent 0a3764c4
Showing 9 changed files with 2228 additions and 93 deletions Show diff stats
CLIP_SERVICE_README.md
docs/亚马逊到店匠格式转换分析.md
docs/向量化模块和API说明文档.md
frontend/index.html
frontend/static/js/app.js
frontend/static/js/tenant_facets_config.js
scripts/start_clip_service.sh
scripts/stop_clip_service.sh
前端分面配置说明.md
@@ -0,0 +1,194 @@
+## 基于 `clip-server` 的向量服务（平级替代 `embeddings`）
+
+本模块说明如何在 **独立环境** 中部署基于 `jina-ai/clip-as-service` 仓库的向量服务（实际安装包为 `clip-server` / `clip-client`），用于替代当前仓库里的本地 `embeddings` 服务（`embeddings/server.py`）。
+
+> 设计目标：  
+> - 与项目主环境（`searchengine` conda env）**完全隔离**  
+> - 使用官方开源项目 [`jina-ai/clip-as-service`](https://github.com/jina-ai/clip-as-service)（对应 PyPI 包：`clip-server` / `clip-client`）  
+> - 提供简单的 **安装 / 启动 / 停止脚本**  
+
+---
+
+## 1. 环境准备（独立环境）
+
+推荐使用 Conda 新建一个专用环境（与本项目的 `searchengine` 环境隔离）：
+
+```bash
+# 1）加载 conda
+source /home/tw/miniconda3/etc/profile.d/conda.sh
+
+# 2）创建 clip 向量服务专用环境
+conda create -n clip_service python=3.9 -y
+
+# 3）激活环境
+conda activate clip_service
+
+# 4）安装 clip-server / clip-client（其内部依赖 jina）
+#    如需绕过镜像问题，可显式使用官方 PyPI 源：
+#    pip install -i https://pypi.org/simple "clip-server" "clip-client"
+pip install "clip-server" "clip-client"
+```
+
+> 如果你不使用 Conda，也可以改用 `python -m venv` 创建虚拟环境，  
+> 但务必保证 **不要与主项目共用同一个 Python 环境**。
+
+---
+
+## 2. 启动 / 停止脚本
+
+本仓库在 `scripts/` 目录下提供了两个脚本（需要手动赋权一次）：
+
+```bash
+chmod +x scripts/start_clip_service.sh
+chmod +x scripts/stop_clip_service.sh
+```
+
+### 2.1 启动服务
+
+```bash
+cd /home/tw/SearchEngine
+./scripts/start_clip_service.sh
+```
+
+脚本行为：
+
+- 自动 `cd` 到仓库根目录 `/home/tw/SearchEngine`
+- 尝试加载 `/home/tw/miniconda3/etc/profile.d/conda.sh` 并激活 `clip_service` 环境
+- 使用 `nohup python -m clip_server` 启动服务到后台
+- 将日志写入 `logs/clip_service.log`
+- 将进程号写入 `logs/clip_service.pid`
+
+默认情况下，`clip-server` 会监听在 **`grpc://0.0.0.0:51000`**（gRPC 协议，端口 51000）。
+
+> ⚠️ **重要**：客户端连接时请使用端口 **51000**，不是 23456 或其他端口。
+
+### 2.2 停止服务
+
+```bash
+cd /home/tw/SearchEngine
+./scripts/stop_clip_service.sh
+```
+
+脚本行为：
+
+- 读取 `logs/clip_service.pid` 中的 PID
+- 如果进程存在则发送 `kill` 终止
+- 清理 `logs/clip_service.pid`
+
+---
+
+## 3. 与现有 `embeddings` 服务的关系
+
+- 现有本地向量服务：  
+  - 启动脚本：`./scripts/start_embedding_service.sh`  
+  - 实现：`embeddings/server.py`（FastAPI + 本地模型 `bge_model.py` / `clip_model.py`）  
+- 新增的 `clip-server`：  
+  - 使用官方实现，单独进程、单独环境  
+  - 面向图像 / 文本的 CLIP 向量化服务  
+
+### 使用建议
+
+- 如果你想继续使用本仓库自带的本地模型服务，保持原有脚本不变即可：  
+  - `./scripts/start_embedding_service.sh`
+- 如果你想用 `clip-as-service` 替代原来的本地服务，可以：
+  - 在上游调用代码中，将向量请求切换到 `clip-as-service` 对应的端口 / 接口
+  - 或者增加一个适配层，将 `clip-as-service` 封装成与 `POST /embed/text` / `POST /embed/image` 相同的接口（视具体场景而定）
+
+---
+
+## 4. 基本验证
+
+1. 确认 `clip_service` 环境创建并安装成功：
+
+   ```bash
+   source /home/tw/miniconda3/etc/profile.d/conda.sh
+   conda activate clip_service
+   python -c "import jina; print('jina version:', jina.__version__)"
+   ```
+
+2. 启动服务并查看日志：
+
+   ```bash
+   cd /home/tw/SearchEngine
+   ./scripts/start_clip_service.sh
+   tail -f logs/clip_service.log
+   ```
+
+   服务启动后，默认监听在 **`grpc://0.0.0.0:51000`**（gRPC 协议，端口 51000）。
+
+3. 测试客户端连接（在 `clip_service` 环境中）：
+
+   ```python
+   from clip_client import Client
+   
+   # 注意：默认端口是 51000，不是 23456
+   c = Client('grpc://0.0.0.0:51000')
+   
+   # 测试连接
+   c.profile()
+   
+   # 测试文本向量化
+   r = c.encode(['First do it', 'then do it right', 'then do it better'])
+   print(r.shape)  # 应该输出 [3, 512] 或类似形状
+   
+   # 测试图像向量化
+   r = c.encode(['https://picsum.photos/200'])
+   print(r.shape)  # 应该输出 [1, 512] 或类似形状
+   ```
+
+4. 如果不再需要服务，执行：
+
+   ```bash
+   ./scripts/stop_clip_service.sh
+   ```
+
+### 常见问题
+
+**Q: 连接被拒绝（Connection refused）？**  
+A: 请确认：
+- 服务已启动（检查 `logs/clip_service.log` 和进程）
+- 客户端使用的端口是 **51000**（不是 23456）
+- 客户端地址格式正确：`grpc://0.0.0.0:51000` 或 `grpc://localhost:51000`
+
+**Q: Gateway 启动了但 worker 连接失败？**  
+A: 可能原因：
+- Worker 进程（clip_t）还在启动中，模型加载需要时间（首次启动可能需要下载模型）
+- 检查日志中是否有模型下载或加载错误：
+  ```bash
+  tail -f logs/clip_service.log | grep -E "(ERROR|WARNING|model|download)"
+  ```
+- 如果持续失败，尝试重启服务：
+  ```bash
+  ./scripts/stop_clip_service.sh
+  ./scripts/start_clip_service.sh
+  ```
+
+**Q: 如何查看服务实际监听的端口？**  
+A: 查看启动日志：
+```bash
+tail -f logs/clip_service.log | grep "bound to"
+```
+或检查进程监听的端口：
+```bash
+lsof -i :51000
+# 或
+netstat -tlnp | grep 51000
+```
+
+**Q: 如何确认服务完全就绪？**  
+A: 查看日志，确认看到类似输出：
+```
+INFO   gateway/rep-0@XXXXX start server bound to 0.0.0.0:51000
+```
+然后等待几秒让 worker 进程启动，再测试客户端连接。
+
+---
+
+## 5. 参考
+
+- 项目地址：`https://github.com/jina-ai/clip-as-service`
+- 本项目向量模块文档：`embeddings/README.md`、`CLOUD_EMBEDDING_README.md`
+
+
+
+
@@ -361,3 +361,8 @@ python scripts/amazon_xlsx_to_shoplazza_xlsx.py \
 这是一个典型的**数据格式转换ETL任务**，涉及数据结构重组、字符串解析、智能算法选择等多个技术领域。
+
+
+
+
+
@@ -0,0 +1,1424 @@
+# 向量化模块和API说明文档
+
+本文档详细说明SearchEngine项目中的向量化模块架构、API接口、配置方法和使用指南。
+
+## 目录
+
+1. [概述](#概述)
+   - 1.1 [向量化模块简介](#11-向量化模块简介)
+   - 1.2 [技术选型](#12-技术选型)
+   - 1.3 [应用场景](#13-应用场景)
+
+2. [向量化服务架构](#向量化服务架构)
+   - 2.1 [本地向量化服务](#21-本地向量化服务)
+   - 2.2 [云端向量化服务](#22-云端向量化服务)
+   - 2.3 [架构对比](#23-架构对比)
+
+3. [本地向量化服务](#本地向量化服务)
+   - 3.1 [服务启动](#31-服务启动)
+   - 3.2 [服务配置](#32-服务配置)
+   - 3.3 [模型说明](#33-模型说明)
+
+4. [云端向量化服务](#云端向量化服务)
+   - 4.1 [阿里云DashScope](#41-阿里云dashscope)
+   - 4.2 [API Key配置](#42-api-key配置)
+   - 4.3 [使用方式](#43-使用方式)
+
+5. [Embedding API详细说明](#embedding-api详细说明)
+   - 5.1 [API概览](#51-api概览)
+   - 5.2 [健康检查接口](#52-健康检查接口)
+   - 5.3 [文本向量化接口](#53-文本向量化接口)
+   - 5.4 [图片向量化接口](#54-图片向量化接口)
+   - 5.5 [错误处理](#55-错误处理)
+
+6. [配置说明](#配置说明)
+   - 6.1 [服务配置](#61-服务配置)
+   - 6.2 [模型配置](#62-模型配置)
+   - 6.3 [批处理配置](#63-批处理配置)
+
+7. [客户端集成示例](#客户端集成示例)
+   - 7.1 [Python客户端](#71-python客户端)
+   - 7.2 [Java客户端](#72-java客户端)
+   - 7.3 [cURL示例](#73-curl示例)
+
+8. [性能对比与优化](#性能对比与优化)
+   - 8.1 [性能对比](#81-性能对比)
+   - 8.2 [成本对比](#82-成本对比)
+   - 8.3 [优化建议](#83-优化建议)
+
+9. [故障排查](#故障排查)
+   - 9.1 [常见问题](#91-常见问题)
+   - 9.2 [日志查看](#92-日志查看)
+   - 9.3 [性能调优](#93-性能调优)
+
+10. [附录](#附录)
+    - 10.1 [向量维度说明](#101-向量维度说明)
+    - 10.2 [模型版本信息](#102-模型版本信息)
+    - 10.3 [相关文档](#103-相关文档)
+
+---
+
+## 概述
+
+### 1.1 向量化模块简介
+
+SearchEngine项目实现了完整的文本和图片向量化能力，支持两种部署方式：
+
+1. **本地向量化服务**：独立部署的微服务，基于本地GPU/CPU运行BGE-M3和CN-CLIP模型
+2. **云端向量化服务**：集成阿里云DashScope API，按使用量付费
+
+向量化模块是搜索引擎的核心组件，为语义搜索、图片搜索提供AI驱动的相似度计算能力。
+
+### 1.2 技术选型
+
+| 功能 | 本地服务 | 云端服务 |
+|------|---------|---------|
+| **文本模型** | BGE-M3 (Xorbits/bge-m3) | text-embedding-v4 |
+| **图片模型** | CN-CLIP (ViT-H-14) | - |
+| **向量维度** | 1024 | 1024 |
+| **服务框架** | FastAPI | 阿里云API |
+| **部署方式** | Docker/本地 | 云端API |
+
+### 1.3 应用场景
+
+- **语义搜索**：查询文本向量化，与商品向量计算相似度
+- **图片搜索**：商品图片向量化，支持以图搜图
+- **混合检索**：BM25 + 向量相似度组合排序
+- **多语言搜索**：中英文跨语言语义理解
+
+---
+
+## 向量化服务架构
+
+### 2.1 本地向量化服务
+
+```
+┌─────────────────────────────────────────┐
+│  Embedding Microservice (FastAPI)       │
+│  Port: 6005, Workers: 1                 │
+└──────────────┬──────────────────────────┘
+               │
+       ┌───────┴───────┐
+       │               │
+┌──────▼──────┐  ┌────▼─────┐
+│ BGE-M3      │  │ CN-CLIP  │
+│ Text Model  │  │ Image    │
+│ (CUDA/CPU)  │  │ Model    │
+└─────────────┘  └──────────┘
+```
+
+**核心特性**：
+- 独立部署，可横向扩展
+- GPU加速支持
+- 线程安全设计
+- 启动时预加载模型
+
+### 2.2 云端向量化服务
+
+```
+┌─────────────────────────────────────┐
+│  SearchEngine Main Service          │
+│  (uses CloudTextEncoder)            │
+└──────────────┬──────────────────────┘
+               │
+               ▼
+┌─────────────────────────────────────┐
+│  Aliyun DashScope API               │
+│  text-embedding-v4                  │
+│  (HTTP/REST)                        │
+└─────────────────────────────────────┘
+```
+
+**核心特性**：
+- 无需GPU资源
+- 按使用量计费
+- 自动扩展
+- 低运维成本
+
+### 2.3 架构对比
+
+| 维度 | 本地服务 | 云端服务 |
+|------|---------|---------|
+| **初始成本** | 高（GPU服务器） | 低（按需付费） |
+| **运行成本** | 固定 | 变动（按调用量） |
+| **延迟** | <100ms | 300-400ms |
+| **吞吐量** | 高（~32 qps） | 中（~2-3 qps） |
+| **离线支持** | ✅ | ❌ |
+| **维护成本** | 高 | 低 |
+| **扩展性** | 手动扩展 | 自动扩展 |
+| **适用场景** | 大规模生产环境 | 初期开发/小规模应用 |
+
+---
+
+## 本地向量化服务
+
+### 3.1 服务启动
+
+#### 方式1：使用脚本启动（推荐）
+
+```bash
+# 启动向量化服务
+./scripts/start_embedding_service.sh
+```
+
+脚本特性：
+- 自动激活conda环境
+- 读取配置文件获取端口
+- 单worker模式启动服务
+
+#### 方式2：手动启动
+
+```bash
+# 激活环境
+source /home/tw/miniconda3/etc/profile.d/conda.sh
+conda activate searchengine
+
+# 启动服务
+python -m uvicorn embeddings.server:app \
+  --host 0.0.0.0 \
+  --port 6005 \
+  --workers 1
+```
+
+#### 方式3：Docker部署（生产环境）
+
+```bash
+# 构建镜像
+docker build -t searchengine-embedding:latest .
+
+# 启动容器
+docker run -d \
+  --name embedding-service \
+  --gpus all \
+  -p 6005:6005 \
+  searchengine-embedding:latest
+```
+
+### 3.2 服务配置
+
+配置文件：`embeddings/config.py`
+
+```python
+class EmbeddingConfig:
+    # 服务配置
+    HOST = "0.0.0.0"      # 监听地址
+    PORT = 6005           # 监听端口
+
+    # 文本模型 (BGE-M3)
+    TEXT_MODEL_DIR = "Xorbits/bge-m3"  # 模型路径/HuggingFace ID
+    TEXT_DEVICE = "cuda"               # 设备: "cuda" 或 "cpu"
+    TEXT_BATCH_SIZE = 32               # 批处理大小
+
+    # 图片模型 (CN-CLIP)
+    IMAGE_MODEL_NAME = "ViT-H-14"      # 模型名称
+    IMAGE_DEVICE = None                # None=自动, "cuda", "cpu"
+    IMAGE_BATCH_SIZE = 8               # 批处理大小
+```
+
+### 3.3 模型说明
+
+#### BGE-M3 文本模型
+
+- **模型ID**: `Xorbits/bge-m3`
+- **向量维度**: 1024
+- **支持语言**: 中文、英文、多语言（100+）
+- **特性**: 强大的语义理解能力，支持长文本
+- **部署**: 自动从HuggingFace下载
+
+#### CN-CLIP 图片模型
+
+- **模型**: ViT-H-14 (Chinese CLIP)
+- **向量维度**: 1024
+- **输入**: 图片URL或本地路径
+- **特性**: 中文图文理解，适合电商场景
+- **预处理**: 自动下载、缩放、归一化
+
+---
+
+## 云端向量化服务
+
+### 4.1 阿里云DashScope
+
+**服务地址**：
+- 北京地域：`https://dashscope.aliyuncs.com/compatible-mode/v1`
+- 新加坡地域：`https://dashscope-intl.aliyuncs.com/compatible-mode/v1`
+
+**模型信息**：
+- **模型名**: `text-embedding-v4`
+- **向量维度**: 1024
+- **输入限制**: 单次最多2048个文本，每个文本最大8192 token
+- **速率限制**: 根据API套餐不同而不同
+
+### 4.2 API Key配置
+
+#### 方式1：环境变量（推荐）
+
+```bash
+# 临时设置
+export DASHSCOPE_API_KEY="sk-your-api-key-here"
+
+# 永久设置（添加到 ~/.bashrc 或 ~/.zshrc）
+echo 'export DASHSCOPE_API_KEY="sk-your-api-key-here"' >> ~/.bashrc
+source ~/.bashrc
+```
+
+#### 方式2：.env文件
+
+在项目根目录创建`.env`文件：
+
+```bash
+DASHSCOPE_API_KEY=sk-your-api-key-here
+```
+
+**获取API Key**：https://help.aliyun.com/zh/model-studio/get-api-key
+
+### 4.3 使用方式
+
+```python
+from embeddings.cloud_text_encoder import CloudTextEncoder
+
+# 初始化编码器（自动从环境变量读取API Key）
+encoder = CloudTextEncoder()
+
+# 单个文本向量化
+text = "衣服的质量杠杠的"
+embedding = encoder.encode(text)
+print(embedding.shape)  # (1, 1024)
+
+# 批量向量化
+texts = ["文本1", "文本2", "文本3"]
+embeddings = encoder.encode(texts)
+print(embeddings.shape)  # (3, 1024)
+
+# 大批量处理（自动分批）
+large_texts = [f"商品 {i}" for i in range(1000)]
+embeddings = encoder.encode_batch(large_texts, batch_size=32)
+```
+
+**自定义配置**：
+
+```python
+# 使用新加坡地域
+encoder = CloudTextEncoder(
+    api_key="sk-xxx",
+    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
+)
+```
+
+---
+
+## Embedding API详细说明
+
+### 5.1 API概览
+
+本地向量化服务提供RESTful API接口：
+
+| 端点 | 方法 | 功能 |
+|------|------|------|
+| `/health` | GET | 健康检查 |
+| `/embed/text` | POST | 文本向量化 |
+| `/embed/image` | POST | 图片向量化 |
+
+**服务地址**：
+- 默认：`http://localhost:6005`
+- 生产：`http://<your-server>:6005`
+
+### 5.2 健康检查接口
+
+```http
+GET /health
+```
+
+**响应示例**：
+```json
+{
+  "status": "ok",
+  "text_model_loaded": true,
+  "image_model_loaded": true
+}
+```
+
+**字段说明**：
+- `status`: 服务状态，"ok"表示正常
+- `text_model_loaded`: 文本模型是否加载成功
+- `image_model_loaded`: 图片模型是否加载成功
+
+**cURL示例**：
+```bash
+curl http://localhost:6005/health
+```
+
+### 5.3 文本向量化接口
+
+```http
+POST /embed/text
+Content-Type: application/json
+```
+
+#### 请求格式
+
+**请求体**（JSON数组）：
+```json
+[
+  "衣服的质量杠杠的",
+  "Bohemian Maxi Dress",
+  "Vintage Denim Jacket"
+]
+```
+
+**参数说明**：
+- 类型：`List[str]`
+- 长度：建议≤100（避免超时）
+- 单个文本：建议≤512个字符
+
+#### 响应格式
+
+**成功响应**（200 OK）：
+```json
+[
+  [0.1234, -0.5678, 0.9012, ..., 0.3456],  // 1024维向量
+  [0.2345, 0.6789, -0.1234, ..., 0.4567],  // 1024维向量
+  [0.3456, -0.7890, 0.2345, ..., 0.5678]   // 1024维向量
+]
+```
+
+**字段说明**：
+- 类型：`List[List[float]]`
+- 每个向量：1024个浮点数
+- 对齐原则：输出数组与输入数组按索引一一对应
+- 失败项：返回`null`
+
+**错误示例**：
+```json
+[
+  [0.1234, -0.5678, ...],  // 成功
+  null,                     // 失败（空文本或其他错误）
+  [0.3456, 0.7890, ...]     // 成功
+]
+```
+
+#### cURL示例
+
+```bash
+# 单个文本
+curl -X POST http://localhost:6005/embed/text \
+  -H "Content-Type: application/json" \
+  -d '["测试查询文本"]'
+
+# 批量文本
+curl -X POST http://localhost:6005/embed/text \
+  -H "Content-Type: application/json" \
+  -d '["红色连衣裙", "blue jeans", "vintage dress"]'
+```
+
+#### Python示例
+
+```python
+import requests
+import numpy as np
+
+def embed_texts(texts):
+    """文本向量化"""
+    response = requests.post(
+        "http://localhost:6005/embed/text",
+        json=texts,
+        timeout=30
+    )
+    response.raise_for_status()
+    embeddings = response.json()
+
+    # 转换为numpy数组
+    valid_embeddings = [e for e in embeddings if e is not None]
+    return np.array(valid_embeddings)
+
+# 使用
+texts = ["红色连衣裙", "blue jeans"]
+embeddings = embed_texts(texts)
+print(f"Shape: {embeddings.shape}")  # (2, 1024)
+
+# 计算相似度
+similarity = np.dot(embeddings[0], embeddings[1])
+print(f"Similarity: {similarity}")
+```
+
+### 5.4 图片向量化接口
+
+```http
+POST /embed/image
+Content-Type: application/json
+```
+
+#### 请求格式
+
+**请求体**（JSON数组）：
+```json
+[
+  "https://example.com/product1.jpg",
+  "https://example.com/product2.png",
+  "/local/path/to/product3.jpg"
+]
+```
+
+**参数说明**：
+- 类型：`List[str]`
+- 支持：HTTP URL或本地文件路径
+- 格式：JPG、PNG等常见图片格式
+- 长度：建议≤10（图片处理较慢）
+
+#### 响应格式
+
+**成功响应**（200 OK）：
+```json
+[
+  [0.1234, 0.5678, 0.9012, ..., 0.3456],  // 1024维向量
+  null,                                   // 失败（图片无效或下载失败）
+  [0.3456, 0.7890, 0.2345, ..., 0.5678]   // 1024维向量
+]
+```
+
+**特性**：
+- 自动下载：HTTP URL自动下载图片
+- 逐个处理：串行处理（带锁保证线程安全）
+- 容错：单个失败不影响其他图片
+
+#### cURL示例
+
+```bash
+# 单个图片（URL）
+curl -X POST http://localhost:6005/embed/image \
+  -H "Content-Type: application/json" \
+  -d '["https://example.com/product.jpg"]'
+
+# 多个图片（混合URL和本地路径）
+curl -X POST http://localhost:6005/embed/image \
+  -H "Content-Type: application/json" \
+  -d '["https://example.com/img1.jpg", "/data/images/img2.png"]'
+```
+
+#### Python示例
+
+```python
+import requests
+import numpy as np
+
+def embed_images(image_urls):
+    """图片向量化"""
+    response = requests.post(
+        "http://localhost:6005/embed/image",
+        json=image_urls,
+        timeout=120  # 图片处理较慢，设置更长超时
+    )
+    response.raise_for_status()
+    embeddings = response.json()
+
+    # 过滤成功的向量化结果
+    valid_embeddings = [(url, emb) for url, emb in zip(image_urls, embeddings) if emb is not None]
+    return valid_embeddings
+
+# 使用
+image_urls = [
+    "https://example.com/dress1.jpg",
+    "https://example.com/dress2.jpg"
+]
+
+results = embed_images(image_urls)
+for url, embedding in results:
+    print(f"{url}: {len(embedding)} dimensions")
+```
+
+### 5.5 错误处理
+
+#### HTTP状态码
+
+| 状态码 | 含义 | 处理方式 |
+|--------|------|---------|
+| 200 | 成功 | 正常处理响应 |
+| 500 | 服务器错误 | 检查服务日志 |
+| 503 | 服务不可用 | 模型未加载，检查启动日志 |
+
+#### 常见错误场景
+
+1. **模型未加载**
+```json
+{
+  "detail": "Runtime Error: Text model not loaded"
+}
+```
+**解决**：检查服务启动日志，确认模型加载成功
+
+2. **无效输入**
+```json
+[null, null]
+```
+**原因**：输入包含空字符串或None
+
+3. **图片下载失败**
+```json
+[
+  [0.123, ...],
+  null  // URL无效或网络问题
+]
+```
+**解决**：检查URL是否可访问
+
+---
+
+## 配置说明
+
+### 6.1 服务配置
+
+编辑 `embeddings/config.py` 修改服务配置：
+
+```python
+class EmbeddingConfig:
+    # ========== 服务配置 ==========
+    HOST = "0.0.0.0"    # 监听所有网卡
+    PORT = 6005         # 默认端口
+```
+
+**生产环境建议**：
+- 使用反向代理（Nginx）处理SSL
+- 配置防火墙规则限制访问
+- 使用Docker容器隔离
+
+### 6.2 模型配置
+
+#### 文本模型配置
+
+```python
+# ========== BGE-M3 文本模型 ==========
+TEXT_MODEL_DIR = "Xorbits/bge-m3"  # HuggingFace模型ID
+TEXT_DEVICE = "cuda"               # 设备选择
+TEXT_BATCH_SIZE = 32               # 批处理大小
+```
+
+**DEVICE选择**：
+- `"cuda"`: GPU加速（推荐，需要CUDA）
+- `"cpu"`: CPU模式（较慢，但兼容性好）
+
+**批处理大小建议**：
+- GPU（16GB显存）：32-64
+- GPU（8GB显存）：16-32
+- CPU：8-16
+
+#### 图片模型配置
+
+```python
+# ========== CN-CLIP 图片模型 ==========
+IMAGE_MODEL_NAME = "ViT-H-14"      # 模型名称
+IMAGE_DEVICE = None                # None=自动检测
+IMAGE_BATCH_SIZE = 8               # 批处理大小
+```
+
+**IMAGE_DEVICE选择**：
+- `None`: 自动检测（推荐）
+- `"cuda"`: 强制使用GPU
+- `"cpu"`: 强制使用CPU
+
+### 6.3 批处理配置
+
+**批处理大小调优**：
+
+| 场景 | 文本Batch Size | 图片Batch Size | 说明 |
+|------|---------------|---------------|------|
+| 开发测试 | 16 | 1 | 快速响应 |
+| 生产环境（GPU） | 32-64 | 4-8 | 平衡性能 |
+| 生产环境（CPU） | 8-16 | 1-2 | 避免内存溢出 |
+| 离线批处理 | 128+ | 16+ | 最大化吞吐 |
+
+**批处理建议**：
+1. 监控GPU内存使用：`nvidia-smi`
+2. 逐步增加batch_size直到OOM
+3. 预留20%内存余量
+
+---
+
+## 客户端集成示例
+
+### 7.1 Python客户端
+
+#### 基础客户端类
+
+```python
+import requests
+from typing import List, Optional
+import numpy as np
+
+class EmbeddingServiceClient:
+    """向量化服务客户端"""
+
+    def __init__(self, base_url: str = "http://localhost:6005"):
+        self.base_url = base_url.rstrip('/')
+        self.timeout = 30
+
+    def health_check(self) -> dict:
+        """健康检查"""
+        response = requests.get(f"{self.base_url}/health", timeout=5)
+        response.raise_for_status()
+        return response.json()
+
+    def embed_text(self, text: str) -> Optional[List[float]]:
+        """单个文本向量化"""
+        result = self.embed_texts([text])
+        return result[0] if result else None
+
+    def embed_texts(self, texts: List[str]) -> List[Optional[List[float]]]:
+        """批量文本向量化"""
+        if not texts:
+            return []
+
+        response = requests.post(
+            f"{self.base_url}/embed/text",
+            json=texts,
+            timeout=self.timeout
+        )
+        response.raise_for_status()
+        return response.json()
+
+    def embed_image(self, image_url: str) -> Optional[List[float]]:
+        """单个图片向量化"""
+        result = self.embed_images([image_url])
+        return result[0] if result else None
+
+    def embed_images(self, image_urls: List[str]) -> List[Optional[List[float]]]:
+        """批量图片向量化"""
+        if not image_urls:
+            return []
+
+        response = requests.post(
+            f"{self.base_url}/embed/image",
+            json=image_urls,
+            timeout=120  # 图片处理需要更长时间
+        )
+        response.raise_for_status()
+        return response.json()
+
+    def embed_texts_to_numpy(self, texts: List[str]) -> Optional[np.ndarray]:
+        """批量文本向量化，返回numpy数组"""
+        embeddings = self.embed_texts(texts)
+        valid_embeddings = [e for e in embeddings if e is not None]
+        if not valid_embeddings:
+            return None
+        return np.array(valid_embeddings, dtype=np.float32)
+
+# 使用示例
+if __name__ == "__main__":
+    client = EmbeddingServiceClient()
+
+    # 健康检查
+    health = client.health_check()
+    print(f"Service status: {health}")
+
+    # 文本向量化
+    texts = ["红色连衣裙", "blue jeans", "vintage dress"]
+    embeddings = client.embed_texts_to_numpy(texts)
+    print(f"Embeddings shape: {embeddings.shape}")
+
+    # 计算相似度
+    from sklearn.metrics.pairwise import cosine_similarity
+    similarities = cosine_similarity(embeddings)
+    print(f"Similarity matrix:\n{similarities}")
+```
+
+#### 高级用法：异步客户端
+
+```python
+import aiohttp
+import asyncio
+from typing import List, Optional
+
+class AsyncEmbeddingClient:
+    """异步向量化服务客户端"""
+
+    def __init__(self, base_url: str = "http://localhost:6005"):
+        self.base_url = base_url.rstrip('/')
+        self.session: Optional[aiohttp.ClientSession] = None
+
+    async def __aenter__(self):
+        self.session = aiohttp.ClientSession()
+        return self
+
+    async def __aexit__(self, exc_type, exc_val, exc_tb):
+        if self.session:
+            await self.session.close()
+
+    async def embed_texts(self, texts: List[str]) -> List[Optional[List[float]]]:
+        """异步批量文本向量化"""
+        if not texts:
+            return []
+
+        if not self.session:
+            raise RuntimeError("Client not initialized. Use 'async with'.")
+
+        async with self.session.post(
+            f"{self.base_url}/embed/text",
+            json=texts,
+            timeout=aiohttp.ClientTimeout(total=30)
+        ) as response:
+            response.raise_for_status()
+            return await response.json()
+
+# 使用示例
+async def main():
+    async with AsyncEmbeddingClient() as client:
+        texts = ["text1", "text2", "text3"]
+        embeddings = await client.embed_texts(texts)
+        print(f"Got {len(embeddings)} embeddings")
+
+asyncio.run(main())
+```
+
+### 7.2 Java客户端
+
+#### 基础客户端类
+
+```java
+import java.net.URI;
+import java.net.http.HttpClient;
+import java.net.http.HttpRequest;
+import java.net.http.HttpResponse;
+import java.time.Duration;
+import java.util.List;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.node.ArrayNode;
+
+public class EmbeddingServiceClient {
+    private final HttpClient httpClient;
+    private final ObjectMapper objectMapper;
+    private final String baseUrl;
+
+    public EmbeddingServiceClient(String baseUrl) {
+        this.baseUrl = baseUrl.replaceAll("/$", "");
+        this.httpClient = HttpClient.newBuilder()
+            .connectTimeout(Duration.ofSeconds(10))
+            .build();
+        this.objectMapper = new ObjectMapper();
+    }
+
+    /**
+     * 健康检查
+     */
+    public HealthStatus healthCheck() throws Exception {
+        HttpRequest request = HttpRequest.newBuilder()
+            .uri(URI.create(baseUrl + "/health"))
+            .timeout(Duration.ofSeconds(5))
+            .GET()
+            .build();
+
+        HttpResponse<String> response = httpClient.send(
+            request,
+            HttpResponse.BodyHandlers.ofString()
+        );
+
+        JsonNode json = objectMapper.readTree(response.body());
+        return new HealthStatus(
+            json.get("status").asText(),
+            json.get("text_model_loaded").asBoolean(),
+            json.get("image_model_loaded").asBoolean()
+        );
+    }
+
+    /**
+     * 批量文本向量化
+     */
+    public List<float[]> embedTexts(List<String> texts) throws Exception {
+        // 构建请求体
+        ArrayNode requestBody = objectMapper.createArrayNode();
+        for (String text : texts) {
+            requestBody.add(text);
+        }
+
+        HttpRequest request = HttpRequest.newBuilder()
+            .uri(URI.create(baseUrl + "/embed/text"))
+            .header("Content-Type", "application/json")
+            .timeout(Duration.ofSeconds(30))
+            .POST(HttpRequest.BodyPublishers.ofString(
+                objectMapper.writeValueAsString(requestBody)
+            ))
+            .build();
+
+        HttpResponse<String> response = httpClient.send(
+            request,
+            HttpResponse.BodyHandlers.ofString()
+        );
+
+        if (response.statusCode() != 200) {
+            throw new RuntimeException("API error: " + response.body());
+        }
+
+        // 解析响应
+        JsonNode root = objectMapper.readTree(response.body());
+        List<float[]> embeddings = new java.util.ArrayList<>();
+
+        for (JsonNode item : root) {
+            if (item.isNull()) {
+                embeddings.add(null);
+            } else {
+                float[] vector = objectMapper.treeToValue(item, float[].class);
+                embeddings.add(vector);
+            }
+        }
+
+        return embeddings;
+    }
+
+    /**
+     * 计算余弦相似度
+     */
+    public static float cosineSimilarity(float[] v1, float[] v2) {
+        if (v1.length != v2.length) {
+            throw new IllegalArgumentException("Vectors must be same length");
+        }
+
+        float dotProduct = 0.0f;
+        float norm1 = 0.0f;
+        float norm2 = 0.0f;
+
+        for (int i = 0; i < v1.length; i++) {
+            dotProduct += v1[i] * v2[i];
+            norm1 += v1[i] * v1[i];
+            norm2 += v2[i] * v2[i];
+        }
+
+        return (float) (dotProduct / (Math.sqrt(norm1) * Math.sqrt(norm2)));
+    }
+
+    // 健康状态数据类
+    public static class HealthStatus {
+        public final String status;
+        public final boolean textModelLoaded;
+        public final boolean imageModelLoaded;
+
+        public HealthStatus(String status, boolean textModelLoaded, boolean imageModelLoaded) {
+            this.status = status;
+            this.textModelLoaded = textModelLoaded;
+            this.imageModelLoaded = imageModelLoaded;
+        }
+
+        @Override
+        public String toString() {
+            return String.format("HealthStatus{status='%s', textModelLoaded=%b, imageModelLoaded=%b}",
+                status, textModelLoaded, imageModelLoaded);
+        }
+    }
+
+    // 使用示例
+    public static void main(String[] args) throws Exception {
+        EmbeddingServiceClient client = new EmbeddingServiceClient("http://localhost:6005");
+
+        // 健康检查
+        HealthStatus health = client.healthCheck();
+        System.out.println("Health: " + health);
+
+        // 文本向量化
+        List<String> texts = List.of("红色连衣裙", "blue jeans", "vintage dress");
+        List<float[]> embeddings = client.embedTexts(texts);
+
+        System.out.println("Got " + embeddings.size() + " embeddings");
+        for (int i = 0; i < embeddings.size(); i++) {
+            System.out.println("Embedding " + i + " dimensions: " +
+                (embeddings.get(i) != null ? embeddings.get(i).length : "null"));
+        }
+
+        // 计算相似度
+        if (embeddings.get(0) != null && embeddings.get(1) != null) {
+            float similarity = cosineSimilarity(embeddings.get(0), embeddings.get(1));
+            System.out.println("Similarity between text 0 and 1: " + similarity);
+        }
+    }
+}
+```
+
+**Maven依赖**（`pom.xml`）：
+
+```xml
+<dependencies>
+    <dependency>
+        <groupId>com.fasterxml.jackson.core</groupId>
+        <artifactId>jackson-databind</artifactId>
+        <version>2.15.2</version>
+    </dependency>
+</dependencies>
+```
+
+### 7.3 cURL示例
+
+#### 健康检查
+
+```bash
+curl http://localhost:6005/health
+```
+
+#### 文本向量化
+
+```bash
+# 单个文本
+curl -X POST http://localhost:6005/embed/text \
+  -H "Content-Type: application/json" \
+  -d '["衣服的质量杠杠的"]' \
+  | jq '.[0][0:10]'  # 打印前10维
+
+# 批量文本
+curl -X POST http://localhost:6005/embed/text \
+  -H "Content-Type: application/json" \
+  -d '["红色连衣裙", "blue jeans", "vintage dress"]' \
+  | jq '. | length'  # 检查返回数量
+```
+
+#### 图片向量化
+
+```bash
+# URL图片
+curl -X POST http://localhost:6005/embed/image \
+  -H "Content-Type: application/json" \
+  -d '["https://example.com/product.jpg"]' \
+  | jq '.[0][0:5]'
+
+# 本地图片
+curl -X POST http://localhost:6005/embed/image \
+  -H "Content-Type: application/json" \
+  -d '["/data/images/product.jpg"]'
+```
+
+#### 错误处理示例
+
+```bash
+# 检查服务状态
+if ! curl -f http://localhost:6005/health > /dev/null 2>&1; then
+    echo "Embedding service is not healthy!"
+    exit 1
+fi
+
+# 调用API并检查错误
+response=$(curl -s -X POST http://localhost:6005/embed/text \
+  -H "Content-Type: application/json" \
+  -d '["test query"]')
+
+if echo "$response" | jq -e '.[0] == null' > /dev/null; then
+    echo "Embedding failed!"
+    echo "$response"
+    exit 1
+fi
+
+echo "Embedding succeeded!"
+```
+
+---
+
+## 性能对比与优化
+
+### 8.1 性能对比
+
+#### 本地服务性能
+
+| 操作 | 硬件配置 | 延迟 | 吞吐量 |
+|------|---------|------|--------|
+| 文本向量化（单个） | GPU (RTX 3090) | ~80ms | ~12 qps |
+| 文本向量化（批量32） | GPU (RTX 3090) | ~2.5s | ~256 qps |
+| 文本向量化（单个） | CPU (16核) | ~500ms | ~2 qps |
+| 图片向量化（单个） | GPU (RTX 3090) | ~150ms | ~6 qps |
+| 图片向量化（批量4） | GPU (RTX 3090) | ~600ms | ~6 qps |
+
+#### 云端服务性能
+
+| 操作 | 指标 | 值 |
+|------|------|-----|
+| 文本向量化（单个） | 延迟 | 300-400ms |
+| 文本向量化（批量） | 吞吐量 | ~2-3 qps |
+| API限制 | 速率限制 | 取决于套餐 |
+| 可用性 | SLA | 99.9% |
+
+### 8.2 成本对比
+
+#### 本地服务成本
+
+| 配置 | 硬件成本（月） | 电费（月） | 总成本（月） |
+|------|--------------|-----------|------------|
+| GPU服务器 (RTX 3090) | ¥3000 | ¥500 | ¥3500 |
+| GPU服务器 (A100) | ¥8000 | ¥800 | ¥8800 |
+| CPU服务器（16核） | ¥800 | ¥200 | ¥1000 |
+
+#### 云端服务成本
+
+阿里云DashScope定价（参考）：
+
+| 套餐 | 价格 | 调用量 | 适用场景 |
+|------|------|--------|---------|
+| 按量付费 | ¥0.0007/1K tokens | 无限制 | 测试/小规模 |
+| 基础版 | ¥100/月 | 1M tokens | 小规模应用 |
+| 专业版 | ¥500/月 | 10M tokens | 中等规模 |
+| 企业版 | 定制 | 无限制 | 大规模 |
+
+**成本计算示例**：
+
+假设每天10万次搜索，每次查询平均10个token：
+- 日调用量：1M tokens
+- 月调用量：30M tokens
+- 月成本：30 × 0.7 = ¥21（按量付费）
+
+### 8.3 优化建议
+
+#### 本地服务优化
+
+1. **GPU利用率优化**
+```python
+# 增加批处理大小
+TEXT_BATCH_SIZE = 64  # 从32增加到64
+```
+
+2. **模型量化**
+```python
+# 使用半精度浮点数（节省显存）
+import torch
+model = model.half()  # FP16
+```
+
+3. **预热模型**
+```python
+# 服务启动后预热
+@app.on_event("startup")
+async def warmup():
+    _text_model.encode(["warmup"], device="cuda")
+```
+
+4. **连接池优化**
+```python
+# uvicorn配置
+--workers 1 \           # 单worker（GPU模型限制）
+--backlog 2048 \        # 增加连接队列
+--limit-concurrency 32  # 限制并发数
+```
+
+#### 云端服务优化
+
+1. **批量合并**
+```python
+# 累积多个请求后批量调用
+class BatchEncoder:
+    def __init__(self, batch_size=32, timeout=0.1):
+        self.batch_size = batch_size
+        self.timeout = timeout
+        self.queue = []
+
+    async def encode(self, text: str):
+        # 等待批量积累
+        future = asyncio.Future()
+        self.queue.append((text, future))
+
+        if len(self.queue) >= self.batch_size:
+            self._flush()
+
+        return await future
+```
+
+2. **本地缓存**
+```python
+import hashlib
+import pickle
+
+class CachedEncoder:
+    def __init__(self, cache_file="embedding_cache.pkl"):
+        self.cache = self._load_cache(cache_file)
+
+    def encode(self, text: str):
+        key = hashlib.md5(text.encode()).hexdigest()
+        if key in self.cache:
+            return self.cache[key]
+
+        embedding = self._call_api(text)
+        self.cache[key] = embedding
+        return embedding
+```
+
+3. **降级策略**
+```python
+class HybridEncoder:
+    def __init__(self):
+        self.cloud_encoder = CloudTextEncoder()
+        self.local_encoder = None  # 按需加载
+
+    def encode(self, text: str):
+        try:
+            return self.cloud_encoder.encode(text)
+        except Exception as e:
+            logger.warning(f"Cloud API failed: {e}, falling back to local")
+            if not self.local_encoder:
+                self.local_encoder = BgeEncoder()
+            return self.local_encoder.encode(text)
+```
+
+---
+
+## 故障排查
+
+### 9.1 常见问题
+
+#### 问题1：服务无法启动
+
+**症状**：
+```bash
+$ ./scripts/start_embedding_service.sh
+Error: Port 6005 already in use
+```
+
+**解决**：
+```bash
+# 检查端口占用
+lsof -i :6005
+
+# 杀死占用进程
+kill -9 <PID>
+
+# 或者修改配置文件中的端口
+# embeddings/config.py: PORT = 6006
+```
+
+#### 问题2：CUDA Out of Memory
+
+**症状**：
+```
+RuntimeError: CUDA out of memory. Tried to allocate 2.00 GiB
+```
+
+**解决**：
+```python
+# 减小批处理大小
+TEXT_BATCH_SIZE = 16  # 从32减少到16
+
+# 或者使用CPU模式
+TEXT_DEVICE = "cpu"
+```
+
+#### 问题3：模型下载失败
+
+**症状**：
+```
+OSError: Can't load tokenizer for 'Xorbits/bge-m3'
+```
+
+**解决**：
+```bash
+# 手动下载模型
+huggingface-cli download Xorbits/bge-m3
+
+# 或使用镜像
+export HF_ENDPOINT=https://hf-mirror.com
+```
+
+#### 问题4：云端API Key无效
+
+**症状**：
+```
+ERROR: DASHSCOPE_API_KEY environment variable is not set!
+```
+
+**解决**：
+```bash
+# 设置环境变量
+export DASHSCOPE_API_KEY="sk-your-key"
+
+# 验证
+echo $DASHSCOPE_API_KEY
+```
+
+#### 问题5：API速率限制
+
+**症状**：
+```
+Rate limit exceeded. Please try again later.
+```
+
+**解决**：
+```python
+# 添加延迟
+import time
+for batch in batches:
+    embeddings = encoder.encode_batch(batch)
+    time.sleep(0.1)  # 每批之间延迟100ms
+```
+
+### 9.2 日志查看
+
+#### 服务日志
+
+```bash
+# 查看实时日志
+./scripts/start_embedding_service.sh 2>&1 | tee embedding.log
+
+# 或使用systemd（如果配置了服务）
+journalctl -u embedding-service -f
+```
+
+#### Python应用日志
+
+```python
+import logging
+
+# 配置日志
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+
+logger = logging.getLogger(__name__)
+
+# 使用
+logger.info("Encoding texts...")
+logger.error("Encoding failed: %s", str(e))
+```
+
+#### GPU监控
+
+```bash
+# 实时监控GPU使用
+watch -n 1 nvidia-smi
+
+# 查看详细信息
+nvidia-smi --query-gpu=timestamp,name,temperature.gpu,utilization.gpu,utilization.memory,memory.total,memory.used,memory.free --format=csv
+```
+
+### 9.3 性能调优
+
+#### 性能分析
+
+```python
+import time
+import numpy as np
+
+def benchmark_encoder(encoder, texts, iterations=100):
+    """性能基准测试"""
+    times = []
+
+    for i in range(iterations):
+        start = time.time()
+        embeddings = encoder.encode(texts)
+        end = time.time()
+        times.append(end - start)
+
+    times = np.array(times)
+    print(f"Mean: {times.mean():.3f}s")
+    print(f"Std:  {times.std():.3f}s")
+    print(f"Min:  {times.min():.3f}s")
+    print(f"Max:  {times.max():.3f}s")
+    print(f"QPS:  {len(texts) / times.mean():.2f}")
+
+# 使用
+benchmark_encoder(encoder, texts=["test"] * 32, iterations=100)
+```
+
+#### 内存分析
+
+```bash
+# Python内存分析
+pip install memory_profiler
+
+# 在代码中添加
+from memory_profiler import profile
+
+@profile
+def encode_batch(texts):
+    return encoder.encode(texts)
+
+# 运行
+python -m memory_profiler script.py
+```
+
+---
+
+## 附录
+
+### 10.1 向量维度说明
+
+#### 为什么是1024维？
+
+1. **表达能力**：1024维可以捕捉丰富的语义信息
+2. **计算效率**：维度适中，计算速度快
+3. **存储平衡**：向量大小合理（每个向量约4KB）
+4. **模型选择**：BGE-M3和text-embedding-v4都使用1024维
+
+#### 向量存储计算
+
+```
+单个向量大小 = 1024 × 4字节（FP32） = 4KB
+100万向量大小 = 4KB × 1,000,000 = 4GB
+1000万向量大小 = 4KB × 10,000,000 = 40GB
+```
+
+### 10.2 模型版本信息
+
+#### BGE-M3
+
+- **HuggingFace ID**: `Xorbits/bge-m3`
+- **论文**: [BGE-M3: Multi-Functionality, Multi-Linguality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation](https://arxiv.org/abs/2402.03616)
+- **GitHub**: https://github.com/FlagOpen/FlagEmbedding
+- **特性**：
+  - 支持100+种语言
+  - 最大支持8192 token长度
+  - 丰富的语义表达能力
+
+#### CN-CLIP
+
+- **模型**: ViT-H-14
+- **论文**: [Chinese CLIP: Contrastive Language-Image Pretraining in Chinese](https://arxiv.org/abs/2211.01935)
+- **GitHub**: https://github.com/OFA-Sys/Chinese-CLIP
+- **特性**：
+  - 中文图文理解
+  - 支持图片检索和文本检索
+  - 适合电商场景
+
+#### Aliyun text-embedding-v4
+
+- **提供商**: 阿里云DashScope
+- **文档**: https://help.aliyun.com/zh/model-studio/getting-started/models
+- **特性**：
+  - 云端API，无需部署
+  - 高可用性（99.9% SLA）
+  - 自动扩展
+
+### 10.3 相关文档
+
+#### 项目文档
+
+- **搜索API对接指南**: `docs/搜索API对接指南.md`
+- **索引字段说明**: `docs/索引字段说明v2.md`
+- **系统设计文档**: `docs/系统设计文档.md`
+- **CLAUDE项目指南**: `CLAUDE.md`
+
+#### 外部参考
+
+- **BGE-M3官方文档**: https://github.com/FlagOpen/FlagEmbedding/tree/master/BGE_M3
+- **阿里云DashScope**: https://help.aliyun.com/zh/model-studio/
+- **Elasticsearch向量搜索**: https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html
+- **FastAPI文档**: https://fastapi.tiangolo.com/
+
+#### 测试脚本
+
+```bash
+# 本地向量化服务测试
+./scripts/test_embedding_service.sh
+
+# 云端向量化服务测试
+python scripts/test_cloud_embedding.py
+
+# 性能基准测试
+python scripts/benchmark_embeddings.py
+```
+
+---
+
+## 版本历史
+
+| 版本 | 日期 | 变更说明 |
+|------|------|---------|
+| v1.0 | 2025-12-23 | 初始版本，完整的向量化模块文档 |
+
+---
+
+## 联系方式
+
+如有问题或建议，请联系项目维护者。
+
+**项目仓库**: `/data/tw/SearchEngine`
+
+**相关文档目录**: `docs/`
@@ -53,7 +53,6 @@
                 <span class="product-count" id="productCount">0 products found</span>
             </div>
             <div class="header-right">
-                <button class="fold-btn" onclick="toggleFilters()">Fold</button>
             </div>
         </header>
@@ -73,8 +72,10 @@
                 </select>
             </div>
             <div class="tenant-input-wrapper">
-                <label for="tenantInput">tenant ID:</label>
-                <input type="text" id="tenantInput" placeholder="请输入租户ID" value="170">
+                <label for="tenantSelect">tenant ID:</label>
+                <select id="tenantSelect" onchange="onTenantIdChange()">
+                    <!-- 选项将通过 JavaScript 动态填充 -->
+                </select>
             </div>
             <div class="tenant-input-wrapper">
                 <label for="skuFilterDimension">sku_filter_dimension:</label>
@@ -90,40 +91,28 @@
         <!-- Filter Section -->
         <div class="filter-section" id="filterSection">
-            <!-- Category Filter (一级分类) -->
-            <div class="filter-row">
-                <div class="filter-label">Category:</div>
-                <div class="filter-tags" id="category1Tags"></div>
-            </div>
-
-            <!-- Sub Category Filter (二级分类) -->
-            <div class="filter-row">
-                <div class="filter-label">Sub Category:</div>
-                <div class="filter-tags" id="category2Tags"></div>
-            </div>
-
-            <!-- Third Category Filter (三级分类) -->
-            <div class="filter-row">
-                <div class="filter-label">Third Category:</div>
-                <div class="filter-tags" id="category3Tags"></div>
-            </div>
+            <!-- 分面面板将通过 JavaScript 动态生成 -->
+            <div id="facetsContainer">
+                <!-- Category Filter (一级分类) - 固定显示 -->
+                <div class="filter-row" data-facet-field="category1_name">
+                    <div class="filter-label">Category:</div>
+                    <div class="filter-tags" id="category1Tags"></div>
+                </div>
-            <!-- Color Filter -->
-            <div class="filter-row">
-                <div class="filter-label">Color:</div>
-                <div class="filter-tags" id="colorTags"></div>
-            </div>
+                <!-- Sub Category Filter (二级分类) - 固定显示 -->
+                <div class="filter-row" data-facet-field="category2_name">
+                    <div class="filter-label">Sub Category:</div>
+                    <div class="filter-tags" id="category2Tags"></div>
+                </div>
-            <!-- Size Filter -->
-            <div class="filter-row">
-                <div class="filter-label">Size:</div>
-                <div class="filter-tags" id="sizeTags"></div>
-            </div>
+                <!-- Third Category Filter (三级分类) - 固定显示 -->
+                <div class="filter-row" data-facet-field="category3_name">
+                    <div class="filter-label">Third Category:</div>
+                    <div class="filter-tags" id="category3Tags"></div>
+                </div>
-            <!-- Material Filter -->
-            <div class="filter-row">
-                <div class="filter-label">Material:</div>
-                <div class="filter-tags" id="materialTags"></div>
+                <!-- 规格分面将通过 JavaScript 动态添加到这里 -->
+                <div id="specificationFacetsContainer"></div>
             </div>
             <!-- Dropdown Filters -->
@@ -208,7 +197,8 @@
         <p>SearchEngine © 2025 | API: <span id="apiUrl">Loading...</span></p>
     </footer>
-    <script src="/static/js/app.js?v=3.2"></script>
+    <script src="/static/js/tenant_facets_config.js?v=1.3"></script>
+    <script src="/static/js/app.js?v=3.6"></script>
     <script>
         // 自动补全功能
         const SUGGEST_API = 'http://120.76.41.98:5003/suggest';
@@ -5,13 +5,13 @@ if (document.getElementById(&#39;apiUrl&#39;)) {
     document.getElementById('apiUrl').textContent = API_BASE_URL;
 }
-// Get tenant ID from input
+// Get tenant ID from select
 function getTenantId() {
-    const tenantInput = document.getElementById('tenantInput');
-    if (tenantInput) {
-        return tenantInput.value.trim();
+    const tenantSelect = document.getElementById('tenantSelect');
+    if (tenantSelect) {
+        return tenantSelect.value.trim();
     }
-    return '1'; // Default fallback
+    return '170'; // Default fallback
 }
 // Get sku_filter_dimension (as list) from input
@@ -45,10 +45,46 @@ let state = {
 };
 // Initialize
-document.addEventListener('DOMContentLoaded', function() {
-    document.getElementById('searchInput').focus();
+function initializeApp() {
+    // 初始化租户下拉框和分面面板
+    console.log('Initializing app...');
+    initTenantSelect();
+    const searchInput = document.getElementById('searchInput');
+    if (searchInput) {
+        searchInput.focus();
+    }
+}
+
+// 在 DOM 加载完成后初始化
+if (document.readyState === 'loading') {
+    document.addEventListener('DOMContentLoaded', initializeApp);
+} else {
+    // DOM 已经加载完成，直接执行
+    initializeApp();
+}
+
+// 备用初始化：如果上面的初始化失败，在 window.onload 时再试一次
+window.addEventListener('load', function() {
+    const tenantSelect = document.getElementById('tenantSelect');
+    if (tenantSelect && tenantSelect.options.length === 0) {
+        console.log('Retrying tenant select initialization on window.load...');
+        initTenantSelect();
+    }
 });
+// 最后尝试：延迟执行，确保所有脚本都已加载
+setTimeout(function() {
+    const tenantSelect = document.getElementById('tenantSelect');
+    if (tenantSelect && tenantSelect.options.length === 0) {
+        console.log('Final retry: Initializing tenant select after delay...');
+        if (typeof getAvailableTenantIds === 'function') {
+            initTenantSelect();
+        } else {
+            console.error('getAvailableTenantIds still not available after delay');
+        }
+    }
+}, 100);
+
 // Keyboard handler
 function handleKeyPress(event) {
     if (event.key === 'Enter') {
@@ -56,10 +92,95 @@ function handleKeyPress(event) {
     }
 }
-// Toggle filters visibility
-function toggleFilters() {
-    const filterSection = document.getElementById('filterSection');
-    filterSection.classList.toggle('hidden');
+// 初始化租户下拉框
+function initTenantSelect() {
+    const tenantSelect = document.getElementById('tenantSelect');
+    if (!tenantSelect) {
+        console.error('tenantSelect element not found');
+        return;
+    }
+    
+    // 检查函数是否可用
+    if (typeof getAvailableTenantIds !== 'function') {
+        console.error('getAvailableTenantIds function not found. Make sure tenant_facets_config.js is loaded before app.js');
+        return;
+    }
+    
+    const availableTenants = getAvailableTenantIds();
+    console.log('Available tenants:', availableTenants);
+    
+    if (!availableTenants || availableTenants.length === 0) {
+        console.warn('No tenant IDs found in configuration');
+        return;
+    }
+    
+    // 清空现有选项
+    tenantSelect.innerHTML = '';
+    
+    // 添加选项
+    availableTenants.forEach(tenantId => {
+        const option = document.createElement('option');
+        option.value = tenantId;
+        option.textContent = tenantId;
+        tenantSelect.appendChild(option);
+    });
+    
+    // 设置默认值
+    if (availableTenants.length > 0) {
+        tenantSelect.value = availableTenants.includes('170') ? '170' : availableTenants[0];
+    }
+    
+    // 初始化分面面板
+    renderFacetsPanel();
+}
+
+// 租户ID改变时的处理
+function onTenantIdChange() {
+    renderFacetsPanel();
+    // 清空当前的分面数据
+    clearFacetsData();
+}
+
+// 根据当前 tenant_id 渲染分面面板结构
+function renderFacetsPanel() {
+    const tenantId = getTenantId();
+    const config = getTenantFacetsConfig(tenantId);
+    const container = document.getElementById('specificationFacetsContainer');
+    
+    if (!container) return;
+    
+    // 清空现有规格分面
+    container.innerHTML = '';
+    
+    // 为每个规格字段创建分面行
+    config.specificationFields.forEach(specField => {
+        const row = document.createElement('div');
+        row.className = 'filter-row';
+        row.setAttribute('data-facet-field', specField.field);
+        
+        row.innerHTML = `
+            <div class="filter-label">${escapeHtml(specField.label)}:</div>
+            <div class="filter-tags" id="${specField.containerId}"></div>
+        `;
+        
+        container.appendChild(row);
+    });
+}
+
+// 清空所有分面数据（保留结构）
+function clearFacetsData() {
+    const allTagContainers = document.querySelectorAll('.filter-tags');
+    allTagContainers.forEach(container => {
+        container.innerHTML = '';
+    });
+}
+
+// Escape HTML to prevent XSS
+function escapeHtml(text) {
+    if (!text) return '';
+    const div = document.createElement('div');
+    div.textContent = text;
+    return div.innerHTML;
 }
 // Perform search
@@ -119,11 +240,17 @@ async function performSearch(page = 1) {
     }
     // 规格相关分面（Multi-Select 模式）
-    facets.push(
-        { field: "specifications.color", size: 20, type: "terms", disjunctive: true },   // 颜色属性
-        { field: "specifications.size", size: 15, type: "terms", disjunctive: true },    // 尺寸属性
-        { field: "specifications.material", size: 10, type: "terms", disjunctive: true } // 材质属性
-    );
+    // 根据 tenant_id 使用不同的配置
+    const tenantFacetsConfig = getTenantFacetsConfig(tenantId);
+    tenantFacetsConfig.specificationFields.forEach(specField => {
+        // 只发送查询参数，不包含显示相关的配置（label, containerId）
+        facets.push({
+            field: specField.field,
+            size: specField.size,
+            type: specField.type,
+            disjunctive: specField.disjunctive
+        });
+    });
     // Show loading
     document.getElementById('loading').style.display = 'block';
@@ -242,46 +369,20 @@ function displayFacets(facets) {
         return;
     }
+    const tenantId = getTenantId();
+    
     facets.forEach((facet) => {
-        // 根据字段名找到对应的容器
-        let containerId = null;
-        let maxDisplay = 10;
-        
-        // 一级分类
-        if (facet.field === 'category1_name') {
-            containerId = 'category1Tags';
-            maxDisplay = 10;
-        }
-        // 二级分类
-        else if (facet.field === 'category2_name') {
-            containerId = 'category2Tags';
-            maxDisplay = 10;
-        }
-        // 三级分类
-        else if (facet.field === 'category3_name') {
-            containerId = 'category3Tags';
-            maxDisplay = 10;
-        }
-        // 颜色属性分面 (specifications.color)
-        else if (facet.field === 'specifications.color') {
-            containerId = 'colorTags';
-            maxDisplay = 10;
-        }
-        // 尺寸属性分面 (specifications.size)
-        else if (facet.field === 'specifications.size') {
-            containerId = 'sizeTags';
-            maxDisplay = 10;
-        }
-        // 材质属性分面 (specifications.material)
-        else if (facet.field === 'specifications.material') {
-            containerId = 'materialTags';
-            maxDisplay = 10;
-        }
+        // 根据配置获取分面显示信息
+        const displayConfig = getFacetDisplayConfig(tenantId, facet.field);
-        if (!containerId) {
+        if (!displayConfig) {
+            // 如果没有配置，跳过该分面
             return;
         }
+        const containerId = displayConfig.containerId;
+        const maxDisplay = displayConfig.maxDisplay;
+        
         const container = document.getElementById(containerId);
         if (!container) {
             return;
@@ -722,13 +823,6 @@ function customStringify(obj) {
 }
 // Helper functions
-function escapeHtml(text) {
-    if (!text) return '';
-    const div = document.createElement('div');
-    div.textContent = text;
-    return div.innerHTML;
-}
-
 function escapeAttr(text) {
     if (!text) return '';
     return text.replace(/'/g, "\\'").replace(/"/g, '&quot;');
@@ -0,0 +1,141 @@
+// 租户分面配置
+// 根据不同的 tenant_id 配置不同的分面字段名、显示标签和容器ID
+const TENANT_FACETS_CONFIG = {
+    // tenant_id=162: 使用小写的规格名称
+    "162": {
+        specificationFields: [
+            { 
+                field: "specifications.color", 
+                label: "Color",
+                containerId: "colorTags",
+                size: 20, 
+                type: "terms", 
+                disjunctive: true 
+            },
+            { 
+                field: "specifications.size", 
+                label: "Size",
+                containerId: "sizeTags",
+                size: 15, 
+                type: "terms", 
+                disjunctive: true 
+            },
+            { 
+                field: "specifications.material", 
+                label: "Material",
+                containerId: "materialTags",
+                size: 10, 
+                type: "terms", 
+                disjunctive: true 
+            }
+        ]
+    },
+    // tenant_id=170: 使用首字母大写的规格名称（Color, Size），没有material
+    "170": {
+        specificationFields: [
+            { 
+                field: "specifications.Color", 
+                label: "Color",
+                containerId: "colorTags",
+                size: 20, 
+                type: "terms", 
+                disjunctive: true 
+            },
+            { 
+                field: "specifications.Size", 
+                label: "Size",
+                containerId: "sizeTags",
+                size: 15, 
+                type: "terms", 
+                disjunctive: true 
+            }
+            // 示例：如果170还有其他规格，可以这样添加：
+            // { 
+            //     field: "specifications.Weight", 
+            //     label: "Weight",
+            //     containerId: "weightTags",
+            //     size: 15, 
+            //     type: "terms", 
+            //     disjunctive: true 
+            // }
+        ]
+    }
+};
+
+// 获取租户的分面配置
+function getTenantFacetsConfig(tenantId) {
+    // 如果没有配置，返回默认配置（使用小写）
+    return TENANT_FACETS_CONFIG[tenantId] || {
+        specificationFields: [
+            { 
+                field: "specifications.color", 
+                label: "Color",
+                containerId: "colorTags",
+                size: 20, 
+                type: "terms", 
+                disjunctive: true 
+            },
+            { 
+                field: "specifications.size", 
+                label: "Size",
+                containerId: "sizeTags",
+                size: 15, 
+                type: "terms", 
+                disjunctive: true 
+            },
+            { 
+                field: "specifications.material", 
+                label: "Material",
+                containerId: "materialTags",
+                size: 10, 
+                type: "terms", 
+                disjunctive: true 
+            }
+        ]
+    };
+}
+
+// 根据字段名获取分面配置信息（用于显示）
+function getFacetDisplayConfig(tenantId, facetField) {
+    const config = getTenantFacetsConfig(tenantId);
+    
+    // 查找匹配的规格字段配置
+    const specField = config.specificationFields.find(f => f.field === facetField);
+    if (specField) {
+        return {
+            containerId: specField.containerId,
+            label: specField.label,
+            maxDisplay: 10
+        };
+    }
+    
+    // 类目字段的固定配置
+    const categoryConfigs = {
+        'category1_name': { containerId: 'category1Tags', label: 'Category', maxDisplay: 10 },
+        'category2_name': { containerId: 'category2Tags', label: 'Sub Category', maxDisplay: 10 },
+        'category3_name': { containerId: 'category3Tags', label: 'Third Category', maxDisplay: 10 }
+    };
+    
+    return categoryConfigs[facetField] || null;
+}
+
+// 获取所有已配置的 tenant_id 列表
+function getAvailableTenantIds() {
+    try {
+        if (typeof TENANT_FACETS_CONFIG === 'undefined') {
+            console.error('TENANT_FACETS_CONFIG is not defined');
+            return [];
+        }
+        if (!TENANT_FACETS_CONFIG || typeof TENANT_FACETS_CONFIG !== 'object') {
+            console.error('TENANT_FACETS_CONFIG is invalid:', typeof TENANT_FACETS_CONFIG);
+            return [];
+        }
+        const keys = Object.keys(TENANT_FACETS_CONFIG);
+        console.log('TENANT_FACETS_CONFIG keys:', keys, 'Config:', TENANT_FACETS_CONFIG);
+        return keys;
+    } catch (e) {
+        console.error('Error in getAvailableTenantIds:', e);
+        return [];
+    }
+}
+
@@ -0,0 +1,63 @@
+#!/bin/bash
+#
+# Start CLIP vector service (clip-server) in an independent environment.
+#
+# This service is designed to be a drop-in alternative to the local
+# `embeddings` service, but runs in its own Python environment and depends
+# on `jina` via `clip-server`.
+#
+set -e
+
+cd "$(dirname "$0")/.."
+
+LOG_DIR="/home/tw/SearchEngine/logs"
+mkdir -p "${LOG_DIR}"
+PID_FILE="${LOG_DIR}/clip_service.pid"
+LOG_FILE="${LOG_DIR}/clip_service.log"
+
+echo "========================================"
+echo "Starting CLIP vector service (clip-server)"
+echo "========================================"
+
+# Load conda and activate dedicated environment, if available
+if [ -f "/home/tw/miniconda3/etc/profile.d/conda.sh" ]; then
+  # shellcheck disable=SC1091
+  source /home/tw/miniconda3/etc/profile.d/conda.sh
+  conda activate clip_service || {
+    echo "Failed to activate conda env 'clip_service'. Please create it first." >&2
+    echo "See CLIP_SERVICE_README.md for setup instructions." >&2
+    exit 1
+  }
+else
+  echo "Warning: /home/tw/miniconda3/etc/profile.d/conda.sh not found." >&2
+  echo "Please activate the 'clip_service' environment manually before running this script." >&2
+fi
+
+if [ -f "${PID_FILE}" ]; then
+  EXISTING_PID="$(cat "${PID_FILE}")"
+  if ps -p "${EXISTING_PID}" > /dev/null 2>&1; then
+    echo "clip-server already appears to be running with PID ${EXISTING_PID}."
+    echo "If this is incorrect, remove ${PID_FILE} and try again."
+    exit 0
+  else
+    echo "Stale PID file found at ${PID_FILE}, removing..."
+    rm -f "${PID_FILE}"
+  fi
+fi
+
+echo "Log file: ${LOG_FILE}"
+echo "PID file: ${PID_FILE}"
+echo
+echo "Starting clip-server in background..."
+
+nohup python -m clip_server > "${LOG_FILE}" 2>&1 &
+SERVICE_PID=$!
+echo "${SERVICE_PID}" > "${PID_FILE}"
+
+echo "clip-server started with PID ${SERVICE_PID}."
+echo "You can check logs with:"
+echo "  tail -f ${LOG_FILE}"
+
+
+
+
@@ -0,0 +1,48 @@
+#!/bin/bash
+#
+# Stop CLIP vector service (clip-as-service) started by start_clip_service.sh
+#
+set -e
+
+LOG_DIR="/home/tw/SearchEngine/logs"
+PID_FILE="${LOG_DIR}/clip_service.pid"
+
+echo "========================================"
+echo "Stopping CLIP vector service (clip-as-service)"
+echo "========================================"
+
+if [ ! -f "${PID_FILE}" ]; then
+  echo "No PID file found at ${PID_FILE}."
+  echo "clip-as-service may not be running (or was not started via start_clip_service.sh)."
+  exit 0
+fi
+
+PID="$(cat "${PID_FILE}")"
+
+if [ -z "${PID}" ]; then
+  echo "PID file exists but is empty. Removing it."
+  rm -f "${PID_FILE}"
+  exit 0
+fi
+
+if ps -p "${PID}" > /dev/null 2>&1; then
+  echo "Sending SIGTERM to clip-as-service (PID ${PID})..."
+  kill "${PID}" || true
+  sleep 1
+
+  if ps -p "${PID}" > /dev/null 2>&1; then
+    echo "Process still alive, sending SIGKILL..."
+    kill -9 "${PID}" || true
+  fi
+
+  echo "clip-as-service (PID ${PID}) has been stopped."
+else
+  echo "No process with PID ${PID} found. Assuming it's already stopped."
+fi
+
+rm -f "${PID_FILE}"
+echo "PID file removed: ${PID_FILE}"
+
+
+
+
@@ -0,0 +1,176 @@
+# 前端分面配置说明
+
+## 问题描述
+
+tenant_id=170 的分面返回为空，原因是：
+1. `category1_name` 字段在数据中为 None（这是数据问题）
+2. `specifications.name` 字段在数据中使用首字母大写（如 "Color"、"Size"），而前端查询时使用小写（"color"、"size"），导致 ES term 查询匹配失败
+
+## 解决方案
+
+采用前端配置方案，根据不同的 `tenant_id` 配置不同的分面字段。配置包括：
+- **字段名**（field）：ES 中的实际字段名，如 `specifications.Color`
+- **显示标签**（label）：前端显示的名称，如 "颜色"、"尺寸"
+- **容器ID**（containerId）：HTML 中用于显示分面的容器 ID，如 `colorTags`
+- **查询参数**：size、type、disjunctive 等
+
+## 配置文件
+
+配置文件位置：`frontend/static/js/tenant_facets_config.js`
+
+### 配置结构
+
+```javascript
+const TENANT_FACETS_CONFIG = {
+    "租户ID": {
+        specificationFields: [
+            { 
+                field: "specifications.字段名",      // ES字段名（必须与实际数据匹配，包括大小写）
+                label: "显示标签",                    // 前端显示名称
+                containerId: "容器ID",                // HTML容器ID
+                size: 20,                            // 返回的分面值数量
+                type: "terms",                       // 分面类型
+                disjunctive: true                    // 是否支持多选
+            }
+        ]
+    }
+};
+```
+
+### 示例配置
+
+#### tenant_id=162（使用小写）
+```javascript
+"162": {
+    specificationFields: [
+        { 
+            field: "specifications.color", 
+            label: "Color",
+            containerId: "colorTags",
+            size: 20, 
+            type: "terms", 
+            disjunctive: true 
+        },
+        { 
+            field: "specifications.size", 
+            label: "Size",
+            containerId: "sizeTags",
+            size: 15, 
+            type: "terms", 
+            disjunctive: true 
+        },
+        { 
+            field: "specifications.material", 
+            label: "Material",
+            containerId: "materialTags",
+            size: 10, 
+            type: "terms", 
+            disjunctive: true 
+        }
+    ]
+}
+```
+
+#### tenant_id=170（使用首字母大写，没有material）
+```javascript
+"170": {
+    specificationFields: [
+        { 
+            field: "specifications.Color",    // 注意：首字母大写
+            label: "Color",
+            containerId: "colorTags",
+            size: 20, 
+            type: "terms", 
+            disjunctive: true 
+        },
+        { 
+            field: "specifications.Size",     // 注意：首字母大写
+            label: "Size",
+            containerId: "sizeTags",
+            size: 15, 
+            type: "terms", 
+            disjunctive: true 
+        }
+        // 注意：170 没有 material 分面
+    ]
+}
+```
+
+#### 示例：添加新租户（包含其他规格字段，如重量、包装方式）
+```javascript
+"新租户ID": {
+    specificationFields: [
+        { 
+            field: "specifications.Weight",        // 重量
+            label: "Weight",
+            containerId: "weightTags",              // 需要在HTML中添加此容器
+            size: 15, 
+            type: "terms", 
+            disjunctive: true 
+        },
+        { 
+            field: "specifications.PackageType",   // 包装方式
+            label: "Package Type",
+            containerId: "packageTags",            // 需要在HTML中添加此容器
+            size: 10, 
+            type: "terms", 
+            disjunctive: true 
+        }
+    ]
+}
+```
+
+## 添加新租户配置步骤
+
+1. **确定 ES 数据中的实际字段名**
+   - 检查 ES 中 `specifications.name` 的实际值（注意大小写）
+   - 例如：`"Color"` 或 `"color"` 是不同的字段
+
+2. **在配置文件中添加配置**
+   ```javascript
+   "新租户ID": {
+       specificationFields: [
+           { 
+               field: "specifications.实际字段名",
+               label: "显示名称",
+               containerId: "容器ID",
+               size: 20, 
+               type: "terms", 
+               disjunctive: true 
+           }
+       ]
+   }
+   ```
+
+3. **在 HTML 中添加容器**（如果需要新的容器）
+   在 `frontend/index.html` 的 Filter Section 中添加：
+   ```html
+   <div class="filter-row">
+       <div class="filter-label">显示名称:</div>
+       <div class="filter-tags" id="容器ID"></div>
+   </div>
+   ```
+
+## 代码修改说明
+
+### 1. 配置文件 (`tenant_facets_config.js`)
+- 增加了 `label` 和 `containerId` 字段
+- 新增 `getFacetDisplayConfig()` 函数，根据字段名获取显示配置
+
+### 2. 前端代码 (`app.js`)
+- `performSearch()`: 使用配置文件获取分面查询参数
+- `displayFacets()`: 使用配置来匹配分面字段并找到对应的容器
+
+### 3. HTML (`index.html`)
+- 引入了配置文件 `tenant_facets_config.js`
+
+## 注意事项
+
+1. **字段名必须完全匹配**：`field` 必须与 ES 中实际存储的 `specifications.name` 值完全匹配（包括大小写）
+2. **容器ID必须存在**：`containerId` 必须在 HTML 中存在，否则分面无法显示
+3. **后端代码无需修改**：后端直接使用前端传入的字段名进行查询
+4. **分面信息是动态加载的**：只有在搜索返回后才显示，符合需求
+
+## 数据问题说明
+
+对于 tenant_id=170，`category1_name` 字段在 ES 数据中全部为 None，因此该类目分面会返回空。这需要在数据索引时修复，确保正确解析和填充 `category1_name` 字段。