Commit 4d824a771eb23ed68b1889e01da8e40151f3b226

Authored by tangwang
1 parent fb68a0ef

所有租户共用一套统一配置.tenantID只在请求层级.服务层级没有tenantID相关的独立配置.

创建统一配置文件 config/config.yaml(从 base 配置迁移,移除 customer_name)

创建脚本体系
启动、停止、重启、moc数据到mysql、从mysql灌入数据到ES 这些脚本
restart.sh
run.sh 内部调用 启动前后端
scripts/mock_data.sh  mock数据 -> mysql
scripts/ingest.sh  mysql->ES
.cursor/plans/所有租户共用一套统一配置.tenantID只在请求层级.服务层级没有tenantID相关的独立配置.md 0 → 100644
... ... @@ -0,0 +1,342 @@
  1 +<!-- d9d0ef58-7a33-4ef6-8e3a-714b4552fd20 56c8cc4b-4eeb-4b77-9986-f4fa349e96b9 -->
  2 +# 多租户架构重构计划
  3 +
  4 +## 概述
  5 +
  6 +将搜索服务从按租户启动改造为真正的多租户架构:
  7 +
  8 +- 服务启动时不指定租户ID,所有租户共用一套配置
  9 +- 删除customer1配置,去掉base层级,统一为config.yaml
  10 +- 统一脚本接口:启动、停止、重启、数据灌入
  11 +- 统一数据灌入流程,ES只有一份索引
  12 +- 前端支持在搜索框左侧输入租户ID
  13 +
  14 +## Phase 1: 配置文件体系重构
  15 +
  16 +### 1.1 创建统一配置文件
  17 +
  18 +**文件**: `config/config.yaml` (NEW)
  19 +
  20 +- 将 `config/schema/base/config.yaml` 移动到 `config/config.yaml`
  21 +- 删除 `customer_name` 字段(不再需要)
  22 +- 删除 `customer_id` 相关逻辑
  23 +- 固定索引名称为 `search_products`
  24 +- 确保包含 `tenant_id` 字段(必需)
  25 +
  26 +### 1.2 删除customer1配置
  27 +
  28 +**删除文件**:
  29 +
  30 +- `config/schema/customer1/config.yaml`
  31 +- `config/schema/customer1/` 目录(如果为空)
  32 +
  33 +### 1.3 更新ConfigLoader
  34 +
  35 +**文件**: `config/config_loader.py`
  36 +
  37 +修改 `load_customer_config()` 方法:
  38 +
  39 +- 移除 `customer_id` 参数
  40 +- 改为 `load_config()` 方法
  41 +- 直接加载 `config/config.yaml`
  42 +- 移除对 `config/schema/{customer_id}/config.yaml` 的查找逻辑
  43 +- 移除 `customer_id` 字段验证
  44 +- 更新 `CustomerConfig` 类:移除 `customer_id` 字段
  45 +
  46 +### 1.4 更新配置验证
  47 +
  48 +**文件**: `config/config_loader.py`
  49 +
  50 +修改 `validate_config()` 方法:
  51 +
  52 +- 确保 `tenant_id` 字段存在且为必需
  53 +- 移除对 `customer_id` 的验证
  54 +
  55 +## Phase 2: 服务启动改造
  56 +
  57 +### 2.1 更新API应用初始化
  58 +
  59 +**文件**: `api/app.py`
  60 +
  61 +修改 `init_service()` 方法:
  62 +
  63 +- 移除 `customer_id` 参数
  64 +- 直接加载统一配置(`config/config.yaml`)
  65 +- 移除 `CUSTOMER_ID` 环境变量依赖
  66 +- 更新日志输出(不再显示customer_id)
  67 +
  68 +修改 `startup_event()` 方法:
  69 +
  70 +- 移除 `CUSTOMER_ID` 环境变量读取
  71 +- 直接调用 `init_service()` 不传参数
  72 +
  73 +### 2.2 更新main.py
  74 +
  75 +**文件**: `main.py`
  76 +
  77 +修改 `cmd_serve()` 方法:
  78 +
  79 +- 移除 `--customer` 参数
  80 +- 移除 `CUSTOMER_ID` 环境变量设置
  81 +- 更新帮助信息
  82 +
  83 +### 2.3 更新启动脚本
  84 +
  85 +**文件**: `scripts/start_backend.sh`
  86 +
  87 +修改:
  88 +
  89 +- 移除 `CUSTOMER_ID` 环境变量
  90 +- 移除 `--customer` 参数
  91 +- 简化启动命令
  92 +
  93 +**文件**: `scripts/start_servers.py`
  94 +
  95 +修改 `start_api_server()` 方法:
  96 +
  97 +- 移除 `customer` 参数
  98 +- 移除 `CUSTOMER_ID` 环境变量设置
  99 +- 简化启动命令
  100 +
  101 +## Phase 3: 脚本体系统一
  102 +
  103 +### 3.1 创建统一启动脚本
  104 +
  105 +**文件**: `scripts/start.sh` (NEW)
  106 +
  107 +功能:
  108 +
  109 +- 启动后端服务(调用 `scripts/start_backend.sh`)
  110 +- 启动前端服务(调用 `scripts/start_frontend.sh`)
  111 +- 等待服务就绪
  112 +- 显示服务状态和访问地址
  113 +
  114 +### 3.2 创建统一停止脚本
  115 +
  116 +**文件**: `scripts/stop.sh` (已存在,需更新)
  117 +
  118 +功能:
  119 +
  120 +- 停止后端服务(端口6002)
  121 +- 停止前端服务(端口6003)
  122 +- 清理PID文件
  123 +- 显示停止状态
  124 +
  125 +### 3.3 创建统一重启脚本
  126 +
  127 +**文件**: `scripts/restart.sh` (已存在,需更新)
  128 +
  129 +功能:
  130 +
  131 +- 调用 `scripts/stop.sh` 停止服务
  132 +- 等待服务完全停止
  133 +- 调用 `scripts/start.sh` 启动服务
  134 +
  135 +### 3.4 创建数据灌入脚本
  136 +
  137 +**文件**: `scripts/ingest.sh` (已存在,需更新)
  138 +
  139 +功能:
  140 +
  141 +- 从MySQL读取数据
  142 +- 转换数据格式(统一处理base和customer1数据源)
  143 +- 灌入到ES索引 `search_products`
  144 +- 支持指定租户ID过滤数据
  145 +- 自动处理字段映射:缺失字段随机生成,多余字段忽略
  146 +
  147 +### 3.5 创建Mock数据脚本
  148 +
  149 +**文件**: `scripts/mock_data.sh` (NEW)
  150 +
  151 +功能:
  152 +
  153 +- 生成测试数据到MySQL
  154 +- 支持指定租户ID
  155 +- 支持指定数据量
  156 +- 调用 `scripts/generate_test_data.py` 和 `scripts/import_test_data.py`
  157 +
  158 +### 3.6 更新根目录脚本
  159 +
  160 +**文件**: `run.sh` (已存在,需更新)
  161 +
  162 +功能:
  163 +
  164 +- 调用 `scripts/start.sh` 启动服务
  165 +
  166 +**文件**: `restart.sh` (已存在,需更新)
  167 +
  168 +功能:
  169 +
  170 +- 调用 `scripts/restart.sh` 重启服务
  171 +
  172 +**文件**: `setup.sh` (已存在,需更新)
  173 +
  174 +功能:
  175 +
  176 +- 设置环境
  177 +- 检查依赖
  178 +- 不包含服务启动逻辑
  179 +
  180 +**文件**: `test_all.sh` (已存在,需更新)
  181 +
  182 +功能:
  183 +
  184 +- 运行完整测试流程
  185 +- 包含数据灌入、服务启动、API测试
  186 +
  187 +### 3.7 清理废弃脚本
  188 +
  189 +**删除文件**:
  190 +
  191 +- `scripts/demo_base.sh`
  192 +- `scripts/stop_base.sh`
  193 +- `scripts/start_test_environment.sh`
  194 +- `scripts/stop_test_environment.sh`
  195 +- 其他不再需要的脚本
  196 +
  197 +## Phase 4: 数据灌入统一
  198 +
  199 +### 4.1 更新数据灌入脚本
  200 +
  201 +**文件**: `scripts/ingest_shoplazza.py`
  202 +
  203 +修改:
  204 +
  205 +- 移除 `--config` 参数(不再需要)
  206 +- 直接加载统一配置(`config/config.yaml`)
  207 +- 统一处理所有数据源(不再区分base和customer1)
  208 +- 支持 `--tenant-id` 参数过滤数据
  209 +- 字段映射逻辑:
  210 +- 如果字段在配置中但数据源中没有,随机生成
  211 +- 如果字段在数据源中但配置中没有,忽略
  212 +- 确保 `tenant_id` 字段正确设置
  213 +
  214 +### 4.2 更新数据转换器
  215 +
  216 +**文件**: `indexer/spu_transformer.py`
  217 +
  218 +修改:
  219 +
  220 +- 移除对配置中 `customer_id` 的依赖
  221 +- 统一处理所有数据源
  222 +- 确保字段映射正确(缺失字段随机生成,多余字段忽略)
  223 +
  224 +### 4.3 统一测试数据生成
  225 +
  226 +**文件**: `scripts/generate_test_data.py`
  227 +
  228 +修改:
  229 +
  230 +- 支持生成符合统一索引结构的测试数据
  231 +- 支持指定租户ID
  232 +- 确保生成的数据包含所有必需字段
  233 +
  234 +## Phase 5: 前端改造
  235 +
  236 +### 5.1 更新前端HTML
  237 +
  238 +**文件**: `frontend/index.html`
  239 +
  240 +修改:
  241 +
  242 +- 在搜索框左侧添加租户ID输入框
  243 +- 添加租户ID标签
  244 +- 更新布局样式
  245 +
  246 +### 5.2 更新前端JavaScript
  247 +
  248 +**文件**: `frontend/static/js/app_base.js`
  249 +
  250 +修改:
  251 +
  252 +- 移除硬编码的 `TENANT_ID = '1'`
  253 +- 从输入框读取租户ID
  254 +- 在搜索请求中发送租户ID(通过 `X-Tenant-ID` header)
  255 +- 添加租户ID验证(不能为空)
  256 +- 更新UI显示
  257 +
  258 +### 5.3 更新前端CSS
  259 +
  260 +**文件**: `frontend/static/css/style.css`
  261 +
  262 +修改:
  263 +
  264 +- 添加租户ID输入框样式
  265 +- 更新搜索栏布局(支持租户ID输入框)
  266 +
  267 +## Phase 6: 更新文档和测试
  268 +
  269 +### 6.1 更新README
  270 +
  271 +**文件**: `README.md`
  272 +
  273 +修改:
  274 +
  275 +- 更新启动说明(不再需要指定租户ID)
  276 +- 更新配置说明(统一配置文件)
  277 +- 更新脚本使用说明
  278 +
  279 +### 6.2 更新API文档
  280 +
  281 +**文件**: `API_DOCUMENTATION.md`
  282 +
  283 +修改:
  284 +
  285 +- 更新租户ID说明(必须通过请求提供)
  286 +- 更新配置说明(统一配置)
  287 +
  288 +### 6.3 更新测试脚本
  289 +
  290 +**文件**: `test_all.sh`
  291 +
  292 +修改:
  293 +
  294 +- 更新测试流程(不再需要指定租户ID)
  295 +- 更新数据灌入测试(统一数据源)
  296 +- 更新API测试(包含租户ID参数)
  297 +
  298 +## Phase 7: 清理和验证
  299 +
  300 +### 7.1 清理废弃代码
  301 +
  302 +- 删除所有对 `customer_id` 的引用
  303 +- 删除所有对 `customer1` 配置的引用
  304 +- 删除所有对 `base` 配置层级的引用
  305 +- 清理不再使用的脚本
  306 +
  307 +### 7.2 验证功能
  308 +
  309 +- 验证服务启动(不指定租户ID)
  310 +- 验证配置加载(统一配置)
  311 +- 验证数据灌入(统一数据源)
  312 +- 验证搜索功能(通过请求提供租户ID)
  313 +- 验证前端功能(租户ID输入)
  314 +
  315 +## 关键文件清单
  316 +
  317 +### 需要修改的文件:
  318 +
  319 +1. `config/config_loader.py` - 移除customer_id逻辑
  320 +2. `config/config.yaml` - 统一配置文件(从base移动)
  321 +3. `api/app.py` - 移除customer_id参数
  322 +4. `main.py` - 移除customer参数
  323 +5. `scripts/start_backend.sh` - 移除CUSTOMER_ID
  324 +6. `scripts/start_servers.py` - 移除customer参数
  325 +7. `scripts/ingest_shoplazza.py` - 统一数据灌入
  326 +8. `frontend/index.html` - 添加租户ID输入框
  327 +9. `frontend/static/js/app_base.js` - 读取租户ID
  328 +10. `run.sh`, `restart.sh`, `setup.sh`, `test_all.sh` - 更新脚本
  329 +
  330 +### 需要删除的文件:
  331 +
  332 +1. `config/schema/customer1/config.yaml`
  333 +2. `config/schema/customer1/` 目录
  334 +3. `scripts/demo_base.sh`
  335 +4. `scripts/stop_base.sh`
  336 +5. 其他废弃脚本
  337 +
  338 +### 需要创建的文件:
  339 +
  340 +1. `config/config.yaml` - 统一配置文件
  341 +2. `scripts/start.sh` - 统一启动脚本
  342 +3. `scripts/mock_data.sh` - Mock数据脚本
0 343 \ No newline at end of file
... ...
api/app.py
... ... @@ -51,28 +51,27 @@ _searcher: Optional[Searcher] = None
51 51 _query_parser: Optional[QueryParser] = None
52 52  
53 53  
54   -def init_service(customer_id: str = "customer1", es_host: str = "http://localhost:9200"):
  54 +def init_service(es_host: str = "http://localhost:9200"):
55 55 """
56   - Initialize search service with configuration.
  56 + Initialize search service with unified configuration.
57 57  
58 58 Args:
59   - customer_id: Customer configuration ID
60 59 es_host: Elasticsearch host URL
61 60 """
62 61 global _config, _es_client, _searcher, _query_parser
63 62  
64   - print(f"Initializing search service for customer: {customer_id}")
  63 + print("Initializing search service (multi-tenant)")
65 64  
66   - # Load configuration
67   - config_loader = ConfigLoader("config/schema")
68   - _config = config_loader.load_customer_config(customer_id)
  65 + # Load unified configuration
  66 + config_loader = ConfigLoader("config/config.yaml")
  67 + _config = config_loader.load_config()
69 68  
70 69 # Validate configuration
71 70 errors = config_loader.validate_config(_config)
72 71 if errors:
73 72 raise ValueError(f"Configuration validation failed: {errors}")
74 73  
75   - print(f"Configuration loaded: {_config.customer_name}")
  74 + print(f"Configuration loaded: {_config.es_index_name}")
76 75  
77 76 # Get ES credentials from environment variables or .env file
78 77 es_username = os.getenv('ES_USERNAME')
... ... @@ -113,7 +112,7 @@ def init_service(customer_id: str = &quot;customer1&quot;, es_host: str = &quot;http://localhos
113 112  
114 113  
115 114 def get_config() -> CustomerConfig:
116   - """Get customer configuration."""
  115 + """Get search engine configuration."""
117 116 if _config is None:
118 117 raise RuntimeError("Service not initialized")
119 118 return _config
... ... @@ -184,15 +183,13 @@ app.add_middleware(
184 183 @app.on_event("startup")
185 184 async def startup_event():
186 185 """Initialize service on startup."""
187   - customer_id = os.getenv("CUSTOMER_ID", "customer1")
188 186 es_host = os.getenv("ES_HOST", "http://localhost:9200")
189 187  
190   - logger.info(f"Starting E-Commerce Search API")
191   - logger.info(f"Customer ID: {customer_id}")
  188 + logger.info("Starting E-Commerce Search API (Multi-Tenant)")
192 189 logger.info(f"Elasticsearch Host: {es_host}")
193 190  
194 191 try:
195   - init_service(customer_id=customer_id, es_host=es_host)
  192 + init_service(es_host=es_host)
196 193 logger.info("Service initialized successfully")
197 194 except Exception as e:
198 195 logger.error(f"Failed to initialize service: {e}")
... ... @@ -310,16 +307,14 @@ else:
310 307 if __name__ == "__main__":
311 308 import uvicorn
312 309  
313   - parser = argparse.ArgumentParser(description='Start search API service')
  310 + parser = argparse.ArgumentParser(description='Start search API service (multi-tenant)')
314 311 parser.add_argument('--host', default='0.0.0.0', help='Host to bind to')
315 312 parser.add_argument('--port', type=int, default=6002, help='Port to bind to')
316   - parser.add_argument('--customer', default='customer1', help='Customer ID')
317 313 parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
318 314 parser.add_argument('--reload', action='store_true', help='Enable auto-reload')
319 315 args = parser.parse_args()
320 316  
321   - # Set environment variables
322   - os.environ['CUSTOMER_ID'] = args.customer
  317 + # Set environment variable
323 318 os.environ['ES_HOST'] = args.es_host
324 319  
325 320 # Run server
... ...
api/models.py
... ... @@ -250,7 +250,6 @@ class HealthResponse(BaseModel):
250 250 """Health check response model."""
251 251 status: str = Field(..., description="Service status")
252 252 elasticsearch: str = Field(..., description="Elasticsearch status")
253   - customer_id: str = Field(..., description="Customer configuration ID")
254 253  
255 254  
256 255 class ErrorResponse(BaseModel):
... ...
api/routes/admin.py
... ... @@ -28,15 +28,13 @@ async def health_check():
28 28  
29 29 return HealthResponse(
30 30 status="healthy" if es_status == "connected" else "unhealthy",
31   - elasticsearch=es_status,
32   - customer_id=config.customer_id
  31 + elasticsearch=es_status
33 32 )
34 33  
35 34 except Exception as e:
36 35 return HealthResponse(
37 36 status="unhealthy",
38   - elasticsearch="error",
39   - customer_id="unknown"
  37 + elasticsearch="error"
40 38 )
41 39  
42 40  
... ... @@ -51,8 +49,6 @@ async def get_configuration():
51 49 config = get_config()
52 50  
53 51 return {
54   - "customer_id": config.customer_id,
55   - "customer_name": config.customer_name,
56 52 "es_index_name": config.es_index_name,
57 53 "num_fields": len(config.fields),
58 54 "num_indexes": len(config.indexes),
... ...
config/config.yaml 0 → 100644
... ... @@ -0,0 +1,269 @@
  1 +# Unified Configuration for Multi-Tenant Search Engine
  2 +# 统一配置文件,所有租户共用一套索引配置
  3 +# 注意:此配置不包含MySQL相关配置,只包含ES搜索相关配置
  4 +
  5 +# Elasticsearch Index
  6 +es_index_name: "search_products"
  7 +
  8 +# ES Index Settings
  9 +es_settings:
  10 + number_of_shards: 1
  11 + number_of_replicas: 0
  12 + refresh_interval: "30s"
  13 +
  14 +# Field Definitions (SPU级别,只包含对搜索有帮助的字段)
  15 +fields:
  16 + # 租户隔离字段(必需)
  17 + - name: "tenant_id"
  18 + type: "KEYWORD"
  19 + required: true
  20 + index: true
  21 + store: true
  22 +
  23 + # 商品标识字段
  24 + - name: "product_id"
  25 + type: "KEYWORD"
  26 + required: true
  27 + index: true
  28 + store: true
  29 +
  30 + - name: "handle"
  31 + type: "KEYWORD"
  32 + index: true
  33 + store: true
  34 +
  35 + # 文本搜索字段
  36 + - name: "title"
  37 + type: "TEXT"
  38 + analyzer: "chinese_ecommerce"
  39 + boost: 3.0
  40 + index: true
  41 + store: true
  42 +
  43 + - name: "brief"
  44 + type: "TEXT"
  45 + analyzer: "chinese_ecommerce"
  46 + boost: 1.5
  47 + index: true
  48 + store: true
  49 +
  50 + - name: "description"
  51 + type: "TEXT"
  52 + analyzer: "chinese_ecommerce"
  53 + boost: 1.0
  54 + index: true
  55 + store: true
  56 +
  57 + # SEO字段(提升相关性)
  58 + - name: "seo_title"
  59 + type: "TEXT"
  60 + analyzer: "chinese_ecommerce"
  61 + boost: 2.0
  62 + index: true
  63 + store: true
  64 +
  65 + - name: "seo_description"
  66 + type: "TEXT"
  67 + analyzer: "chinese_ecommerce"
  68 + boost: 1.5
  69 + index: true
  70 + store: true
  71 +
  72 + - name: "seo_keywords"
  73 + type: "TEXT"
  74 + analyzer: "chinese_ecommerce"
  75 + boost: 2.0
  76 + index: true
  77 + store: true
  78 +
  79 + # 分类和标签字段(TEXT + KEYWORD双重索引)
  80 + - name: "vendor"
  81 + type: "TEXT"
  82 + analyzer: "chinese_ecommerce"
  83 + boost: 1.5
  84 + index: true
  85 + store: true
  86 +
  87 + - name: "vendor_keyword"
  88 + type: "KEYWORD"
  89 + index: true
  90 + store: false
  91 +
  92 + - name: "product_type"
  93 + type: "TEXT"
  94 + analyzer: "chinese_ecommerce"
  95 + boost: 1.5
  96 + index: true
  97 + store: true
  98 +
  99 + - name: "product_type_keyword"
  100 + type: "KEYWORD"
  101 + index: true
  102 + store: false
  103 +
  104 + - name: "tags"
  105 + type: "TEXT"
  106 + analyzer: "chinese_ecommerce"
  107 + boost: 1.0
  108 + index: true
  109 + store: true
  110 +
  111 + - name: "tags_keyword"
  112 + type: "KEYWORD"
  113 + index: true
  114 + store: false
  115 +
  116 + - name: "category"
  117 + type: "TEXT"
  118 + analyzer: "chinese_ecommerce"
  119 + boost: 1.5
  120 + index: true
  121 + store: true
  122 +
  123 + - name: "category_keyword"
  124 + type: "KEYWORD"
  125 + index: true
  126 + store: false
  127 +
  128 + # 价格字段(扁平化)
  129 + - name: "min_price"
  130 + type: "FLOAT"
  131 + index: true
  132 + store: true
  133 +
  134 + - name: "max_price"
  135 + type: "FLOAT"
  136 + index: true
  137 + store: true
  138 +
  139 + - name: "compare_at_price"
  140 + type: "FLOAT"
  141 + index: true
  142 + store: true
  143 +
  144 + # 图片字段(用于显示,不参与搜索)
  145 + - name: "image_url"
  146 + type: "KEYWORD"
  147 + index: false
  148 + store: true
  149 +
  150 + # 嵌套variants字段
  151 + - name: "variants"
  152 + type: "JSON"
  153 + nested: true
  154 + nested_properties:
  155 + variant_id:
  156 + type: "keyword"
  157 + index: true
  158 + store: true
  159 + title:
  160 + type: "text"
  161 + analyzer: "chinese_ecommerce"
  162 + index: true
  163 + store: true
  164 + price:
  165 + type: "float"
  166 + index: true
  167 + store: true
  168 + compare_at_price:
  169 + type: "float"
  170 + index: true
  171 + store: true
  172 + sku:
  173 + type: "keyword"
  174 + index: true
  175 + store: true
  176 + stock:
  177 + type: "long"
  178 + index: true
  179 + store: true
  180 + options:
  181 + type: "object"
  182 + enabled: true
  183 +
  184 +# Index Structure (Query Domains)
  185 +indexes:
  186 + - name: "default"
  187 + label: "默认索引"
  188 + fields:
  189 + - "title"
  190 + - "brief"
  191 + - "description"
  192 + - "seo_title"
  193 + - "seo_description"
  194 + - "seo_keywords"
  195 + - "vendor"
  196 + - "product_type"
  197 + - "tags"
  198 + - "category"
  199 + analyzer: "chinese_ecommerce"
  200 + boost: 1.0
  201 +
  202 + - name: "title"
  203 + label: "标题索引"
  204 + fields:
  205 + - "title"
  206 + - "seo_title"
  207 + analyzer: "chinese_ecommerce"
  208 + boost: 2.0
  209 +
  210 + - name: "vendor"
  211 + label: "品牌索引"
  212 + fields:
  213 + - "vendor"
  214 + analyzer: "chinese_ecommerce"
  215 + boost: 1.5
  216 +
  217 + - name: "category"
  218 + label: "类目索引"
  219 + fields:
  220 + - "category"
  221 + analyzer: "chinese_ecommerce"
  222 + boost: 1.5
  223 +
  224 + - name: "tags"
  225 + label: "标签索引"
  226 + fields:
  227 + - "tags"
  228 + - "seo_keywords"
  229 + analyzer: "chinese_ecommerce"
  230 + boost: 1.0
  231 +
  232 +# Query Configuration
  233 +query_config:
  234 + supported_languages:
  235 + - "zh"
  236 + - "en"
  237 + default_language: "zh"
  238 + enable_translation: true
  239 + enable_text_embedding: true
  240 + enable_query_rewrite: true
  241 +
  242 + # Translation API (DeepL)
  243 + translation_service: "deepl"
  244 + translation_api_key: null # Set via environment variable
  245 +
  246 +# Ranking Configuration
  247 +ranking:
  248 + expression: "bm25() + 0.2*text_embedding_relevance()"
  249 + description: "BM25 text relevance combined with semantic embedding similarity"
  250 +
  251 +# Function Score配置(ES层打分规则)
  252 +function_score:
  253 + score_mode: "sum"
  254 + boost_mode: "multiply"
  255 +
  256 + functions: []
  257 +
  258 +# Rerank配置(本地重排,当前禁用)
  259 +rerank:
  260 + enabled: false
  261 + expression: ""
  262 + description: "Local reranking (disabled, use ES function_score instead)"
  263 +
  264 +# SPU配置(已启用,使用嵌套variants)
  265 +spu_config:
  266 + enabled: true
  267 + spu_field: "product_id"
  268 + inner_hits_size: 10
  269 +
... ...
config/config_loader.py
... ... @@ -86,10 +86,7 @@ class RerankConfig:
86 86  
87 87 @dataclass
88 88 class CustomerConfig:
89   - """Complete configuration for a customer."""
90   - customer_id: str
91   - customer_name: str
92   -
  89 + """Complete configuration for search engine (multi-tenant)."""
93 90 # Field definitions
94 91 fields: List[FieldConfig]
95 92  
... ... @@ -122,22 +119,20 @@ class ConfigurationError(Exception):
122 119  
123 120  
124 121 class ConfigLoader:
125   - """Loads and validates customer configurations from YAML files."""
  122 + """Loads and validates unified search engine configuration from YAML file."""
126 123  
127   - def __init__(self, config_dir: str = "config/schema"):
128   - self.config_dir = Path(config_dir)
  124 + def __init__(self, config_file: str = "config/config.yaml"):
  125 + self.config_file = Path(config_file)
129 126  
130   - def _load_rewrite_dictionary(self, customer_id: str) -> Dict[str, str]:
  127 + def _load_rewrite_dictionary(self) -> Dict[str, str]:
131 128 """
132 129 Load query rewrite dictionary from external file.
133 130  
134   - Args:
135   - customer_id: Customer identifier
136   -
137 131 Returns:
138 132 Dictionary mapping query terms to rewritten queries
139 133 """
140   - dict_file = self.config_dir / customer_id / "query_rewrite.dict"
  134 + # Try config/query_rewrite.dict first
  135 + dict_file = self.config_file.parent / "query_rewrite.dict"
141 136  
142 137 if not dict_file.exists():
143 138 # Dictionary file is optional, return empty dict if not found
... ... @@ -166,16 +161,9 @@ class ConfigLoader:
166 161  
167 162 return rewrite_dict
168 163  
169   - def load_customer_config(self, customer_id: str) -> CustomerConfig:
  164 + def load_config(self) -> CustomerConfig:
170 165 """
171   - Load customer configuration from YAML file.
172   -
173   - Supports two directory structures:
174   - 1. New structure: config/schema/{customer_id}/config.yaml
175   - 2. Old structure: config/schema/{customer_id}_config.yaml (for backward compatibility)
176   -
177   - Args:
178   - customer_id: Customer identifier (used to find config file)
  166 + Load unified configuration from YAML file.
179 167  
180 168 Returns:
181 169 CustomerConfig object
... ... @@ -183,25 +171,18 @@ class ConfigLoader:
183 171 Raises:
184 172 ConfigurationError: If config file not found or invalid
185 173 """
186   - # Try new directory structure first
187   - config_file = self.config_dir / customer_id / "config.yaml"
188   -
189   - # Fall back to old structure if new one doesn't exist
190   - if not config_file.exists():
191   - config_file = self.config_dir / f"{customer_id}_config.yaml"
192   -
193   - if not config_file.exists():
194   - raise ConfigurationError(f"Configuration file not found: {config_file}")
  174 + if not self.config_file.exists():
  175 + raise ConfigurationError(f"Configuration file not found: {self.config_file}")
195 176  
196 177 try:
197   - with open(config_file, 'r', encoding='utf-8') as f:
  178 + with open(self.config_file, 'r', encoding='utf-8') as f:
198 179 config_data = yaml.safe_load(f)
199 180 except yaml.YAMLError as e:
200   - raise ConfigurationError(f"Invalid YAML in {config_file}: {e}")
  181 + raise ConfigurationError(f"Invalid YAML in {self.config_file}: {e}")
201 182  
202   - return self._parse_config(config_data, customer_id)
  183 + return self._parse_config(config_data)
203 184  
204   - def _parse_config(self, config_data: Dict[str, Any], customer_id: str) -> CustomerConfig:
  185 + def _parse_config(self, config_data: Dict[str, Any]) -> CustomerConfig:
205 186 """Parse configuration dictionary into CustomerConfig object."""
206 187  
207 188 # Parse fields
... ... @@ -218,7 +199,7 @@ class ConfigLoader:
218 199 query_config_data = config_data.get("query_config", {})
219 200  
220 201 # Load rewrite dictionary from external file instead of config
221   - rewrite_dictionary = self._load_rewrite_dictionary(customer_id)
  202 + rewrite_dictionary = self._load_rewrite_dictionary()
222 203  
223 204 query_config = QueryConfig(
224 205 supported_languages=query_config_data.get("supported_languages", ["zh", "en"]),
... ... @@ -263,8 +244,6 @@ class ConfigLoader:
263 244 )
264 245  
265 246 return CustomerConfig(
266   - customer_id=customer_id,
267   - customer_name=config_data.get("customer_name", customer_id),
268 247 fields=fields,
269 248 indexes=indexes,
270 249 query_config=query_config,
... ... @@ -272,7 +251,7 @@ class ConfigLoader:
272 251 function_score=function_score,
273 252 rerank=rerank,
274 253 spu_config=spu_config,
275   - es_index_name=config_data.get("es_index_name", f"search_{customer_id}"),
  254 + es_index_name=config_data.get("es_index_name", "search_products"),
276 255 es_settings=config_data.get("es_settings", {})
277 256 )
278 257  
... ... @@ -430,23 +409,21 @@ class ConfigLoader:
430 409  
431 410 def save_config(self, config: CustomerConfig, output_path: Optional[str] = None) -> None:
432 411 """
433   - Save customer configuration to YAML file.
  412 + Save configuration to YAML file.
434 413  
435 414 Note: rewrite_dictionary is saved separately to query_rewrite.dict file
436 415  
437 416 Args:
438 417 config: Configuration to save
439   - output_path: Optional output path (defaults to new directory structure)
  418 + output_path: Optional output path (defaults to config/config.yaml)
440 419 """
441 420 if output_path is None:
442   - # Use new directory structure by default
443   - customer_dir = self.config_dir / config.customer_id
444   - customer_dir.mkdir(parents=True, exist_ok=True)
445   - output_path = customer_dir / "config.yaml"
  421 + output_path = self.config_file
  422 + else:
  423 + output_path = Path(output_path)
446 424  
447 425 # Convert config back to dictionary format
448 426 config_dict = {
449   - "customer_name": config.customer_name,
450 427 "es_index_name": config.es_index_name,
451 428 "es_settings": config.es_settings,
452 429 "fields": [self._field_to_dict(field) for field in config.fields],
... ... @@ -482,23 +459,22 @@ class ConfigLoader:
482 459 }
483 460 }
484 461  
  462 + output_path.parent.mkdir(parents=True, exist_ok=True)
485 463 with open(output_path, 'w', encoding='utf-8') as f:
486 464 yaml.dump(config_dict, f, default_flow_style=False, allow_unicode=True)
487 465  
488 466 # Save rewrite dictionary to separate file
489   - self._save_rewrite_dictionary(config.customer_id, config.query_config.rewrite_dictionary)
  467 + self._save_rewrite_dictionary(config.query_config.rewrite_dictionary)
490 468  
491   - def _save_rewrite_dictionary(self, customer_id: str, rewrite_dict: Dict[str, str]) -> None:
  469 + def _save_rewrite_dictionary(self, rewrite_dict: Dict[str, str]) -> None:
492 470 """
493 471 Save rewrite dictionary to external file.
494 472  
495 473 Args:
496   - customer_id: Customer identifier
497 474 rewrite_dict: Dictionary to save
498 475 """
499   - customer_dir = self.config_dir / customer_id
500   - customer_dir.mkdir(parents=True, exist_ok=True)
501   - dict_file = customer_dir / "query_rewrite.dict"
  476 + dict_file = self.config_file.parent / "query_rewrite.dict"
  477 + dict_file.parent.mkdir(parents=True, exist_ok=True)
502 478  
503 479 with open(dict_file, 'w', encoding='utf-8') as f:
504 480 for key, value in rewrite_dict.items():
... ...
config/query_rewrite.dict 0 → 100644
... ... @@ -0,0 +1,4 @@
  1 +芭比 brand:芭比 OR name:芭比娃娃
  2 +玩具 category:玩具
  3 +消防 category:消防 OR name:消防
  4 +
... ...
frontend/index.html
... ... @@ -21,6 +21,10 @@
21 21  
22 22 <!-- Search Bar -->
23 23 <div class="search-bar">
  24 + <div class="tenant-input-wrapper">
  25 + <label for="tenantInput">租户ID:</label>
  26 + <input type="text" id="tenantInput" placeholder="请输入租户ID" value="1">
  27 + </div>
24 28 <input type="text" id="searchInput" placeholder="输入搜索关键词... (支持中文、英文、俄文)"
25 29 onkeypress="handleKeyPress(event)">
26 30 <button onclick="performSearch()" class="search-btn">搜索</button>
... ...
frontend/static/css/style.css
... ... @@ -69,6 +69,32 @@ body {
69 69 padding: 20px 30px;
70 70 background: white;
71 71 border-bottom: 1px solid #e0e0e0;
  72 + align-items: center;
  73 +}
  74 +
  75 +.tenant-input-wrapper {
  76 + display: flex;
  77 + align-items: center;
  78 + gap: 8px;
  79 +}
  80 +
  81 +.tenant-input-wrapper label {
  82 + font-size: 14px;
  83 + color: #666;
  84 + white-space: nowrap;
  85 +}
  86 +
  87 +#tenantInput {
  88 + width: 120px;
  89 + padding: 10px 15px;
  90 + font-size: 14px;
  91 + border: 1px solid #ddd;
  92 + border-radius: 4px;
  93 + outline: none;
  94 +}
  95 +
  96 +#tenantInput:focus {
  97 + border-color: #e74c3c;
72 98 }
73 99  
74 100 #searchInput {
... ...
frontend/static/js/app.js
1   -// SearchEngine Frontend - Modern UI
  1 +// SearchEngine Frontend - Modern UI (Multi-Tenant)
2 2  
3   -const API_BASE_URL = 'http://120.76.41.98:6002';
  3 +const API_BASE_URL = 'http://localhost:6002';
4 4 document.getElementById('apiUrl').textContent = API_BASE_URL;
5 5  
  6 +// Get tenant ID from input
  7 +function getTenantId() {
  8 + const tenantInput = document.getElementById('tenantInput');
  9 + if (tenantInput) {
  10 + return tenantInput.value.trim();
  11 + }
  12 + return '1'; // Default fallback
  13 +}
  14 +
6 15 // State Management
7 16 let state = {
8 17 query: '',
... ... @@ -42,12 +51,18 @@ function toggleFilters() {
42 51 // Perform search
43 52 async function performSearch(page = 1) {
44 53 const query = document.getElementById('searchInput').value.trim();
  54 + const tenantId = getTenantId();
45 55  
46 56 if (!query) {
47 57 alert('Please enter search keywords');
48 58 return;
49 59 }
50 60  
  61 + if (!tenantId) {
  62 + alert('Please enter tenant ID');
  63 + return;
  64 + }
  65 +
51 66 state.query = query;
52 67 state.currentPage = page;
53 68 state.pageSize = parseInt(document.getElementById('resultSize').value);
... ... @@ -57,22 +72,22 @@ async function performSearch(page = 1) {
57 72 // Define facets (简化配置)
58 73 const facets = [
59 74 {
60   - "field": "categoryName_keyword",
  75 + "field": "category_keyword",
61 76 "size": 15,
62 77 "type": "terms"
63 78 },
64 79 {
65   - "field": "brandName_keyword",
  80 + "field": "vendor_keyword",
66 81 "size": 15,
67 82 "type": "terms"
68 83 },
69 84 {
70   - "field": "supplierName_keyword",
  85 + "field": "tags_keyword",
71 86 "size": 10,
72 87 "type": "terms"
73 88 },
74 89 {
75   - "field": "price",
  90 + "field": "min_price",
76 91 "type": "range",
77 92 "ranges": [
78 93 {"key": "0-50", "to": 50},
... ... @@ -92,6 +107,7 @@ async function performSearch(page = 1) {
92 107 method: 'POST',
93 108 headers: {
94 109 'Content-Type': 'application/json',
  110 + 'X-Tenant-ID': tenantId,
95 111 },
96 112 body: JSON.stringify({
97 113 query: query,
... ... @@ -140,7 +156,7 @@ async function performSearch(page = 1) {
140 156 function displayResults(data) {
141 157 const grid = document.getElementById('productGrid');
142 158  
143   - if (!data.hits || data.hits.length === 0) {
  159 + if (!data.results || data.results.length === 0) {
144 160 grid.innerHTML = `
145 161 <div class="no-results" style="grid-column: 1 / -1;">
146 162 <h3>No Results Found</h3>
... ... @@ -152,16 +168,20 @@ function displayResults(data) {
152 168  
153 169 let html = '';
154 170  
155   - data.hits.forEach((hit) => {
156   - const source = hit._source;
157   - const score = hit._custom_score || hit._score;
  171 + data.results.forEach((result) => {
  172 + const product = result;
  173 + const title = product.title || product.name || 'N/A';
  174 + const price = product.min_price || product.price || 'N/A';
  175 + const imageUrl = product.image_url || product.imageUrl || '';
  176 + const category = product.category || product.categoryName || '';
  177 + const vendor = product.vendor || product.brandName || '';
158 178  
159 179 html += `
160 180 <div class="product-card">
161 181 <div class="product-image-wrapper">
162   - ${source.imageUrl ? `
163   - <img src="${escapeHtml(source.imageUrl)}"
164   - alt="${escapeHtml(source.name)}"
  182 + ${imageUrl ? `
  183 + <img src="${escapeHtml(imageUrl)}"
  184 + alt="${escapeHtml(title)}"
165 185 class="product-image"
166 186 onerror="this.src='data:image/svg+xml,%3Csvg xmlns=%22http://www.w3.org/2000/svg%22 width=%22100%22 height=%22100%22%3E%3Crect fill=%22%23f0f0f0%22 width=%22100%22 height=%22100%22/%3E%3Ctext x=%2250%25%22 y=%2250%25%22 font-size=%2214%22 text-anchor=%22middle%22 dy=%22.3em%22 fill=%22%23999%22%3ENo Image%3C/text%3E%3C/svg%3E'">
167 187 ` : `
... ... @@ -170,31 +190,17 @@ function displayResults(data) {
170 190 </div>
171 191  
172 192 <div class="product-price">
173   - ${source.price ? `${source.price} ₽` : 'N/A'}
174   - </div>
175   -
176   - <div class="product-moq">
177   - MOQ ${source.moq || 1} Box
178   - </div>
179   -
180   - <div class="product-quantity">
181   - ${source.quantity || 'N/A'} pcs / Box
  193 + ${price !== 'N/A' ? `¥${price}` : 'N/A'}
182 194 </div>
183 195  
184 196 <div class="product-title">
185   - ${escapeHtml(source.name || source.enSpuName || 'N/A')}
  197 + ${escapeHtml(title)}
186 198 </div>
187 199  
188 200 <div class="product-meta">
189   - ${source.categoryName ? escapeHtml(source.categoryName) : ''}
190   - ${source.brandName ? ' | ' + escapeHtml(source.brandName) : ''}
  201 + ${category ? escapeHtml(category) : ''}
  202 + ${vendor ? ' | ' + escapeHtml(vendor) : ''}
191 203 </div>
192   -
193   - ${source.create_time ? `
194   - <div class="product-time">
195   - Listed: ${formatDate(source.create_time)}
196   - </div>
197   - ` : ''}
198 204 </div>
199 205 `;
200 206 });
... ... @@ -211,13 +217,13 @@ function displayFacets(facets) {
211 217 let containerId = null;
212 218 let maxDisplay = 10;
213 219  
214   - if (facet.field === 'categoryName_keyword') {
  220 + if (facet.field === 'category_keyword') {
215 221 containerId = 'categoryTags';
216 222 maxDisplay = 10;
217   - } else if (facet.field === 'brandName_keyword') {
  223 + } else if (facet.field === 'vendor_keyword') {
218 224 containerId = 'brandTags';
219 225 maxDisplay = 10;
220   - } else if (facet.field === 'supplierName_keyword') {
  226 + } else if (facet.field === 'tags_keyword') {
221 227 containerId = 'supplierTags';
222 228 maxDisplay = 8;
223 229 }
... ... @@ -269,7 +275,7 @@ function toggleFilter(field, value) {
269 275 // Handle price filter (重构版 - 使用 rangeFilters)
270 276 function handlePriceFilter(value) {
271 277 if (!value) {
272   - delete state.rangeFilters.price;
  278 + delete state.rangeFilters.min_price;
273 279 } else {
274 280 const priceRanges = {
275 281 '0-50': { lt: 50 },
... ... @@ -279,7 +285,7 @@ function handlePriceFilter(value) {
279 285 };
280 286  
281 287 if (priceRanges[value]) {
282   - state.rangeFilters.price = priceRanges[value];
  288 + state.rangeFilters.min_price = priceRanges[value];
283 289 }
284 290 }
285 291  
... ...
frontend/static/js/app_base.js
1   -// SearchEngine Frontend - Modern UI
  1 +// SearchEngine Frontend - Modern UI (Multi-Tenant)
2 2  
3   -const TENANT_ID = '1';
4 3 const API_BASE_URL = 'http://localhost:6002';
5 4 document.getElementById('apiUrl').textContent = API_BASE_URL;
6 5  
  6 +// Get tenant ID from input
  7 +function getTenantId() {
  8 + const tenantInput = document.getElementById('tenantInput');
  9 + if (tenantInput) {
  10 + return tenantInput.value.trim();
  11 + }
  12 + return '1'; // Default fallback
  13 +}
  14 +
7 15 // State Management
8 16 let state = {
9 17 query: '',
... ... @@ -43,12 +51,18 @@ function toggleFilters() {
43 51 // Perform search
44 52 async function performSearch(page = 1) {
45 53 const query = document.getElementById('searchInput').value.trim();
  54 + const tenantId = getTenantId();
46 55  
47 56 if (!query) {
48 57 alert('Please enter search keywords');
49 58 return;
50 59 }
51 60  
  61 + if (!tenantId) {
  62 + alert('Please enter tenant ID');
  63 + return;
  64 + }
  65 +
52 66 state.query = query;
53 67 state.currentPage = page;
54 68 state.pageSize = parseInt(document.getElementById('resultSize').value);
... ... @@ -93,7 +107,7 @@ async function performSearch(page = 1) {
93 107 method: 'POST',
94 108 headers: {
95 109 'Content-Type': 'application/json',
96   - 'X-Tenant-ID': TENANT_ID,
  110 + 'X-Tenant-ID': tenantId,
97 111 },
98 112 body: JSON.stringify({
99 113 query: query,
... ...
frontend/unified.html 0 → 100644
... ... @@ -0,0 +1,138 @@
  1 +<!DOCTYPE html>
  2 +<html lang="zh-CN">
  3 +<head>
  4 + <meta charset="UTF-8">
  5 + <meta name="viewport" content="width=device-width, initial-scale=1.0">
  6 + <title>统一搜索界面 - Unified Search</title>
  7 + <link rel="stylesheet" href="/static/css/style.css">
  8 + <style>
  9 + .tenant-selector {
  10 + display: flex;
  11 + align-items: center;
  12 + gap: 10px;
  13 + margin-bottom: 10px;
  14 + padding: 10px;
  15 + background: #f5f5f5;
  16 + border-radius: 4px;
  17 + }
  18 + .tenant-selector label {
  19 + font-weight: bold;
  20 + color: #333;
  21 + }
  22 + .tenant-selector select {
  23 + padding: 6px 12px;
  24 + border: 1px solid #ddd;
  25 + border-radius: 4px;
  26 + font-size: 14px;
  27 + background: white;
  28 + cursor: pointer;
  29 + }
  30 + .tenant-selector select:hover {
  31 + border-color: #007bff;
  32 + }
  33 + .tenant-info {
  34 + font-size: 12px;
  35 + color: #666;
  36 + margin-left: auto;
  37 + }
  38 + </style>
  39 +</head>
  40 +<body>
  41 + <div class="page-container">
  42 + <!-- Header -->
  43 + <header class="top-header">
  44 + <div class="header-left">
  45 + <span class="logo">Unified Search</span>
  46 + <span class="product-count" id="productCount">0 products found</span>
  47 + </div>
  48 + <div class="header-right">
  49 + <button class="fold-btn" onclick="toggleFilters()">Fold</button>
  50 + </div>
  51 + </header>
  52 +
  53 + <!-- Tenant Selector -->
  54 + <div class="tenant-selector">
  55 + <label for="tenantSelect">选择租户:</label>
  56 + <select id="tenantSelect" onchange="switchTenant()">
  57 + <option value="customer1">Customer1 (旧配置)</option>
  58 + <option value="base:1" selected>Base - Tenant 1 (店匠通用)</option>
  59 + <option value="base:2">Base - Tenant 2 (店匠通用)</option>
  60 + </select>
  61 + <span class="tenant-info" id="tenantInfo">当前: Base - Tenant 1</span>
  62 + </div>
  63 +
  64 + <!-- Search Bar -->
  65 + <div class="search-bar">
  66 + <input type="text" id="searchInput" placeholder="输入搜索关键词... (支持中文、英文)"
  67 + onkeypress="handleKeyPress(event)">
  68 + <button onclick="performSearch()" class="search-btn">搜索</button>
  69 + </div>
  70 +
  71 + <!-- Filter Section -->
  72 + <div class="filter-section" id="filterSection">
  73 + <!-- Category Filter -->
  74 + <div class="filter-row">
  75 + <div class="filter-label">Categories:</div>
  76 + <div class="filter-tags" id="categoryTags"></div>
  77 + </div>
  78 +
  79 + <!-- Vendor/Brand Filter -->
  80 + <div class="filter-row">
  81 + <div class="filter-label" id="vendorLabel">Vendor:</div>
  82 + <div class="filter-tags" id="brandTags"></div>
  83 + </div>
  84 +
  85 + <!-- Tags/Supplier Filter -->
  86 + <div class="filter-row">
  87 + <div class="filter-label" id="tagsLabel">Tags:</div>
  88 + <div class="filter-tags" id="supplierTags"></div>
  89 + </div>
  90 +
  91 + <!-- Price Range Filter -->
  92 + <div class="filter-row">
  93 + <div class="filter-label">Price Range:</div>
  94 + <div class="filter-tags" id="priceTags"></div>
  95 + </div>
  96 +
  97 + <!-- Result Size -->
  98 + <div class="filter-row">
  99 + <div class="filter-label">Results per page:</div>
  100 + <select id="resultSize" onchange="performSearch()">
  101 + <option value="10">10</option>
  102 + <option value="20" selected>20</option>
  103 + <option value="50">50</option>
  104 + <option value="100">100</option>
  105 + </select>
  106 + </div>
  107 + </div>
  108 +
  109 + <!-- Results Section -->
  110 + <div class="results-section">
  111 + <div id="loading" style="display: none; text-align: center; padding: 20px;">
  112 + <p>Searching...</p>
  113 + </div>
  114 +
  115 + <div id="error" style="display: none; color: red; padding: 20px; text-align: center;"></div>
  116 +
  117 + <div id="welcome" style="text-align: center; padding: 40px; color: #666;">
  118 + <h2>Welcome to Unified Search</h2>
  119 + <p>Select a tenant and enter keywords to search for products</p>
  120 + </div>
  121 +
  122 + <div id="productGrid" class="product-grid"></div>
  123 +
  124 + <!-- Pagination -->
  125 + <div id="pagination" class="pagination"></div>
  126 + </div>
  127 +
  128 + <!-- Debug Section -->
  129 + <div class="debug-section" id="debugSection" style="display: none;">
  130 + <button onclick="toggleDebug()" class="debug-toggle">Toggle Debug Info</button>
  131 + <div id="debugInfo" style="display: none;"></div>
  132 + </div>
  133 + </div>
  134 +
  135 + <script src="/static/js/app_unified.js"></script>
  136 +</body>
  137 +</html>
  138 +
... ...
... ... @@ -27,11 +27,11 @@ from search import Searcher
27 27  
28 28 def cmd_ingest(args):
29 29 """Run data ingestion."""
30   - print(f"Starting ingestion for customer: {args.customer}")
  30 + print("Starting data ingestion")
31 31  
32 32 # Load config
33   - config_loader = ConfigLoader("config/schema")
34   - config = config_loader.load_customer_config(args.customer)
  33 + config_loader = ConfigLoader("config/config.yaml")
  34 + config = config_loader.load_config()
35 35  
36 36 # Initialize ES
37 37 es_client = ESClient(hosts=[args.es_host])
... ... @@ -65,11 +65,9 @@ def cmd_ingest(args):
65 65  
66 66 def cmd_serve(args):
67 67 """Start API service."""
68   - os.environ['CUSTOMER_ID'] = args.customer
69 68 os.environ['ES_HOST'] = args.es_host
70 69  
71   - print(f"Starting API service...")
72   - print(f" Customer: {args.customer}")
  70 + print("Starting API service (multi-tenant)...")
73 71 print(f" Host: {args.host}:{args.port}")
74 72 print(f" Elasticsearch: {args.es_host}")
75 73  
... ... @@ -84,8 +82,8 @@ def cmd_serve(args):
84 82 def cmd_search(args):
85 83 """Test search from command line."""
86 84 # Load config
87   - config_loader = ConfigLoader("config/schema")
88   - config = config_loader.load_customer_config(args.customer)
  85 + config_loader = ConfigLoader("config/config.yaml")
  86 + config = config_loader.load_config()
89 87  
90 88 # Initialize ES and searcher
91 89 es_client = ESClient(hosts=[args.es_host])
... ... @@ -93,15 +91,16 @@ def cmd_search(args):
93 91 print(f"ERROR: Cannot connect to Elasticsearch at {args.es_host}")
94 92 return 1
95 93  
96   - searcher = Searcher(config, es_client)
  94 + from query import QueryParser
  95 + query_parser = QueryParser(config)
  96 + searcher = Searcher(config, es_client, query_parser)
97 97  
98 98 # Execute search
99   - print(f"Searching for: '{args.query}'")
  99 + print(f"Searching for: '{args.query}' (tenant: {args.tenant_id})")
100 100 result = searcher.search(
101 101 query=args.query,
102   - size=args.size,
103   - enable_translation=not args.no_translation,
104   - enable_embedding=not args.no_embedding
  102 + tenant_id=args.tenant_id,
  103 + size=args.size
105 104 )
106 105  
107 106 # Display results
... ... @@ -136,7 +135,6 @@ def main():
136 135 # Ingest command
137 136 ingest_parser = subparsers.add_parser('ingest', help='Ingest data into Elasticsearch')
138 137 ingest_parser.add_argument('csv_file', help='Path to CSV data file')
139   - ingest_parser.add_argument('--customer', default='customer1', help='Customer ID')
140 138 ingest_parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
141 139 ingest_parser.add_argument('--limit', type=int, help='Limit number of documents')
142 140 ingest_parser.add_argument('--batch-size', type=int, default=100, help='Batch size')
... ... @@ -144,8 +142,7 @@ def main():
144 142 ingest_parser.add_argument('--skip-embeddings', action='store_true', help='Skip embeddings')
145 143  
146 144 # Serve command
147   - serve_parser = subparsers.add_parser('serve', help='Start API service')
148   - serve_parser.add_argument('--customer', default='customer1', help='Customer ID')
  145 + serve_parser = subparsers.add_parser('serve', help='Start API service (multi-tenant)')
149 146 serve_parser.add_argument('--host', default='0.0.0.0', help='Host to bind to')
150 147 serve_parser.add_argument('--port', type=int, default=6002, help='Port to bind to')
151 148 serve_parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
... ... @@ -154,7 +151,7 @@ def main():
154 151 # Search command
155 152 search_parser = subparsers.add_parser('search', help='Test search from command line')
156 153 search_parser.add_argument('query', help='Search query')
157   - search_parser.add_argument('--customer', default='customer1', help='Customer ID')
  154 + search_parser.add_argument('--tenant-id', required=True, help='Tenant ID (required)')
158 155 search_parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
159 156 search_parser.add_argument('--size', type=int, default=10, help='Number of results')
160 157 search_parser.add_argument('--no-translation', action='store_true', help='Disable translation')
... ...
restart.sh
... ... @@ -34,8 +34,8 @@ sleep 3
34 34  
35 35 # Step 2: Start all services
36 36 echo -e "\n${YELLOW}Step 2/2: 重新启动服务${NC}"
37   -if [ -f "./run.sh" ]; then
38   - ./run.sh
  37 +if [ -f "./scripts/start.sh" ]; then
  38 + ./scripts/start.sh
39 39 if [ $? -eq 0 ]; then
40 40 echo -e "${GREEN}========================================${NC}"
41 41 echo -e "${GREEN}服务重启完成!${NC}"
... ...
... ... @@ -17,95 +17,5 @@ echo -e &quot;${GREEN}========================================${NC}&quot;
17 17 # Create logs directory if it doesn't exist
18 18 mkdir -p logs
19 19  
20   -# Step 1: Start backend in background
21   -echo -e "\n${YELLOW}Step 1/2: 启动后端服务${NC}"
22   -echo -e "${YELLOW}后端服务将在后台运行...${NC}"
23   -
24   -nohup ./scripts/start_backend.sh > logs/backend.log 2>&1 &
25   -BACKEND_PID=$!
26   -echo $BACKEND_PID > logs/backend.pid
27   -echo -e "${GREEN}后端服务已启动 (PID: $BACKEND_PID)${NC}"
28   -echo -e "${GREEN}日志文件: logs/backend.log${NC}"
29   -
30   -# Wait for backend to start
31   -echo -e "${YELLOW}等待后端服务启动...${NC}"
32   -MAX_RETRIES=30
33   -RETRY_COUNT=0
34   -BACKEND_READY=false
35   -
36   -while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do
37   - sleep 2
38   - if curl -s http://localhost:6002/ > /dev/null 2>&1; then
39   - BACKEND_READY=true
40   - break
41   - fi
42   - RETRY_COUNT=$((RETRY_COUNT + 1))
43   - echo -e "${YELLOW} 等待中... ($RETRY_COUNT/$MAX_RETRIES)${NC}"
44   -done
45   -
46   -# Check if backend is running
47   -if [ "$BACKEND_READY" = true ]; then
48   - echo -e "${GREEN}✓ 后端服务运行正常${NC}"
49   - # Try health check
50   - if curl -s http://localhost:6002/admin/health > /dev/null 2>&1; then
51   - echo -e "${GREEN}✓ 健康检查通过${NC}"
52   - else
53   - echo -e "${YELLOW}⚠ 健康检查未通过,但服务已启动${NC}"
54   - fi
55   -else
56   - echo -e "${RED}✗ 后端服务启动失败,请检查日志: logs/backend.log${NC}"
57   - echo -e "${YELLOW}提示: 后端服务可能需要更多时间启动,或者检查端口是否被占用${NC}"
58   - exit 1
59   -fi
60   -
61   -# Step 2: Start frontend in background
62   -echo -e "\n${YELLOW}Step 2/2: 启动前端服务${NC}"
63   -echo -e "${YELLOW}前端服务将在后台运行...${NC}"
64   -
65   -nohup ./scripts/start_frontend.sh > logs/frontend.log 2>&1 &
66   -FRONTEND_PID=$!
67   -echo $FRONTEND_PID > logs/frontend.pid
68   -echo -e "${GREEN}前端服务已启动 (PID: $FRONTEND_PID)${NC}"
69   -echo -e "${GREEN}日志文件: logs/frontend.log${NC}"
70   -
71   -# Wait for frontend to start
72   -echo -e "${YELLOW}等待前端服务启动...${NC}"
73   -MAX_RETRIES=15
74   -RETRY_COUNT=0
75   -FRONTEND_READY=false
76   -
77   -while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do
78   - sleep 2
79   - if curl -s http://localhost:6003/ > /dev/null 2>&1; then
80   - FRONTEND_READY=true
81   - break
82   - fi
83   - RETRY_COUNT=$((RETRY_COUNT + 1))
84   - echo -e "${YELLOW} 等待中... ($RETRY_COUNT/$MAX_RETRIES)${NC}"
85   -done
86   -
87   -# Check if frontend is running
88   -if [ "$FRONTEND_READY" = true ]; then
89   - echo -e "${GREEN}✓ 前端服务运行正常${NC}"
90   -else
91   - echo -e "${YELLOW}⚠ 前端服务可能还在启动中,请稍后访问${NC}"
92   -fi
93   -
94   -echo -e "${GREEN}========================================${NC}"
95   -echo -e "${GREEN}所有服务启动完成!${NC}"
96   -echo -e "${GREEN}========================================${NC}"
97   -echo ""
98   -echo -e "访问地址:"
99   -echo -e " ${GREEN}前端界面: http://localhost:6003${NC}"
100   -echo -e " ${GREEN}后端API: http://localhost:6002${NC}"
101   -echo -e " ${GREEN}API文档: http://localhost:6002/docs${NC}"
102   -echo ""
103   -echo -e "日志文件:"
104   -echo -e " 后端: logs/backend.log"
105   -echo -e " 前端: logs/frontend.log"
106   -echo ""
107   -echo -e "停止服务:"
108   -echo -e " 所有服务: ./stop.sh"
109   -echo -e " 单独停止后端: kill \$(cat logs/backend.pid)"
110   -echo -e " 单独停止前端: kill \$(cat logs/frontend.pid)"
111   -echo ""
112 20 \ No newline at end of file
  21 +# Call unified start script
  22 +./scripts/start.sh
113 23 \ No newline at end of file
... ...
scripts/demo_base.sh
... ... @@ -178,7 +178,7 @@ echo -e &quot;${GREEN}演示环境启动完成!${NC}&quot;
178 178 echo -e "${GREEN}========================================${NC}"
179 179 echo ""
180 180 echo -e "访问地址:"
181   -echo -e " ${GREEN}前端界面: http://localhost:$FRONTEND_PORT/base${NC}"
  181 +echo -e " ${GREEN}前端界面: http://localhost:$FRONTEND_PORT/base${NC} (或 http://localhost:$FRONTEND_PORT/base.html)"
182 182 echo -e " ${GREEN}后端API: http://localhost:$API_PORT${NC}"
183 183 echo -e " ${GREEN}API文档: http://localhost:$API_PORT/docs${NC}"
184 184 echo ""
... ...
scripts/frontend_server.py
... ... @@ -47,13 +47,18 @@ class MyHTTPRequestHandler(http.server.SimpleHTTPRequestHandler, RateLimitingMix
47 47  
48 48 def do_GET(self):
49 49 """Handle GET requests with support for base.html."""
50   - # Route /base to base.html
51   - if self.path == '/base' or self.path == '/base/':
52   - self.path = '/base.html'
  50 + # Parse path (handle query strings)
  51 + path = self.path.split('?')[0] # Remove query string if present
  52 +
  53 + # Route /base to base.html (handle both with and without trailing slash)
  54 + if path == '/base' or path == '/base/':
  55 + self.path = '/base.html' + (self.path.split('?', 1)[1] if '?' in self.path else '')
53 56 # Route / to index.html (default)
54   - elif self.path == '/':
55   - self.path = '/index.html'
56   - return super().do_GET()
  57 + elif path == '/' or path == '':
  58 + self.path = '/index.html' + (self.path.split('?', 1)[1] if '?' in self.path else '')
  59 +
  60 + # Call parent do_GET with modified path
  61 + super().do_GET()
57 62  
58 63 def setup(self):
59 64 """Setup with error handling."""
... ... @@ -125,6 +130,18 @@ class ThreadedTCPServer(socketserver.ThreadingMixIn, socketserver.TCPServer):
125 130 daemon_threads = True
126 131  
127 132 if __name__ == '__main__':
  133 + # Check if port is already in use
  134 + import socket
  135 + sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  136 + try:
  137 + sock.bind(("", PORT))
  138 + sock.close()
  139 + except OSError:
  140 + print(f"ERROR: Port {PORT} is already in use.")
  141 + print(f"Please stop the existing server or use a different port.")
  142 + print(f"To stop existing server: kill $(lsof -t -i:{PORT})")
  143 + sys.exit(1)
  144 +
128 145 # Create threaded server for better concurrency
129 146 with ThreadedTCPServer(("", PORT), MyHTTPRequestHandler) as httpd:
130 147 print(f"Frontend server started at http://localhost:{PORT}")
... ...
scripts/ingest.sh
1 1 #!/bin/bash
2 2  
3   -# Data Ingestion Script for Customer1
4   -
5   -set -e
  3 +# Unified data ingestion script for SearchEngine
  4 +# Ingests data from MySQL to Elasticsearch
6 5  
7 6 cd "$(dirname "$0")/.."
8 7 source /home/tw/miniconda3/etc/profile.d/conda.sh
... ... @@ -10,41 +9,75 @@ conda activate searchengine
10 9  
11 10 GREEN='\033[0;32m'
12 11 YELLOW='\033[1;33m'
  12 +RED='\033[0;31m'
13 13 NC='\033[0m'
14 14  
15 15 echo -e "${GREEN}========================================${NC}"
16   -echo -e "${GREEN}Customer1 Data Ingestion${NC}"
  16 +echo -e "${GREEN}数据灌入脚本${NC}"
17 17 echo -e "${GREEN}========================================${NC}"
18 18  
19   -# Default values
20   -LIMIT=${1:-1000}
21   -SKIP_EMBEDDINGS=${2:-false}
  19 +# Load config from .env file if it exists
  20 +if [ -f .env ]; then
  21 + set -a
  22 + source .env
  23 + set +a
  24 +fi
  25 +
  26 +# Parameters
  27 +TENANT_ID=${1:-"1"}
  28 +DB_HOST=${DB_HOST:-"120.79.247.228"}
  29 +DB_PORT=${DB_PORT:-"3316"}
  30 +DB_DATABASE=${DB_DATABASE:-"saas"}
  31 +DB_USERNAME=${DB_USERNAME:-"saas"}
  32 +DB_PASSWORD=${DB_PASSWORD:-"P89cZHS5d7dFyc9R"}
  33 +ES_HOST=${ES_HOST:-"http://localhost:9200"}
  34 +BATCH_SIZE=${BATCH_SIZE:-500}
  35 +RECREATE=${RECREATE:-false}
22 36  
23 37 echo -e "\n${YELLOW}Configuration:${NC}"
24   -echo " Limit: $LIMIT documents"
25   -echo " Skip embeddings: $SKIP_EMBEDDINGS"
  38 +echo " Tenant ID: $TENANT_ID"
  39 +echo " MySQL: $DB_HOST:$DB_PORT/$DB_DATABASE"
  40 +echo " Elasticsearch: $ES_HOST"
  41 +echo " Batch Size: $BATCH_SIZE"
  42 +echo " Recreate Index: $RECREATE"
26 43  
27   -CSV_FILE="data/customer1/goods_with_pic.5years_congku.csv.shuf.1w"
  44 +# Validate parameters
  45 +if [ -z "$TENANT_ID" ]; then
  46 + echo -e "${RED}ERROR: Tenant ID is required${NC}"
  47 + echo "Usage: $0 <tenant_id> [batch_size] [recreate]"
  48 + exit 1
  49 +fi
28 50  
29   -if [ ! -f "$CSV_FILE" ]; then
30   - echo "Error: CSV file not found: $CSV_FILE"
  51 +if [ -z "$DB_PASSWORD" ]; then
  52 + echo -e "${RED}ERROR: DB_PASSWORD未设置,请检查.env文件或环境变量${NC}"
31 53 exit 1
32 54 fi
33 55  
34 56 # Build command
35   -CMD="python data/customer1/ingest_customer1.py \
36   - --csv $CSV_FILE \
37   - --limit $LIMIT \
38   - --recreate-index \
39   - --batch-size 100"
40   -
41   -if [ "$SKIP_EMBEDDINGS" = "true" ]; then
42   - CMD="$CMD --skip-embeddings"
  57 +CMD="python scripts/ingest_shoplazza.py \
  58 + --db-host $DB_HOST \
  59 + --db-port $DB_PORT \
  60 + --db-database $DB_DATABASE \
  61 + --db-username $DB_USERNAME \
  62 + --db-password $DB_PASSWORD \
  63 + --tenant-id $TENANT_ID \
  64 + --es-host $ES_HOST \
  65 + --batch-size $BATCH_SIZE"
  66 +
  67 +if [ "$RECREATE" = "true" ] || [ "$RECREATE" = "1" ]; then
  68 + CMD="$CMD --recreate"
43 69 fi
44 70  
45   -echo -e "\n${YELLOW}Starting ingestion...${NC}"
  71 +echo -e "\n${YELLOW}Starting data ingestion...${NC}"
46 72 eval $CMD
47 73  
48   -echo -e "\n${GREEN}========================================${NC}"
49   -echo -e "${GREEN}Ingestion Complete!${NC}"
50   -echo -e "${GREEN}========================================${NC}"
  74 +if [ $? -eq 0 ]; then
  75 + echo -e "\n${GREEN}========================================${NC}"
  76 + echo -e "${GREEN}数据灌入完成!${NC}"
  77 + echo -e "${GREEN}========================================${NC}"
  78 +else
  79 + echo -e "\n${RED}========================================${NC}"
  80 + echo -e "${RED}数据灌入失败!${NC}"
  81 + echo -e "${RED}========================================${NC}"
  82 + exit 1
  83 +fi
... ...
scripts/ingest_shoplazza.py
... ... @@ -33,7 +33,6 @@ def main():
33 33  
34 34 # Tenant and index
35 35 parser.add_argument('--tenant-id', required=True, help='Tenant ID (required)')
36   - parser.add_argument('--config', default='base', help='Configuration ID (default: base)')
37 36 parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
38 37  
39 38 # Options
... ... @@ -44,11 +43,11 @@ def main():
44 43  
45 44 print(f"Starting Shoplazza data ingestion for tenant: {args.tenant_id}")
46 45  
47   - # Load configuration
48   - config_loader = ConfigLoader("config/schema")
  46 + # Load unified configuration
  47 + config_loader = ConfigLoader("config/config.yaml")
49 48 try:
50   - config = config_loader.load_customer_config(args.config)
51   - print(f"Loaded configuration: {config.customer_name}")
  49 + config = config_loader.load_config()
  50 + print(f"Loaded configuration: {config.es_index_name}")
52 51 except Exception as e:
53 52 print(f"ERROR: Failed to load configuration: {e}")
54 53 return 1
... ...
scripts/mock_data.sh 0 → 100755
... ... @@ -0,0 +1,88 @@
  1 +#!/bin/bash
  2 +
  3 +# Mock data script for SearchEngine
  4 +# Generates test data and imports to MySQL
  5 +
  6 +cd "$(dirname "$0")/.."
  7 +source /home/tw/miniconda3/etc/profile.d/conda.sh
  8 +conda activate searchengine
  9 +
  10 +GREEN='\033[0;32m'
  11 +YELLOW='\033[1;33m'
  12 +RED='\033[0;31m'
  13 +NC='\033[0m'
  14 +
  15 +echo -e "${GREEN}========================================${NC}"
  16 +echo -e "${GREEN}Mock Data Script${NC}"
  17 +echo -e "${GREEN}========================================${NC}"
  18 +
  19 +# Load config from .env file if it exists
  20 +if [ -f .env ]; then
  21 + set -a
  22 + source .env
  23 + set +a
  24 +fi
  25 +
  26 +# Parameters
  27 +TENANT_ID=${1:-"1"}
  28 +NUM_SPUS=${2:-100}
  29 +DB_HOST=${DB_HOST:-"120.79.247.228"}
  30 +DB_PORT=${DB_PORT:-"3316"}
  31 +DB_DATABASE=${DB_DATABASE:-"saas"}
  32 +DB_USERNAME=${DB_USERNAME:-"saas"}
  33 +DB_PASSWORD=${DB_PASSWORD:-"P89cZHS5d7dFyc9R"}
  34 +SQL_FILE="test_data.sql"
  35 +
  36 +echo -e "\n${YELLOW}Configuration:${NC}"
  37 +echo " Tenant ID: $TENANT_ID"
  38 +echo " Number of SPUs: $NUM_SPUS"
  39 +echo " MySQL: $DB_HOST:$DB_PORT/$DB_DATABASE"
  40 +echo " SQL File: $SQL_FILE"
  41 +
  42 +# Step 1: Generate test data
  43 +echo -e "\n${YELLOW}Step 1/2: 生成测试数据${NC}"
  44 +python scripts/generate_test_data.py \
  45 + --num-spus $NUM_SPUS \
  46 + --tenant-id "$TENANT_ID" \
  47 + --start-spu-id 1 \
  48 + --start-sku-id 1 \
  49 + --output "$SQL_FILE"
  50 +
  51 +if [ $? -ne 0 ]; then
  52 + echo -e "${RED}✗ 生成测试数据失败${NC}"
  53 + exit 1
  54 +fi
  55 +
  56 +echo -e "${GREEN}✓ 测试数据已生成: $SQL_FILE${NC}"
  57 +
  58 +# Step 2: Import test data to MySQL
  59 +echo -e "\n${YELLOW}Step 2/2: 导入测试数据到MySQL${NC}"
  60 +if [ -z "$DB_PASSWORD" ]; then
  61 + echo -e "${RED}ERROR: DB_PASSWORD未设置,请检查.env文件或环境变量${NC}"
  62 + exit 1
  63 +fi
  64 +
  65 +python scripts/import_test_data.py \
  66 + --db-host "$DB_HOST" \
  67 + --db-port "$DB_PORT" \
  68 + --db-database "$DB_DATABASE" \
  69 + --db-username "$DB_USERNAME" \
  70 + --db-password "$DB_PASSWORD" \
  71 + --sql-file "$SQL_FILE" \
  72 + --tenant-id "$TENANT_ID"
  73 +
  74 +if [ $? -ne 0 ]; then
  75 + echo -e "${RED}✗ 导入测试数据失败${NC}"
  76 + exit 1
  77 +fi
  78 +
  79 +echo -e "${GREEN}✓ 测试数据已导入MySQL${NC}"
  80 +
  81 +echo -e "\n${GREEN}========================================${NC}"
  82 +echo -e "${GREEN}Mock数据完成!${NC}"
  83 +echo -e "${GREEN}========================================${NC}"
  84 +echo ""
  85 +echo -e "下一步:"
  86 +echo -e " ${YELLOW}./scripts/ingest.sh --tenant-id $TENANT_ID${NC} - 从MySQL灌入数据到ES"
  87 +echo ""
  88 +
... ...
scripts/start.sh 0 → 100755
... ... @@ -0,0 +1,106 @@
  1 +#!/bin/bash
  2 +
  3 +# Unified startup script for SearchEngine services
  4 +# This script starts both frontend and backend services
  5 +
  6 +cd "$(dirname "$0")/.."
  7 +
  8 +GREEN='\033[0;32m'
  9 +YELLOW='\033[1;33m'
  10 +RED='\033[0;31m'
  11 +NC='\033[0m'
  12 +
  13 +echo -e "${GREEN}========================================${NC}"
  14 +echo -e "${GREEN}SearchEngine服务启动脚本${NC}"
  15 +echo -e "${GREEN}========================================${NC}"
  16 +
  17 +# Create logs directory if it doesn't exist
  18 +mkdir -p logs
  19 +
  20 +# Step 1: Start backend in background
  21 +echo -e "\n${YELLOW}Step 1/2: 启动后端服务${NC}"
  22 +echo -e "${YELLOW}后端服务将在后台运行...${NC}"
  23 +
  24 +nohup ./scripts/start_backend.sh > logs/backend.log 2>&1 &
  25 +BACKEND_PID=$!
  26 +echo $BACKEND_PID > logs/backend.pid
  27 +echo -e "${GREEN}后端服务已启动 (PID: $BACKEND_PID)${NC}"
  28 +echo -e "${GREEN}日志文件: logs/backend.log${NC}"
  29 +
  30 +# Wait for backend to start
  31 +echo -e "${YELLOW}等待后端服务启动...${NC}"
  32 +MAX_RETRIES=30
  33 +RETRY_COUNT=0
  34 +BACKEND_READY=false
  35 +
  36 +while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do
  37 + sleep 2
  38 + if curl -s http://localhost:6002/health > /dev/null 2>&1; then
  39 + BACKEND_READY=true
  40 + break
  41 + fi
  42 + RETRY_COUNT=$((RETRY_COUNT + 1))
  43 + echo -e "${YELLOW} 等待中... ($RETRY_COUNT/$MAX_RETRIES)${NC}"
  44 +done
  45 +
  46 +# Check if backend is running
  47 +if [ "$BACKEND_READY" = true ]; then
  48 + echo -e "${GREEN}✓ 后端服务运行正常${NC}"
  49 +else
  50 + echo -e "${RED}✗ 后端服务启动失败,请检查日志: logs/backend.log${NC}"
  51 + echo -e "${YELLOW}提示: 后端服务可能需要更多时间启动,或者检查端口是否被占用${NC}"
  52 + exit 1
  53 +fi
  54 +
  55 +# Step 2: Start frontend in background
  56 +echo -e "\n${YELLOW}Step 2/2: 启动前端服务${NC}"
  57 +echo -e "${YELLOW}前端服务将在后台运行...${NC}"
  58 +
  59 +nohup ./scripts/start_frontend.sh > logs/frontend.log 2>&1 &
  60 +FRONTEND_PID=$!
  61 +echo $FRONTEND_PID > logs/frontend.pid
  62 +echo -e "${GREEN}前端服务已启动 (PID: $FRONTEND_PID)${NC}"
  63 +echo -e "${GREEN}日志文件: logs/frontend.log${NC}"
  64 +
  65 +# Wait for frontend to start
  66 +echo -e "${YELLOW}等待前端服务启动...${NC}"
  67 +MAX_RETRIES=15
  68 +RETRY_COUNT=0
  69 +FRONTEND_READY=false
  70 +
  71 +while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do
  72 + sleep 2
  73 + if curl -s http://localhost:6003/ > /dev/null 2>&1; then
  74 + FRONTEND_READY=true
  75 + break
  76 + fi
  77 + RETRY_COUNT=$((RETRY_COUNT + 1))
  78 + echo -e "${YELLOW} 等待中... ($RETRY_COUNT/$MAX_RETRIES)${NC}"
  79 +done
  80 +
  81 +# Check if frontend is running
  82 +if [ "$FRONTEND_READY" = true ]; then
  83 + echo -e "${GREEN}✓ 前端服务运行正常${NC}"
  84 +else
  85 + echo -e "${YELLOW}⚠ 前端服务可能还在启动中,请稍后访问${NC}"
  86 +fi
  87 +
  88 +echo -e "${GREEN}========================================${NC}"
  89 +echo -e "${GREEN}所有服务启动完成!${NC}"
  90 +echo -e "${GREEN}========================================${NC}"
  91 +echo ""
  92 +echo -e "访问地址:"
  93 +echo -e " ${GREEN}前端界面: http://localhost:6003${NC}"
  94 +echo -e " ${GREEN}后端API: http://localhost:6002${NC}"
  95 +echo -e " ${GREEN}API文档: http://localhost:6002/docs${NC}"
  96 +echo ""
  97 +echo -e "日志文件:"
  98 +echo -e " 后端: logs/backend.log"
  99 +echo -e " 前端: logs/frontend.log"
  100 +echo ""
  101 +echo -e "停止服务:"
  102 +echo -e " 所有服务: ./scripts/stop.sh"
  103 +echo -e " 单独停止后端: kill \$(cat logs/backend.pid)"
  104 +echo -e " 单独停止前端: kill \$(cat logs/frontend.pid)"
  105 +echo ""
  106 +
... ...
scripts/start_backend.sh
... ... @@ -24,16 +24,14 @@ if [ -f .env ]; then
24 24 fi
25 25  
26 26 echo -e "\n${YELLOW}Configuration:${NC}"
27   -echo " Customer: ${CUSTOMER_ID:-customer1}"
28 27 echo " API Host: ${API_HOST:-0.0.0.0}"
29 28 echo " API Port: ${API_PORT:-6002}"
30 29 echo " ES Host: ${ES_HOST:-http://localhost:9200}"
31 30 echo " ES Username: ${ES_USERNAME:-not set}"
32 31  
33   -echo -e "\n${YELLOW}Starting service...${NC}"
  32 +echo -e "\n${YELLOW}Starting service (multi-tenant)...${NC}"
34 33  
35 34 # Export environment variables for the Python process
36   -export CUSTOMER_ID=${CUSTOMER_ID:-customer1}
37 35 export API_HOST=${API_HOST:-0.0.0.0}
38 36 export API_PORT=${API_PORT:-6002}
39 37 export ES_HOST=${ES_HOST:-http://localhost:9200}
... ... @@ -43,6 +41,5 @@ export ES_PASSWORD=${ES_PASSWORD:-}
43 41 python -m api.app \
44 42 --host $API_HOST \
45 43 --port $API_PORT \
46   - --customer $CUSTOMER_ID \
47 44 --es-host $ES_HOST
48 45  
... ...
scripts/start_servers.py
... ... @@ -9,6 +9,7 @@ import signal
9 9 import time
10 10 import subprocess
11 11 import logging
  12 +import argparse
12 13 from typing import Dict, List, Optional
13 14 import multiprocessing
14 15 import threading
... ... @@ -65,12 +66,11 @@ class ServerManager:
65 66 logger.error(f"Failed to start frontend server: {e}")
66 67 return False
67 68  
68   - def start_api_server(self, customer: str = "customer1", es_host: str = "http://localhost:9200") -> bool:
  69 + def start_api_server(self, es_host: str = "http://localhost:9200") -> bool:
69 70 """Start the API server."""
70 71 try:
71 72 cmd = [
72 73 sys.executable, 'main.py', 'serve',
73   - '--customer', customer,
74 74 '--es-host', es_host,
75 75 '--host', '0.0.0.0',
76 76 '--port', '6002'
... ... @@ -78,7 +78,6 @@ class ServerManager:
78 78  
79 79 env = os.environ.copy()
80 80 env['PYTHONUNBUFFERED'] = '1'
81   - env['CUSTOMER_ID'] = customer
82 81 env['ES_HOST'] = es_host
83 82  
84 83 process = subprocess.Popen(
... ... @@ -179,14 +178,12 @@ def main():
179 178 """Main function to start all servers."""
180 179 global manager
181 180  
182   - parser = argparse.ArgumentParser(description='Start SearchEngine servers')
183   - parser.add_argument('--customer', default='customer1', help='Customer ID')
  181 + parser = argparse.ArgumentParser(description='Start SearchEngine servers (multi-tenant)')
184 182 parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
185 183 parser.add_argument('--check-dependencies', action='store_true', help='Check dependencies before starting')
186 184 args = parser.parse_args()
187 185  
188   - logger.info("Starting SearchEngine servers...")
189   - logger.info(f"Customer: {args.customer}")
  186 + logger.info("Starting SearchEngine servers (multi-tenant)...")
190 187 logger.info(f"Elasticsearch: {args.es_host}")
191 188  
192 189 # Check dependencies if requested
... ... @@ -209,7 +206,7 @@ def main():
209 206  
210 207 try:
211 208 # Start servers
212   - if not manager.start_api_server(args.customer, args.es_host):
  209 + if not manager.start_api_server(args.es_host):
213 210 logger.error("Failed to start API server")
214 211 sys.exit(1)
215 212  
... ...
test_all.sh
... ... @@ -43,8 +43,8 @@ try:
43 43 es_config = get_es_config()
44 44 es_client = ESClient(hosts=[es_config['host']], username=es_config.get('username'), password=es_config.get('password'))
45 45  
46   - config_loader = ConfigLoader('config/schema')
47   - config = config_loader.load_customer_config('customer1')
  46 + config_loader = ConfigLoader('config/config.yaml')
  47 + config = config_loader.load_config()
48 48  
49 49 if es_client.index_exists(config.es_index_name):
50 50 doc_count = es_client.count(config.es_index_name)
... ...
tests/conftest.py
... ... @@ -15,7 +15,8 @@ from unittest.mock import Mock, MagicMock
15 15 project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
16 16 sys.path.insert(0, project_root)
17 17  
18   -from config import CustomerConfig, QueryConfig, IndexConfig, FieldConfig, SPUConfig, RankingConfig
  18 +from config import CustomerConfig, QueryConfig, IndexConfig, FieldConfig, SPUConfig, RankingConfig, FunctionScoreConfig, RerankConfig
  19 +from config.field_types import FieldType, AnalyzerType
19 20 from utils.es_client import ESClient
20 21 from search import Searcher
21 22 from query import QueryParser
... ... @@ -39,7 +40,9 @@ def sample_index_config() -&gt; IndexConfig:
39 40 """样例索引配置"""
40 41 return IndexConfig(
41 42 name="default",
42   - match_fields=["name", "brand_name", "tags"],
  43 + label="默认索引",
  44 + fields=["name", "brand_name", "tags"],
  45 + analyzer=AnalyzerType.CHINESE_ECOMMERCE,
43 46 language_field_mapping={
44 47 "zh": ["name", "brand_name"],
45 48 "en": ["name_en", "brand_name_en"]
... ... @@ -64,23 +67,29 @@ def sample_customer_config(sample_index_config) -&gt; CustomerConfig:
64 67 )
65 68  
66 69 ranking_config = RankingConfig(
67   - expression="static_bm25() + text_embedding_relevance() * 0.2"
  70 + expression="static_bm25() + text_embedding_relevance() * 0.2",
  71 + description="Test ranking"
68 72 )
69 73  
  74 + function_score_config = FunctionScoreConfig()
  75 + rerank_config = RerankConfig()
  76 +
70 77 return CustomerConfig(
71   - customer_id="test_customer",
72 78 es_index_name="test_products",
73   - query=query_config,
  79 + fields=[
  80 + FieldConfig(name="tenant_id", field_type=FieldType.KEYWORD, required=True),
  81 + FieldConfig(name="name", field_type=FieldType.TEXT, analyzer=AnalyzerType.CHINESE_ECOMMERCE),
  82 + FieldConfig(name="brand_name", field_type=FieldType.TEXT, analyzer=AnalyzerType.CHINESE_ECOMMERCE),
  83 + FieldConfig(name="tags", field_type=FieldType.TEXT, analyzer=AnalyzerType.CHINESE_ECOMMERCE),
  84 + FieldConfig(name="price", field_type=FieldType.DOUBLE),
  85 + FieldConfig(name="category_id", field_type=FieldType.INT),
  86 + ],
74 87 indexes=[sample_index_config],
75   - spu=spu_config,
  88 + query_config=query_config,
76 89 ranking=ranking_config,
77   - fields=[
78   - FieldConfig(name="name", type="TEXT", analyzer="ansj"),
79   - FieldConfig(name="brand_name", type="TEXT", analyzer="ansj"),
80   - FieldConfig(name="tags", type="TEXT", analyzer="ansj"),
81   - FieldConfig(name="price", type="DOUBLE"),
82   - FieldConfig(name="category_id", type="INT"),
83   - ]
  90 + function_score=function_score_config,
  91 + rerank=rerank_config,
  92 + spu_config=spu_config
84 93 )
85 94  
86 95  
... ... @@ -165,31 +174,48 @@ def temp_config_file() -&gt; Generator[str, None, None]:
165 174 import yaml
166 175  
167 176 config_data = {
168   - "customer_id": "test_customer",
169 177 "es_index_name": "test_products",
170   - "query": {
  178 + "query_config": {
171 179 "enable_query_rewrite": True,
172 180 "enable_translation": True,
173 181 "enable_text_embedding": True,
174 182 "supported_languages": ["zh", "en"]
175 183 },
  184 + "fields": [
  185 + {"name": "tenant_id", "type": "KEYWORD", "required": True},
  186 + {"name": "name", "type": "TEXT", "analyzer": "ansj"},
  187 + {"name": "brand_name", "type": "TEXT", "analyzer": "ansj"}
  188 + ],
176 189 "indexes": [
177 190 {
178 191 "name": "default",
179   - "match_fields": ["name", "brand_name"],
  192 + "label": "默认索引",
  193 + "fields": ["name", "brand_name"],
  194 + "analyzer": "ansj",
180 195 "language_field_mapping": {
181 196 "zh": ["name", "brand_name"],
182 197 "en": ["name_en", "brand_name_en"]
183 198 }
184 199 }
185 200 ],
186   - "spu": {
  201 + "spu_config": {
187 202 "enabled": True,
188 203 "spu_field": "spu_id",
189 204 "inner_hits_size": 3
190 205 },
191 206 "ranking": {
192   - "expression": "static_bm25() + text_embedding_relevance() * 0.2"
  207 + "expression": "static_bm25() + text_embedding_relevance() * 0.2",
  208 + "description": "Test ranking"
  209 + },
  210 + "function_score": {
  211 + "score_mode": "sum",
  212 + "boost_mode": "multiply",
  213 + "functions": []
  214 + },
  215 + "rerank": {
  216 + "enabled": False,
  217 + "expression": "",
  218 + "description": ""
193 219 }
194 220 }
195 221  
... ... @@ -209,7 +235,6 @@ def mock_env_variables(monkeypatch):
209 235 monkeypatch.setenv("ES_HOST", "http://localhost:9200")
210 236 monkeypatch.setenv("ES_USERNAME", "elastic")
211 237 monkeypatch.setenv("ES_PASSWORD", "changeme")
212   - monkeypatch.setenv("CUSTOMER_ID", "test_customer")
213 238  
214 239  
215 240 # 标记配置
... ...