Commit 4d824a771eb23ed68b1889e01da8e40151f3b226

Authored by tangwang
1 parent fb68a0ef

所有租户共用一套统一配置.tenantID只在请求层级.服务层级没有tenantID相关的独立配置.

创建统一配置文件 config/config.yaml(从 base 配置迁移,移除 customer_name)

创建脚本体系
启动、停止、重启、moc数据到mysql、从mysql灌入数据到ES 这些脚本
restart.sh
run.sh 内部调用 启动前后端
scripts/mock_data.sh  mock数据 -> mysql
scripts/ingest.sh  mysql->ES
.cursor/plans/所有租户共用一套统一配置.tenantID只在请求层级.服务层级没有tenantID相关的独立配置.md 0 → 100644
@@ -0,0 +1,342 @@ @@ -0,0 +1,342 @@
  1 +<!-- d9d0ef58-7a33-4ef6-8e3a-714b4552fd20 56c8cc4b-4eeb-4b77-9986-f4fa349e96b9 -->
  2 +# 多租户架构重构计划
  3 +
  4 +## 概述
  5 +
  6 +将搜索服务从按租户启动改造为真正的多租户架构:
  7 +
  8 +- 服务启动时不指定租户ID,所有租户共用一套配置
  9 +- 删除customer1配置,去掉base层级,统一为config.yaml
  10 +- 统一脚本接口:启动、停止、重启、数据灌入
  11 +- 统一数据灌入流程,ES只有一份索引
  12 +- 前端支持在搜索框左侧输入租户ID
  13 +
  14 +## Phase 1: 配置文件体系重构
  15 +
  16 +### 1.1 创建统一配置文件
  17 +
  18 +**文件**: `config/config.yaml` (NEW)
  19 +
  20 +- 将 `config/schema/base/config.yaml` 移动到 `config/config.yaml`
  21 +- 删除 `customer_name` 字段(不再需要)
  22 +- 删除 `customer_id` 相关逻辑
  23 +- 固定索引名称为 `search_products`
  24 +- 确保包含 `tenant_id` 字段(必需)
  25 +
  26 +### 1.2 删除customer1配置
  27 +
  28 +**删除文件**:
  29 +
  30 +- `config/schema/customer1/config.yaml`
  31 +- `config/schema/customer1/` 目录(如果为空)
  32 +
  33 +### 1.3 更新ConfigLoader
  34 +
  35 +**文件**: `config/config_loader.py`
  36 +
  37 +修改 `load_customer_config()` 方法:
  38 +
  39 +- 移除 `customer_id` 参数
  40 +- 改为 `load_config()` 方法
  41 +- 直接加载 `config/config.yaml`
  42 +- 移除对 `config/schema/{customer_id}/config.yaml` 的查找逻辑
  43 +- 移除 `customer_id` 字段验证
  44 +- 更新 `CustomerConfig` 类:移除 `customer_id` 字段
  45 +
  46 +### 1.4 更新配置验证
  47 +
  48 +**文件**: `config/config_loader.py`
  49 +
  50 +修改 `validate_config()` 方法:
  51 +
  52 +- 确保 `tenant_id` 字段存在且为必需
  53 +- 移除对 `customer_id` 的验证
  54 +
  55 +## Phase 2: 服务启动改造
  56 +
  57 +### 2.1 更新API应用初始化
  58 +
  59 +**文件**: `api/app.py`
  60 +
  61 +修改 `init_service()` 方法:
  62 +
  63 +- 移除 `customer_id` 参数
  64 +- 直接加载统一配置(`config/config.yaml`)
  65 +- 移除 `CUSTOMER_ID` 环境变量依赖
  66 +- 更新日志输出(不再显示customer_id)
  67 +
  68 +修改 `startup_event()` 方法:
  69 +
  70 +- 移除 `CUSTOMER_ID` 环境变量读取
  71 +- 直接调用 `init_service()` 不传参数
  72 +
  73 +### 2.2 更新main.py
  74 +
  75 +**文件**: `main.py`
  76 +
  77 +修改 `cmd_serve()` 方法:
  78 +
  79 +- 移除 `--customer` 参数
  80 +- 移除 `CUSTOMER_ID` 环境变量设置
  81 +- 更新帮助信息
  82 +
  83 +### 2.3 更新启动脚本
  84 +
  85 +**文件**: `scripts/start_backend.sh`
  86 +
  87 +修改:
  88 +
  89 +- 移除 `CUSTOMER_ID` 环境变量
  90 +- 移除 `--customer` 参数
  91 +- 简化启动命令
  92 +
  93 +**文件**: `scripts/start_servers.py`
  94 +
  95 +修改 `start_api_server()` 方法:
  96 +
  97 +- 移除 `customer` 参数
  98 +- 移除 `CUSTOMER_ID` 环境变量设置
  99 +- 简化启动命令
  100 +
  101 +## Phase 3: 脚本体系统一
  102 +
  103 +### 3.1 创建统一启动脚本
  104 +
  105 +**文件**: `scripts/start.sh` (NEW)
  106 +
  107 +功能:
  108 +
  109 +- 启动后端服务(调用 `scripts/start_backend.sh`)
  110 +- 启动前端服务(调用 `scripts/start_frontend.sh`)
  111 +- 等待服务就绪
  112 +- 显示服务状态和访问地址
  113 +
  114 +### 3.2 创建统一停止脚本
  115 +
  116 +**文件**: `scripts/stop.sh` (已存在,需更新)
  117 +
  118 +功能:
  119 +
  120 +- 停止后端服务(端口6002)
  121 +- 停止前端服务(端口6003)
  122 +- 清理PID文件
  123 +- 显示停止状态
  124 +
  125 +### 3.3 创建统一重启脚本
  126 +
  127 +**文件**: `scripts/restart.sh` (已存在,需更新)
  128 +
  129 +功能:
  130 +
  131 +- 调用 `scripts/stop.sh` 停止服务
  132 +- 等待服务完全停止
  133 +- 调用 `scripts/start.sh` 启动服务
  134 +
  135 +### 3.4 创建数据灌入脚本
  136 +
  137 +**文件**: `scripts/ingest.sh` (已存在,需更新)
  138 +
  139 +功能:
  140 +
  141 +- 从MySQL读取数据
  142 +- 转换数据格式(统一处理base和customer1数据源)
  143 +- 灌入到ES索引 `search_products`
  144 +- 支持指定租户ID过滤数据
  145 +- 自动处理字段映射:缺失字段随机生成,多余字段忽略
  146 +
  147 +### 3.5 创建Mock数据脚本
  148 +
  149 +**文件**: `scripts/mock_data.sh` (NEW)
  150 +
  151 +功能:
  152 +
  153 +- 生成测试数据到MySQL
  154 +- 支持指定租户ID
  155 +- 支持指定数据量
  156 +- 调用 `scripts/generate_test_data.py` 和 `scripts/import_test_data.py`
  157 +
  158 +### 3.6 更新根目录脚本
  159 +
  160 +**文件**: `run.sh` (已存在,需更新)
  161 +
  162 +功能:
  163 +
  164 +- 调用 `scripts/start.sh` 启动服务
  165 +
  166 +**文件**: `restart.sh` (已存在,需更新)
  167 +
  168 +功能:
  169 +
  170 +- 调用 `scripts/restart.sh` 重启服务
  171 +
  172 +**文件**: `setup.sh` (已存在,需更新)
  173 +
  174 +功能:
  175 +
  176 +- 设置环境
  177 +- 检查依赖
  178 +- 不包含服务启动逻辑
  179 +
  180 +**文件**: `test_all.sh` (已存在,需更新)
  181 +
  182 +功能:
  183 +
  184 +- 运行完整测试流程
  185 +- 包含数据灌入、服务启动、API测试
  186 +
  187 +### 3.7 清理废弃脚本
  188 +
  189 +**删除文件**:
  190 +
  191 +- `scripts/demo_base.sh`
  192 +- `scripts/stop_base.sh`
  193 +- `scripts/start_test_environment.sh`
  194 +- `scripts/stop_test_environment.sh`
  195 +- 其他不再需要的脚本
  196 +
  197 +## Phase 4: 数据灌入统一
  198 +
  199 +### 4.1 更新数据灌入脚本
  200 +
  201 +**文件**: `scripts/ingest_shoplazza.py`
  202 +
  203 +修改:
  204 +
  205 +- 移除 `--config` 参数(不再需要)
  206 +- 直接加载统一配置(`config/config.yaml`)
  207 +- 统一处理所有数据源(不再区分base和customer1)
  208 +- 支持 `--tenant-id` 参数过滤数据
  209 +- 字段映射逻辑:
  210 +- 如果字段在配置中但数据源中没有,随机生成
  211 +- 如果字段在数据源中但配置中没有,忽略
  212 +- 确保 `tenant_id` 字段正确设置
  213 +
  214 +### 4.2 更新数据转换器
  215 +
  216 +**文件**: `indexer/spu_transformer.py`
  217 +
  218 +修改:
  219 +
  220 +- 移除对配置中 `customer_id` 的依赖
  221 +- 统一处理所有数据源
  222 +- 确保字段映射正确(缺失字段随机生成,多余字段忽略)
  223 +
  224 +### 4.3 统一测试数据生成
  225 +
  226 +**文件**: `scripts/generate_test_data.py`
  227 +
  228 +修改:
  229 +
  230 +- 支持生成符合统一索引结构的测试数据
  231 +- 支持指定租户ID
  232 +- 确保生成的数据包含所有必需字段
  233 +
  234 +## Phase 5: 前端改造
  235 +
  236 +### 5.1 更新前端HTML
  237 +
  238 +**文件**: `frontend/index.html`
  239 +
  240 +修改:
  241 +
  242 +- 在搜索框左侧添加租户ID输入框
  243 +- 添加租户ID标签
  244 +- 更新布局样式
  245 +
  246 +### 5.2 更新前端JavaScript
  247 +
  248 +**文件**: `frontend/static/js/app_base.js`
  249 +
  250 +修改:
  251 +
  252 +- 移除硬编码的 `TENANT_ID = '1'`
  253 +- 从输入框读取租户ID
  254 +- 在搜索请求中发送租户ID(通过 `X-Tenant-ID` header)
  255 +- 添加租户ID验证(不能为空)
  256 +- 更新UI显示
  257 +
  258 +### 5.3 更新前端CSS
  259 +
  260 +**文件**: `frontend/static/css/style.css`
  261 +
  262 +修改:
  263 +
  264 +- 添加租户ID输入框样式
  265 +- 更新搜索栏布局(支持租户ID输入框)
  266 +
  267 +## Phase 6: 更新文档和测试
  268 +
  269 +### 6.1 更新README
  270 +
  271 +**文件**: `README.md`
  272 +
  273 +修改:
  274 +
  275 +- 更新启动说明(不再需要指定租户ID)
  276 +- 更新配置说明(统一配置文件)
  277 +- 更新脚本使用说明
  278 +
  279 +### 6.2 更新API文档
  280 +
  281 +**文件**: `API_DOCUMENTATION.md`
  282 +
  283 +修改:
  284 +
  285 +- 更新租户ID说明(必须通过请求提供)
  286 +- 更新配置说明(统一配置)
  287 +
  288 +### 6.3 更新测试脚本
  289 +
  290 +**文件**: `test_all.sh`
  291 +
  292 +修改:
  293 +
  294 +- 更新测试流程(不再需要指定租户ID)
  295 +- 更新数据灌入测试(统一数据源)
  296 +- 更新API测试(包含租户ID参数)
  297 +
  298 +## Phase 7: 清理和验证
  299 +
  300 +### 7.1 清理废弃代码
  301 +
  302 +- 删除所有对 `customer_id` 的引用
  303 +- 删除所有对 `customer1` 配置的引用
  304 +- 删除所有对 `base` 配置层级的引用
  305 +- 清理不再使用的脚本
  306 +
  307 +### 7.2 验证功能
  308 +
  309 +- 验证服务启动(不指定租户ID)
  310 +- 验证配置加载(统一配置)
  311 +- 验证数据灌入(统一数据源)
  312 +- 验证搜索功能(通过请求提供租户ID)
  313 +- 验证前端功能(租户ID输入)
  314 +
  315 +## 关键文件清单
  316 +
  317 +### 需要修改的文件:
  318 +
  319 +1. `config/config_loader.py` - 移除customer_id逻辑
  320 +2. `config/config.yaml` - 统一配置文件(从base移动)
  321 +3. `api/app.py` - 移除customer_id参数
  322 +4. `main.py` - 移除customer参数
  323 +5. `scripts/start_backend.sh` - 移除CUSTOMER_ID
  324 +6. `scripts/start_servers.py` - 移除customer参数
  325 +7. `scripts/ingest_shoplazza.py` - 统一数据灌入
  326 +8. `frontend/index.html` - 添加租户ID输入框
  327 +9. `frontend/static/js/app_base.js` - 读取租户ID
  328 +10. `run.sh`, `restart.sh`, `setup.sh`, `test_all.sh` - 更新脚本
  329 +
  330 +### 需要删除的文件:
  331 +
  332 +1. `config/schema/customer1/config.yaml`
  333 +2. `config/schema/customer1/` 目录
  334 +3. `scripts/demo_base.sh`
  335 +4. `scripts/stop_base.sh`
  336 +5. 其他废弃脚本
  337 +
  338 +### 需要创建的文件:
  339 +
  340 +1. `config/config.yaml` - 统一配置文件
  341 +2. `scripts/start.sh` - 统一启动脚本
  342 +3. `scripts/mock_data.sh` - Mock数据脚本
0 \ No newline at end of file 343 \ No newline at end of file
@@ -51,28 +51,27 @@ _searcher: Optional[Searcher] = None @@ -51,28 +51,27 @@ _searcher: Optional[Searcher] = None
51 _query_parser: Optional[QueryParser] = None 51 _query_parser: Optional[QueryParser] = None
52 52
53 53
54 -def init_service(customer_id: str = "customer1", es_host: str = "http://localhost:9200"): 54 +def init_service(es_host: str = "http://localhost:9200"):
55 """ 55 """
56 - Initialize search service with configuration. 56 + Initialize search service with unified configuration.
57 57
58 Args: 58 Args:
59 - customer_id: Customer configuration ID  
60 es_host: Elasticsearch host URL 59 es_host: Elasticsearch host URL
61 """ 60 """
62 global _config, _es_client, _searcher, _query_parser 61 global _config, _es_client, _searcher, _query_parser
63 62
64 - print(f"Initializing search service for customer: {customer_id}") 63 + print("Initializing search service (multi-tenant)")
65 64
66 - # Load configuration  
67 - config_loader = ConfigLoader("config/schema")  
68 - _config = config_loader.load_customer_config(customer_id) 65 + # Load unified configuration
  66 + config_loader = ConfigLoader("config/config.yaml")
  67 + _config = config_loader.load_config()
69 68
70 # Validate configuration 69 # Validate configuration
71 errors = config_loader.validate_config(_config) 70 errors = config_loader.validate_config(_config)
72 if errors: 71 if errors:
73 raise ValueError(f"Configuration validation failed: {errors}") 72 raise ValueError(f"Configuration validation failed: {errors}")
74 73
75 - print(f"Configuration loaded: {_config.customer_name}") 74 + print(f"Configuration loaded: {_config.es_index_name}")
76 75
77 # Get ES credentials from environment variables or .env file 76 # Get ES credentials from environment variables or .env file
78 es_username = os.getenv('ES_USERNAME') 77 es_username = os.getenv('ES_USERNAME')
@@ -113,7 +112,7 @@ def init_service(customer_id: str = &quot;customer1&quot;, es_host: str = &quot;http://localhos @@ -113,7 +112,7 @@ def init_service(customer_id: str = &quot;customer1&quot;, es_host: str = &quot;http://localhos
113 112
114 113
115 def get_config() -> CustomerConfig: 114 def get_config() -> CustomerConfig:
116 - """Get customer configuration.""" 115 + """Get search engine configuration."""
117 if _config is None: 116 if _config is None:
118 raise RuntimeError("Service not initialized") 117 raise RuntimeError("Service not initialized")
119 return _config 118 return _config
@@ -184,15 +183,13 @@ app.add_middleware( @@ -184,15 +183,13 @@ app.add_middleware(
184 @app.on_event("startup") 183 @app.on_event("startup")
185 async def startup_event(): 184 async def startup_event():
186 """Initialize service on startup.""" 185 """Initialize service on startup."""
187 - customer_id = os.getenv("CUSTOMER_ID", "customer1")  
188 es_host = os.getenv("ES_HOST", "http://localhost:9200") 186 es_host = os.getenv("ES_HOST", "http://localhost:9200")
189 187
190 - logger.info(f"Starting E-Commerce Search API")  
191 - logger.info(f"Customer ID: {customer_id}") 188 + logger.info("Starting E-Commerce Search API (Multi-Tenant)")
192 logger.info(f"Elasticsearch Host: {es_host}") 189 logger.info(f"Elasticsearch Host: {es_host}")
193 190
194 try: 191 try:
195 - init_service(customer_id=customer_id, es_host=es_host) 192 + init_service(es_host=es_host)
196 logger.info("Service initialized successfully") 193 logger.info("Service initialized successfully")
197 except Exception as e: 194 except Exception as e:
198 logger.error(f"Failed to initialize service: {e}") 195 logger.error(f"Failed to initialize service: {e}")
@@ -310,16 +307,14 @@ else: @@ -310,16 +307,14 @@ else:
310 if __name__ == "__main__": 307 if __name__ == "__main__":
311 import uvicorn 308 import uvicorn
312 309
313 - parser = argparse.ArgumentParser(description='Start search API service') 310 + parser = argparse.ArgumentParser(description='Start search API service (multi-tenant)')
314 parser.add_argument('--host', default='0.0.0.0', help='Host to bind to') 311 parser.add_argument('--host', default='0.0.0.0', help='Host to bind to')
315 parser.add_argument('--port', type=int, default=6002, help='Port to bind to') 312 parser.add_argument('--port', type=int, default=6002, help='Port to bind to')
316 - parser.add_argument('--customer', default='customer1', help='Customer ID')  
317 parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host') 313 parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
318 parser.add_argument('--reload', action='store_true', help='Enable auto-reload') 314 parser.add_argument('--reload', action='store_true', help='Enable auto-reload')
319 args = parser.parse_args() 315 args = parser.parse_args()
320 316
321 - # Set environment variables  
322 - os.environ['CUSTOMER_ID'] = args.customer 317 + # Set environment variable
323 os.environ['ES_HOST'] = args.es_host 318 os.environ['ES_HOST'] = args.es_host
324 319
325 # Run server 320 # Run server
@@ -250,7 +250,6 @@ class HealthResponse(BaseModel): @@ -250,7 +250,6 @@ class HealthResponse(BaseModel):
250 """Health check response model.""" 250 """Health check response model."""
251 status: str = Field(..., description="Service status") 251 status: str = Field(..., description="Service status")
252 elasticsearch: str = Field(..., description="Elasticsearch status") 252 elasticsearch: str = Field(..., description="Elasticsearch status")
253 - customer_id: str = Field(..., description="Customer configuration ID")  
254 253
255 254
256 class ErrorResponse(BaseModel): 255 class ErrorResponse(BaseModel):
api/routes/admin.py
@@ -28,15 +28,13 @@ async def health_check(): @@ -28,15 +28,13 @@ async def health_check():
28 28
29 return HealthResponse( 29 return HealthResponse(
30 status="healthy" if es_status == "connected" else "unhealthy", 30 status="healthy" if es_status == "connected" else "unhealthy",
31 - elasticsearch=es_status,  
32 - customer_id=config.customer_id 31 + elasticsearch=es_status
33 ) 32 )
34 33
35 except Exception as e: 34 except Exception as e:
36 return HealthResponse( 35 return HealthResponse(
37 status="unhealthy", 36 status="unhealthy",
38 - elasticsearch="error",  
39 - customer_id="unknown" 37 + elasticsearch="error"
40 ) 38 )
41 39
42 40
@@ -51,8 +49,6 @@ async def get_configuration(): @@ -51,8 +49,6 @@ async def get_configuration():
51 config = get_config() 49 config = get_config()
52 50
53 return { 51 return {
54 - "customer_id": config.customer_id,  
55 - "customer_name": config.customer_name,  
56 "es_index_name": config.es_index_name, 52 "es_index_name": config.es_index_name,
57 "num_fields": len(config.fields), 53 "num_fields": len(config.fields),
58 "num_indexes": len(config.indexes), 54 "num_indexes": len(config.indexes),
config/config.yaml 0 → 100644
@@ -0,0 +1,269 @@ @@ -0,0 +1,269 @@
  1 +# Unified Configuration for Multi-Tenant Search Engine
  2 +# 统一配置文件,所有租户共用一套索引配置
  3 +# 注意:此配置不包含MySQL相关配置,只包含ES搜索相关配置
  4 +
  5 +# Elasticsearch Index
  6 +es_index_name: "search_products"
  7 +
  8 +# ES Index Settings
  9 +es_settings:
  10 + number_of_shards: 1
  11 + number_of_replicas: 0
  12 + refresh_interval: "30s"
  13 +
  14 +# Field Definitions (SPU级别,只包含对搜索有帮助的字段)
  15 +fields:
  16 + # 租户隔离字段(必需)
  17 + - name: "tenant_id"
  18 + type: "KEYWORD"
  19 + required: true
  20 + index: true
  21 + store: true
  22 +
  23 + # 商品标识字段
  24 + - name: "product_id"
  25 + type: "KEYWORD"
  26 + required: true
  27 + index: true
  28 + store: true
  29 +
  30 + - name: "handle"
  31 + type: "KEYWORD"
  32 + index: true
  33 + store: true
  34 +
  35 + # 文本搜索字段
  36 + - name: "title"
  37 + type: "TEXT"
  38 + analyzer: "chinese_ecommerce"
  39 + boost: 3.0
  40 + index: true
  41 + store: true
  42 +
  43 + - name: "brief"
  44 + type: "TEXT"
  45 + analyzer: "chinese_ecommerce"
  46 + boost: 1.5
  47 + index: true
  48 + store: true
  49 +
  50 + - name: "description"
  51 + type: "TEXT"
  52 + analyzer: "chinese_ecommerce"
  53 + boost: 1.0
  54 + index: true
  55 + store: true
  56 +
  57 + # SEO字段(提升相关性)
  58 + - name: "seo_title"
  59 + type: "TEXT"
  60 + analyzer: "chinese_ecommerce"
  61 + boost: 2.0
  62 + index: true
  63 + store: true
  64 +
  65 + - name: "seo_description"
  66 + type: "TEXT"
  67 + analyzer: "chinese_ecommerce"
  68 + boost: 1.5
  69 + index: true
  70 + store: true
  71 +
  72 + - name: "seo_keywords"
  73 + type: "TEXT"
  74 + analyzer: "chinese_ecommerce"
  75 + boost: 2.0
  76 + index: true
  77 + store: true
  78 +
  79 + # 分类和标签字段(TEXT + KEYWORD双重索引)
  80 + - name: "vendor"
  81 + type: "TEXT"
  82 + analyzer: "chinese_ecommerce"
  83 + boost: 1.5
  84 + index: true
  85 + store: true
  86 +
  87 + - name: "vendor_keyword"
  88 + type: "KEYWORD"
  89 + index: true
  90 + store: false
  91 +
  92 + - name: "product_type"
  93 + type: "TEXT"
  94 + analyzer: "chinese_ecommerce"
  95 + boost: 1.5
  96 + index: true
  97 + store: true
  98 +
  99 + - name: "product_type_keyword"
  100 + type: "KEYWORD"
  101 + index: true
  102 + store: false
  103 +
  104 + - name: "tags"
  105 + type: "TEXT"
  106 + analyzer: "chinese_ecommerce"
  107 + boost: 1.0
  108 + index: true
  109 + store: true
  110 +
  111 + - name: "tags_keyword"
  112 + type: "KEYWORD"
  113 + index: true
  114 + store: false
  115 +
  116 + - name: "category"
  117 + type: "TEXT"
  118 + analyzer: "chinese_ecommerce"
  119 + boost: 1.5
  120 + index: true
  121 + store: true
  122 +
  123 + - name: "category_keyword"
  124 + type: "KEYWORD"
  125 + index: true
  126 + store: false
  127 +
  128 + # 价格字段(扁平化)
  129 + - name: "min_price"
  130 + type: "FLOAT"
  131 + index: true
  132 + store: true
  133 +
  134 + - name: "max_price"
  135 + type: "FLOAT"
  136 + index: true
  137 + store: true
  138 +
  139 + - name: "compare_at_price"
  140 + type: "FLOAT"
  141 + index: true
  142 + store: true
  143 +
  144 + # 图片字段(用于显示,不参与搜索)
  145 + - name: "image_url"
  146 + type: "KEYWORD"
  147 + index: false
  148 + store: true
  149 +
  150 + # 嵌套variants字段
  151 + - name: "variants"
  152 + type: "JSON"
  153 + nested: true
  154 + nested_properties:
  155 + variant_id:
  156 + type: "keyword"
  157 + index: true
  158 + store: true
  159 + title:
  160 + type: "text"
  161 + analyzer: "chinese_ecommerce"
  162 + index: true
  163 + store: true
  164 + price:
  165 + type: "float"
  166 + index: true
  167 + store: true
  168 + compare_at_price:
  169 + type: "float"
  170 + index: true
  171 + store: true
  172 + sku:
  173 + type: "keyword"
  174 + index: true
  175 + store: true
  176 + stock:
  177 + type: "long"
  178 + index: true
  179 + store: true
  180 + options:
  181 + type: "object"
  182 + enabled: true
  183 +
  184 +# Index Structure (Query Domains)
  185 +indexes:
  186 + - name: "default"
  187 + label: "默认索引"
  188 + fields:
  189 + - "title"
  190 + - "brief"
  191 + - "description"
  192 + - "seo_title"
  193 + - "seo_description"
  194 + - "seo_keywords"
  195 + - "vendor"
  196 + - "product_type"
  197 + - "tags"
  198 + - "category"
  199 + analyzer: "chinese_ecommerce"
  200 + boost: 1.0
  201 +
  202 + - name: "title"
  203 + label: "标题索引"
  204 + fields:
  205 + - "title"
  206 + - "seo_title"
  207 + analyzer: "chinese_ecommerce"
  208 + boost: 2.0
  209 +
  210 + - name: "vendor"
  211 + label: "品牌索引"
  212 + fields:
  213 + - "vendor"
  214 + analyzer: "chinese_ecommerce"
  215 + boost: 1.5
  216 +
  217 + - name: "category"
  218 + label: "类目索引"
  219 + fields:
  220 + - "category"
  221 + analyzer: "chinese_ecommerce"
  222 + boost: 1.5
  223 +
  224 + - name: "tags"
  225 + label: "标签索引"
  226 + fields:
  227 + - "tags"
  228 + - "seo_keywords"
  229 + analyzer: "chinese_ecommerce"
  230 + boost: 1.0
  231 +
  232 +# Query Configuration
  233 +query_config:
  234 + supported_languages:
  235 + - "zh"
  236 + - "en"
  237 + default_language: "zh"
  238 + enable_translation: true
  239 + enable_text_embedding: true
  240 + enable_query_rewrite: true
  241 +
  242 + # Translation API (DeepL)
  243 + translation_service: "deepl"
  244 + translation_api_key: null # Set via environment variable
  245 +
  246 +# Ranking Configuration
  247 +ranking:
  248 + expression: "bm25() + 0.2*text_embedding_relevance()"
  249 + description: "BM25 text relevance combined with semantic embedding similarity"
  250 +
  251 +# Function Score配置(ES层打分规则)
  252 +function_score:
  253 + score_mode: "sum"
  254 + boost_mode: "multiply"
  255 +
  256 + functions: []
  257 +
  258 +# Rerank配置(本地重排,当前禁用)
  259 +rerank:
  260 + enabled: false
  261 + expression: ""
  262 + description: "Local reranking (disabled, use ES function_score instead)"
  263 +
  264 +# SPU配置(已启用,使用嵌套variants)
  265 +spu_config:
  266 + enabled: true
  267 + spu_field: "product_id"
  268 + inner_hits_size: 10
  269 +
config/config_loader.py
@@ -86,10 +86,7 @@ class RerankConfig: @@ -86,10 +86,7 @@ class RerankConfig:
86 86
87 @dataclass 87 @dataclass
88 class CustomerConfig: 88 class CustomerConfig:
89 - """Complete configuration for a customer."""  
90 - customer_id: str  
91 - customer_name: str  
92 - 89 + """Complete configuration for search engine (multi-tenant)."""
93 # Field definitions 90 # Field definitions
94 fields: List[FieldConfig] 91 fields: List[FieldConfig]
95 92
@@ -122,22 +119,20 @@ class ConfigurationError(Exception): @@ -122,22 +119,20 @@ class ConfigurationError(Exception):
122 119
123 120
124 class ConfigLoader: 121 class ConfigLoader:
125 - """Loads and validates customer configurations from YAML files.""" 122 + """Loads and validates unified search engine configuration from YAML file."""
126 123
127 - def __init__(self, config_dir: str = "config/schema"):  
128 - self.config_dir = Path(config_dir) 124 + def __init__(self, config_file: str = "config/config.yaml"):
  125 + self.config_file = Path(config_file)
129 126
130 - def _load_rewrite_dictionary(self, customer_id: str) -> Dict[str, str]: 127 + def _load_rewrite_dictionary(self) -> Dict[str, str]:
131 """ 128 """
132 Load query rewrite dictionary from external file. 129 Load query rewrite dictionary from external file.
133 130
134 - Args:  
135 - customer_id: Customer identifier  
136 -  
137 Returns: 131 Returns:
138 Dictionary mapping query terms to rewritten queries 132 Dictionary mapping query terms to rewritten queries
139 """ 133 """
140 - dict_file = self.config_dir / customer_id / "query_rewrite.dict" 134 + # Try config/query_rewrite.dict first
  135 + dict_file = self.config_file.parent / "query_rewrite.dict"
141 136
142 if not dict_file.exists(): 137 if not dict_file.exists():
143 # Dictionary file is optional, return empty dict if not found 138 # Dictionary file is optional, return empty dict if not found
@@ -166,16 +161,9 @@ class ConfigLoader: @@ -166,16 +161,9 @@ class ConfigLoader:
166 161
167 return rewrite_dict 162 return rewrite_dict
168 163
169 - def load_customer_config(self, customer_id: str) -> CustomerConfig: 164 + def load_config(self) -> CustomerConfig:
170 """ 165 """
171 - Load customer configuration from YAML file.  
172 -  
173 - Supports two directory structures:  
174 - 1. New structure: config/schema/{customer_id}/config.yaml  
175 - 2. Old structure: config/schema/{customer_id}_config.yaml (for backward compatibility)  
176 -  
177 - Args:  
178 - customer_id: Customer identifier (used to find config file) 166 + Load unified configuration from YAML file.
179 167
180 Returns: 168 Returns:
181 CustomerConfig object 169 CustomerConfig object
@@ -183,25 +171,18 @@ class ConfigLoader: @@ -183,25 +171,18 @@ class ConfigLoader:
183 Raises: 171 Raises:
184 ConfigurationError: If config file not found or invalid 172 ConfigurationError: If config file not found or invalid
185 """ 173 """
186 - # Try new directory structure first  
187 - config_file = self.config_dir / customer_id / "config.yaml"  
188 -  
189 - # Fall back to old structure if new one doesn't exist  
190 - if not config_file.exists():  
191 - config_file = self.config_dir / f"{customer_id}_config.yaml"  
192 -  
193 - if not config_file.exists():  
194 - raise ConfigurationError(f"Configuration file not found: {config_file}") 174 + if not self.config_file.exists():
  175 + raise ConfigurationError(f"Configuration file not found: {self.config_file}")
195 176
196 try: 177 try:
197 - with open(config_file, 'r', encoding='utf-8') as f: 178 + with open(self.config_file, 'r', encoding='utf-8') as f:
198 config_data = yaml.safe_load(f) 179 config_data = yaml.safe_load(f)
199 except yaml.YAMLError as e: 180 except yaml.YAMLError as e:
200 - raise ConfigurationError(f"Invalid YAML in {config_file}: {e}") 181 + raise ConfigurationError(f"Invalid YAML in {self.config_file}: {e}")
201 182
202 - return self._parse_config(config_data, customer_id) 183 + return self._parse_config(config_data)
203 184
204 - def _parse_config(self, config_data: Dict[str, Any], customer_id: str) -> CustomerConfig: 185 + def _parse_config(self, config_data: Dict[str, Any]) -> CustomerConfig:
205 """Parse configuration dictionary into CustomerConfig object.""" 186 """Parse configuration dictionary into CustomerConfig object."""
206 187
207 # Parse fields 188 # Parse fields
@@ -218,7 +199,7 @@ class ConfigLoader: @@ -218,7 +199,7 @@ class ConfigLoader:
218 query_config_data = config_data.get("query_config", {}) 199 query_config_data = config_data.get("query_config", {})
219 200
220 # Load rewrite dictionary from external file instead of config 201 # Load rewrite dictionary from external file instead of config
221 - rewrite_dictionary = self._load_rewrite_dictionary(customer_id) 202 + rewrite_dictionary = self._load_rewrite_dictionary()
222 203
223 query_config = QueryConfig( 204 query_config = QueryConfig(
224 supported_languages=query_config_data.get("supported_languages", ["zh", "en"]), 205 supported_languages=query_config_data.get("supported_languages", ["zh", "en"]),
@@ -263,8 +244,6 @@ class ConfigLoader: @@ -263,8 +244,6 @@ class ConfigLoader:
263 ) 244 )
264 245
265 return CustomerConfig( 246 return CustomerConfig(
266 - customer_id=customer_id,  
267 - customer_name=config_data.get("customer_name", customer_id),  
268 fields=fields, 247 fields=fields,
269 indexes=indexes, 248 indexes=indexes,
270 query_config=query_config, 249 query_config=query_config,
@@ -272,7 +251,7 @@ class ConfigLoader: @@ -272,7 +251,7 @@ class ConfigLoader:
272 function_score=function_score, 251 function_score=function_score,
273 rerank=rerank, 252 rerank=rerank,
274 spu_config=spu_config, 253 spu_config=spu_config,
275 - es_index_name=config_data.get("es_index_name", f"search_{customer_id}"), 254 + es_index_name=config_data.get("es_index_name", "search_products"),
276 es_settings=config_data.get("es_settings", {}) 255 es_settings=config_data.get("es_settings", {})
277 ) 256 )
278 257
@@ -430,23 +409,21 @@ class ConfigLoader: @@ -430,23 +409,21 @@ class ConfigLoader:
430 409
431 def save_config(self, config: CustomerConfig, output_path: Optional[str] = None) -> None: 410 def save_config(self, config: CustomerConfig, output_path: Optional[str] = None) -> None:
432 """ 411 """
433 - Save customer configuration to YAML file. 412 + Save configuration to YAML file.
434 413
435 Note: rewrite_dictionary is saved separately to query_rewrite.dict file 414 Note: rewrite_dictionary is saved separately to query_rewrite.dict file
436 415
437 Args: 416 Args:
438 config: Configuration to save 417 config: Configuration to save
439 - output_path: Optional output path (defaults to new directory structure) 418 + output_path: Optional output path (defaults to config/config.yaml)
440 """ 419 """
441 if output_path is None: 420 if output_path is None:
442 - # Use new directory structure by default  
443 - customer_dir = self.config_dir / config.customer_id  
444 - customer_dir.mkdir(parents=True, exist_ok=True)  
445 - output_path = customer_dir / "config.yaml" 421 + output_path = self.config_file
  422 + else:
  423 + output_path = Path(output_path)
446 424
447 # Convert config back to dictionary format 425 # Convert config back to dictionary format
448 config_dict = { 426 config_dict = {
449 - "customer_name": config.customer_name,  
450 "es_index_name": config.es_index_name, 427 "es_index_name": config.es_index_name,
451 "es_settings": config.es_settings, 428 "es_settings": config.es_settings,
452 "fields": [self._field_to_dict(field) for field in config.fields], 429 "fields": [self._field_to_dict(field) for field in config.fields],
@@ -482,23 +459,22 @@ class ConfigLoader: @@ -482,23 +459,22 @@ class ConfigLoader:
482 } 459 }
483 } 460 }
484 461
  462 + output_path.parent.mkdir(parents=True, exist_ok=True)
485 with open(output_path, 'w', encoding='utf-8') as f: 463 with open(output_path, 'w', encoding='utf-8') as f:
486 yaml.dump(config_dict, f, default_flow_style=False, allow_unicode=True) 464 yaml.dump(config_dict, f, default_flow_style=False, allow_unicode=True)
487 465
488 # Save rewrite dictionary to separate file 466 # Save rewrite dictionary to separate file
489 - self._save_rewrite_dictionary(config.customer_id, config.query_config.rewrite_dictionary) 467 + self._save_rewrite_dictionary(config.query_config.rewrite_dictionary)
490 468
491 - def _save_rewrite_dictionary(self, customer_id: str, rewrite_dict: Dict[str, str]) -> None: 469 + def _save_rewrite_dictionary(self, rewrite_dict: Dict[str, str]) -> None:
492 """ 470 """
493 Save rewrite dictionary to external file. 471 Save rewrite dictionary to external file.
494 472
495 Args: 473 Args:
496 - customer_id: Customer identifier  
497 rewrite_dict: Dictionary to save 474 rewrite_dict: Dictionary to save
498 """ 475 """
499 - customer_dir = self.config_dir / customer_id  
500 - customer_dir.mkdir(parents=True, exist_ok=True)  
501 - dict_file = customer_dir / "query_rewrite.dict" 476 + dict_file = self.config_file.parent / "query_rewrite.dict"
  477 + dict_file.parent.mkdir(parents=True, exist_ok=True)
502 478
503 with open(dict_file, 'w', encoding='utf-8') as f: 479 with open(dict_file, 'w', encoding='utf-8') as f:
504 for key, value in rewrite_dict.items(): 480 for key, value in rewrite_dict.items():
config/query_rewrite.dict 0 → 100644
@@ -0,0 +1,4 @@ @@ -0,0 +1,4 @@
  1 +芭比 brand:芭比 OR name:芭比娃娃
  2 +玩具 category:玩具
  3 +消防 category:消防 OR name:消防
  4 +
frontend/index.html
@@ -21,6 +21,10 @@ @@ -21,6 +21,10 @@
21 21
22 <!-- Search Bar --> 22 <!-- Search Bar -->
23 <div class="search-bar"> 23 <div class="search-bar">
  24 + <div class="tenant-input-wrapper">
  25 + <label for="tenantInput">租户ID:</label>
  26 + <input type="text" id="tenantInput" placeholder="请输入租户ID" value="1">
  27 + </div>
24 <input type="text" id="searchInput" placeholder="输入搜索关键词... (支持中文、英文、俄文)" 28 <input type="text" id="searchInput" placeholder="输入搜索关键词... (支持中文、英文、俄文)"
25 onkeypress="handleKeyPress(event)"> 29 onkeypress="handleKeyPress(event)">
26 <button onclick="performSearch()" class="search-btn">搜索</button> 30 <button onclick="performSearch()" class="search-btn">搜索</button>
frontend/static/css/style.css
@@ -69,6 +69,32 @@ body { @@ -69,6 +69,32 @@ body {
69 padding: 20px 30px; 69 padding: 20px 30px;
70 background: white; 70 background: white;
71 border-bottom: 1px solid #e0e0e0; 71 border-bottom: 1px solid #e0e0e0;
  72 + align-items: center;
  73 +}
  74 +
  75 +.tenant-input-wrapper {
  76 + display: flex;
  77 + align-items: center;
  78 + gap: 8px;
  79 +}
  80 +
  81 +.tenant-input-wrapper label {
  82 + font-size: 14px;
  83 + color: #666;
  84 + white-space: nowrap;
  85 +}
  86 +
  87 +#tenantInput {
  88 + width: 120px;
  89 + padding: 10px 15px;
  90 + font-size: 14px;
  91 + border: 1px solid #ddd;
  92 + border-radius: 4px;
  93 + outline: none;
  94 +}
  95 +
  96 +#tenantInput:focus {
  97 + border-color: #e74c3c;
72 } 98 }
73 99
74 #searchInput { 100 #searchInput {
frontend/static/js/app.js
1 -// SearchEngine Frontend - Modern UI 1 +// SearchEngine Frontend - Modern UI (Multi-Tenant)
2 2
3 -const API_BASE_URL = 'http://120.76.41.98:6002'; 3 +const API_BASE_URL = 'http://localhost:6002';
4 document.getElementById('apiUrl').textContent = API_BASE_URL; 4 document.getElementById('apiUrl').textContent = API_BASE_URL;
5 5
  6 +// Get tenant ID from input
  7 +function getTenantId() {
  8 + const tenantInput = document.getElementById('tenantInput');
  9 + if (tenantInput) {
  10 + return tenantInput.value.trim();
  11 + }
  12 + return '1'; // Default fallback
  13 +}
  14 +
6 // State Management 15 // State Management
7 let state = { 16 let state = {
8 query: '', 17 query: '',
@@ -42,12 +51,18 @@ function toggleFilters() { @@ -42,12 +51,18 @@ function toggleFilters() {
42 // Perform search 51 // Perform search
43 async function performSearch(page = 1) { 52 async function performSearch(page = 1) {
44 const query = document.getElementById('searchInput').value.trim(); 53 const query = document.getElementById('searchInput').value.trim();
  54 + const tenantId = getTenantId();
45 55
46 if (!query) { 56 if (!query) {
47 alert('Please enter search keywords'); 57 alert('Please enter search keywords');
48 return; 58 return;
49 } 59 }
50 60
  61 + if (!tenantId) {
  62 + alert('Please enter tenant ID');
  63 + return;
  64 + }
  65 +
51 state.query = query; 66 state.query = query;
52 state.currentPage = page; 67 state.currentPage = page;
53 state.pageSize = parseInt(document.getElementById('resultSize').value); 68 state.pageSize = parseInt(document.getElementById('resultSize').value);
@@ -57,22 +72,22 @@ async function performSearch(page = 1) { @@ -57,22 +72,22 @@ async function performSearch(page = 1) {
57 // Define facets (简化配置) 72 // Define facets (简化配置)
58 const facets = [ 73 const facets = [
59 { 74 {
60 - "field": "categoryName_keyword", 75 + "field": "category_keyword",
61 "size": 15, 76 "size": 15,
62 "type": "terms" 77 "type": "terms"
63 }, 78 },
64 { 79 {
65 - "field": "brandName_keyword", 80 + "field": "vendor_keyword",
66 "size": 15, 81 "size": 15,
67 "type": "terms" 82 "type": "terms"
68 }, 83 },
69 { 84 {
70 - "field": "supplierName_keyword", 85 + "field": "tags_keyword",
71 "size": 10, 86 "size": 10,
72 "type": "terms" 87 "type": "terms"
73 }, 88 },
74 { 89 {
75 - "field": "price", 90 + "field": "min_price",
76 "type": "range", 91 "type": "range",
77 "ranges": [ 92 "ranges": [
78 {"key": "0-50", "to": 50}, 93 {"key": "0-50", "to": 50},
@@ -92,6 +107,7 @@ async function performSearch(page = 1) { @@ -92,6 +107,7 @@ async function performSearch(page = 1) {
92 method: 'POST', 107 method: 'POST',
93 headers: { 108 headers: {
94 'Content-Type': 'application/json', 109 'Content-Type': 'application/json',
  110 + 'X-Tenant-ID': tenantId,
95 }, 111 },
96 body: JSON.stringify({ 112 body: JSON.stringify({
97 query: query, 113 query: query,
@@ -140,7 +156,7 @@ async function performSearch(page = 1) { @@ -140,7 +156,7 @@ async function performSearch(page = 1) {
140 function displayResults(data) { 156 function displayResults(data) {
141 const grid = document.getElementById('productGrid'); 157 const grid = document.getElementById('productGrid');
142 158
143 - if (!data.hits || data.hits.length === 0) { 159 + if (!data.results || data.results.length === 0) {
144 grid.innerHTML = ` 160 grid.innerHTML = `
145 <div class="no-results" style="grid-column: 1 / -1;"> 161 <div class="no-results" style="grid-column: 1 / -1;">
146 <h3>No Results Found</h3> 162 <h3>No Results Found</h3>
@@ -152,16 +168,20 @@ function displayResults(data) { @@ -152,16 +168,20 @@ function displayResults(data) {
152 168
153 let html = ''; 169 let html = '';
154 170
155 - data.hits.forEach((hit) => {  
156 - const source = hit._source;  
157 - const score = hit._custom_score || hit._score; 171 + data.results.forEach((result) => {
  172 + const product = result;
  173 + const title = product.title || product.name || 'N/A';
  174 + const price = product.min_price || product.price || 'N/A';
  175 + const imageUrl = product.image_url || product.imageUrl || '';
  176 + const category = product.category || product.categoryName || '';
  177 + const vendor = product.vendor || product.brandName || '';
158 178
159 html += ` 179 html += `
160 <div class="product-card"> 180 <div class="product-card">
161 <div class="product-image-wrapper"> 181 <div class="product-image-wrapper">
162 - ${source.imageUrl ? `  
163 - <img src="${escapeHtml(source.imageUrl)}"  
164 - alt="${escapeHtml(source.name)}" 182 + ${imageUrl ? `
  183 + <img src="${escapeHtml(imageUrl)}"
  184 + alt="${escapeHtml(title)}"
165 class="product-image" 185 class="product-image"
166 onerror="this.src='data:image/svg+xml,%3Csvg xmlns=%22http://www.w3.org/2000/svg%22 width=%22100%22 height=%22100%22%3E%3Crect fill=%22%23f0f0f0%22 width=%22100%22 height=%22100%22/%3E%3Ctext x=%2250%25%22 y=%2250%25%22 font-size=%2214%22 text-anchor=%22middle%22 dy=%22.3em%22 fill=%22%23999%22%3ENo Image%3C/text%3E%3C/svg%3E'"> 186 onerror="this.src='data:image/svg+xml,%3Csvg xmlns=%22http://www.w3.org/2000/svg%22 width=%22100%22 height=%22100%22%3E%3Crect fill=%22%23f0f0f0%22 width=%22100%22 height=%22100%22/%3E%3Ctext x=%2250%25%22 y=%2250%25%22 font-size=%2214%22 text-anchor=%22middle%22 dy=%22.3em%22 fill=%22%23999%22%3ENo Image%3C/text%3E%3C/svg%3E'">
167 ` : ` 187 ` : `
@@ -170,31 +190,17 @@ function displayResults(data) { @@ -170,31 +190,17 @@ function displayResults(data) {
170 </div> 190 </div>
171 191
172 <div class="product-price"> 192 <div class="product-price">
173 - ${source.price ? `${source.price} ₽` : 'N/A'}  
174 - </div>  
175 -  
176 - <div class="product-moq">  
177 - MOQ ${source.moq || 1} Box  
178 - </div>  
179 -  
180 - <div class="product-quantity">  
181 - ${source.quantity || 'N/A'} pcs / Box 193 + ${price !== 'N/A' ? `¥${price}` : 'N/A'}
182 </div> 194 </div>
183 195
184 <div class="product-title"> 196 <div class="product-title">
185 - ${escapeHtml(source.name || source.enSpuName || 'N/A')} 197 + ${escapeHtml(title)}
186 </div> 198 </div>
187 199
188 <div class="product-meta"> 200 <div class="product-meta">
189 - ${source.categoryName ? escapeHtml(source.categoryName) : ''}  
190 - ${source.brandName ? ' | ' + escapeHtml(source.brandName) : ''} 201 + ${category ? escapeHtml(category) : ''}
  202 + ${vendor ? ' | ' + escapeHtml(vendor) : ''}
191 </div> 203 </div>
192 -  
193 - ${source.create_time ? `  
194 - <div class="product-time">  
195 - Listed: ${formatDate(source.create_time)}  
196 - </div>  
197 - ` : ''}  
198 </div> 204 </div>
199 `; 205 `;
200 }); 206 });
@@ -211,13 +217,13 @@ function displayFacets(facets) { @@ -211,13 +217,13 @@ function displayFacets(facets) {
211 let containerId = null; 217 let containerId = null;
212 let maxDisplay = 10; 218 let maxDisplay = 10;
213 219
214 - if (facet.field === 'categoryName_keyword') { 220 + if (facet.field === 'category_keyword') {
215 containerId = 'categoryTags'; 221 containerId = 'categoryTags';
216 maxDisplay = 10; 222 maxDisplay = 10;
217 - } else if (facet.field === 'brandName_keyword') { 223 + } else if (facet.field === 'vendor_keyword') {
218 containerId = 'brandTags'; 224 containerId = 'brandTags';
219 maxDisplay = 10; 225 maxDisplay = 10;
220 - } else if (facet.field === 'supplierName_keyword') { 226 + } else if (facet.field === 'tags_keyword') {
221 containerId = 'supplierTags'; 227 containerId = 'supplierTags';
222 maxDisplay = 8; 228 maxDisplay = 8;
223 } 229 }
@@ -269,7 +275,7 @@ function toggleFilter(field, value) { @@ -269,7 +275,7 @@ function toggleFilter(field, value) {
269 // Handle price filter (重构版 - 使用 rangeFilters) 275 // Handle price filter (重构版 - 使用 rangeFilters)
270 function handlePriceFilter(value) { 276 function handlePriceFilter(value) {
271 if (!value) { 277 if (!value) {
272 - delete state.rangeFilters.price; 278 + delete state.rangeFilters.min_price;
273 } else { 279 } else {
274 const priceRanges = { 280 const priceRanges = {
275 '0-50': { lt: 50 }, 281 '0-50': { lt: 50 },
@@ -279,7 +285,7 @@ function handlePriceFilter(value) { @@ -279,7 +285,7 @@ function handlePriceFilter(value) {
279 }; 285 };
280 286
281 if (priceRanges[value]) { 287 if (priceRanges[value]) {
282 - state.rangeFilters.price = priceRanges[value]; 288 + state.rangeFilters.min_price = priceRanges[value];
283 } 289 }
284 } 290 }
285 291
frontend/static/js/app_base.js
1 -// SearchEngine Frontend - Modern UI 1 +// SearchEngine Frontend - Modern UI (Multi-Tenant)
2 2
3 -const TENANT_ID = '1';  
4 const API_BASE_URL = 'http://localhost:6002'; 3 const API_BASE_URL = 'http://localhost:6002';
5 document.getElementById('apiUrl').textContent = API_BASE_URL; 4 document.getElementById('apiUrl').textContent = API_BASE_URL;
6 5
  6 +// Get tenant ID from input
  7 +function getTenantId() {
  8 + const tenantInput = document.getElementById('tenantInput');
  9 + if (tenantInput) {
  10 + return tenantInput.value.trim();
  11 + }
  12 + return '1'; // Default fallback
  13 +}
  14 +
7 // State Management 15 // State Management
8 let state = { 16 let state = {
9 query: '', 17 query: '',
@@ -43,12 +51,18 @@ function toggleFilters() { @@ -43,12 +51,18 @@ function toggleFilters() {
43 // Perform search 51 // Perform search
44 async function performSearch(page = 1) { 52 async function performSearch(page = 1) {
45 const query = document.getElementById('searchInput').value.trim(); 53 const query = document.getElementById('searchInput').value.trim();
  54 + const tenantId = getTenantId();
46 55
47 if (!query) { 56 if (!query) {
48 alert('Please enter search keywords'); 57 alert('Please enter search keywords');
49 return; 58 return;
50 } 59 }
51 60
  61 + if (!tenantId) {
  62 + alert('Please enter tenant ID');
  63 + return;
  64 + }
  65 +
52 state.query = query; 66 state.query = query;
53 state.currentPage = page; 67 state.currentPage = page;
54 state.pageSize = parseInt(document.getElementById('resultSize').value); 68 state.pageSize = parseInt(document.getElementById('resultSize').value);
@@ -93,7 +107,7 @@ async function performSearch(page = 1) { @@ -93,7 +107,7 @@ async function performSearch(page = 1) {
93 method: 'POST', 107 method: 'POST',
94 headers: { 108 headers: {
95 'Content-Type': 'application/json', 109 'Content-Type': 'application/json',
96 - 'X-Tenant-ID': TENANT_ID, 110 + 'X-Tenant-ID': tenantId,
97 }, 111 },
98 body: JSON.stringify({ 112 body: JSON.stringify({
99 query: query, 113 query: query,
frontend/unified.html 0 → 100644
@@ -0,0 +1,138 @@ @@ -0,0 +1,138 @@
  1 +<!DOCTYPE html>
  2 +<html lang="zh-CN">
  3 +<head>
  4 + <meta charset="UTF-8">
  5 + <meta name="viewport" content="width=device-width, initial-scale=1.0">
  6 + <title>统一搜索界面 - Unified Search</title>
  7 + <link rel="stylesheet" href="/static/css/style.css">
  8 + <style>
  9 + .tenant-selector {
  10 + display: flex;
  11 + align-items: center;
  12 + gap: 10px;
  13 + margin-bottom: 10px;
  14 + padding: 10px;
  15 + background: #f5f5f5;
  16 + border-radius: 4px;
  17 + }
  18 + .tenant-selector label {
  19 + font-weight: bold;
  20 + color: #333;
  21 + }
  22 + .tenant-selector select {
  23 + padding: 6px 12px;
  24 + border: 1px solid #ddd;
  25 + border-radius: 4px;
  26 + font-size: 14px;
  27 + background: white;
  28 + cursor: pointer;
  29 + }
  30 + .tenant-selector select:hover {
  31 + border-color: #007bff;
  32 + }
  33 + .tenant-info {
  34 + font-size: 12px;
  35 + color: #666;
  36 + margin-left: auto;
  37 + }
  38 + </style>
  39 +</head>
  40 +<body>
  41 + <div class="page-container">
  42 + <!-- Header -->
  43 + <header class="top-header">
  44 + <div class="header-left">
  45 + <span class="logo">Unified Search</span>
  46 + <span class="product-count" id="productCount">0 products found</span>
  47 + </div>
  48 + <div class="header-right">
  49 + <button class="fold-btn" onclick="toggleFilters()">Fold</button>
  50 + </div>
  51 + </header>
  52 +
  53 + <!-- Tenant Selector -->
  54 + <div class="tenant-selector">
  55 + <label for="tenantSelect">选择租户:</label>
  56 + <select id="tenantSelect" onchange="switchTenant()">
  57 + <option value="customer1">Customer1 (旧配置)</option>
  58 + <option value="base:1" selected>Base - Tenant 1 (店匠通用)</option>
  59 + <option value="base:2">Base - Tenant 2 (店匠通用)</option>
  60 + </select>
  61 + <span class="tenant-info" id="tenantInfo">当前: Base - Tenant 1</span>
  62 + </div>
  63 +
  64 + <!-- Search Bar -->
  65 + <div class="search-bar">
  66 + <input type="text" id="searchInput" placeholder="输入搜索关键词... (支持中文、英文)"
  67 + onkeypress="handleKeyPress(event)">
  68 + <button onclick="performSearch()" class="search-btn">搜索</button>
  69 + </div>
  70 +
  71 + <!-- Filter Section -->
  72 + <div class="filter-section" id="filterSection">
  73 + <!-- Category Filter -->
  74 + <div class="filter-row">
  75 + <div class="filter-label">Categories:</div>
  76 + <div class="filter-tags" id="categoryTags"></div>
  77 + </div>
  78 +
  79 + <!-- Vendor/Brand Filter -->
  80 + <div class="filter-row">
  81 + <div class="filter-label" id="vendorLabel">Vendor:</div>
  82 + <div class="filter-tags" id="brandTags"></div>
  83 + </div>
  84 +
  85 + <!-- Tags/Supplier Filter -->
  86 + <div class="filter-row">
  87 + <div class="filter-label" id="tagsLabel">Tags:</div>
  88 + <div class="filter-tags" id="supplierTags"></div>
  89 + </div>
  90 +
  91 + <!-- Price Range Filter -->
  92 + <div class="filter-row">
  93 + <div class="filter-label">Price Range:</div>
  94 + <div class="filter-tags" id="priceTags"></div>
  95 + </div>
  96 +
  97 + <!-- Result Size -->
  98 + <div class="filter-row">
  99 + <div class="filter-label">Results per page:</div>
  100 + <select id="resultSize" onchange="performSearch()">
  101 + <option value="10">10</option>
  102 + <option value="20" selected>20</option>
  103 + <option value="50">50</option>
  104 + <option value="100">100</option>
  105 + </select>
  106 + </div>
  107 + </div>
  108 +
  109 + <!-- Results Section -->
  110 + <div class="results-section">
  111 + <div id="loading" style="display: none; text-align: center; padding: 20px;">
  112 + <p>Searching...</p>
  113 + </div>
  114 +
  115 + <div id="error" style="display: none; color: red; padding: 20px; text-align: center;"></div>
  116 +
  117 + <div id="welcome" style="text-align: center; padding: 40px; color: #666;">
  118 + <h2>Welcome to Unified Search</h2>
  119 + <p>Select a tenant and enter keywords to search for products</p>
  120 + </div>
  121 +
  122 + <div id="productGrid" class="product-grid"></div>
  123 +
  124 + <!-- Pagination -->
  125 + <div id="pagination" class="pagination"></div>
  126 + </div>
  127 +
  128 + <!-- Debug Section -->
  129 + <div class="debug-section" id="debugSection" style="display: none;">
  130 + <button onclick="toggleDebug()" class="debug-toggle">Toggle Debug Info</button>
  131 + <div id="debugInfo" style="display: none;"></div>
  132 + </div>
  133 + </div>
  134 +
  135 + <script src="/static/js/app_unified.js"></script>
  136 +</body>
  137 +</html>
  138 +
@@ -27,11 +27,11 @@ from search import Searcher @@ -27,11 +27,11 @@ from search import Searcher
27 27
28 def cmd_ingest(args): 28 def cmd_ingest(args):
29 """Run data ingestion.""" 29 """Run data ingestion."""
30 - print(f"Starting ingestion for customer: {args.customer}") 30 + print("Starting data ingestion")
31 31
32 # Load config 32 # Load config
33 - config_loader = ConfigLoader("config/schema")  
34 - config = config_loader.load_customer_config(args.customer) 33 + config_loader = ConfigLoader("config/config.yaml")
  34 + config = config_loader.load_config()
35 35
36 # Initialize ES 36 # Initialize ES
37 es_client = ESClient(hosts=[args.es_host]) 37 es_client = ESClient(hosts=[args.es_host])
@@ -65,11 +65,9 @@ def cmd_ingest(args): @@ -65,11 +65,9 @@ def cmd_ingest(args):
65 65
66 def cmd_serve(args): 66 def cmd_serve(args):
67 """Start API service.""" 67 """Start API service."""
68 - os.environ['CUSTOMER_ID'] = args.customer  
69 os.environ['ES_HOST'] = args.es_host 68 os.environ['ES_HOST'] = args.es_host
70 69
71 - print(f"Starting API service...")  
72 - print(f" Customer: {args.customer}") 70 + print("Starting API service (multi-tenant)...")
73 print(f" Host: {args.host}:{args.port}") 71 print(f" Host: {args.host}:{args.port}")
74 print(f" Elasticsearch: {args.es_host}") 72 print(f" Elasticsearch: {args.es_host}")
75 73
@@ -84,8 +82,8 @@ def cmd_serve(args): @@ -84,8 +82,8 @@ def cmd_serve(args):
84 def cmd_search(args): 82 def cmd_search(args):
85 """Test search from command line.""" 83 """Test search from command line."""
86 # Load config 84 # Load config
87 - config_loader = ConfigLoader("config/schema")  
88 - config = config_loader.load_customer_config(args.customer) 85 + config_loader = ConfigLoader("config/config.yaml")
  86 + config = config_loader.load_config()
89 87
90 # Initialize ES and searcher 88 # Initialize ES and searcher
91 es_client = ESClient(hosts=[args.es_host]) 89 es_client = ESClient(hosts=[args.es_host])
@@ -93,15 +91,16 @@ def cmd_search(args): @@ -93,15 +91,16 @@ def cmd_search(args):
93 print(f"ERROR: Cannot connect to Elasticsearch at {args.es_host}") 91 print(f"ERROR: Cannot connect to Elasticsearch at {args.es_host}")
94 return 1 92 return 1
95 93
96 - searcher = Searcher(config, es_client) 94 + from query import QueryParser
  95 + query_parser = QueryParser(config)
  96 + searcher = Searcher(config, es_client, query_parser)
97 97
98 # Execute search 98 # Execute search
99 - print(f"Searching for: '{args.query}'") 99 + print(f"Searching for: '{args.query}' (tenant: {args.tenant_id})")
100 result = searcher.search( 100 result = searcher.search(
101 query=args.query, 101 query=args.query,
102 - size=args.size,  
103 - enable_translation=not args.no_translation,  
104 - enable_embedding=not args.no_embedding 102 + tenant_id=args.tenant_id,
  103 + size=args.size
105 ) 104 )
106 105
107 # Display results 106 # Display results
@@ -136,7 +135,6 @@ def main(): @@ -136,7 +135,6 @@ def main():
136 # Ingest command 135 # Ingest command
137 ingest_parser = subparsers.add_parser('ingest', help='Ingest data into Elasticsearch') 136 ingest_parser = subparsers.add_parser('ingest', help='Ingest data into Elasticsearch')
138 ingest_parser.add_argument('csv_file', help='Path to CSV data file') 137 ingest_parser.add_argument('csv_file', help='Path to CSV data file')
139 - ingest_parser.add_argument('--customer', default='customer1', help='Customer ID')  
140 ingest_parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host') 138 ingest_parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
141 ingest_parser.add_argument('--limit', type=int, help='Limit number of documents') 139 ingest_parser.add_argument('--limit', type=int, help='Limit number of documents')
142 ingest_parser.add_argument('--batch-size', type=int, default=100, help='Batch size') 140 ingest_parser.add_argument('--batch-size', type=int, default=100, help='Batch size')
@@ -144,8 +142,7 @@ def main(): @@ -144,8 +142,7 @@ def main():
144 ingest_parser.add_argument('--skip-embeddings', action='store_true', help='Skip embeddings') 142 ingest_parser.add_argument('--skip-embeddings', action='store_true', help='Skip embeddings')
145 143
146 # Serve command 144 # Serve command
147 - serve_parser = subparsers.add_parser('serve', help='Start API service')  
148 - serve_parser.add_argument('--customer', default='customer1', help='Customer ID') 145 + serve_parser = subparsers.add_parser('serve', help='Start API service (multi-tenant)')
149 serve_parser.add_argument('--host', default='0.0.0.0', help='Host to bind to') 146 serve_parser.add_argument('--host', default='0.0.0.0', help='Host to bind to')
150 serve_parser.add_argument('--port', type=int, default=6002, help='Port to bind to') 147 serve_parser.add_argument('--port', type=int, default=6002, help='Port to bind to')
151 serve_parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host') 148 serve_parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
@@ -154,7 +151,7 @@ def main(): @@ -154,7 +151,7 @@ def main():
154 # Search command 151 # Search command
155 search_parser = subparsers.add_parser('search', help='Test search from command line') 152 search_parser = subparsers.add_parser('search', help='Test search from command line')
156 search_parser.add_argument('query', help='Search query') 153 search_parser.add_argument('query', help='Search query')
157 - search_parser.add_argument('--customer', default='customer1', help='Customer ID') 154 + search_parser.add_argument('--tenant-id', required=True, help='Tenant ID (required)')
158 search_parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host') 155 search_parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
159 search_parser.add_argument('--size', type=int, default=10, help='Number of results') 156 search_parser.add_argument('--size', type=int, default=10, help='Number of results')
160 search_parser.add_argument('--no-translation', action='store_true', help='Disable translation') 157 search_parser.add_argument('--no-translation', action='store_true', help='Disable translation')
@@ -34,8 +34,8 @@ sleep 3 @@ -34,8 +34,8 @@ sleep 3
34 34
35 # Step 2: Start all services 35 # Step 2: Start all services
36 echo -e "\n${YELLOW}Step 2/2: 重新启动服务${NC}" 36 echo -e "\n${YELLOW}Step 2/2: 重新启动服务${NC}"
37 -if [ -f "./run.sh" ]; then  
38 - ./run.sh 37 +if [ -f "./scripts/start.sh" ]; then
  38 + ./scripts/start.sh
39 if [ $? -eq 0 ]; then 39 if [ $? -eq 0 ]; then
40 echo -e "${GREEN}========================================${NC}" 40 echo -e "${GREEN}========================================${NC}"
41 echo -e "${GREEN}服务重启完成!${NC}" 41 echo -e "${GREEN}服务重启完成!${NC}"
@@ -17,95 +17,5 @@ echo -e &quot;${GREEN}========================================${NC}&quot; @@ -17,95 +17,5 @@ echo -e &quot;${GREEN}========================================${NC}&quot;
17 # Create logs directory if it doesn't exist 17 # Create logs directory if it doesn't exist
18 mkdir -p logs 18 mkdir -p logs
19 19
20 -# Step 1: Start backend in background  
21 -echo -e "\n${YELLOW}Step 1/2: 启动后端服务${NC}"  
22 -echo -e "${YELLOW}后端服务将在后台运行...${NC}"  
23 -  
24 -nohup ./scripts/start_backend.sh > logs/backend.log 2>&1 &  
25 -BACKEND_PID=$!  
26 -echo $BACKEND_PID > logs/backend.pid  
27 -echo -e "${GREEN}后端服务已启动 (PID: $BACKEND_PID)${NC}"  
28 -echo -e "${GREEN}日志文件: logs/backend.log${NC}"  
29 -  
30 -# Wait for backend to start  
31 -echo -e "${YELLOW}等待后端服务启动...${NC}"  
32 -MAX_RETRIES=30  
33 -RETRY_COUNT=0  
34 -BACKEND_READY=false  
35 -  
36 -while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do  
37 - sleep 2  
38 - if curl -s http://localhost:6002/ > /dev/null 2>&1; then  
39 - BACKEND_READY=true  
40 - break  
41 - fi  
42 - RETRY_COUNT=$((RETRY_COUNT + 1))  
43 - echo -e "${YELLOW} 等待中... ($RETRY_COUNT/$MAX_RETRIES)${NC}"  
44 -done  
45 -  
46 -# Check if backend is running  
47 -if [ "$BACKEND_READY" = true ]; then  
48 - echo -e "${GREEN}✓ 后端服务运行正常${NC}"  
49 - # Try health check  
50 - if curl -s http://localhost:6002/admin/health > /dev/null 2>&1; then  
51 - echo -e "${GREEN}✓ 健康检查通过${NC}"  
52 - else  
53 - echo -e "${YELLOW}⚠ 健康检查未通过,但服务已启动${NC}"  
54 - fi  
55 -else  
56 - echo -e "${RED}✗ 后端服务启动失败,请检查日志: logs/backend.log${NC}"  
57 - echo -e "${YELLOW}提示: 后端服务可能需要更多时间启动,或者检查端口是否被占用${NC}"  
58 - exit 1  
59 -fi  
60 -  
61 -# Step 2: Start frontend in background  
62 -echo -e "\n${YELLOW}Step 2/2: 启动前端服务${NC}"  
63 -echo -e "${YELLOW}前端服务将在后台运行...${NC}"  
64 -  
65 -nohup ./scripts/start_frontend.sh > logs/frontend.log 2>&1 &  
66 -FRONTEND_PID=$!  
67 -echo $FRONTEND_PID > logs/frontend.pid  
68 -echo -e "${GREEN}前端服务已启动 (PID: $FRONTEND_PID)${NC}"  
69 -echo -e "${GREEN}日志文件: logs/frontend.log${NC}"  
70 -  
71 -# Wait for frontend to start  
72 -echo -e "${YELLOW}等待前端服务启动...${NC}"  
73 -MAX_RETRIES=15  
74 -RETRY_COUNT=0  
75 -FRONTEND_READY=false  
76 -  
77 -while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do  
78 - sleep 2  
79 - if curl -s http://localhost:6003/ > /dev/null 2>&1; then  
80 - FRONTEND_READY=true  
81 - break  
82 - fi  
83 - RETRY_COUNT=$((RETRY_COUNT + 1))  
84 - echo -e "${YELLOW} 等待中... ($RETRY_COUNT/$MAX_RETRIES)${NC}"  
85 -done  
86 -  
87 -# Check if frontend is running  
88 -if [ "$FRONTEND_READY" = true ]; then  
89 - echo -e "${GREEN}✓ 前端服务运行正常${NC}"  
90 -else  
91 - echo -e "${YELLOW}⚠ 前端服务可能还在启动中,请稍后访问${NC}"  
92 -fi  
93 -  
94 -echo -e "${GREEN}========================================${NC}"  
95 -echo -e "${GREEN}所有服务启动完成!${NC}"  
96 -echo -e "${GREEN}========================================${NC}"  
97 -echo ""  
98 -echo -e "访问地址:"  
99 -echo -e " ${GREEN}前端界面: http://localhost:6003${NC}"  
100 -echo -e " ${GREEN}后端API: http://localhost:6002${NC}"  
101 -echo -e " ${GREEN}API文档: http://localhost:6002/docs${NC}"  
102 -echo ""  
103 -echo -e "日志文件:"  
104 -echo -e " 后端: logs/backend.log"  
105 -echo -e " 前端: logs/frontend.log"  
106 -echo ""  
107 -echo -e "停止服务:"  
108 -echo -e " 所有服务: ./stop.sh"  
109 -echo -e " 单独停止后端: kill \$(cat logs/backend.pid)"  
110 -echo -e " 单独停止前端: kill \$(cat logs/frontend.pid)"  
111 -echo ""  
112 \ No newline at end of file 20 \ No newline at end of file
  21 +# Call unified start script
  22 +./scripts/start.sh
113 \ No newline at end of file 23 \ No newline at end of file
scripts/demo_base.sh
@@ -178,7 +178,7 @@ echo -e &quot;${GREEN}演示环境启动完成!${NC}&quot; @@ -178,7 +178,7 @@ echo -e &quot;${GREEN}演示环境启动完成!${NC}&quot;
178 echo -e "${GREEN}========================================${NC}" 178 echo -e "${GREEN}========================================${NC}"
179 echo "" 179 echo ""
180 echo -e "访问地址:" 180 echo -e "访问地址:"
181 -echo -e " ${GREEN}前端界面: http://localhost:$FRONTEND_PORT/base${NC}" 181 +echo -e " ${GREEN}前端界面: http://localhost:$FRONTEND_PORT/base${NC} (或 http://localhost:$FRONTEND_PORT/base.html)"
182 echo -e " ${GREEN}后端API: http://localhost:$API_PORT${NC}" 182 echo -e " ${GREEN}后端API: http://localhost:$API_PORT${NC}"
183 echo -e " ${GREEN}API文档: http://localhost:$API_PORT/docs${NC}" 183 echo -e " ${GREEN}API文档: http://localhost:$API_PORT/docs${NC}"
184 echo "" 184 echo ""
scripts/frontend_server.py
@@ -47,13 +47,18 @@ class MyHTTPRequestHandler(http.server.SimpleHTTPRequestHandler, RateLimitingMix @@ -47,13 +47,18 @@ class MyHTTPRequestHandler(http.server.SimpleHTTPRequestHandler, RateLimitingMix
47 47
48 def do_GET(self): 48 def do_GET(self):
49 """Handle GET requests with support for base.html.""" 49 """Handle GET requests with support for base.html."""
50 - # Route /base to base.html  
51 - if self.path == '/base' or self.path == '/base/':  
52 - self.path = '/base.html' 50 + # Parse path (handle query strings)
  51 + path = self.path.split('?')[0] # Remove query string if present
  52 +
  53 + # Route /base to base.html (handle both with and without trailing slash)
  54 + if path == '/base' or path == '/base/':
  55 + self.path = '/base.html' + (self.path.split('?', 1)[1] if '?' in self.path else '')
53 # Route / to index.html (default) 56 # Route / to index.html (default)
54 - elif self.path == '/':  
55 - self.path = '/index.html'  
56 - return super().do_GET() 57 + elif path == '/' or path == '':
  58 + self.path = '/index.html' + (self.path.split('?', 1)[1] if '?' in self.path else '')
  59 +
  60 + # Call parent do_GET with modified path
  61 + super().do_GET()
57 62
58 def setup(self): 63 def setup(self):
59 """Setup with error handling.""" 64 """Setup with error handling."""
@@ -125,6 +130,18 @@ class ThreadedTCPServer(socketserver.ThreadingMixIn, socketserver.TCPServer): @@ -125,6 +130,18 @@ class ThreadedTCPServer(socketserver.ThreadingMixIn, socketserver.TCPServer):
125 daemon_threads = True 130 daemon_threads = True
126 131
127 if __name__ == '__main__': 132 if __name__ == '__main__':
  133 + # Check if port is already in use
  134 + import socket
  135 + sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  136 + try:
  137 + sock.bind(("", PORT))
  138 + sock.close()
  139 + except OSError:
  140 + print(f"ERROR: Port {PORT} is already in use.")
  141 + print(f"Please stop the existing server or use a different port.")
  142 + print(f"To stop existing server: kill $(lsof -t -i:{PORT})")
  143 + sys.exit(1)
  144 +
128 # Create threaded server for better concurrency 145 # Create threaded server for better concurrency
129 with ThreadedTCPServer(("", PORT), MyHTTPRequestHandler) as httpd: 146 with ThreadedTCPServer(("", PORT), MyHTTPRequestHandler) as httpd:
130 print(f"Frontend server started at http://localhost:{PORT}") 147 print(f"Frontend server started at http://localhost:{PORT}")
1 #!/bin/bash 1 #!/bin/bash
2 2
3 -# Data Ingestion Script for Customer1  
4 -  
5 -set -e 3 +# Unified data ingestion script for SearchEngine
  4 +# Ingests data from MySQL to Elasticsearch
6 5
7 cd "$(dirname "$0")/.." 6 cd "$(dirname "$0")/.."
8 source /home/tw/miniconda3/etc/profile.d/conda.sh 7 source /home/tw/miniconda3/etc/profile.d/conda.sh
@@ -10,41 +9,75 @@ conda activate searchengine @@ -10,41 +9,75 @@ conda activate searchengine
10 9
11 GREEN='\033[0;32m' 10 GREEN='\033[0;32m'
12 YELLOW='\033[1;33m' 11 YELLOW='\033[1;33m'
  12 +RED='\033[0;31m'
13 NC='\033[0m' 13 NC='\033[0m'
14 14
15 echo -e "${GREEN}========================================${NC}" 15 echo -e "${GREEN}========================================${NC}"
16 -echo -e "${GREEN}Customer1 Data Ingestion${NC}" 16 +echo -e "${GREEN}数据灌入脚本${NC}"
17 echo -e "${GREEN}========================================${NC}" 17 echo -e "${GREEN}========================================${NC}"
18 18
19 -# Default values  
20 -LIMIT=${1:-1000}  
21 -SKIP_EMBEDDINGS=${2:-false} 19 +# Load config from .env file if it exists
  20 +if [ -f .env ]; then
  21 + set -a
  22 + source .env
  23 + set +a
  24 +fi
  25 +
  26 +# Parameters
  27 +TENANT_ID=${1:-"1"}
  28 +DB_HOST=${DB_HOST:-"120.79.247.228"}
  29 +DB_PORT=${DB_PORT:-"3316"}
  30 +DB_DATABASE=${DB_DATABASE:-"saas"}
  31 +DB_USERNAME=${DB_USERNAME:-"saas"}
  32 +DB_PASSWORD=${DB_PASSWORD:-"P89cZHS5d7dFyc9R"}
  33 +ES_HOST=${ES_HOST:-"http://localhost:9200"}
  34 +BATCH_SIZE=${BATCH_SIZE:-500}
  35 +RECREATE=${RECREATE:-false}
22 36
23 echo -e "\n${YELLOW}Configuration:${NC}" 37 echo -e "\n${YELLOW}Configuration:${NC}"
24 -echo " Limit: $LIMIT documents"  
25 -echo " Skip embeddings: $SKIP_EMBEDDINGS" 38 +echo " Tenant ID: $TENANT_ID"
  39 +echo " MySQL: $DB_HOST:$DB_PORT/$DB_DATABASE"
  40 +echo " Elasticsearch: $ES_HOST"
  41 +echo " Batch Size: $BATCH_SIZE"
  42 +echo " Recreate Index: $RECREATE"
26 43
27 -CSV_FILE="data/customer1/goods_with_pic.5years_congku.csv.shuf.1w" 44 +# Validate parameters
  45 +if [ -z "$TENANT_ID" ]; then
  46 + echo -e "${RED}ERROR: Tenant ID is required${NC}"
  47 + echo "Usage: $0 <tenant_id> [batch_size] [recreate]"
  48 + exit 1
  49 +fi
28 50
29 -if [ ! -f "$CSV_FILE" ]; then  
30 - echo "Error: CSV file not found: $CSV_FILE" 51 +if [ -z "$DB_PASSWORD" ]; then
  52 + echo -e "${RED}ERROR: DB_PASSWORD未设置,请检查.env文件或环境变量${NC}"
31 exit 1 53 exit 1
32 fi 54 fi
33 55
34 # Build command 56 # Build command
35 -CMD="python data/customer1/ingest_customer1.py \  
36 - --csv $CSV_FILE \  
37 - --limit $LIMIT \  
38 - --recreate-index \  
39 - --batch-size 100"  
40 -  
41 -if [ "$SKIP_EMBEDDINGS" = "true" ]; then  
42 - CMD="$CMD --skip-embeddings" 57 +CMD="python scripts/ingest_shoplazza.py \
  58 + --db-host $DB_HOST \
  59 + --db-port $DB_PORT \
  60 + --db-database $DB_DATABASE \
  61 + --db-username $DB_USERNAME \
  62 + --db-password $DB_PASSWORD \
  63 + --tenant-id $TENANT_ID \
  64 + --es-host $ES_HOST \
  65 + --batch-size $BATCH_SIZE"
  66 +
  67 +if [ "$RECREATE" = "true" ] || [ "$RECREATE" = "1" ]; then
  68 + CMD="$CMD --recreate"
43 fi 69 fi
44 70
45 -echo -e "\n${YELLOW}Starting ingestion...${NC}" 71 +echo -e "\n${YELLOW}Starting data ingestion...${NC}"
46 eval $CMD 72 eval $CMD
47 73
48 -echo -e "\n${GREEN}========================================${NC}"  
49 -echo -e "${GREEN}Ingestion Complete!${NC}"  
50 -echo -e "${GREEN}========================================${NC}" 74 +if [ $? -eq 0 ]; then
  75 + echo -e "\n${GREEN}========================================${NC}"
  76 + echo -e "${GREEN}数据灌入完成!${NC}"
  77 + echo -e "${GREEN}========================================${NC}"
  78 +else
  79 + echo -e "\n${RED}========================================${NC}"
  80 + echo -e "${RED}数据灌入失败!${NC}"
  81 + echo -e "${RED}========================================${NC}"
  82 + exit 1
  83 +fi
scripts/ingest_shoplazza.py
@@ -33,7 +33,6 @@ def main(): @@ -33,7 +33,6 @@ def main():
33 33
34 # Tenant and index 34 # Tenant and index
35 parser.add_argument('--tenant-id', required=True, help='Tenant ID (required)') 35 parser.add_argument('--tenant-id', required=True, help='Tenant ID (required)')
36 - parser.add_argument('--config', default='base', help='Configuration ID (default: base)')  
37 parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host') 36 parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
38 37
39 # Options 38 # Options
@@ -44,11 +43,11 @@ def main(): @@ -44,11 +43,11 @@ def main():
44 43
45 print(f"Starting Shoplazza data ingestion for tenant: {args.tenant_id}") 44 print(f"Starting Shoplazza data ingestion for tenant: {args.tenant_id}")
46 45
47 - # Load configuration  
48 - config_loader = ConfigLoader("config/schema") 46 + # Load unified configuration
  47 + config_loader = ConfigLoader("config/config.yaml")
49 try: 48 try:
50 - config = config_loader.load_customer_config(args.config)  
51 - print(f"Loaded configuration: {config.customer_name}") 49 + config = config_loader.load_config()
  50 + print(f"Loaded configuration: {config.es_index_name}")
52 except Exception as e: 51 except Exception as e:
53 print(f"ERROR: Failed to load configuration: {e}") 52 print(f"ERROR: Failed to load configuration: {e}")
54 return 1 53 return 1
scripts/mock_data.sh 0 → 100755
@@ -0,0 +1,88 @@ @@ -0,0 +1,88 @@
  1 +#!/bin/bash
  2 +
  3 +# Mock data script for SearchEngine
  4 +# Generates test data and imports to MySQL
  5 +
  6 +cd "$(dirname "$0")/.."
  7 +source /home/tw/miniconda3/etc/profile.d/conda.sh
  8 +conda activate searchengine
  9 +
  10 +GREEN='\033[0;32m'
  11 +YELLOW='\033[1;33m'
  12 +RED='\033[0;31m'
  13 +NC='\033[0m'
  14 +
  15 +echo -e "${GREEN}========================================${NC}"
  16 +echo -e "${GREEN}Mock Data Script${NC}"
  17 +echo -e "${GREEN}========================================${NC}"
  18 +
  19 +# Load config from .env file if it exists
  20 +if [ -f .env ]; then
  21 + set -a
  22 + source .env
  23 + set +a
  24 +fi
  25 +
  26 +# Parameters
  27 +TENANT_ID=${1:-"1"}
  28 +NUM_SPUS=${2:-100}
  29 +DB_HOST=${DB_HOST:-"120.79.247.228"}
  30 +DB_PORT=${DB_PORT:-"3316"}
  31 +DB_DATABASE=${DB_DATABASE:-"saas"}
  32 +DB_USERNAME=${DB_USERNAME:-"saas"}
  33 +DB_PASSWORD=${DB_PASSWORD:-"P89cZHS5d7dFyc9R"}
  34 +SQL_FILE="test_data.sql"
  35 +
  36 +echo -e "\n${YELLOW}Configuration:${NC}"
  37 +echo " Tenant ID: $TENANT_ID"
  38 +echo " Number of SPUs: $NUM_SPUS"
  39 +echo " MySQL: $DB_HOST:$DB_PORT/$DB_DATABASE"
  40 +echo " SQL File: $SQL_FILE"
  41 +
  42 +# Step 1: Generate test data
  43 +echo -e "\n${YELLOW}Step 1/2: 生成测试数据${NC}"
  44 +python scripts/generate_test_data.py \
  45 + --num-spus $NUM_SPUS \
  46 + --tenant-id "$TENANT_ID" \
  47 + --start-spu-id 1 \
  48 + --start-sku-id 1 \
  49 + --output "$SQL_FILE"
  50 +
  51 +if [ $? -ne 0 ]; then
  52 + echo -e "${RED}✗ 生成测试数据失败${NC}"
  53 + exit 1
  54 +fi
  55 +
  56 +echo -e "${GREEN}✓ 测试数据已生成: $SQL_FILE${NC}"
  57 +
  58 +# Step 2: Import test data to MySQL
  59 +echo -e "\n${YELLOW}Step 2/2: 导入测试数据到MySQL${NC}"
  60 +if [ -z "$DB_PASSWORD" ]; then
  61 + echo -e "${RED}ERROR: DB_PASSWORD未设置,请检查.env文件或环境变量${NC}"
  62 + exit 1
  63 +fi
  64 +
  65 +python scripts/import_test_data.py \
  66 + --db-host "$DB_HOST" \
  67 + --db-port "$DB_PORT" \
  68 + --db-database "$DB_DATABASE" \
  69 + --db-username "$DB_USERNAME" \
  70 + --db-password "$DB_PASSWORD" \
  71 + --sql-file "$SQL_FILE" \
  72 + --tenant-id "$TENANT_ID"
  73 +
  74 +if [ $? -ne 0 ]; then
  75 + echo -e "${RED}✗ 导入测试数据失败${NC}"
  76 + exit 1
  77 +fi
  78 +
  79 +echo -e "${GREEN}✓ 测试数据已导入MySQL${NC}"
  80 +
  81 +echo -e "\n${GREEN}========================================${NC}"
  82 +echo -e "${GREEN}Mock数据完成!${NC}"
  83 +echo -e "${GREEN}========================================${NC}"
  84 +echo ""
  85 +echo -e "下一步:"
  86 +echo -e " ${YELLOW}./scripts/ingest.sh --tenant-id $TENANT_ID${NC} - 从MySQL灌入数据到ES"
  87 +echo ""
  88 +
scripts/start.sh 0 → 100755
@@ -0,0 +1,106 @@ @@ -0,0 +1,106 @@
  1 +#!/bin/bash
  2 +
  3 +# Unified startup script for SearchEngine services
  4 +# This script starts both frontend and backend services
  5 +
  6 +cd "$(dirname "$0")/.."
  7 +
  8 +GREEN='\033[0;32m'
  9 +YELLOW='\033[1;33m'
  10 +RED='\033[0;31m'
  11 +NC='\033[0m'
  12 +
  13 +echo -e "${GREEN}========================================${NC}"
  14 +echo -e "${GREEN}SearchEngine服务启动脚本${NC}"
  15 +echo -e "${GREEN}========================================${NC}"
  16 +
  17 +# Create logs directory if it doesn't exist
  18 +mkdir -p logs
  19 +
  20 +# Step 1: Start backend in background
  21 +echo -e "\n${YELLOW}Step 1/2: 启动后端服务${NC}"
  22 +echo -e "${YELLOW}后端服务将在后台运行...${NC}"
  23 +
  24 +nohup ./scripts/start_backend.sh > logs/backend.log 2>&1 &
  25 +BACKEND_PID=$!
  26 +echo $BACKEND_PID > logs/backend.pid
  27 +echo -e "${GREEN}后端服务已启动 (PID: $BACKEND_PID)${NC}"
  28 +echo -e "${GREEN}日志文件: logs/backend.log${NC}"
  29 +
  30 +# Wait for backend to start
  31 +echo -e "${YELLOW}等待后端服务启动...${NC}"
  32 +MAX_RETRIES=30
  33 +RETRY_COUNT=0
  34 +BACKEND_READY=false
  35 +
  36 +while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do
  37 + sleep 2
  38 + if curl -s http://localhost:6002/health > /dev/null 2>&1; then
  39 + BACKEND_READY=true
  40 + break
  41 + fi
  42 + RETRY_COUNT=$((RETRY_COUNT + 1))
  43 + echo -e "${YELLOW} 等待中... ($RETRY_COUNT/$MAX_RETRIES)${NC}"
  44 +done
  45 +
  46 +# Check if backend is running
  47 +if [ "$BACKEND_READY" = true ]; then
  48 + echo -e "${GREEN}✓ 后端服务运行正常${NC}"
  49 +else
  50 + echo -e "${RED}✗ 后端服务启动失败,请检查日志: logs/backend.log${NC}"
  51 + echo -e "${YELLOW}提示: 后端服务可能需要更多时间启动,或者检查端口是否被占用${NC}"
  52 + exit 1
  53 +fi
  54 +
  55 +# Step 2: Start frontend in background
  56 +echo -e "\n${YELLOW}Step 2/2: 启动前端服务${NC}"
  57 +echo -e "${YELLOW}前端服务将在后台运行...${NC}"
  58 +
  59 +nohup ./scripts/start_frontend.sh > logs/frontend.log 2>&1 &
  60 +FRONTEND_PID=$!
  61 +echo $FRONTEND_PID > logs/frontend.pid
  62 +echo -e "${GREEN}前端服务已启动 (PID: $FRONTEND_PID)${NC}"
  63 +echo -e "${GREEN}日志文件: logs/frontend.log${NC}"
  64 +
  65 +# Wait for frontend to start
  66 +echo -e "${YELLOW}等待前端服务启动...${NC}"
  67 +MAX_RETRIES=15
  68 +RETRY_COUNT=0
  69 +FRONTEND_READY=false
  70 +
  71 +while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do
  72 + sleep 2
  73 + if curl -s http://localhost:6003/ > /dev/null 2>&1; then
  74 + FRONTEND_READY=true
  75 + break
  76 + fi
  77 + RETRY_COUNT=$((RETRY_COUNT + 1))
  78 + echo -e "${YELLOW} 等待中... ($RETRY_COUNT/$MAX_RETRIES)${NC}"
  79 +done
  80 +
  81 +# Check if frontend is running
  82 +if [ "$FRONTEND_READY" = true ]; then
  83 + echo -e "${GREEN}✓ 前端服务运行正常${NC}"
  84 +else
  85 + echo -e "${YELLOW}⚠ 前端服务可能还在启动中,请稍后访问${NC}"
  86 +fi
  87 +
  88 +echo -e "${GREEN}========================================${NC}"
  89 +echo -e "${GREEN}所有服务启动完成!${NC}"
  90 +echo -e "${GREEN}========================================${NC}"
  91 +echo ""
  92 +echo -e "访问地址:"
  93 +echo -e " ${GREEN}前端界面: http://localhost:6003${NC}"
  94 +echo -e " ${GREEN}后端API: http://localhost:6002${NC}"
  95 +echo -e " ${GREEN}API文档: http://localhost:6002/docs${NC}"
  96 +echo ""
  97 +echo -e "日志文件:"
  98 +echo -e " 后端: logs/backend.log"
  99 +echo -e " 前端: logs/frontend.log"
  100 +echo ""
  101 +echo -e "停止服务:"
  102 +echo -e " 所有服务: ./scripts/stop.sh"
  103 +echo -e " 单独停止后端: kill \$(cat logs/backend.pid)"
  104 +echo -e " 单独停止前端: kill \$(cat logs/frontend.pid)"
  105 +echo ""
  106 +
scripts/start_backend.sh
@@ -24,16 +24,14 @@ if [ -f .env ]; then @@ -24,16 +24,14 @@ if [ -f .env ]; then
24 fi 24 fi
25 25
26 echo -e "\n${YELLOW}Configuration:${NC}" 26 echo -e "\n${YELLOW}Configuration:${NC}"
27 -echo " Customer: ${CUSTOMER_ID:-customer1}"  
28 echo " API Host: ${API_HOST:-0.0.0.0}" 27 echo " API Host: ${API_HOST:-0.0.0.0}"
29 echo " API Port: ${API_PORT:-6002}" 28 echo " API Port: ${API_PORT:-6002}"
30 echo " ES Host: ${ES_HOST:-http://localhost:9200}" 29 echo " ES Host: ${ES_HOST:-http://localhost:9200}"
31 echo " ES Username: ${ES_USERNAME:-not set}" 30 echo " ES Username: ${ES_USERNAME:-not set}"
32 31
33 -echo -e "\n${YELLOW}Starting service...${NC}" 32 +echo -e "\n${YELLOW}Starting service (multi-tenant)...${NC}"
34 33
35 # Export environment variables for the Python process 34 # Export environment variables for the Python process
36 -export CUSTOMER_ID=${CUSTOMER_ID:-customer1}  
37 export API_HOST=${API_HOST:-0.0.0.0} 35 export API_HOST=${API_HOST:-0.0.0.0}
38 export API_PORT=${API_PORT:-6002} 36 export API_PORT=${API_PORT:-6002}
39 export ES_HOST=${ES_HOST:-http://localhost:9200} 37 export ES_HOST=${ES_HOST:-http://localhost:9200}
@@ -43,6 +41,5 @@ export ES_PASSWORD=${ES_PASSWORD:-} @@ -43,6 +41,5 @@ export ES_PASSWORD=${ES_PASSWORD:-}
43 python -m api.app \ 41 python -m api.app \
44 --host $API_HOST \ 42 --host $API_HOST \
45 --port $API_PORT \ 43 --port $API_PORT \
46 - --customer $CUSTOMER_ID \  
47 --es-host $ES_HOST 44 --es-host $ES_HOST
48 45
scripts/start_servers.py
@@ -9,6 +9,7 @@ import signal @@ -9,6 +9,7 @@ import signal
9 import time 9 import time
10 import subprocess 10 import subprocess
11 import logging 11 import logging
  12 +import argparse
12 from typing import Dict, List, Optional 13 from typing import Dict, List, Optional
13 import multiprocessing 14 import multiprocessing
14 import threading 15 import threading
@@ -65,12 +66,11 @@ class ServerManager: @@ -65,12 +66,11 @@ class ServerManager:
65 logger.error(f"Failed to start frontend server: {e}") 66 logger.error(f"Failed to start frontend server: {e}")
66 return False 67 return False
67 68
68 - def start_api_server(self, customer: str = "customer1", es_host: str = "http://localhost:9200") -> bool: 69 + def start_api_server(self, es_host: str = "http://localhost:9200") -> bool:
69 """Start the API server.""" 70 """Start the API server."""
70 try: 71 try:
71 cmd = [ 72 cmd = [
72 sys.executable, 'main.py', 'serve', 73 sys.executable, 'main.py', 'serve',
73 - '--customer', customer,  
74 '--es-host', es_host, 74 '--es-host', es_host,
75 '--host', '0.0.0.0', 75 '--host', '0.0.0.0',
76 '--port', '6002' 76 '--port', '6002'
@@ -78,7 +78,6 @@ class ServerManager: @@ -78,7 +78,6 @@ class ServerManager:
78 78
79 env = os.environ.copy() 79 env = os.environ.copy()
80 env['PYTHONUNBUFFERED'] = '1' 80 env['PYTHONUNBUFFERED'] = '1'
81 - env['CUSTOMER_ID'] = customer  
82 env['ES_HOST'] = es_host 81 env['ES_HOST'] = es_host
83 82
84 process = subprocess.Popen( 83 process = subprocess.Popen(
@@ -179,14 +178,12 @@ def main(): @@ -179,14 +178,12 @@ def main():
179 """Main function to start all servers.""" 178 """Main function to start all servers."""
180 global manager 179 global manager
181 180
182 - parser = argparse.ArgumentParser(description='Start SearchEngine servers')  
183 - parser.add_argument('--customer', default='customer1', help='Customer ID') 181 + parser = argparse.ArgumentParser(description='Start SearchEngine servers (multi-tenant)')
184 parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host') 182 parser.add_argument('--es-host', default='http://localhost:9200', help='Elasticsearch host')
185 parser.add_argument('--check-dependencies', action='store_true', help='Check dependencies before starting') 183 parser.add_argument('--check-dependencies', action='store_true', help='Check dependencies before starting')
186 args = parser.parse_args() 184 args = parser.parse_args()
187 185
188 - logger.info("Starting SearchEngine servers...")  
189 - logger.info(f"Customer: {args.customer}") 186 + logger.info("Starting SearchEngine servers (multi-tenant)...")
190 logger.info(f"Elasticsearch: {args.es_host}") 187 logger.info(f"Elasticsearch: {args.es_host}")
191 188
192 # Check dependencies if requested 189 # Check dependencies if requested
@@ -209,7 +206,7 @@ def main(): @@ -209,7 +206,7 @@ def main():
209 206
210 try: 207 try:
211 # Start servers 208 # Start servers
212 - if not manager.start_api_server(args.customer, args.es_host): 209 + if not manager.start_api_server(args.es_host):
213 logger.error("Failed to start API server") 210 logger.error("Failed to start API server")
214 sys.exit(1) 211 sys.exit(1)
215 212
@@ -43,8 +43,8 @@ try: @@ -43,8 +43,8 @@ try:
43 es_config = get_es_config() 43 es_config = get_es_config()
44 es_client = ESClient(hosts=[es_config['host']], username=es_config.get('username'), password=es_config.get('password')) 44 es_client = ESClient(hosts=[es_config['host']], username=es_config.get('username'), password=es_config.get('password'))
45 45
46 - config_loader = ConfigLoader('config/schema')  
47 - config = config_loader.load_customer_config('customer1') 46 + config_loader = ConfigLoader('config/config.yaml')
  47 + config = config_loader.load_config()
48 48
49 if es_client.index_exists(config.es_index_name): 49 if es_client.index_exists(config.es_index_name):
50 doc_count = es_client.count(config.es_index_name) 50 doc_count = es_client.count(config.es_index_name)
@@ -15,7 +15,8 @@ from unittest.mock import Mock, MagicMock @@ -15,7 +15,8 @@ from unittest.mock import Mock, MagicMock
15 project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) 15 project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
16 sys.path.insert(0, project_root) 16 sys.path.insert(0, project_root)
17 17
18 -from config import CustomerConfig, QueryConfig, IndexConfig, FieldConfig, SPUConfig, RankingConfig 18 +from config import CustomerConfig, QueryConfig, IndexConfig, FieldConfig, SPUConfig, RankingConfig, FunctionScoreConfig, RerankConfig
  19 +from config.field_types import FieldType, AnalyzerType
19 from utils.es_client import ESClient 20 from utils.es_client import ESClient
20 from search import Searcher 21 from search import Searcher
21 from query import QueryParser 22 from query import QueryParser
@@ -39,7 +40,9 @@ def sample_index_config() -&gt; IndexConfig: @@ -39,7 +40,9 @@ def sample_index_config() -&gt; IndexConfig:
39 """样例索引配置""" 40 """样例索引配置"""
40 return IndexConfig( 41 return IndexConfig(
41 name="default", 42 name="default",
42 - match_fields=["name", "brand_name", "tags"], 43 + label="默认索引",
  44 + fields=["name", "brand_name", "tags"],
  45 + analyzer=AnalyzerType.CHINESE_ECOMMERCE,
43 language_field_mapping={ 46 language_field_mapping={
44 "zh": ["name", "brand_name"], 47 "zh": ["name", "brand_name"],
45 "en": ["name_en", "brand_name_en"] 48 "en": ["name_en", "brand_name_en"]
@@ -64,23 +67,29 @@ def sample_customer_config(sample_index_config) -&gt; CustomerConfig: @@ -64,23 +67,29 @@ def sample_customer_config(sample_index_config) -&gt; CustomerConfig:
64 ) 67 )
65 68
66 ranking_config = RankingConfig( 69 ranking_config = RankingConfig(
67 - expression="static_bm25() + text_embedding_relevance() * 0.2" 70 + expression="static_bm25() + text_embedding_relevance() * 0.2",
  71 + description="Test ranking"
68 ) 72 )
69 73
  74 + function_score_config = FunctionScoreConfig()
  75 + rerank_config = RerankConfig()
  76 +
70 return CustomerConfig( 77 return CustomerConfig(
71 - customer_id="test_customer",  
72 es_index_name="test_products", 78 es_index_name="test_products",
73 - query=query_config, 79 + fields=[
  80 + FieldConfig(name="tenant_id", field_type=FieldType.KEYWORD, required=True),
  81 + FieldConfig(name="name", field_type=FieldType.TEXT, analyzer=AnalyzerType.CHINESE_ECOMMERCE),
  82 + FieldConfig(name="brand_name", field_type=FieldType.TEXT, analyzer=AnalyzerType.CHINESE_ECOMMERCE),
  83 + FieldConfig(name="tags", field_type=FieldType.TEXT, analyzer=AnalyzerType.CHINESE_ECOMMERCE),
  84 + FieldConfig(name="price", field_type=FieldType.DOUBLE),
  85 + FieldConfig(name="category_id", field_type=FieldType.INT),
  86 + ],
74 indexes=[sample_index_config], 87 indexes=[sample_index_config],
75 - spu=spu_config, 88 + query_config=query_config,
76 ranking=ranking_config, 89 ranking=ranking_config,
77 - fields=[  
78 - FieldConfig(name="name", type="TEXT", analyzer="ansj"),  
79 - FieldConfig(name="brand_name", type="TEXT", analyzer="ansj"),  
80 - FieldConfig(name="tags", type="TEXT", analyzer="ansj"),  
81 - FieldConfig(name="price", type="DOUBLE"),  
82 - FieldConfig(name="category_id", type="INT"),  
83 - ] 90 + function_score=function_score_config,
  91 + rerank=rerank_config,
  92 + spu_config=spu_config
84 ) 93 )
85 94
86 95
@@ -165,31 +174,48 @@ def temp_config_file() -&gt; Generator[str, None, None]: @@ -165,31 +174,48 @@ def temp_config_file() -&gt; Generator[str, None, None]:
165 import yaml 174 import yaml
166 175
167 config_data = { 176 config_data = {
168 - "customer_id": "test_customer",  
169 "es_index_name": "test_products", 177 "es_index_name": "test_products",
170 - "query": { 178 + "query_config": {
171 "enable_query_rewrite": True, 179 "enable_query_rewrite": True,
172 "enable_translation": True, 180 "enable_translation": True,
173 "enable_text_embedding": True, 181 "enable_text_embedding": True,
174 "supported_languages": ["zh", "en"] 182 "supported_languages": ["zh", "en"]
175 }, 183 },
  184 + "fields": [
  185 + {"name": "tenant_id", "type": "KEYWORD", "required": True},
  186 + {"name": "name", "type": "TEXT", "analyzer": "ansj"},
  187 + {"name": "brand_name", "type": "TEXT", "analyzer": "ansj"}
  188 + ],
176 "indexes": [ 189 "indexes": [
177 { 190 {
178 "name": "default", 191 "name": "default",
179 - "match_fields": ["name", "brand_name"], 192 + "label": "默认索引",
  193 + "fields": ["name", "brand_name"],
  194 + "analyzer": "ansj",
180 "language_field_mapping": { 195 "language_field_mapping": {
181 "zh": ["name", "brand_name"], 196 "zh": ["name", "brand_name"],
182 "en": ["name_en", "brand_name_en"] 197 "en": ["name_en", "brand_name_en"]
183 } 198 }
184 } 199 }
185 ], 200 ],
186 - "spu": { 201 + "spu_config": {
187 "enabled": True, 202 "enabled": True,
188 "spu_field": "spu_id", 203 "spu_field": "spu_id",
189 "inner_hits_size": 3 204 "inner_hits_size": 3
190 }, 205 },
191 "ranking": { 206 "ranking": {
192 - "expression": "static_bm25() + text_embedding_relevance() * 0.2" 207 + "expression": "static_bm25() + text_embedding_relevance() * 0.2",
  208 + "description": "Test ranking"
  209 + },
  210 + "function_score": {
  211 + "score_mode": "sum",
  212 + "boost_mode": "multiply",
  213 + "functions": []
  214 + },
  215 + "rerank": {
  216 + "enabled": False,
  217 + "expression": "",
  218 + "description": ""
193 } 219 }
194 } 220 }
195 221
@@ -209,7 +235,6 @@ def mock_env_variables(monkeypatch): @@ -209,7 +235,6 @@ def mock_env_variables(monkeypatch):
209 monkeypatch.setenv("ES_HOST", "http://localhost:9200") 235 monkeypatch.setenv("ES_HOST", "http://localhost:9200")
210 monkeypatch.setenv("ES_USERNAME", "elastic") 236 monkeypatch.setenv("ES_USERNAME", "elastic")
211 monkeypatch.setenv("ES_PASSWORD", "changeme") 237 monkeypatch.setenv("ES_PASSWORD", "changeme")
212 - monkeypatch.setenv("CUSTOMER_ID", "test_customer")  
213 238
214 239
215 # 标记配置 240 # 标记配置