接口优化

tangwang
1 parent 13377199
Showing 8 changed files with 1177 additions and 0 deletions Show diff stats
.cursor/plans/将数据pipeline相关配置从索引配置中剥离.md
.cursor/plans/spu-index-b5a93a00.plan.md -> .cursor/plans/所有tenant按同一份所有_返回接口优化.md
api/SearchEngine.code-workspace
config/schema/base/config.yaml
SHOPLAZZA_INTEGRATION_GUIDE.md -> docs/店匠相关资料/SHOPLAZZA_INTEGRATION_GUIDE.md
docs/店匠相关资料/店匠官方参考文档.md
docs/店匠相关资料/搜索web后端调用python搜索接口.md
docs/店匠相关资料/记录tenant和token-获取商品信息.md
@@ -0,0 +1,469 @@
+<!-- b5a93a00-49d7-4266-8dbf-3d3f708334ed c9ba91cf-2b58-440d-86d1-35b805e5d3cf -->
+# Configuration and Pipeline Separation Refactoring
+
+## Overview
+
+Implement clean separation between **Search Configuration** (customer-facing, ES/search focused) and **Data Pipeline** (internal ETL, script-controlled). Configuration files will only contain search engine settings, while data source and transformation logic will be controlled entirely by script parameters.
+
+## Phase 1: Configuration File Cleanup
+
+### 1.1 Clean BASE Configuration
+
+**File**: [`config/schema/base/config.yaml`](config/schema/base/config.yaml)
+
+**Remove** (data pipeline concerns):
+
+- `mysql_config` section
+- `main_table` field
+- `sku_table` field  
+- `extension_table` field
+- `source_table` in field definitions
+- `source_column` in field definitions
+
+**Keep** (search configuration):
+
+- `customer_name`
+- `es_index_name`
+- `es_settings`
+- `fields` (simplified, no source mapping)
+- `indexes` (search domains)
+- `query_config`
+- `function_score`
+- `rerank`
+- `spu_config`
+- `tenant_config` (as template)
+- `default_facets`
+
+**Simplify field definitions**:
+
+```yaml
+fields:
+  - name: "title"
+    type: "TEXT"
+    analyzer: "chinese_ecommerce"
+    boost: 3.0
+    index: true
+    store: true
+    # NO source_table, NO source_column
+```
+
+### 1.2 Update Legacy Configuration
+
+**File**: [`config/schema/customer1_legacy/config.yaml`](config/schema/customer1_legacy/config.yaml)
+
+Apply same cleanup as BASE config, marking it as legacy in comments.
+
+## Phase 2: Transformer Architecture Refactoring
+
+### 2.1 Create Base Transformer Class
+
+**File**: [`indexer/base_transformer.py`](indexer/base_transformer.py) (NEW)
+
+Create abstract base class with shared logic:
+
+- `__init__` with config, encoders, cache
+- `_convert_value()` - type conversion (shared)
+- `_generate_text_embeddings()` - text embedding (shared)
+- `_generate_image_embeddings()` - image embedding (shared)
+- `_inject_tenant_id()` - tenant_id injection (shared)
+- `@abstractmethod transform()` - to be implemented by subclasses
+
+### 2.2 Refactor DataTransformer
+
+**File**: [`indexer/data_transformer.py`](indexer/data_transformer.py)
+
+Changes:
+
+- Inherit from `BaseDataTransformer`
+- Remove dependency on `source_table`, `source_column` from config
+- Accept field mapping as parameter (from script)
+- Implement `transform(df, field_mapping)` method
+
+### 2.3 Refactor SPUDataTransformer
+
+**File**: [`indexer/spu_data_transformer.py`](indexer/spu_data_transformer.py)
+
+Changes:
+
+- Inherit from `BaseDataTransformer`
+- Remove dependency on config's table names
+- Accept field mapping as parameter
+- Implement `transform(spu_df, sku_df, spu_field_mapping, sku_field_mapping)` method
+
+### 2.4 Create Transformer Factory
+
+**File**: [`indexer/transformer_factory.py`](indexer/transformer_factory.py) (NEW)
+
+Factory to create appropriate transformer based on parameters:
+
+```python
+class TransformerFactory:
+    @staticmethod
+    def create(
+        transformer_type: str,  # 'sku' or 'spu'
+        config: CustomerConfig,
+        text_encoder=None,
+        image_encoder=None
+    ) -> BaseDataTransformer:
+        if transformer_type == 'spu':
+            return SPUDataTransformer(config, text_encoder, image_encoder)
+        elif transformer_type == 'sku':
+            return DataTransformer(config, text_encoder, image_encoder)
+        else:
+            raise ValueError(f"Unknown transformer type: {transformer_type}")
+```
+
+### 2.5 Update Package Exports
+
+**File**: [`indexer/__init__.py`](indexer/**init**.py)
+
+Export new structure:
+
+```python
+from .base_transformer import BaseDataTransformer
+from .data_transformer import DataTransformer
+from .spu_data_transformer import SPUDataTransformer
+from .transformer_factory import TransformerFactory
+
+__all__ = [
+    'BaseDataTransformer',
+    'DataTransformer',
+    'SPUDataTransformer',
+    'TransformerFactory',  # Recommended for new code
+    'BulkIndexer',
+    'IndexingPipeline',
+]
+```
+
+## Phase 3: Script Refactoring
+
+### 3.1 Create Unified Ingestion Script
+
+**File**: [`scripts/ingest_universal.py`](scripts/ingest_universal.py) (NEW)
+
+Universal ingestion script with full parameter control:
+
+**Parameters**:
+
+```bash
+# Search configuration (pure)
+--config base                      # Which search config to use
+
+# Runtime parameters
+--tenant-id shop_12345            # REQUIRED tenant identifier
+--es-host http://localhost:9200
+--es-username elastic
+--es-password xxx
+
+# Data source parameters (pipeline concern)
+--data-source mysql               # mysql, csv, api, etc.
+--mysql-host 120.79.247.228
+--mysql-port 3316
+--mysql-database saas
+--mysql-username saas
+--mysql-password xxx
+
+# Transformer parameters (pipeline concern)
+--transformer spu                 # spu or sku
+--spu-table shoplazza_product_spu
+--sku-table shoplazza_product_sku
+--shop-id 1                       # Filter by shop_id
+
+# Field mapping (optional, uses defaults if not provided)
+--field-mapping mapping.json
+
+# Processing parameters
+--batch-size 100
+--limit 1000
+--skip-embeddings
+--recreate-index
+```
+
+**Logic**:
+
+1. Load search config (clean, no data source info)
+2. Set tenant_id from parameter
+3. Connect to data source based on `--data-source` parameter
+4. Load data from tables specified by parameters
+5. Create transformer based on `--transformer` parameter
+6. Apply field mapping (default or custom)
+7. Transform and index
+
+### 3.2 Update BASE Ingestion Script
+
+**File**: [`scripts/ingest_base.py`](scripts/ingest_base.py)
+
+Update to use script parameters instead of config values:
+
+- Remove dependency on `config.mysql_config`
+- Remove dependency on `config.main_table`, `config.sku_table`
+- Get all data source info from command-line arguments
+- Use TransformerFactory
+
+### 3.3 Create Field Mapping Helper
+
+**File**: [`scripts/field_mapping_generator.py`](scripts/field_mapping_generator.py) (NEW)
+
+Helper script to generate default field mappings:
+
+```python
+# Generate default mapping for Shoplazza SPU schema
+python scripts/field_mapping_generator.py \
+  --source shoplazza \
+  --level spu \
+  --output mappings/shoplazza_spu.json
+```
+
+Output example:
+
+```json
+{
+  "spu_fields": {
+    "id": "id",
+    "title": "title",
+    "description": "description",
+    ...
+  },
+  "sku_fields": {
+    "id": "id",
+    "price": "price",
+    "sku": "sku",
+    ...
+  }
+}
+```
+
+## Phase 4: Configuration Loader Updates
+
+### 4.1 Simplify ConfigLoader
+
+**File**: [`config/config_loader.py`](config/config_loader.py)
+
+Changes:
+
+- Remove parsing of `mysql_config`
+- Remove parsing of `main_table`, `sku_table`, `extension_table`
+- Remove validation of source_table/source_column in fields
+- Simplify field parsing (no source mapping)
+- Keep validation of ES/search related config
+
+### 4.2 Update CustomerConfig Model
+
+**File**: [`config/__init__.py`](config/**init**.py) or wherever CustomerConfig is defined
+
+Remove attributes:
+
+- `mysql_config`
+- `main_table`
+- `sku_table`
+- `extension_table`
+
+Add attributes:
+
+- `tenant_id` (runtime, default None)
+
+Simplify FieldConfig:
+
+- Remove `source_table`
+- Remove `source_column`
+
+## Phase 5: Documentation Updates
+
+### 5.1 Create Pipeline Guide
+
+**File**: [`docs/DATA_PIPELINE_GUIDE.md`](docs/DATA_PIPELINE_GUIDE.md) (NEW)
+
+Document:
+
+- Separation of concerns (config vs pipeline)
+- How to use `ingest_universal.py`
+- Default field mappings for common sources
+- Custom field mapping examples
+- Transformer selection guide
+
+### 5.2 Update BASE Config Guide
+
+**File**: [`docs/BASE_CONFIG_GUIDE.md`](docs/BASE_CONFIG_GUIDE.md)
+
+Update to reflect:
+
+- Config only contains search settings
+- No data source configuration
+- How tenant_id is injected at runtime
+- Examples of using same config with different data sources
+
+### 5.3 Update API Documentation
+
+**File**: [`API_DOCUMENTATION.md`](API_DOCUMENTATION.md)
+
+No changes needed (API layer doesn't know about data pipeline).
+
+### 5.4 Update Design Documentation  
+
+**File**: [`设计文档.md`](设计文档.md)
+
+Add section on configuration architecture:
+
+- Clear separation between search config and pipeline
+- Benefits of this approach
+- How to extend for new data sources
+
+## Phase 6: Create Default Field Mappings
+
+### 6.1 Shoplazza SPU Mapping
+
+**File**: [`mappings/shoplazza_spu.json`](mappings/shoplazza_spu.json) (NEW)
+
+Default field mapping for Shoplazza SPU/SKU tables to BASE config fields.
+
+### 6.2 Shoplazza SKU Mapping (Legacy)
+
+**File**: [`mappings/shoplazza_sku_legacy.json`](mappings/shoplazza_sku_legacy.json) (NEW)
+
+Default field mapping for legacy SKU-level indexing.
+
+### 6.3 CSV Template Mapping
+
+**File**: [`mappings/csv_template.json`](mappings/csv_template.json) (NEW)
+
+Example mapping for CSV data sources.
+
+## Phase 7: Testing & Validation
+
+### 7.1 Test Script with Different Sources
+
+Test `ingest_universal.py` with:
+
+1. MySQL Shoplazza tables (SPU level)
+2. MySQL Shoplazza tables (SKU level, legacy)
+3. CSV files (if time permits)
+
+### 7.2 Verify Configuration Portability
+
+Test same BASE config with:
+
+- Different data sources
+- Different field mappings
+- Different transformers
+
+### 7.3 Update Test Scripts
+
+**File**: [`scripts/test_base.sh`](scripts/test_base.sh)
+
+Update to use new script parameters.
+
+## Phase 8: Migration & Cleanup
+
+### 8.1 Create Migration Guide
+
+**File**: [`docs/CONFIG_MIGRATION_GUIDE.md`](docs/CONFIG_MIGRATION_GUIDE.md) (NEW)
+
+Guide for migrating from old config format to new:
+
+- What changed
+- How to update existing configs
+- How to update ingestion scripts
+- Breaking changes
+
+### 8.2 Update Example Configs
+
+Update all example configurations to new format.
+
+### 8.3 Mark Old Scripts as Deprecated
+
+Add deprecation warnings to scripts that still use old config format.
+
+## Key Design Principles
+
+### 1. Separation of Concerns
+
+**Search Configuration** (customer-facing):
+
+- What fields exist in ES
+- How fields are analyzed/indexed
+- Search strategies and ranking
+- Facets and aggregations
+- Query processing rules
+
+**Data Pipeline** (internal):
+
+- Where data comes from
+- How to connect to data sources
+- Which tables/files to read
+- How to transform data
+- Field mapping logic
+
+### 2. Configuration Portability
+
+Same search config can be used with:
+
+- Different data sources (MySQL, CSV, API)
+- Different schemas (with appropriate mapping)
+- Different transformation strategies
+
+### 3. Flexibility
+
+Pipeline decisions (transformer, data source, field mapping) made at runtime, not in config.
+
+## Migration Path
+
+### For Existing Users
+
+1. Update config files (remove data source settings)
+2. Update ingestion commands (add new parameters)
+3. Optionally create field mapping files for convenience
+
+### For New Users
+
+1. Copy BASE config (already clean)
+2. Run `ingest_universal.py` with appropriate parameters
+3. Provide custom field mapping if needed
+
+## Success Criteria
+
+- [ ] BASE config contains ZERO data source information
+- [ ] Same config works with MySQL and CSV sources
+- [ ] Pipeline fully controlled by script parameters
+- [ ] Transformers work with external field mapping
+- [ ] Documentation clearly separates concerns
+- [ ] Tests validate portability
+- [ ] Migration guide provided
+
+## Estimated Effort
+
+- Configuration cleanup: 2 hours
+- Transformer refactoring: 4-5 hours
+- Script refactoring: 3-4 hours
+- Config loader updates: 2 hours
+- Documentation: 2-3 hours
+- Testing & validation: 2-3 hours
+- **Total: 15-19 hours**
+
+## Benefits
+
+✅ **Clean separation of concerns**
+
+✅ **Configuration reusability across data sources**
+
+✅ **Customer doesn't need to understand ETL**
+
+✅ **Easier to add new data sources**
+
+✅ **More flexible pipeline control**
+
+✅ **Reduced configuration complexity**
+
+### To-dos
+
+- [ ] Clean BASE and legacy configs: remove mysql_config, table names, source_table/source_column from fields
+- [ ] Create BaseDataTransformer abstract class with shared logic (type conversion, embeddings, tenant_id)
+- [ ] Refactor DataTransformer and SPUDataTransformer to inherit from base, accept field mapping as parameter
+- [ ] Create TransformerFactory for creating transformers based on type parameter
+- [ ] Create ingest_universal.py with full parameter control for data source, transformer, field mapping
+- [ ] Update scripts/ingest_base.py to use parameters instead of config for data source
+- [ ] Create field_mapping_generator.py and default mapping files (shoplazza_spu.json, etc.)
+- [ ] Simplify ConfigLoader to only parse search config, remove data source parsing
+- [ ] Create DATA_PIPELINE_GUIDE.md documenting pipeline approach and config separation
+- [ ] Update BASE_CONFIG_GUIDE.md to reflect config-only-search-settings approach
+- [ ] Create CONFIG_MIGRATION_GUIDE.md for migrating from old to new config format
+- [ ] Test same config with different data sources and validate portability
 \ No newline at end of file
@@ -0,0 +1,8 @@
+{
+	"folders": [
+		{
+			"path": ".."
+		}
+	],
+	"settings": {}
+}
 \ No newline at end of file
@@ -0,0 +1,13 @@
+### 13.1 官方文档
+
+- [店匠开发者文档](https://www.shoplazza.dev/reference/overview-29)
+- [店匠 OAuth 文档](https://www.shoplazza.dev/v2024.07/reference/authentication)
+- [店匠 API 参考](https://www.shoplazza.dev/v2024.07/reference/overview)
+- [店匠 Webhook 文档](https://www.shoplazza.dev/v2024.07/reference/webhooks)
+
+### 13.2 技术栈文档
+
+- [OAuth 2.0 RFC 6749](https://tools.ietf.org/html/rfc6749)
+- [Elasticsearch 官方文档](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html)
+- [Liquid 模板语言](https://shopify.github.io/liquid/)
+- [FastAPI 文档](https://fastapi.tiangolo.com/)
@@ -0,0 +1,261 @@
+非常好的问题！让我详细解释一下搜索应用的调用链路和 OAuth 的关系。
+
+## 调用链路说明
+
+### 1. **OAuth 认证的作用**
+
+OAuth **不是用于前端搜索调用**的，它主要用于：
+
+```mermaid
+graph LR
+    A[商家安装APP] --> B[OAuth授权]
+    B --> C[获取Access Token]
+    C --> D[后端拉取商品数据]
+    D --> E[建立ES索引]
+    E --> F[搜索服务就绪]
+```
+
+**OAuth Token 的用途：**
+- ✅ 你的后端调用店匠 Admin API（拉取商品、订单、客户数据）
+- ✅ 注册 Webhook（接收数据变更通知）
+- ❌ **不用于**前端买家搜索时的调用
+
+### 2. **前端搜索的实际调用链路**
+
+当买家在店铺搜索商品时：
+
+```
+买家浏览器 → 搜索框组件(Liquid/JS) → 你的搜索API → Elasticsearch → 返回结果
+```
+
+**关键点：**
+- 前端 JavaScript **直接调用**你的公开搜索 API
+- 不需要 OAuth Token
+- 需要传递 `store_id` 参数来标识是哪个店铺
+
+### 3. **搜索接口的两种方案**
+
+[2 tools called]
+
+## 详细解答
+
+### 📍 **方案 A：前端直接调用（推荐用于公开搜索）**
+
+**流程：**
+
+```javascript
+// 在店铺前端页面（买家浏览器中）
+const response = await fetch('https://your-domain.com/api/search/products', {
+  method: 'POST',
+  headers: {
+    'Content-Type': 'application/json'
+  },
+  body: JSON.stringify({
+    query: "蓝牙耳机",
+    tenant: "tenant_47167113-1",  // 店铺标识
+    size: 24,
+    filters: {},
+    facets: ['product_type', 'vendor']
+  })
+});
+```
+
+**你的搜索 API 需要：**
+
+1. **允许跨域访问（CORS）**：
+```python
+# Python FastAPI 示例
+from fastapi.middleware.cors import CORSMiddleware
+
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],  # 或指定店匠域名白名单
+    allow_credentials=True,
+    allow_methods=["POST"],
+    allow_headers=["*"],
+)
+```
+
+2. **根据 store_id 隔离数据**：
+```python
+@app.post("/api/search/products")
+async def search(request: SearchRequest):
+    # 从 tenant 参数提取 tenant_id
+    tenant_id = extract_tenant_id(request.tenant)
+    
+    # 使用租户专属索引
+    index_name = f"shoplazza_products_{tenant_id}"
+    
+    # 执行搜索
+    results = es_client.search(index=index_name, body=query)
+    return results
+```
+
+3. **不需要 OAuth Token 认证**（因为是公开查询）
+
+---
+
+### 📍 **方案 B：通过 Java 后端中转（更安全）**
+
+**流程：**
+
+```
+买家浏览器 → Java后端(/api/search/products?storeId=xxx) → Python搜索服务 → ES
+```
+
+**Java 后端代码：**
+
+```java
+@RestController
+@RequestMapping("/api/search")
+public class SearchController {
+    
+    @PostMapping("/products")
+    public ResponseEntity<SearchResponse> search(
+            @RequestParam String storeId,  // 从URL参数获取店铺ID
+            @RequestBody SearchRequest request) {
+        
+        // 1. 验证店铺ID（可选：检查域名白名单）
+        ShopConfig shop = shopConfigMapper.selectByStoreId(storeId);
+        if (shop == null) {
+            return ResponseEntity.notFound().build();
+        }
+        
+        // 2. 添加租户隔离参数
+        request.setTenant("tenant_" + shop.getTenantId());
+        
+        // 3. 调用 Python 搜索服务
+        SearchResponse response = restTemplate.postForObject(
+            "http://localhost:6002/search/",
+            request,
+            SearchResponse.class
+        );
+        
+        // 4. 记录搜索日志
+        searchLogService.logSearch(shop.getId(), request.getQuery(), response.getTotal());
+        
+        return ResponseEntity.ok(response);
+    }
+}
+```
+
+**前端调用（带 store_id）：**
+
+```javascript
+const response = await fetch(
+  `https://your-domain.com/api/search/products?storeId=${config.storeId}`,
+  {
+    method: 'POST',
+    headers: { 'Content-Type': 'application/json' },
+    body: JSON.stringify({
+      query: "蓝牙耳机",
+      size: 24,
+      filters: {},
+      facets: ['product_type', 'vendor']
+    })
+  }
+);
+```
+
+---
+
+## 🔐 OAuth 认证在整个系统中的位置
+
+```mermaid
+graph TB
+    subgraph "1. 商家安装阶段（使用OAuth）"
+        A[商家安装APP] --> B[OAuth授权]
+        B --> C[获取Access Token]
+        C --> D[存储Token到数据库]
+    end
+    
+    subgraph "2. 数据准备阶段（使用OAuth Token）"
+        D --> E[定时任务启动]
+        E --> F[使用Token调用店匠API]
+        F --> G[拉取商品/订单数据]
+        G --> H[建立ES索引]
+    end
+    
+    subgraph "3. 买家搜索阶段（不需要OAuth）"
+        I[买家访问店铺] --> J[输入搜索词]
+        J --> K[前端JS直接调用搜索API]
+        K --> L[搜索ES索引]
+        L --> M[返回结果]
+    end
+    
+    H -.索引建立后.-> L
+```
+
+**关键理解：**
+- **OAuth Token** = 你的后端 ↔ 店匠 Admin API（拉数据用）
+- **前端搜索** = 买家浏览器 ↔ 你的搜索 API（不需要 OAuth）
+
+---
+
+## ✅ 你需要做的事情
+
+### 1. **搜索 API 设计**
+
+当前你的 Python 搜索服务应该已经支持：
+
+```python
+POST http://your-domain:6002/search/
+Content-Type: application/json
+
+{
+  "query": "蓝牙耳机",
+  "tenant": "tenant_1",  // 重要！用于隔离不同店铺
+  "size": 20,
+  "filters": {},
+  "facets": ["product_type", "vendor"]
+}
+```
+
+### 2. **CORS 配置**（如果前端直接调用）
+
+在 Python FastAPI 服务中添加：
+
+```python
+from fastapi.middleware.cors import CORSMiddleware
+
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=[
+        "https://*.myshoplaza.com",  # 店匠域名
+        "https://your-domain.com"
+    ],
+    allow_methods=["POST", "GET"],
+    allow_headers=["*"],
+)
+```
+
+### 3. **店铺标识传递**
+
+在前端 Liquid 模板中获取店铺域名：
+
+```liquid
+<script>
+window.AI_SEARCH_CONFIG = {
+  storeId: "{{ shop.domain }}",  // 店匠自动注入
+  apiEndpoint: "https://your-domain.com/api/search/products"
+};
+</script>
+```
+
+### 4. **安全性考虑**
+
+- ✅ 租户隔离：根据 `tenant` 参数查询不同索引
+- ✅ 域名白名单：只允许店匠域名调用
+- ✅ 速率限制：防止恶意请求
+- ❌ **不需要**在每次搜索时验证 OAuth Token
+
+---
+
+## 🎯 总结
+
+1. **OAuth 只用于后台数据同步**，不用于前端搜索
+2. **前端搜索直接调用你的公开 API**，通过 `storeId` 参数隔离数据
+3. **你的搜索接口可以是无认证的**，但必须做好租户隔离
+4. **推荐方案**：前端 → 你的 Java 后端 → Python 搜索服务（便于日志统计和安全控制）
+
+有其他疑问吗？
 \ No newline at end of file
@@ -0,0 +1,426 @@
+让我详细解释这些任务，并查看相关的代码实现：
+
+[3 tools called]
+
+## 📋 详细任务解释
+
+让我逐一解释这些开发任务：
+
+---
+
+## 1️⃣ **OAuth 的迁移 - 参考 Go 的实现**
+
+**任务目标：** 将现有的 Go 语言 OAuth 实现迁移到 Java 项目中
+
+**具体工作：**
+```java
+// 需要实现的 Java 端点：
+GET  /oauth/install?shop={shop_domain}     // 处理APP安装请求
+GET  /oauth/callback?code=xxx&shop=xxx     // 处理OAuth授权回调
+POST /partner/oauth/token                  // 换取/刷新 Access Token
+```
+
+**参考的 Go 代码功能：**
+- 生成授权 URL 并重定向
+- 处理授权回调
+- 用 code 换取 Access Token
+- 解析 Token 响应并存储
+
+---
+
+## 2️⃣ **AccessToken 的存储 - 存储到 shoplazza_shop_config**
+
+**任务目标：** OAuth 成功后，将 Token 信息保存到数据库
+
+**示意图：**
+```
+店匠平台                     搜索SaaS平台
+-----------                 ----------------
+[应用市场]                  
+    ↓
+[商家安装APP] --------→ OAuth授权流程
+    ↓                       ↓
+[商家授权成功] --------→ 【第2项】创建租户+存储Token
+                            ↓
+                        system_tenant (新建)
+                        shoplazza_shop_config (新建)
+                        存储 AccessToken 和 RefreshToken
+                            ↓
+                        【第3项】定时刷新Token
+```
+
+ Token 的获取和使用流程
+
+```mermaid
+sequenceDiagram
+    participant 商家
+    participant 店匠
+    participant 你的后端
+    participant 数据库
+    
+    Note over 商家,你的后端: 1. OAuth 授权阶段
+    商家->>店匠: 安装 APP
+    店匠->>你的后端: 跳转授权
+    商家->>店匠: 同意授权
+    店匠->>你的后端: 回调带 code
+    你的后端->>店匠: 用 code 换 Token
+    店匠->>你的后端: 返回 Access Token
+    你的后端->>数据库: 存储到 shoplazza_shop_config
+    
+    Note over 你的后端,数据库: 2. 注册 Webhook 阶段
+    你的后端->>数据库: 读取 Access Token
+    你的后端->>店匠: 注册 Webhook (带 Access Token)
+    店匠->>你的后端: Webhook 注册成功
+```
+
+**核心逻辑：**
+```java
+@Transactional
+public void handleOAuthCallback(TokenResponse tokenResponse) {
+    // 1. 检查租户是否存在，不存在则创建
+    Tenant tenant = tenantMapper.selectByStoreId(storeId);
+    if (tenant == null) {
+        tenant = new Tenant();
+        tenant.setName(storeName);
+        tenantMapper.insert(tenant);  // 👈 创建新租户
+    }
+    
+    // 2. 创建或更新店铺配置
+    ShopConfig shop = shopConfigMapper.selectByStoreId(storeId);
+    if (shop == null) {
+        shop = new ShopConfig();
+        shop.setTenantId(tenant.getId());
+        shop.setStoreId(storeId);
+        shop.setStoreName(storeName);
+    }
+    
+    // 3. 保存 Token 信息
+    shop.setAccessToken(tokenResponse.getAccessToken());      // 👈 存储
+    shop.setRefreshToken(tokenResponse.getRefreshToken());    // 👈 存储
+    shop.setTokenExpiresAt(tokenResponse.getExpiresAt());     // 👈 存储
+    shop.setLocale(tokenResponse.getLocale());
+    shop.setStatus("active");
+    
+    shopConfigMapper.insertOrUpdate(shop);
+}
+```
+
+**数据表：** `shoplazza_shop_config`（已设计在文档第4章）
+
+### 📊 token数据库表关系
+
+```sql
+-- shoplazza_shop_config 表中存储的数据
++----------+----------------+----------------------------------------+
+| store_id | store_name     | access_token                           |
++----------+----------------+----------------------------------------+
+| 2286274  | 47167113-1     | V2WDYgkTvrN68QCESZ9eHb3EjpR6EB...     |  👈 OAuth时保存
++----------+----------------+----------------------------------------+
+                                     ↓
+                              注册 Webhook 时读取使用
+```
+
+### 🔐 Token 的两种用途
+
+**这个 Access Token 在你的系统中有两大用途：**
+
+1. **拉取数据** - 调用店匠 Admin API
+   - 拉取商品：`GET /openapi/2022-01/products`
+   - 拉取订单：`GET /openapi/2022-01/orders`
+   - 拉取客户：`GET /openapi/2022-01/customers`
+
+2. **注册 Webhook** - 让店匠主动推送数据变更
+   - 注册：`POST /openapi/2022-01/webhooks`（需要 Token）
+   - 接收：店匠推送到你的 `/webhook/shoplazza/{storeId}` 端点（不需要 Token）
+
+### ⚠️ 注意事项
+
+```java
+// 注册 Webhook 前，确保 Token 有效
+public void registerWebhooks(Long shopConfigId) {
+    ShopConfig shop = shopConfigMapper.selectById(shopConfigId);
+    
+    // 检查 Token 是否过期
+    if (shop.getTokenExpiresAt().before(new Date())) {
+        // Token 已过期，先刷新
+        tokenService.refreshToken(shop);
+        shop = shopConfigMapper.selectById(shopConfigId);  // 重新读取
+    }
+    
+    // 使用有效的 Token 注册 Webhook
+    String accessToken = shop.getAccessToken();
+    // ... 注册逻辑
+}
+```
+
+---
+
+## 3️⃣ **RefreshToken 的实现 - 基于定时任务，需考虑对多家店铺的处理**
+
+**任务目标：** 自动刷新即将过期的 Access Token
+
+**实现方式：**
+
+```java
+@Scheduled(cron = "0 0 2 * * ?")  // 每天凌晨2点执行
+public void refreshExpiringTokens() {
+    // 1. 查询7天内过期的所有店铺
+    DateTime sevenDaysLater = DateTime.now().plusDays(7);
+    List<ShopConfig> shops = shopConfigMapper.selectExpiringTokens(sevenDaysLater);
+    
+    // 2. 遍历每个店铺，刷新 Token
+    for (ShopConfig shop : shops) {
+        try {
+            TokenResponse newToken = oauthClient.refreshToken(
+                shop.getRefreshToken(),
+                clientId,
+                clientSecret
+            );
+            
+            // 3. 更新数据库中的 Token
+            shop.setAccessToken(newToken.getAccessToken());
+            shop.setRefreshToken(newToken.getRefreshToken());
+            shop.setTokenExpiresAt(newToken.getExpiresAt());
+            shopConfigMapper.updateById(shop);
+            
+            log.info("Token refreshed for shop: {}", shop.getStoreName());
+        } catch (Exception e) {
+            log.error("Failed to refresh token for shop: {}", shop.getStoreName(), e);
+            // 发送告警通知
+        }
+    }
+}
+```
+
+**关键点：**
+- ✅ 批量处理多家店铺
+- ✅ 提前7天刷新（避免过期）
+- ✅ 异常处理和告警
+
+---
+
+## 4️⃣ **批量拉取商品信息的优化 - 验证分页查询**
+
+**任务目标：** 完善商品数据同步，确保分页正确处理
+
+**当前问题：** 代码可能只拉取了第一页数据，未正确遍历所有页
+
+**需要验证和优化：**
+
+```java
+public void syncProducts(Long shopConfigId) {
+    ShopConfig shop = shopConfigMapper.selectById(shopConfigId);
+    
+    int page = 1;
+    int limit = 50;
+    boolean hasMore = true;
+    
+    while (hasMore) {  // 👈 关键：循环直到没有更多数据
+        // 调用店匠 API
+        String url = String.format(
+            "https://%s/openapi/2022-01/products?page=%d&limit=%d",
+            shop.getStoreDomain(), page, limit
+        );
+        
+        ProductListResponse response = apiClient.get(url, shop.getAccessToken());
+        
+        // 判断是否还有更多数据
+        if (response.getProducts() == null || response.getProducts().isEmpty()) {
+            hasMore = false;  // 👈 没有数据了，退出循环
+            break;
+        }
+        
+        // 保存当前页的商品
+        for (ProductDto product : response.getProducts()) {
+            saveProduct(shop.getTenantId(), shop.getStoreId(), product);
+        }
+        
+        page++;  // 👈 下一页
+        Thread.sleep(100);  // 避免触发速率限制
+    }
+}
+```
+
+**验证要点：**
+- ✅ 分页参数正确传递
+- ✅ 循环终止条件正确
+- ✅ 处理空页面情况
+- ✅ 速率限制控制
+
+---
+
+## 5️⃣ **批量拉取客户信息的优化 - 验证分页查询**
+
+**任务目标：** 与商品同步类似，完善客户数据同步
+
+**实现逻辑：**
+```java
+public void syncCustomers(Long shopConfigId) {
+    // 与 syncProducts 类似，遍历所有分页
+    String url = "https://{shop}/openapi/2022-01/customers?page={page}&limit=50";
+    
+    // 循环拉取所有页
+    // 保存到 shoplazza_customer 和 shoplazza_customer_address 表
+}
+```
+
+---
+
+## 6️⃣ **批量拉取订单信息的优化 - 验证分页查询**
+
+**任务目标：** 完善订单数据同步
+
+**实现逻辑：**
+```java
+public void syncOrders(Long shopConfigId) {
+    String url = "https://{shop}/openapi/2022-01/orders?page={page}&limit=50";
+    
+    // 保存到 shoplazza_order 和 shoplazza_order_item 表
+}
+```
+
+---
+
+## 7️⃣ **批量拉取店铺信息的实现 - 新增实现，需设计对应的数据库表**
+
+**任务目标：** 拉取店铺的详细配置信息
+
+**API 调用：**
+```bash
+GET /openapi/2022-01/shop
+```
+
+**可能的响应字段：**
+```json
+{
+  "id": "2286274",
+  "name": "47167113-1",
+  "domain": "47167113-1.myshoplaza.com",
+  "email": "shop@example.com",
+  "currency": "USD",
+  "timezone": "Asia/Shanghai",
+  "locale": "zh-CN",
+  "address": {...},
+  "phone": "+86 123456789"
+}
+```
+
+**需要设计的数据表：**
+```sql
+CREATE TABLE `shoplazza_shop_info` (
+  `id` BIGINT NOT NULL AUTO_INCREMENT,
+  `store_id` VARCHAR(64) NOT NULL,
+  `shop_name` VARCHAR(255),
+  `domain` VARCHAR(255),
+  `email` VARCHAR(255),
+  `currency` VARCHAR(16),
+  `timezone` VARCHAR(64),
+  `locale` VARCHAR(16),
+  `phone` VARCHAR(64),
+  `address` JSON,  -- 存储完整地址信息
+  `plan_name` VARCHAR(64),  -- 套餐名称
+  `created_at` DATETIME,
+  `updated_at` DATETIME,
+  PRIMARY KEY (`id`),
+  UNIQUE KEY `uk_store_id` (`store_id`)
+) COMMENT='店铺详细信息表';
+```
+
+---
+
+## 8️⃣ **注册店铺的 Webhook - 新增实现，需考虑安全验证**
+
+**任务目标：** 为每个店铺注册 Webhook，接收实时数据变更通知
+
+**实现步骤：**
+
+### A. 注册 Webhook（后端主动调用）
+
+```java
+@Service
+public class WebhookService {
+    
+    private static final List<String> WEBHOOK_TOPICS = Arrays.asList(
+        "products/create", "products/update", "products/delete",
+        "orders/create", "orders/updated", "customers/create"
+    );
+    
+    public void registerWebhooks(Long shopConfigId) {
+        ShopConfig shop = shopConfigMapper.selectById(shopConfigId);
+        String webhookUrl = "https://your-domain.com/webhook/shoplazza/" + shop.getStoreId();
+        
+        for (String topic : WEBHOOK_TOPICS) {
+            // 调用店匠 API 注册
+            apiClient.post(
+                "https://" + shop.getStoreDomain() + "/openapi/2022-01/webhooks",
+                shop.getAccessToken(),
+                Map.of("address", webhookUrl, "topic", topic)
+            );
+        }
+    }
+}
+```
+
+### B. 接收 Webhook（店匠主动推送）
+
+```java
+@RestController
+@RequestMapping("/webhook/shoplazza")
+public class WebhookController {
+    
+    @PostMapping("/{storeId}")
+    public ResponseEntity<String> handleWebhook(
+            @PathVariable String storeId,
+            @RequestHeader("X-Shoplazza-Hmac-Sha256") String signature,  // 👈 安全验证
+            @RequestHeader("X-Shoplazza-Topic") String topic,
+            @RequestBody String payload) {
+        
+        // 1. 验证签名（安全验证）
+        if (!verifySignature(payload, signature, clientSecret)) {
+            return ResponseEntity.status(401).body("Invalid signature");
+        }
+        
+        // 2. 异步处理事件
+        webhookService.processAsync(storeId, topic, payload);
+        
+        // 3. 立即返回 200（店匠要求3秒内响应）
+        return ResponseEntity.ok("OK");
+    }
+    
+    // HMAC-SHA256 签名验证
+    private boolean verifySignature(String payload, String signature, String secret) {
+        Mac mac = Mac.getInstance("HmacSHA256");
+        mac.init(new SecretKeySpec(secret.getBytes(), "HmacSHA256"));
+        byte[] hash = mac.doFinal(payload.getBytes());
+        String computed = Base64.getEncoder().encodeToString(hash);
+        return computed.equals(signature);
+    }
+}
+```
+
+**安全验证关键点：**
+- ✅ 使用 HMAC-SHA256 验证签名
+- ✅ 签名密钥使用 APP 的 Client Secret
+- ✅ 3秒内返回响应
+- ✅ 异步处理事件，避免超时
+
+---
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+