# 搜索API对接指南-05-索引接口（Indexer）

本篇覆盖数据同步/索引构建相关的所有接口（原文第 5 章），用于 `external indexer` 和 `Indexer 服务` 的对接。

## 索引接口

本节内容与 `api/routes/indexer.py` 中的索引相关服务一致，包含以下接口：

| 接口 | 方法 | 路径 | 说明 |
|------|------|------|------|
| 全量重建索引 | POST | `/indexer/reindex` | 将指定租户所有 SPU 导入 ES（不删现有索引） |
| 增量索引 | POST | `/indexer/index` | 按 SPU ID 列表索引/删除，支持自动检测删除与显式删除 |
| 查询文档 | POST | `/indexer/documents` | 按 SPU ID 列表查询 ES 文档，不写入 ES |
| 构建 ES 文档（正式） | POST | `/indexer/build-docs` | 由上游提供 MySQL 行数据，返回 ES-ready 文档，不写 ES |
| 构建 ES 文档（测试） | POST | `/indexer/build-docs-from-db` | 由本服务查库并构建文档，仅测试/调试用 |
| 内容理解字段生成 | POST | `/indexer/enrich-content` | 根据商品标题批量生成 qanchors、enriched_attributes、tags（供微服务组合方式使用） |
| 索引健康检查 | GET | `/indexer/health` | 检查索引服务与数据库连接状态 |

#### 5.0 支撑外部 indexer 的三种方式

本服务对**外部 indexer 程序**（如 Java 索引系统）提供三种对接方式，可按需选择：

| 方式 | 说明 | 适用场景 |
|------|------|----------|
| **1）doc 填充接口** | 调用 `POST /indexer/build-docs` 或 `POST /indexer/build-docs-from-db`，由本服务基于 MySQL 行数据构建完整 ES 文档（含多语言、向量、规格等），**不写入 ES**，由调用方自行写入。 | 希望一站式拿到 ES-ready doc，由己方控制写 ES 的时机与索引名。 |
| **2）微服务组合** | 单独调用**翻译**、**向量化**、**内容理解字段生成**等接口，由 indexer 程序自己组装 doc 并写入 ES。翻译与向量化为独立微服务（见第 7 节）；内容理解为 Indexer 服务内接口 `POST /indexer/enrich-content`。 | 需要灵活编排、或希望将 LLM/向量等耗时步骤与主链路解耦（如异步补齐 qanchors/tags）。 |
| **3）本服务直接写 ES** | 调用全量索引 `POST /indexer/reindex`、增量索引 `POST /indexer/index`（指定 SPU ID 列表），由本服务从 MySQL 拉数并直接写入 ES。 | 自建运维、联调或不需要由 Java 写 ES 的场景。 |

- **方式 1** 与 **方式 2** 下，ES 的写入方均为外部 indexer（或 Java），职责清晰。
- **方式 3** 下，本服务同时负责读库、构建 doc 与写 ES。

### 5.1 为租户创建索引

为租户创建索引需要两个步骤：

1. **创建索引结构**（可选，仅在需要更新 mapping 或在新环境首次创建时执行）
   - 使用脚本创建 ES 索引结构（基于 `mappings/search_products.json`）
   - 如果索引已存在，会提示用户确认（会删除现有数据）

2. **导入数据**（必需）
   - 使用全量索引接口 `/indexer/reindex` 导入数据

**创建索引结构（支持多环境 namespace）**：

```bash
# 以 UAT 环境为例：
# 1. 准备 UAT 环境的 .env（包含 UAT 的 ES_HOST/DB_HOST 等）
# 2. 设置环境前缀（也可以直接在 .env 中配置）：
export RUNTIME_ENV=uat
export ES_INDEX_NAMESPACE=uat_

# 3. 为 tenant_id=170 创建索引结构
./scripts/create_tenant_index.sh 170
```

脚本会自动从项目根目录的 `.env` 文件加载 ES 配置，并根据 `ES_INDEX_NAMESPACE` 创建：

- prod 环境（ES_INDEX_NAMESPACE 为空）：`search_products_tenant_170`
- UAT 环境（ES_INDEX_NAMESPACE=uat_）：`uat_search_products_tenant_170`

**注意事项**：
- ⚠️ 如果索引已存在，脚本会提示确认，确认后会删除现有数据
- 创建索引后，**必须**调用 `/indexer/reindex` 导入数据
- 如果只是更新数据而不需要修改索引结构，直接使用 `/indexer/reindex` 即可

---

### 5.2 全量索引接口

- **端点**: `POST /indexer/reindex`
- **描述**: 全量索引，将指定租户的所有SPU数据导入到ES索引（不会删除现有索引）。**推荐仅用于自测/运维场景**；生产环境下更推荐由 Java 等上游控制调度与写 ES。

#### 请求参数

```json
{
  "tenant_id": "162",
  "batch_size": 500
}
```

| 参数 | 类型 | 必填 | 默认值 | 说明 |
|------|------|------|--------|------|
| `tenant_id` | string | Y | - | 租户ID |
| `batch_size` | integer | N | 500 | 批量导入大小 |

#### 响应格式

**成功响应（200 OK）**（示例，实际 `index_name` 会带上 tenant 和环境前缀）:

```json
{
  "success": true,
  "total": 1000,
  "indexed": 1000,
  "failed": 0,
  "elapsed_time": 12.34,
  "index_name": "search_products_tenant_162",
  "tenant_id": "162"
}
```

**错误响应**:
- `400 Bad Request`: 参数错误
- `503 Service Unavailable`: 服务未初始化

#### 请求示例

**全量索引（不会删除现有索引）**:

```bash
curl -X POST "http://localhost:6004/indexer/reindex" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "162",
    "batch_size": 500
  }'
```

**查看日志**:

```bash
# 查看API日志（包含索引操作日志）
tail -f logs/api.log

# 或者查看所有日志文件
tail -f logs/*.log
```

> ⚠️ **重要提示**：如需 **创建索引结构**，请参考 [5.1 为租户创建索引](#51-为租户创建索引) 章节，使用 `./scripts/create_tenant_index.sh <tenant_id>`。创建后需要调用 `/indexer/reindex` 导入数据。

**查看索引日志**:

索引操作的所有关键信息都会记录到 `logs/indexer.log` 文件中（JSON 格式），包括：
- 请求开始和结束时间
- 租户ID、SPU ID、操作类型
- 每个SPU的处理状态
- ES批量写入结果
- 成功/失败统计和详细错误信息

```bash
# 实时查看索引日志（包含全量和增量索引的所有操作）
tail -f logs/indexer.log

# 使用 grep 查询（简单方式）
# 查看全量索引日志
grep "\"index_type\":\"bulk\"" logs/indexer.log | tail -100

# 查看增量索引日志
grep "\"index_type\":\"incremental\"" logs/indexer.log | tail -100

# 查看特定租户的索引日志
grep "\"tenant_id\":\"162\"" logs/indexer.log | tail -100

# 使用 jq 查询（推荐，更精确的 JSON 查询）
# 安装 jq: sudo apt-get install jq 或 brew install jq

# 查看全量索引日志
cat logs/indexer.log | jq 'select(.index_type == "bulk")' | tail -100

# 查看增量索引日志
cat logs/indexer.log | jq 'select(.index_type == "incremental")' | tail -100

# 查看特定租户的索引日志
cat logs/indexer.log | jq 'select(.tenant_id == "162")' | tail -100

# 查看失败的索引操作
cat logs/indexer.log | jq 'select(.operation == "request_complete" and .failed_count > 0)'

# 查看特定SPU的处理日志
cat logs/indexer.log | jq 'select(.spu_id == "123")'

# 查看最近的索引请求统计
cat logs/indexer.log | jq 'select(.operation == "request_complete") | {timestamp, index_type, tenant_id, total_count, success_count, failed_count, elapsed_time}'
```

### 5.3 增量索引接口

- **端点**: `POST /indexer/index`
- **描述**: 增量索引接口，根据指定的SPU ID列表进行索引，直接将数据写入ES。用于增量更新指定商品。**推荐仅作为内部/调试入口**；正式对接建议改用 `/indexer/build-docs`，由上游写 ES。

**删除说明**：
- `spu_ids`中的SPU：如果数据库`deleted=1`，自动从ES删除，响应状态为`deleted`
- `delete_spu_ids`中的SPU：直接删除，响应状态为`deleted`、`not_found`或`failed`

#### 请求参数

```json
{
  "tenant_id": "162",
  "spu_ids": ["123", "456", "789"],
  "delete_spu_ids": ["100", "101"]
}
```

| 参数 | 类型 | 必填 | 说明 |
|------|------|------|------|
| `tenant_id` | string | Y | 租户ID |
| `spu_ids` | array[string] | N | SPU ID列表（1-100个），要索引的SPU。如果为空，则只执行删除操作 |
| `delete_spu_ids` | array[string] | N | 显式指定要删除的SPU ID列表（1-100个），可选。无论数据库状态如何，都会从ES中删除这些SPU |

**注意**：
- `spu_ids` 和 `delete_spu_ids` 不能同时为空
- 每个列表最多支持100个SPU ID
- 如果SPU在`spu_ids`中且数据库`deleted=1`，会自动从ES删除（自动检测删除）

#### 响应格式

```json
{
  "spu_ids": [
    {
      "spu_id": "123",
      "status": "indexed"
    },
    {
      "spu_id": "456",
      "status": "deleted"
    },
    {
      "spu_id": "789",
      "status": "failed",
      "msg": "SPU not found (unexpected)"
    }
  ],
  "delete_spu_ids": [
    {
      "spu_id": "100",
      "status": "deleted"
    },
    {
      "spu_id": "101",
      "status": "not_found"
    },
    {
      "spu_id": "102",
      "status": "failed",
      "msg": "Failed to delete from ES: Connection timeout"
    }
  ],
  "total": 6,
  "success_count": 4,
  "failed_count": 2,
  "elapsed_time": 1.23,
  "index_name": "search_products",
  "tenant_id": "162"
}
```

| 字段 | 类型 | 说明 |
|------|------|------|
| `spu_ids` | array | spu_ids对应的响应列表，每个元素包含 `spu_id` 和 `status` |
| `spu_ids[].status` | string | 状态：`indexed`（已索引）、`deleted`（已删除，自动检测）、`failed`（失败） |
| `spu_ids[].msg` | string | 当status为`failed`时，包含失败原因（可选） |
| `delete_spu_ids` | array | delete_spu_ids对应的响应列表，每个元素包含 `spu_id` 和 `status` |
| `delete_spu_ids[].status` | string | 状态：`deleted`（已删除）、`not_found`（ES中不存在）、`failed`（失败） |
| `delete_spu_ids[].msg` | string | 当status为`failed`时，包含失败原因（可选） |
| `total` | integer | 总处理数量（spu_ids数量 + delete_spu_ids数量） |
| `success_count` | integer | 成功数量（indexed + deleted + not_found） |
| `failed_count` | integer | 失败数量 |
| `elapsed_time` | float | 耗时（秒） |
| `index_name` | string | 索引名称 |
| `tenant_id` | string | 租户ID |

**状态说明**：
- `spu_ids` 的状态：
  - `indexed`: SPU已成功索引到ES
  - `deleted`: SPU在数据库中被标记为deleted=1，已从ES删除（自动检测）
  - `failed`: 处理失败，会包含`msg`字段说明失败原因
- `delete_spu_ids` 的状态：
  - `deleted`: SPU已从ES成功删除
  - `not_found`: SPU在ES中不存在（也算成功，可能已经被删除过）
  - `failed`: 删除失败，会包含`msg`字段说明失败原因

#### 请求示例

**示例1：普通增量索引（自动检测删除）**:

```bash
curl -X POST "http://localhost:6004/indexer/index" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "162",
    "spu_ids": ["123", "456", "789"]
  }'
```

说明：如果SPU 456在数据库中`deleted=1`，会自动从ES删除，在响应中`spu_ids`列表里456的状态为`deleted`。

**示例2：显式删除（批量删除）**:

```bash
curl -X POST "http://localhost:6004/indexer/index" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "162",
    "spu_ids": ["123", "456"],
    "delete_spu_ids": ["100", "101", "102"]
  }'
```

说明：SPU 100、101、102会被显式删除，无论数据库状态如何。

**示例3：仅删除（不索引）**:

```bash
curl -X POST "http://localhost:6004/indexer/index" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "162",
    "spu_ids": [],
    "delete_spu_ids": ["100", "101"]
  }'
```

说明：只执行删除操作，不进行索引。

**示例4：混合操作（索引+删除）**:

```bash
curl -X POST "http://localhost:6004/indexer/index" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "162",
    "spu_ids": ["123", "456", "789"],
    "delete_spu_ids": ["100", "101"]
  }'
```

说明：同时执行索引和删除操作。

#### 日志说明

增量索引操作的所有关键信息都会记录到 `logs/indexer.log` 文件中（JSON格式），包括：
- 请求开始和结束时间
- 每个SPU的处理状态（获取、转换、索引、删除）
- ES批量写入结果
- 成功/失败统计
- 详细的错误信息

日志查询方式请参考[5.1节查看索引日志](#51-全量重建索引接口)部分。

### 5.4 查询文档接口

- **端点**: `POST /indexer/documents`
- **描述**: 查询文档接口，根据SPU ID列表获取ES文档数据（**不写入ES**）。用于查看、调试或验证SPU数据。

#### 请求参数

```json
{
  "tenant_id": "162",
  "spu_ids": ["123", "456", "789"]
}
```

| 参数 | 类型 | 必填 | 说明 |
|------|------|------|------|
| `tenant_id` | string | Y | 租户ID |
| `spu_ids` | array[string] | Y | SPU ID列表（1-100个） |

#### 响应格式

```json
{
  "success": [
    {
      "spu_id": "123",
      "document": {
        "tenant_id": "162",
        "spu_id": "123",
        "title": {
          "zh": "商品标题"
        },
        ...
      }
    },
    {
      "spu_id": "456",
      "document": {...}
    }
  ],
  "failed": [
    {
      "spu_id": "789",
      "error": "SPU not found or deleted"
    }
  ],
  "total": 3,
  "success_count": 2,
  "failed_count": 1
}
```

| 字段 | 类型 | 说明 |
|------|------|------|
| `success` | array | 成功获取的SPU列表，每个元素包含 `spu_id` 和 `document`（完整的ES文档数据） |
| `failed` | array | 失败的SPU列表，每个元素包含 `spu_id` 和 `error`（失败原因） |
| `total` | integer | 总SPU数量 |
| `success_count` | integer | 成功数量 |
| `failed_count` | integer | 失败数量 |

#### 请求示例

**单个SPU查询**:

```bash
curl -X POST "http://localhost:6004/indexer/documents" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "162",
    "spu_ids": ["123"]
  }'
```

**批量SPU查询**:

```bash
curl -X POST "http://localhost:6004/indexer/documents" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "162",
    "spu_ids": ["123", "456", "789"]
  }'
```

#### 与 `/indexer/index` 的区别

| 接口 | 功能 | 是否写入ES | 返回内容 |
|------|------|-----------|----------|
| `/indexer/documents` | 查询SPU文档数据 | 否 | 返回完整的ES文档数据 |
| `/indexer/index` | 增量索引 | 是 | 返回成功/失败列表和统计信息 |

**使用场景**：
- `/indexer/documents`：用于查看、调试或验证SPU数据，不修改ES索引
- `/indexer/index`：用于实际的增量索引操作，将更新的SPU数据同步到ES

### 5.5 索引健康检查接口

- **端点**: `GET /indexer/health`
- **描述**: 检查索引服务健康状态（与 `api/routes/indexer.py` 中 `indexer_health_check` 一致）

#### 响应格式

```json
{
  "status": "available",
  "database": "connected",
  "preloaded_data": {
    "category_mappings": 150
  }
}
```

| 字段 | 类型 | 说明 |
|------|------|------|
| `status` | string | `available`（服务可用）、`unavailable`（未初始化）、`error`（异常） |
| `database` | string | 数据库连接状态，如 `connected` 或 `disconnected: ...` |
| `preloaded_data.category_mappings` | integer | 已加载的分类映射数量 |

#### 请求示例

```bash
curl -X GET "http://localhost:6004/indexer/health"
```

### 5.6 文档构建接口（正式对接推荐）

#### 5.6.1 `POST /indexer/build-docs`

- **描述**:  
  基于调用方（通常是 Java 索引程序）提供的 **MySQL 行数据** 构建 ES 文档（doc），**不写入 ES**。  
  由本服务负责“如何构建 doc”（多语言、翻译、向量、规格聚合等），由调用方负责“何时调度 + 如何写 ES”。

#### 请求参数

```json
{
  "tenant_id": "170",
  "items": [
    {
      "spu": { "id": 223167, "tenant_id": 170, "title": "..." },
      "skus": [
        { "id": 3988393, "spu_id": 223167, "price": 25.99, "compare_at_price": 25.99 }
      ],
      "options": []
    }
  ]
}
```

| 参数 | 类型 | 必填 | 说明 |
|------|------|------|------|
| `tenant_id` | string | Y | 租户 ID |
| `items` | array | Y | 需构建 doc 的 SPU 列表（每项含 `spu`、`skus`、`options`），**单次最多 200 条** |

> `spu` / `skus` / `options` 字段应当直接使用从 `shoplazza_product_spu` / `shoplazza_product_sku` / `shoplazza_product_option` 查询出的行字段。

#### 请求示例（完整 curl）

> 完整请求体参考 `scripts/test_build_docs_api.py` 中的 `build_sample_request()`。

```bash
# 单条 SPU 示例（含 spu、skus、options）
curl -X POST "http://localhost:6004/indexer/build-docs" \
  -H "Content-Type: application/json" \
  -d '{
  "tenant_id": "162",
  "items": [
    {
      "spu": {
        "id": 10001,
        "title": "测试T恤 纯棉短袖",
        "brief": "舒适纯棉，多色可选",
        "description": "这是一款适合日常穿着的纯棉T恤，透气吸汗。",
        "vendor": "测试品牌",
        "category": "服装/上衣/T恤",
        "category_id": 100,
        "category_level": 2,
        "category_path": "服装/上衣/T恤",
        "fake_sales": 1280,
        "image_src": "https://oss.essa.cn/98532128-cf8e-456c-9e30-6f2a5ea0c19f.jpg",
        "enriched_tags": ["T恤", "纯棉"],
        "create_time": "2024-01-01T00:00:00Z",
        "update_time": "2024-01-01T00:00:00Z"
      },
      "skus": [
        {
          "id": 20001,
          "spu_id": 10001,
          "price": 99.0,
          "compare_at_price": 129.0,
          "sku": "SKU-TSHIRT-001",
          "inventory_quantity": 50,
          "option1": "黑色",
          "option2": "M",
          "option3": null
        },
        {
          "id": 20002,
          "spu_id": 10001,
          "price": 99.0,
          "compare_at_price": 129.0,
          "sku": "SKU-TSHIRT-002",
          "inventory_quantity": 30,
          "option1": "白色",
          "option2": "L",
          "option3": null
        }
      ],
      "options": [
        {"id": 1, "position": 1, "name": "颜色"},
        {"id": 2, "position": 2, "name": "尺码"}
      ]
    }
  ]
}'
```

生产环境替换 `localhost:6004` 为实际 Indexer 地址，如 `http://43.166.252.75:6004`。

#### 响应示例（节选）

```json
{
  "tenant_id": "170",
  "docs": [
    {
      "tenant_id": "170",
      "spu_id": "223167",
      "title": { "en": "...", "zh": "..." },
      "enriched_tags": ["Floerns", "Clothing", "Shoes & Jewelry"],
      "skus": [
        {
          "sku_id": "3988393",
          "price": 25.99,
          "compare_at_price": 25.99,
          "stock": 100
        }
      ],
      "min_price": 25.99,
      "max_price": 25.99,
      "compare_at_price": 25.99,
      "total_inventory": 100,
      "title_embedding": [/* 1024 维向量 */]
      // 其余字段与 mappings/search_products.json 一致
    }
  ],
  "total": 1,
  "success_count": 1,
  "failed_count": 0,
  "failed": []
}
```

| 字段 | 类型 | 说明 |
|------|------|------|
| `tenant_id` | string | 租户 ID |
| `docs` | array | 构建成功的 ES 文档列表，与 `mappings/search_products.json` 一致 |
| `total` | integer | 请求的 items 总数 |
| `success_count` | integer | 成功构建数量 |
| `failed_count` | integer | 失败数量 |
| `failed` | array | 失败项列表，每项含 `spu_id`、`error` |

#### 使用建议

- **生产环境推荐流程**：
  1. Java 根据业务逻辑决定哪些 SPU 需要（全量/增量）处理；
  2. Java 从 MySQL 查询 SPU/SKU/Option 行，拼成 `items`；
  3. 调用 `/indexer/build-docs` 获取 ES-ready `docs`；
  4. Java 使用自己的 ES 客户端写入 `search_products_tenant_{tenant_id}`。

### 5.7 文档构建接口（测试 / 自测）

#### 5.7.1 `POST /indexer/build-docs-from-db`

- **描述**:  
  仅用于测试/调试：调用方只提供 `tenant_id` 和 `spu_ids`，由 indexer 服务内部从 MySQL 查询 SPU/SKU/Option，然后调用与 `/indexer/build-docs` 相同的文档构建逻辑，返回 ES-ready doc。**生产环境请使用 `/indexer/build-docs`，由上游查库并写 ES。**

#### 请求参数

```json
{
  "tenant_id": "170",
  "spu_ids": ["223167", "223168"]
}
```

| 参数 | 类型 | 必填 | 说明 |
|------|------|------|------|
| `tenant_id` | string | Y | 租户 ID |
| `spu_ids` | array[string] | Y | SPU ID 列表，**单次最多 200 个** |

#### 响应格式

与 `/indexer/build-docs` 相同：`tenant_id`、`docs`、`total`、`success_count`、`failed_count`、`failed`。

#### 请求示例

```bash
curl -X POST "http://127.0.0.1:6004/indexer/build-docs-from-db" \
  -H "Content-Type: application/json" \
  -d '{"tenant_id": "170", "spu_ids": ["223167"]}'
```

返回结构与 `/indexer/build-docs` 相同，可直接用于对比 ES 实际文档或调试字段映射问题。

### 5.8 内容理解字段生成接口

- **端点**: `POST /indexer/enrich-content`
- **描述**: 根据商品内容信息批量生成 **qanchors**（锚文本）、**enriched_attributes**（通用语义属性）、**enriched_tags**（细分标签）、**enriched_taxonomy_attributes**（taxonomy 结构化属性），供外部 indexer 在「微服务组合」方式下自行拼装 doc 时使用。请求以 `items[]` 传入商品内容字段（必填/可选见下表）。接口只暴露商品内容输入，语言选择、分析维度与最终字段结构统一由 `indexer.product_enrich` 内部决定；当前返回结果与 `search_products` mapping 保持一致。单次请求在线程池中执行，避免阻塞其他接口。

当前支持的 `category_taxonomy_profile`：
- `apparel`
- `3c`
- `bags`
- `pet_supplies`
- `electronics`
- `outdoor`
- `home_appliances`
- `home_living`
- `wigs`
- `beauty`
- `accessories`
- `toys`
- `shoes`
- `sports`
- `others`

说明：
- 所有 profile 的 `enriched_taxonomy_attributes.value` 都统一返回 `zh` + `en`。
- 外部调用 `/indexer/enrich-content` 时，以请求中的 `category_taxonomy_profile` 为准。
- 当前 Indexer 内部构建 ES 文档时，taxonomy profile 暂时固定使用 `apparel`；代码里已保留 TODO，后续从数据库读取该租户真实所属行业后再替换。

#### 请求参数

```json
{
  "tenant_id": "170",
  "enrichment_scopes": ["generic", "category_taxonomy"],
  "category_taxonomy_profile": "apparel",
  "items": [
    {
      "spu_id": "223167",
      "title": "纯棉短袖T恤 夏季男装",
      "brief": "夏季透气纯棉短袖，舒适亲肤",
      "description": "100%棉，圆领版型，适合日常通勤与休闲穿搭。",
      "image_url": "https://example.com/images/223167.jpg"
    },
    {
      "spu_id": "223168",
      "title": "12PCS Dolls with Bottles",
      "image_url": "https://example.com/images/223168.jpg"
    }
  ]
}
```

| 参数 | 类型 | 必填 | 默认值 | 说明 |
|------|------|------|--------|------|
| `tenant_id` | string | Y | - | 租户 ID。目前仅用于记录日志，不产生实际作用|
| `enrichment_scopes` | array[string] | N | `["generic", "category_taxonomy"]` | 选择要执行的增强范围。`generic` 生成 `qanchors`/`enriched_tags`/`enriched_attributes`，`category_taxonomy` 生成 `enriched_taxonomy_attributes` |
| `category_taxonomy_profile` | string | N | `apparel` | 品类 taxonomy profile。支持：`apparel`、`3c`、`bags`、`pet_supplies`、`electronics`、`outdoor`、`home_appliances`、`home_living`、`wigs`、`beauty`、`accessories`、`toys`、`shoes`、`sports`、`others` |
| `items` | array | Y | - | 待分析列表；**单次最多 50 条** |

`items[]` 字段说明：

| 字段 | 类型 | 必填 | 说明 |
|------|------|------|------|
| `spu_id` | string | Y | SPU ID，用于回填结果；目前仅用于记录日志，不产生实际作用|
| `title` | string | Y | 商品标题 |
| `image_url` | string | N | 商品主图 URL；当前仅透传，暂未参与 prompt 与缓存键，后续可用于图像/多模态内容理解 |
| `brief` | string | N | 商品简介/短描述；当前会参与 prompt 与缓存键 |
| `description` | string | N | 商品详情/长描述；当前会参与 prompt 与缓存键 |

缓存说明：

- 内容缓存按 **增强范围 + taxonomy profile** 拆分；`generic` 与 `category_taxonomy:apparel` 等使用不同缓存命名空间，互不污染、可独立演进。
- 缓存键由 `analysis_kind + target_lang + prompt/schema 版本指纹 + prompt 输入文本 hash` 构成；对 category taxonomy 来说，profile 会进入 schema 标识与版本指纹。
- 当前真正参与 prompt 输入的字段是：`title`、`brief`、`description`；这些字段任一变化，都会落到新的缓存 key。
- `prompt/schema 版本指纹` 会综合 system prompt、shared instruction、localized table headers、result fields、user instruction template 等信息生成；因此只要提示词或输出契约变化，旧缓存会自然失效。
- `tenant_id`、`spu_id` 只用于请求归属与结果回填，不参与缓存键。
- 因此，输入内容与 prompt 契约都不变时可跨请求直接命中缓存；任一一侧变化，都会自然落到新的缓存 key。

语言说明：

- 接口不接受语言控制参数。
- 返回哪些语言、返回哪些语义维度，统一由 `indexer.product_enrich` 内部逻辑决定。
- 当前为了与 `search_products` mapping 对齐，通用增强字段与 taxonomy 字段都统一只返回核心索引语言 `zh`、`en`。

批量请求建议：
- **全量**：强烈建议 尽可能 **20 个 SPU/doc** 攒成一个批次后再请求一次。
- **增量**：可按时效要求设置时间窗口（例如 **5 分钟**），在窗口内尽可能攒到 **20 个**；达到 20 或窗口到期就发送一次请求。
- 允许超过20，服务内部会拆分成小批次逐个处理。也允许小于20，但是将造成费用和耗时的成本上升，特别是每次请求一个doc的情况。

#### 响应格式

```json
{
  "tenant_id": "170",
  "enrichment_scopes": ["generic", "category_taxonomy"],
  "category_taxonomy_profile": "apparel",
  "total": 2,
  "results": [
    {
      "spu_id": "223167",
      "qanchors": {
        "zh": ["短袖T恤", "纯棉", "男装", "夏季"],
        "en": ["cotton t-shirt", "short sleeve", "men", "summer"]
      },
      "enriched_tags": {
        "zh": ["纯棉", "短袖", "男装"],
        "en": ["cotton", "short sleeve", "men"]
      },
      "enriched_attributes": [
        { "name": "enriched_tags", "value": { "zh": "纯棉" } },
        { "name": "usage_scene", "value": { "zh": "日常" } },
        { "name": "enriched_tags", "value": { "en": "cotton" } }
      ],
      "enriched_taxonomy_attributes": [
        { "name": "Product Type", "value": { "zh": ["T恤"], "en": ["t-shirt"] } },
        { "name": "Target Gender", "value": { "zh": ["男"], "en": ["men"] } },
        { "name": "Season", "value": { "zh": ["夏季"], "en": ["summer"] } }
      ]
    },
    {
      "spu_id": "223168",
      "qanchors": {
        "en": ["dolls", "toys", "12pcs"]
      },
      "enriched_tags": {
        "en": ["dolls", "toys"]
      },
      "enriched_attributes": [],
      "enriched_taxonomy_attributes": []
    }
  ]
}
```

| 字段 | 类型 | 说明 |
|------|------|------|
| `enrichment_scopes` | array | 实际执行的增强范围列表 |
| `category_taxonomy_profile` | string | 实际使用的品类 taxonomy profile |
| `results` | array | 与请求 `items` 一一对应，每项含 `spu_id`、`qanchors`、`enriched_attributes`、`enriched_tags`、`enriched_taxonomy_attributes` |
| `results[].qanchors` | object | 与 ES `qanchors` 字段同结构，按语言键返回短语数组 |
| `results[].enriched_tags` | object | 与 ES `enriched_tags` 字段同结构，按语言键返回标签数组 |
| `results[].enriched_attributes` | array | 与 ES `enriched_attributes` nested 字段同结构，每项为 `{ "name", "value": { "zh"?: "...", "en"?: "..." } }` |
| `results[].enriched_taxonomy_attributes` | array | 与 ES `enriched_taxonomy_attributes` nested 字段同结构。每项通常为 `{ "name", "value": { "zh"?: [...], "en"?: [...] } }` |
| `results[].error` | string | 若该条处理失败（如 LLM 异常），会在此字段返回错误信息 |

**错误响应**:
- `400`: `items` 为空或超过 50 条
- `503`: 未配置 `DASHSCOPE_API_KEY`，内容理解服务不可用

#### 请求示例

```bash
curl -X POST "http://localhost:6004/indexer/enrich-content" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "163",
    "enrichment_scopes": ["generic", "category_taxonomy"],
    "category_taxonomy_profile": "apparel",
    "items": [
      {
        "spu_id": "223167",
        "title": "纯棉短袖T恤 夏季男装夏季男装",
        "brief": "夏季透气纯棉短袖，舒适亲肤",
        "description": "100%棉，圆领版型，适合日常通勤与休闲穿搭。",
        "image_url": "https://example.com/images/223167.jpg"
      }
    ]
  }'
```

---