diff --git a/docs/搜索API对接指南.md b/docs/搜索API对接指南.md index 342c9b8..f0ddca5 100644 --- a/docs/搜索API对接指南.md +++ b/docs/搜索API对接指南.md @@ -136,7 +136,7 @@ curl -X POST "http://120.76.41.98:6002/search/" \ #### 1. 精确匹配过滤器 (filters) -用于精确匹配或多值匹配(OR 逻辑)。 +用于精确匹配或多值匹配。对于普通字段,数组表示 OR 逻辑(匹配任意一个值);对于 specifications 字段,按维度分组处理(见下文)。 **格式**: ```json @@ -181,7 +181,7 @@ curl -X POST "http://120.76.41.98:6002/search/" \ ``` 查询规格名称为"color"且值为"white"的商品。 -**多个规格过滤(OR 逻辑)**: +**多个规格过滤(按维度分组)**: ```json { "filters": { @@ -192,7 +192,26 @@ curl -X POST "http://120.76.41.98:6002/search/" \ } } ``` -查询满足任意一个规格的商品(color=white **或** size=256GB)。 +查询同时满足所有规格的商品(color=white **且** size=256GB)。 + +**相同维度的多个值(OR 逻辑)**: +```json +{ + "filters": { + "specifications": [ + {"name": "size", "value": "3"}, + {"name": "size", "value": "4"}, + {"name": "size", "value": "5"}, + {"name": "color", "value": "green"} + ] + } +} +``` +查询满足 (size=3 **或** size=4 **或** size=5) **且** color=green 的商品。 + +**过滤逻辑说明**: +- **不同维度**(不同的 `name`)之间是 **AND** 关系(求交集) +- **相同维度**(相同的 `name`)的多个值之间是 **OR** 关系(求并集) **常用过滤字段**: - `category_name`: 类目名称 @@ -707,9 +726,9 @@ curl -X POST "http://120.76.41.98:6002/search/" \ } ``` -### 场景7:多个规格过滤(OR逻辑) +### 场景7:多个规格过滤(不同维度AND,相同维度OR) -**需求**: 搜索"手机",筛选color为"white"或size为"256GB"的商品 +**需求**: 搜索"手机",筛选color为"white"且size为"256GB"的商品 ```json { @@ -725,6 +744,24 @@ curl -X POST "http://120.76.41.98:6002/search/" \ } ``` +**需求**: 搜索"手机",筛选size为"3"、"4"或"5",且color为"green"的商品 + +```json +{ + "query": "手机", + "size": 20, + "language": "zh", + "filters": { + "specifications": [ + {"name": "size", "value": "3"}, + {"name": "size", "value": "4"}, + {"name": "size", "value": "5"}, + {"name": "color", "value": "green"} + ] + } +} +``` + ### 场景8:规格分面搜索 **需求**: 搜索"手机",获取所有规格的分面统计 diff --git a/docs/搜索API速查表.md b/docs/搜索API速查表.md index 40e6529..0c313c2 100644 --- a/docs/搜索API速查表.md +++ b/docs/搜索API速查表.md @@ -41,7 +41,7 @@ POST /search/ } ``` -**多个规格(OR)**: +**多个规格(按维度分组)**: ```bash { "filters": { @@ -52,6 +52,7 @@ POST /search/ } } ``` +说明:不同维度(不同name)是AND关系,相同维度(相同name)的多个值是OR关系。 --- diff --git a/docs/系统设计文档.md b/docs/系统设计文档.md index 7286d9a..2dff803 100644 --- a/docs/系统设计文档.md +++ b/docs/系统设计文档.md @@ -480,7 +480,8 @@ laptop AND (gaming OR professional) ANDNOT cheap - 范围过滤:`{"min_price": {"gte": 50, "lte": 200}}` - **Specifications嵌套过滤**: - 单个规格:`{"specifications": {"name": "color", "value": "white"}}` - - 多个规格(OR):`{"specifications": [{"name": "color", "value": "white"}, {"name": "size", "value": "256GB"}]}` + - 多个规格:`{"specifications": [{"name": "color", "value": "white"}, {"name": "size", "value": "256GB"}]}` + - 过滤逻辑:不同维度(不同name)是AND关系,相同维度(相同name)的多个值是OR关系 - 使用ES的`nested`查询实现 - **text_recall**: 文本相关性召回 - 同时搜索中英文字段(`title_zh/en`, `brief_zh/en`, `description_zh/en`, `vendor_zh/en`, `category_path_zh/en`, `category_name_zh/en`, `tags`) @@ -593,7 +594,7 @@ ranking: - ✅ 语义搜索(KNN 检索) - ✅ 相关性排序(BM25 + 向量相似度) - ✅ 结果聚合(Faceted Search) -- ✅ Specifications嵌套过滤(单个和多个规格,OR逻辑) +- ✅ Specifications嵌套过滤(单个和多个规格,按维度分组:不同维度AND,相同维度OR) - ✅ Specifications嵌套分面(所有规格名称和指定规格名称) - ✅ SKU筛选(按维度过滤,应用层实现) @@ -784,7 +785,7 @@ class RangeFilter(BaseModel): } ``` -**多个规格过滤(OR逻辑)**: +**多个规格过滤(按维度分组)**: ```json { "specifications": [ @@ -799,7 +800,7 @@ class RangeFilter(BaseModel): 2. Searcher 层:透传 `filters` 字典 3. ES Query Builder:检测 `specifications` 键,构建ES `nested` 查询 - 单个规格:构建单个 `nested` 查询 - - 多个规格:构建多个 `nested` 查询,使用 `should` 组合(OR逻辑) + - 多个规格:按 name 维度分组,相同维度内使用 `should` 组合(OR逻辑),不同维度之间使用 `must` 组合(AND逻辑) 4. 输出:ES nested 查询(`nested.path=specifications` + `bool.must=[term(name), term(value)]`) #### 8.3.4 响应 Facets 数据流 diff --git a/docs/索引字段说明v2.md b/docs/索引字段说明v2.md index 04d9331..a019379 100644 --- a/docs/索引字段说明v2.md +++ b/docs/索引字段说明v2.md @@ -132,7 +132,7 @@ } ``` -**多个规格过滤(OR逻辑)**: +**多个规格过滤(按维度分组)**: ```json { "query": "手机", @@ -144,21 +144,57 @@ } } ``` +说明:不同维度(不同name)是AND关系,相同维度(相同name)的多个值是OR关系。 + +**示例:相同维度的多个值(OR)**: +```json +{ + "query": "手机", + "filters": { + "specifications": [ + {"name": "size", "value": "3"}, + {"name": "size", "value": "4"}, + {"name": "size", "value": "5"}, + {"name": "color", "value": "green"} + ] + } +} +``` +生成查询:(size=3 OR size=4 OR size=5) AND color=green **ES 查询结构**(后端自动生成): ```json { - "nested": { - "path": "specifications", - "query": { - "bool": { - "must": [ - { "term": { "specifications.name": "color" } }, - { "term": { "specifications.value": "white" } } - ] + "filter": [ + { + "nested": { + "path": "specifications", + "query": { + "bool": { + "should": [ + {"bool": {"must": [{"term": {"specifications.name": "size"}}, {"term": {"specifications.value": "3"}}]}}, + {"bool": {"must": [{"term": {"specifications.name": "size"}}, {"term": {"specifications.value": "4"}}]}}, + {"bool": {"must": [{"term": {"specifications.name": "size"}}, {"term": {"specifications.value": "5"}}]}} + ], + "minimum_should_match": 1 + } + } + } + }, + { + "nested": { + "path": "specifications", + "query": { + "bool": { + "must": [ + {"term": {"specifications.name": "color"}}, + {"term": {"specifications.value": "green"}} + ] + } + } } } - } + ] } ``` diff --git a/search/es_query_builder.py b/search/es_query_builder.py index 7e7a793..a78218a 100644 --- a/search/es_query_builder.py +++ b/search/es_query_builder.py @@ -373,33 +373,58 @@ class ESQueryBuilder: } }) elif isinstance(value, list): - # 多个规格过滤(OR逻辑):[{"name": "color", "value": "green"}, ...] - should_clauses = [] + # 多个规格过滤:按 name 分组,相同维度 OR,不同维度 AND + # 例如:[{"name": "size", "value": "3"}, {"name": "size", "value": "4"}, {"name": "color", "value": "green"}] + # 应该生成:(size=3 OR size=4) AND color=green + from collections import defaultdict + specs_by_name = defaultdict(list) for spec in value: if isinstance(spec, dict): name = spec.get("name") spec_value = spec.get("value") if name and spec_value: - should_clauses.append({ - "nested": { - "path": "specifications", - "query": { - "bool": { - "must": [ - {"term": {"specifications.name": name}}, - {"term": {"specifications.value": spec_value}} - ] - } + specs_by_name[name].append(spec_value) + + # 为每个 name 维度生成一个过滤子句 + for name, values in specs_by_name.items(): + if len(values) == 1: + # 单个值,直接生成 term 查询 + filter_clauses.append({ + "nested": { + "path": "specifications", + "query": { + "bool": { + "must": [ + {"term": {"specifications.name": name}}, + {"term": {"specifications.value": values[0]}} + ] } } + } + }) + else: + # 多个值,使用 should (OR) 连接 + should_clauses = [] + for spec_value in values: + should_clauses.append({ + "bool": { + "must": [ + {"term": {"specifications.name": name}}, + {"term": {"specifications.value": spec_value}} + ] + } }) - if should_clauses: - filter_clauses.append({ - "bool": { - "should": should_clauses, - "minimum_should_match": 1 - } - }) + filter_clauses.append({ + "nested": { + "path": "specifications", + "query": { + "bool": { + "should": should_clauses, + "minimum_should_match": 1 + } + } + } + }) continue # 普通字段过滤 -- libgit2 0.21.2