Commit fca871fb17bf0366c5e2c324a76c4629c32dd729
1 parent
36cf0ef9
索引字段修改
Showing
4 changed files
with
1959 additions
and
72 deletions
Show diff stats
mappings/README.md
| ... | ... | @@ -2,32 +2,280 @@ |
| 2 | 2 | |
| 3 | 3 | ## 概述 |
| 4 | 4 | |
| 5 | -所有租户共享同一个ES mapping结构,直接使用手写的JSON文件,无需通过config.yaml生成。 | |
| 5 | +所有租户共享同一个 Elasticsearch mapping 结构。 | |
| 6 | 6 | |
| 7 | -## Mapping文件 | |
| 7 | +当前目录采用“声明式 Python 规格 + 字段模板 + 最终 JSON 产物”的方式维护 `search_products` 的索引定义: | |
| 8 | 8 | |
| 9 | -- `search_products.json`: 完整的ES索引配置,包括settings和mappings | |
| 9 | +- `generate_search_products_mapping.py`: 唯一的生成源,包含字段模板、语言列表、分析器配置和递归生成逻辑 | |
| 10 | +- `search_products.json`: 由脚本生成的完整 ES 索引配置,包括 `settings` 和 `mappings` | |
| 11 | +- `search_suggestions.json`: 搜索建议索引配置 | |
| 10 | 12 | |
| 11 | -## 使用方式 | |
| 13 | +默认应修改生成脚本中的规格定义,而不是手工编辑 `search_products.json`。 | |
| 12 | 14 | |
| 13 | -### 创建索引 | |
| 15 | +## 字段抽象 | |
| 16 | + | |
| 17 | +脚本从业务语义上抽象出 4 类文本模板: | |
| 18 | + | |
| 19 | +- `all_language_text`: 全语言字段,不带 `keyword` | |
| 20 | +- `all_language_text_with_keyword`: 全语言字段,所有受支持语言都带 `keyword` | |
| 21 | +- `core_language_text`: 核心索引语言字段,不带 `keyword` | |
| 22 | +- `core_language_text_with_keyword`: 核心索引语言字段,核心语言都带 `keyword` | |
| 23 | + | |
| 24 | +这里的“核心索引语言”不是因为系统只支持两种语言,而是因为所有店铺、所有商品都必须至少产出这两种语言的索引内容。目前核心索引语言固定为: | |
| 25 | + | |
| 26 | +- `zh` | |
| 27 | +- `en` | |
| 28 | + | |
| 29 | +“全语言”表示 mapping 为原始商品语言预留了更多语言槽位。商品实际灌入时,不要求每个字段把所有语言都填满,只要求: | |
| 30 | + | |
| 31 | +- 核心索引语言字段必须填充 `zh` 和 `en` | |
| 32 | +- 全语言字段必须填充 `zh` 和 `en` | |
| 33 | +- 如果商品原始语言属于受支持语言,还应额外填充对应的原始语言字段,例如 `ru` | |
| 34 | + | |
| 35 | +当前字段大致分为几类: | |
| 36 | + | |
| 37 | +- 全语言字段:`title`、`keywords`、`brief`、`description`、`vendor`、`category_path`、`category_name_text`、`specifications.value` | |
| 38 | +- 核心索引语言字段:`qanchors`、`tags`、`option1_values`、`option2_values`、`option3_values`、`enriched_attributes.value` | |
| 39 | +- 复合嵌套字段:`image_embedding`、`specifications`、`enriched_attributes`、`skus` | |
| 40 | +- 其他标量字段:`tenant_id`、`spu_id`、价格、库存、类目等 | |
| 41 | + | |
| 42 | +生成规则里的几个基础约束: | |
| 43 | + | |
| 44 | +- 中文字段使用 `index_ik`,并额外设置 `search_analyzer: query_ik` | |
| 45 | +- 非中文语言使用各自的 Elasticsearch 内置 analyzer | |
| 46 | +- 带 `with_keyword` 的模板会为对应语言增加 `.keyword` | |
| 47 | +- `settings.analysis`、`normalizer`、`similarity` 也属于生成结果的一部分,不能只维护 `mappings.properties` | |
| 48 | + | |
| 49 | +## 索引灌入指引 | |
| 50 | + | |
| 51 | +### 基本原则 | |
| 52 | + | |
| 53 | +1. 所有商品都必须生成核心索引语言版本,也就是 `zh` 和 `en`。 | |
| 54 | +2. 全语言字段除了必须有 `zh` 和 `en`,还应尽量保留商品原始语言版本。 | |
| 55 | +3. 如果商品原始语言本身就是 `zh` 或 `en`,则原文直接写入对应字段,另一种核心语言通过翻译补齐。 | |
| 56 | +4. 如果商品原始语言是 `ru` 这类受支持的非核心语言,则应同时写入原始语言字段和 `zh/en` 翻译结果。 | |
| 57 | +5. 如果某个值为空,不应写入伪造内容;应在上游清洗后决定是否跳过该字段。 | |
| 58 | + | |
| 59 | +### 核心索引语言字段 | |
| 60 | + | |
| 61 | +这类字段的目标是保证所有商品都至少能被中文和英文检索到。无论商品原始语言是什么,都应通过翻译或标准化得到 `zh` 和 `en` 两份结果。 | |
| 62 | + | |
| 63 | +典型字段: | |
| 64 | + | |
| 65 | +- `qanchors` | |
| 66 | +- `tags` | |
| 67 | +- `option1_values` | |
| 68 | +- `option2_values` | |
| 69 | +- `option3_values` | |
| 70 | +- `enriched_attributes.value` | |
| 71 | + | |
| 72 | +以 `category_path` 和 `option*_values` 为例,核心语言灌入结果应至少包含: | |
| 73 | + | |
| 74 | +- `category_path.zh` | |
| 75 | +- `category_path.en` | |
| 76 | +- `option1_values.zh` | |
| 77 | +- `option1_values.en` | |
| 78 | +- `option2_values.zh` | |
| 79 | +- `option2_values.en` | |
| 80 | +- `option3_values.zh` | |
| 81 | +- `option3_values.en` | |
| 82 | + | |
| 83 | +示例:原始商品语言为俄语,原始 `option1_values` 为 `красный, синий` | |
| 84 | + | |
| 85 | +```json | |
| 86 | +{ | |
| 87 | + "option1_values": { | |
| 88 | + "zh": "红色, 蓝色", | |
| 89 | + "en": "red, blue" | |
| 90 | + } | |
| 91 | +} | |
| 92 | +``` | |
| 93 | + | |
| 94 | +示例:原始商品语言为俄语,类目路径为 `Одежда > Женская одежда > Куртки` | |
| 95 | + | |
| 96 | +```json | |
| 97 | +{ | |
| 98 | + "category_path": { | |
| 99 | + "zh": "服饰 > 女装 > 夹克", | |
| 100 | + "en": "Apparel > Women's Clothing > Jackets", | |
| 101 | + "ru": "Одежда > Женская одежда > Куртки" | |
| 102 | + } | |
| 103 | +} | |
| 104 | +``` | |
| 105 | + | |
| 106 | +注意:`category_path` 在 mapping 上属于全语言字段,但在灌入规范上依然要求 `zh/en` 必填。 | |
| 107 | + | |
| 108 | +### 全语言字段 | |
| 109 | + | |
| 110 | +这类字段既要保证 `zh/en` 两个核心索引语言可用,也要尽量保留商品原始语言,以便原语种召回和更自然的检索。 | |
| 111 | + | |
| 112 | +典型字段: | |
| 113 | + | |
| 114 | +- `title` | |
| 115 | +- `keywords` | |
| 116 | +- `brief` | |
| 117 | +- `description` | |
| 118 | +- `vendor` | |
| 119 | +- `category_path` | |
| 120 | +- `category_name_text` | |
| 121 | +- `specifications.value` | |
| 122 | + | |
| 123 | +灌入规则: | |
| 124 | + | |
| 125 | +1. 找到商品原始语言,例如 `ru` | |
| 126 | +2. 原文写入对应语言字段,例如 `title.ru` | |
| 127 | +3. 将原文翻译成 `zh` 和 `en` | |
| 128 | +4. 分别写入 `title.zh` 和 `title.en` | |
| 129 | + | |
| 130 | +示例:原始商品语言为俄语,标题为 `Женская зимняя куртка` | |
| 131 | + | |
| 132 | +```json | |
| 133 | +{ | |
| 134 | + "title": { | |
| 135 | + "zh": "女士冬季夹克", | |
| 136 | + "en": "Women's winter jacket", | |
| 137 | + "ru": "Женская зимняя куртка" | |
| 138 | + } | |
| 139 | +} | |
| 140 | +``` | |
| 141 | + | |
| 142 | +示例:原始商品语言为俄语,类目名称为 `Женские куртки` | |
| 143 | + | |
| 144 | +```json | |
| 145 | +{ | |
| 146 | + "category_name_text": { | |
| 147 | + "zh": "女式夹克", | |
| 148 | + "en": "Women's jackets", | |
| 149 | + "ru": "Женские куртки" | |
| 150 | + } | |
| 151 | +} | |
| 152 | +``` | |
| 153 | + | |
| 154 | +示例:规格值 `specifications.value` | |
| 155 | + | |
| 156 | +```json | |
| 157 | +{ | |
| 158 | + "specifications": [ | |
| 159 | + { | |
| 160 | + "sku_id": "sku-red-s", | |
| 161 | + "name": "color", | |
| 162 | + "value": { | |
| 163 | + "zh": "红色", | |
| 164 | + "en": "red", | |
| 165 | + "ru": "красный" | |
| 166 | + } | |
| 167 | + } | |
| 168 | + ] | |
| 169 | +} | |
| 170 | +``` | |
| 171 | + | |
| 172 | +### 原始语言为中文或英文时 | |
| 173 | + | |
| 174 | +如果原始语言就是核心索引语言之一,不需要额外再写第三份语言字段。 | |
| 175 | + | |
| 176 | +示例:原始语言为中文 | |
| 177 | + | |
| 178 | +```json | |
| 179 | +{ | |
| 180 | + "title": { | |
| 181 | + "zh": "女士冬季夹克", | |
| 182 | + "en": "Women's winter jacket" | |
| 183 | + }, | |
| 184 | + "option1_values": { | |
| 185 | + "zh": "红色, 蓝色", | |
| 186 | + "en": "red, blue" | |
| 187 | + } | |
| 188 | +} | |
| 189 | +``` | |
| 190 | + | |
| 191 | +示例:原始语言为英文 | |
| 192 | + | |
| 193 | +```json | |
| 194 | +{ | |
| 195 | + "title": { | |
| 196 | + "zh": "女士冬季夹克", | |
| 197 | + "en": "Women's winter jacket" | |
| 198 | + }, | |
| 199 | + "vendor": { | |
| 200 | + "zh": "北境服饰", | |
| 201 | + "en": "Northern Apparel" | |
| 202 | + } | |
| 203 | +} | |
| 204 | +``` | |
| 205 | + | |
| 206 | +### 不同字段的灌入方式 | |
| 207 | + | |
| 208 | +可以按下面的方式理解和实现: | |
| 209 | + | |
| 210 | +- 标量字段:直接写固定值,例如 `tenant_id`、`spu_id`、`min_price` | |
| 211 | +- 核心索引语言字段:只生成 `zh/en` | |
| 212 | +- 全语言字段:生成 `zh/en`,再按原始语言补一个对应语种字段 | |
| 213 | +- 嵌套字段:对每个元素内部重复应用同样规则,例如 `specifications[].value` | |
| 214 | + | |
| 215 | +### 推荐灌入流程 | |
| 216 | + | |
| 217 | +1. 识别商品原始语言 | |
| 218 | +2. 提取原文标题、描述、类目、规格、属性、选项值等字段 | |
| 219 | +3. 生成 `zh` 和 `en` 两份核心索引语言内容 | |
| 220 | +4. 对全语言字段,如果原始语言受支持,则额外写入原始语言字段 | |
| 221 | +5. 组装最终 ES 文档并写入索引 | |
| 222 | + | |
| 223 | +## 生成 Mapping | |
| 224 | + | |
| 225 | +在仓库根目录执行: | |
| 226 | + | |
| 227 | +```bash | |
| 228 | +source activate.sh | |
| 229 | +python mappings/generate_search_products_mapping.py > mappings/search_products.json | |
| 230 | +``` | |
| 231 | + | |
| 232 | +如果只想查看输出而不覆盖文件: | |
| 233 | + | |
| 234 | +```bash | |
| 235 | +source activate.sh | |
| 236 | +python mappings/generate_search_products_mapping.py | |
| 237 | +``` | |
| 238 | + | |
| 239 | +如果想先生成到临时文件: | |
| 240 | + | |
| 241 | +```bash | |
| 242 | +source activate.sh | |
| 243 | +python mappings/generate_search_products_mapping.py > mappings/search_products.generated.json | |
| 244 | +``` | |
| 245 | + | |
| 246 | +## 校验 Mapping | |
| 247 | + | |
| 248 | +确认当前 `search_products.json` 是否与生成规则完全一致: | |
| 249 | + | |
| 250 | +```bash | |
| 251 | +source activate.sh | |
| 252 | +python mappings/generate_search_products_mapping.py --check mappings/search_products.json | |
| 253 | +``` | |
| 254 | + | |
| 255 | +## 创建索引 | |
| 14 | 256 | |
| 15 | 257 | ```python |
| 16 | 258 | from indexer.mapping_generator import load_mapping, create_index_if_not_exists |
| 17 | 259 | from utils.es_client import ESClient |
| 18 | 260 | |
| 19 | 261 | es_client = ESClient(hosts=["http://localhost:9200"]) |
| 20 | -mapping = load_mapping() # 从mappings/search_products.json加载 | |
| 262 | +mapping = load_mapping() | |
| 21 | 263 | create_index_if_not_exists(es_client, "search_products", mapping) |
| 22 | 264 | ``` |
| 23 | 265 | |
| 24 | -### 修改Mapping | |
| 266 | +## 修改 Mapping | |
| 267 | + | |
| 268 | +推荐流程: | |
| 269 | + | |
| 270 | +1. 修改 `mappings/generate_search_products_mapping.py` | |
| 271 | +2. 重新生成 `mappings/search_products.json` | |
| 272 | +3. 用 `--check` 或 diff 确认变更符合预期 | |
| 273 | +4. 重新创建索引并导入数据 | |
| 25 | 274 | |
| 26 | -直接编辑 `mappings/search_products.json` 文件,然后重新创建索引。 | |
| 275 | +注意:Elasticsearch 不支持直接修改已有字段的 mapping 类型,只能新增字段。如需修改字段类型,需要: | |
| 27 | 276 | |
| 28 | -注意:ES不支持修改已有字段的mapping类型,只能添加新字段。如需修改字段类型,需要: | |
| 29 | 277 | 1. 删除旧索引 |
| 30 | -2. 使用新mapping创建索引 | |
| 278 | +2. 使用新 mapping 创建索引 | |
| 31 | 279 | 3. 重新导入数据 |
| 32 | 280 | |
| 33 | 281 | ## 字段说明 | ... | ... |
| ... | ... | @@ -0,0 +1,355 @@ |
| 1 | +#!/usr/bin/env python3 | |
| 2 | +from __future__ import annotations | |
| 3 | + | |
| 4 | +import argparse | |
| 5 | +import json | |
| 6 | +from pathlib import Path | |
| 7 | +from typing import Any | |
| 8 | + | |
| 9 | +ALL_LANGUAGE_CODES = [ | |
| 10 | + "zh", | |
| 11 | + "en", | |
| 12 | + "ar", | |
| 13 | + "hy", | |
| 14 | + "eu", | |
| 15 | + "pt_br", | |
| 16 | + "bg", | |
| 17 | + "ca", | |
| 18 | + "cjk", | |
| 19 | + "cs", | |
| 20 | + "da", | |
| 21 | + "nl", | |
| 22 | + "fi", | |
| 23 | + "fr", | |
| 24 | + "gl", | |
| 25 | + "de", | |
| 26 | + "el", | |
| 27 | + "hi", | |
| 28 | + "hu", | |
| 29 | + "id", | |
| 30 | + "it", | |
| 31 | + "no", | |
| 32 | + "fa", | |
| 33 | + "pt", | |
| 34 | + "ro", | |
| 35 | + "ru", | |
| 36 | + "es", | |
| 37 | + "sv", | |
| 38 | + "tr", | |
| 39 | + "th", | |
| 40 | +] | |
| 41 | + | |
| 42 | +CORE_INDEX_LANGUAGES = ["zh", "en"] | |
| 43 | + | |
| 44 | +LANGUAGE_GROUPS = { | |
| 45 | + "all": ALL_LANGUAGE_CODES, | |
| 46 | + "core": CORE_INDEX_LANGUAGES, | |
| 47 | +} | |
| 48 | + | |
| 49 | +ANALYZERS = { | |
| 50 | + "zh": "index_ik", | |
| 51 | + "en": "english", | |
| 52 | + "ar": "arabic", | |
| 53 | + "hy": "armenian", | |
| 54 | + "eu": "basque", | |
| 55 | + "pt_br": "brazilian", | |
| 56 | + "bg": "bulgarian", | |
| 57 | + "ca": "catalan", | |
| 58 | + "cjk": "cjk", | |
| 59 | + "cs": "czech", | |
| 60 | + "da": "danish", | |
| 61 | + "nl": "dutch", | |
| 62 | + "fi": "finnish", | |
| 63 | + "fr": "french", | |
| 64 | + "gl": "galician", | |
| 65 | + "de": "german", | |
| 66 | + "el": "greek", | |
| 67 | + "hi": "hindi", | |
| 68 | + "hu": "hungarian", | |
| 69 | + "id": "indonesian", | |
| 70 | + "it": "italian", | |
| 71 | + "no": "norwegian", | |
| 72 | + "fa": "persian", | |
| 73 | + "pt": "portuguese", | |
| 74 | + "ro": "romanian", | |
| 75 | + "ru": "russian", | |
| 76 | + "es": "spanish", | |
| 77 | + "sv": "swedish", | |
| 78 | + "tr": "turkish", | |
| 79 | + "th": "thai", | |
| 80 | +} | |
| 81 | + | |
| 82 | +SETTINGS = { | |
| 83 | + "number_of_shards": 1, | |
| 84 | + "number_of_replicas": 0, | |
| 85 | + "refresh_interval": "30s", | |
| 86 | + "analysis": { | |
| 87 | + "analyzer": { | |
| 88 | + "index_ik": { | |
| 89 | + "type": "custom", | |
| 90 | + "tokenizer": "ik_max_word", | |
| 91 | + "filter": ["lowercase", "asciifolding"], | |
| 92 | + }, | |
| 93 | + "query_ik": { | |
| 94 | + "type": "custom", | |
| 95 | + "tokenizer": "ik_smart", | |
| 96 | + "filter": ["lowercase", "asciifolding"], | |
| 97 | + }, | |
| 98 | + }, | |
| 99 | + "normalizer": { | |
| 100 | + "lowercase": { | |
| 101 | + "type": "custom", | |
| 102 | + "filter": ["lowercase"], | |
| 103 | + } | |
| 104 | + }, | |
| 105 | + }, | |
| 106 | + "similarity": { | |
| 107 | + "default": { | |
| 108 | + "type": "BM25", | |
| 109 | + "b": 0.0, | |
| 110 | + "k1": 0.0, | |
| 111 | + } | |
| 112 | + }, | |
| 113 | +} | |
| 114 | + | |
| 115 | +TEXT_FIELD_TEMPLATES = { | |
| 116 | + "all_language_text": { | |
| 117 | + "language_group": "all", | |
| 118 | + "with_keyword": False, | |
| 119 | + }, | |
| 120 | + "all_language_text_with_keyword": { | |
| 121 | + "language_group": "all", | |
| 122 | + "with_keyword": True, | |
| 123 | + }, | |
| 124 | + "core_language_text": { | |
| 125 | + "language_group": "core", | |
| 126 | + "with_keyword": False, | |
| 127 | + }, | |
| 128 | + "core_language_text_with_keyword": { | |
| 129 | + "language_group": "core", | |
| 130 | + "with_keyword": True, | |
| 131 | + }, | |
| 132 | +} | |
| 133 | + | |
| 134 | + | |
| 135 | +def scalar_field(name: str, field_type: str, **extra: Any) -> dict[str, Any]: | |
| 136 | + spec = { | |
| 137 | + "name": name, | |
| 138 | + "kind": "scalar", | |
| 139 | + "type": field_type, | |
| 140 | + } | |
| 141 | + if extra: | |
| 142 | + spec["extra"] = extra | |
| 143 | + return spec | |
| 144 | + | |
| 145 | + | |
| 146 | +def text_field(name: str, template: str) -> dict[str, Any]: | |
| 147 | + return { | |
| 148 | + "name": name, | |
| 149 | + "kind": "text", | |
| 150 | + "template": template, | |
| 151 | + } | |
| 152 | + | |
| 153 | + | |
| 154 | +def nested_field(name: str, *fields: dict[str, Any]) -> dict[str, Any]: | |
| 155 | + return { | |
| 156 | + "name": name, | |
| 157 | + "kind": "nested", | |
| 158 | + "fields": list(fields), | |
| 159 | + } | |
| 160 | + | |
| 161 | +TEXT_EMBEDDING_SIZE = 1024 | |
| 162 | +IMAGE_EMBEDDING_SIZE = 768 | |
| 163 | + | |
| 164 | +FIELD_SPECS = [ | |
| 165 | + scalar_field("tenant_id", "keyword"), | |
| 166 | + scalar_field("spu_id", "keyword"), | |
| 167 | + scalar_field("create_time", "date"), | |
| 168 | + scalar_field("update_time", "date"), | |
| 169 | + text_field("title", "all_language_text"), | |
| 170 | + text_field("keywords", "all_language_text_with_keyword"), | |
| 171 | + text_field("brief", "all_language_text"), | |
| 172 | + text_field("description", "all_language_text"), | |
| 173 | + text_field("vendor", "all_language_text_with_keyword"), | |
| 174 | + scalar_field("image_url", "keyword", index=False), | |
| 175 | + scalar_field( | |
| 176 | + "title_embedding", | |
| 177 | + "dense_vector", | |
| 178 | + dims=TEXT_EMBEDDING_SIZE, | |
| 179 | + index=True, | |
| 180 | + similarity="dot_product", | |
| 181 | + element_type="bfloat16", | |
| 182 | + ), | |
| 183 | + nested_field( | |
| 184 | + "image_embedding", | |
| 185 | + scalar_field( | |
| 186 | + "vector", | |
| 187 | + "dense_vector", | |
| 188 | + dims=IMAGE_EMBEDDING_SIZE, | |
| 189 | + index=True, | |
| 190 | + similarity="dot_product", | |
| 191 | + element_type="bfloat16", | |
| 192 | + ), | |
| 193 | + scalar_field("url", "text"), | |
| 194 | + ), | |
| 195 | + text_field("category_path", "all_language_text_with_keyword"), | |
| 196 | + text_field("category_name_text", "all_language_text_with_keyword"), | |
| 197 | + text_field("qanchors", "core_language_text"), | |
| 198 | + text_field("tags", "core_language_text_with_keyword"), | |
| 199 | + scalar_field("category_id", "keyword"), | |
| 200 | + scalar_field("category_name", "keyword"), | |
| 201 | + scalar_field("category_level", "integer"), | |
| 202 | + scalar_field("category1_name", "keyword"), | |
| 203 | + scalar_field("category2_name", "keyword"), | |
| 204 | + scalar_field("category3_name", "keyword"), | |
| 205 | + nested_field( | |
| 206 | + "specifications", | |
| 207 | + scalar_field("sku_id", "keyword"), | |
| 208 | + scalar_field("name", "keyword"), | |
| 209 | + scalar_field("value_keyword", "keyword"), | |
| 210 | + text_field("value_text", "core_language_text_with_keyword"), | |
| 211 | + ), | |
| 212 | + nested_field( | |
| 213 | + "enriched_attributes", | |
| 214 | + scalar_field("name", "keyword"), | |
| 215 | + text_field("value", "core_language_text_with_keyword"), | |
| 216 | + ), | |
| 217 | + scalar_field("option1_name", "keyword"), | |
| 218 | + scalar_field("option2_name", "keyword"), | |
| 219 | + scalar_field("option3_name", "keyword"), | |
| 220 | + text_field("option1_values", "core_language_text_with_keyword"), | |
| 221 | + text_field("option2_values", "core_language_text_with_keyword"), | |
| 222 | + text_field("option3_values", "core_language_text_with_keyword"), | |
| 223 | + scalar_field("min_price", "float"), | |
| 224 | + scalar_field("max_price", "float"), | |
| 225 | + scalar_field("compare_at_price", "float"), | |
| 226 | + scalar_field("sku_prices", "float"), | |
| 227 | + scalar_field("sku_weights", "long"), | |
| 228 | + scalar_field("sku_weight_units", "keyword"), | |
| 229 | + scalar_field("total_inventory", "long"), | |
| 230 | + scalar_field("sales", "long"), | |
| 231 | + nested_field( | |
| 232 | + "skus", | |
| 233 | + scalar_field("sku_id", "keyword"), | |
| 234 | + scalar_field("price", "float"), | |
| 235 | + scalar_field("compare_at_price", "float"), | |
| 236 | + scalar_field("sku_code", "keyword"), | |
| 237 | + scalar_field("stock", "long"), | |
| 238 | + scalar_field("weight", "float"), | |
| 239 | + scalar_field("weight_unit", "keyword"), | |
| 240 | + scalar_field("option1_value", "keyword"), | |
| 241 | + scalar_field("option2_value", "keyword"), | |
| 242 | + scalar_field("option3_value", "keyword"), | |
| 243 | + scalar_field("image_src", "keyword", index=False), | |
| 244 | + ), | |
| 245 | +] | |
| 246 | + | |
| 247 | + | |
| 248 | +def build_keyword_fields() -> dict[str, Any]: | |
| 249 | + return { | |
| 250 | + "keyword": { | |
| 251 | + "type": "keyword", | |
| 252 | + "normalizer": "lowercase", | |
| 253 | + } | |
| 254 | + } | |
| 255 | + | |
| 256 | + | |
| 257 | +def build_text_field(language: str, *, add_keyword: bool) -> dict[str, Any]: | |
| 258 | + field = { | |
| 259 | + "type": "text", | |
| 260 | + "analyzer": ANALYZERS[language], | |
| 261 | + } | |
| 262 | + if language == "zh": | |
| 263 | + field["search_analyzer"] = "query_ik" | |
| 264 | + if add_keyword: | |
| 265 | + field["fields"] = build_keyword_fields() | |
| 266 | + return field | |
| 267 | + | |
| 268 | + | |
| 269 | +def render_field(spec: dict[str, Any]) -> dict[str, Any]: | |
| 270 | + kind = spec["kind"] | |
| 271 | + | |
| 272 | + if kind == "scalar": | |
| 273 | + rendered = {"type": spec["type"]} | |
| 274 | + rendered.update(spec.get("extra", {})) | |
| 275 | + return rendered | |
| 276 | + | |
| 277 | + if kind == "text": | |
| 278 | + template = TEXT_FIELD_TEMPLATES[spec["template"]] | |
| 279 | + languages = LANGUAGE_GROUPS[template["language_group"]] | |
| 280 | + properties = {} | |
| 281 | + for language in languages: | |
| 282 | + properties[language] = build_text_field( | |
| 283 | + language, | |
| 284 | + add_keyword=template["with_keyword"], | |
| 285 | + ) | |
| 286 | + return { | |
| 287 | + "type": "object", | |
| 288 | + "properties": properties, | |
| 289 | + } | |
| 290 | + | |
| 291 | + if kind == "nested": | |
| 292 | + properties = {} | |
| 293 | + for child in spec["fields"]: | |
| 294 | + properties[child["name"]] = render_field(child) | |
| 295 | + return { | |
| 296 | + "type": "nested", | |
| 297 | + "properties": properties, | |
| 298 | + } | |
| 299 | + | |
| 300 | + raise ValueError(f"Unknown field kind: {kind}") | |
| 301 | + | |
| 302 | + | |
| 303 | +def build_mapping() -> dict[str, Any]: | |
| 304 | + properties = {} | |
| 305 | + for spec in FIELD_SPECS: | |
| 306 | + properties[spec["name"]] = render_field(spec) | |
| 307 | + | |
| 308 | + return { | |
| 309 | + "settings": SETTINGS, | |
| 310 | + "mappings": { | |
| 311 | + "properties": properties, | |
| 312 | + }, | |
| 313 | + } | |
| 314 | + | |
| 315 | + | |
| 316 | +def render_mapping() -> str: | |
| 317 | + return json.dumps(build_mapping(), indent=2, ensure_ascii=False) | |
| 318 | + | |
| 319 | + | |
| 320 | +def main() -> int: | |
| 321 | + parser = argparse.ArgumentParser( | |
| 322 | + description="Generate mappings/search_products.json from a compact Python spec.", | |
| 323 | + ) | |
| 324 | + parser.add_argument( | |
| 325 | + "-o", | |
| 326 | + "--output", | |
| 327 | + type=Path, | |
| 328 | + help="Write the generated mapping to this file. Defaults to stdout.", | |
| 329 | + ) | |
| 330 | + parser.add_argument( | |
| 331 | + "--check", | |
| 332 | + type=Path, | |
| 333 | + help="Fail if the generated output does not exactly match this file.", | |
| 334 | + ) | |
| 335 | + args = parser.parse_args() | |
| 336 | + | |
| 337 | + rendered = render_mapping() | |
| 338 | + | |
| 339 | + if args.check is not None: | |
| 340 | + existing = args.check.read_text(encoding="utf-8") | |
| 341 | + if existing != rendered: | |
| 342 | + print(f"Generated mapping does not match {args.check}") | |
| 343 | + return 1 | |
| 344 | + print(f"Generated mapping matches {args.check}") | |
| 345 | + | |
| 346 | + if args.output is not None: | |
| 347 | + args.output.write_text(rendered, encoding="utf-8") | |
| 348 | + elif args.check is None: | |
| 349 | + print(rendered, end="") | |
| 350 | + | |
| 351 | + return 0 | |
| 352 | + | |
| 353 | + | |
| 354 | +if __name__ == "__main__": | |
| 355 | + raise SystemExit(main()) | ... | ... |
mappings/search_products.json
| ... | ... | @@ -185,7 +185,13 @@ |
| 185 | 185 | "zh": { |
| 186 | 186 | "type": "text", |
| 187 | 187 | "analyzer": "index_ik", |
| 188 | - "search_analyzer": "query_ik" | |
| 188 | + "search_analyzer": "query_ik", | |
| 189 | + "fields": { | |
| 190 | + "keyword": { | |
| 191 | + "type": "keyword", | |
| 192 | + "normalizer": "lowercase" | |
| 193 | + } | |
| 194 | + } | |
| 189 | 195 | }, |
| 190 | 196 | "en": { |
| 191 | 197 | "type": "text", |
| ... | ... | @@ -737,7 +743,13 @@ |
| 737 | 743 | "zh": { |
| 738 | 744 | "type": "text", |
| 739 | 745 | "analyzer": "index_ik", |
| 740 | - "search_analyzer": "query_ik" | |
| 746 | + "search_analyzer": "query_ik", | |
| 747 | + "fields": { | |
| 748 | + "keyword": { | |
| 749 | + "type": "keyword", | |
| 750 | + "normalizer": "lowercase" | |
| 751 | + } | |
| 752 | + } | |
| 741 | 753 | }, |
| 742 | 754 | "en": { |
| 743 | 755 | "type": "text", |
| ... | ... | @@ -1063,123 +1075,303 @@ |
| 1063 | 1075 | "zh": { |
| 1064 | 1076 | "type": "text", |
| 1065 | 1077 | "analyzer": "index_ik", |
| 1066 | - "search_analyzer": "query_ik" | |
| 1078 | + "search_analyzer": "query_ik", | |
| 1079 | + "fields": { | |
| 1080 | + "keyword": { | |
| 1081 | + "type": "keyword", | |
| 1082 | + "normalizer": "lowercase" | |
| 1083 | + } | |
| 1084 | + } | |
| 1067 | 1085 | }, |
| 1068 | 1086 | "en": { |
| 1069 | 1087 | "type": "text", |
| 1070 | - "analyzer": "english" | |
| 1088 | + "analyzer": "english", | |
| 1089 | + "fields": { | |
| 1090 | + "keyword": { | |
| 1091 | + "type": "keyword", | |
| 1092 | + "normalizer": "lowercase" | |
| 1093 | + } | |
| 1094 | + } | |
| 1071 | 1095 | }, |
| 1072 | 1096 | "ar": { |
| 1073 | 1097 | "type": "text", |
| 1074 | - "analyzer": "arabic" | |
| 1098 | + "analyzer": "arabic", | |
| 1099 | + "fields": { | |
| 1100 | + "keyword": { | |
| 1101 | + "type": "keyword", | |
| 1102 | + "normalizer": "lowercase" | |
| 1103 | + } | |
| 1104 | + } | |
| 1075 | 1105 | }, |
| 1076 | 1106 | "hy": { |
| 1077 | 1107 | "type": "text", |
| 1078 | - "analyzer": "armenian" | |
| 1108 | + "analyzer": "armenian", | |
| 1109 | + "fields": { | |
| 1110 | + "keyword": { | |
| 1111 | + "type": "keyword", | |
| 1112 | + "normalizer": "lowercase" | |
| 1113 | + } | |
| 1114 | + } | |
| 1079 | 1115 | }, |
| 1080 | 1116 | "eu": { |
| 1081 | 1117 | "type": "text", |
| 1082 | - "analyzer": "basque" | |
| 1118 | + "analyzer": "basque", | |
| 1119 | + "fields": { | |
| 1120 | + "keyword": { | |
| 1121 | + "type": "keyword", | |
| 1122 | + "normalizer": "lowercase" | |
| 1123 | + } | |
| 1124 | + } | |
| 1083 | 1125 | }, |
| 1084 | 1126 | "pt_br": { |
| 1085 | 1127 | "type": "text", |
| 1086 | - "analyzer": "brazilian" | |
| 1128 | + "analyzer": "brazilian", | |
| 1129 | + "fields": { | |
| 1130 | + "keyword": { | |
| 1131 | + "type": "keyword", | |
| 1132 | + "normalizer": "lowercase" | |
| 1133 | + } | |
| 1134 | + } | |
| 1087 | 1135 | }, |
| 1088 | 1136 | "bg": { |
| 1089 | 1137 | "type": "text", |
| 1090 | - "analyzer": "bulgarian" | |
| 1138 | + "analyzer": "bulgarian", | |
| 1139 | + "fields": { | |
| 1140 | + "keyword": { | |
| 1141 | + "type": "keyword", | |
| 1142 | + "normalizer": "lowercase" | |
| 1143 | + } | |
| 1144 | + } | |
| 1091 | 1145 | }, |
| 1092 | 1146 | "ca": { |
| 1093 | 1147 | "type": "text", |
| 1094 | - "analyzer": "catalan" | |
| 1148 | + "analyzer": "catalan", | |
| 1149 | + "fields": { | |
| 1150 | + "keyword": { | |
| 1151 | + "type": "keyword", | |
| 1152 | + "normalizer": "lowercase" | |
| 1153 | + } | |
| 1154 | + } | |
| 1095 | 1155 | }, |
| 1096 | 1156 | "cjk": { |
| 1097 | 1157 | "type": "text", |
| 1098 | - "analyzer": "cjk" | |
| 1158 | + "analyzer": "cjk", | |
| 1159 | + "fields": { | |
| 1160 | + "keyword": { | |
| 1161 | + "type": "keyword", | |
| 1162 | + "normalizer": "lowercase" | |
| 1163 | + } | |
| 1164 | + } | |
| 1099 | 1165 | }, |
| 1100 | 1166 | "cs": { |
| 1101 | 1167 | "type": "text", |
| 1102 | - "analyzer": "czech" | |
| 1168 | + "analyzer": "czech", | |
| 1169 | + "fields": { | |
| 1170 | + "keyword": { | |
| 1171 | + "type": "keyword", | |
| 1172 | + "normalizer": "lowercase" | |
| 1173 | + } | |
| 1174 | + } | |
| 1103 | 1175 | }, |
| 1104 | 1176 | "da": { |
| 1105 | 1177 | "type": "text", |
| 1106 | - "analyzer": "danish" | |
| 1178 | + "analyzer": "danish", | |
| 1179 | + "fields": { | |
| 1180 | + "keyword": { | |
| 1181 | + "type": "keyword", | |
| 1182 | + "normalizer": "lowercase" | |
| 1183 | + } | |
| 1184 | + } | |
| 1107 | 1185 | }, |
| 1108 | 1186 | "nl": { |
| 1109 | 1187 | "type": "text", |
| 1110 | - "analyzer": "dutch" | |
| 1188 | + "analyzer": "dutch", | |
| 1189 | + "fields": { | |
| 1190 | + "keyword": { | |
| 1191 | + "type": "keyword", | |
| 1192 | + "normalizer": "lowercase" | |
| 1193 | + } | |
| 1194 | + } | |
| 1111 | 1195 | }, |
| 1112 | 1196 | "fi": { |
| 1113 | 1197 | "type": "text", |
| 1114 | - "analyzer": "finnish" | |
| 1198 | + "analyzer": "finnish", | |
| 1199 | + "fields": { | |
| 1200 | + "keyword": { | |
| 1201 | + "type": "keyword", | |
| 1202 | + "normalizer": "lowercase" | |
| 1203 | + } | |
| 1204 | + } | |
| 1115 | 1205 | }, |
| 1116 | 1206 | "fr": { |
| 1117 | 1207 | "type": "text", |
| 1118 | - "analyzer": "french" | |
| 1208 | + "analyzer": "french", | |
| 1209 | + "fields": { | |
| 1210 | + "keyword": { | |
| 1211 | + "type": "keyword", | |
| 1212 | + "normalizer": "lowercase" | |
| 1213 | + } | |
| 1214 | + } | |
| 1119 | 1215 | }, |
| 1120 | 1216 | "gl": { |
| 1121 | 1217 | "type": "text", |
| 1122 | - "analyzer": "galician" | |
| 1218 | + "analyzer": "galician", | |
| 1219 | + "fields": { | |
| 1220 | + "keyword": { | |
| 1221 | + "type": "keyword", | |
| 1222 | + "normalizer": "lowercase" | |
| 1223 | + } | |
| 1224 | + } | |
| 1123 | 1225 | }, |
| 1124 | 1226 | "de": { |
| 1125 | 1227 | "type": "text", |
| 1126 | - "analyzer": "german" | |
| 1228 | + "analyzer": "german", | |
| 1229 | + "fields": { | |
| 1230 | + "keyword": { | |
| 1231 | + "type": "keyword", | |
| 1232 | + "normalizer": "lowercase" | |
| 1233 | + } | |
| 1234 | + } | |
| 1127 | 1235 | }, |
| 1128 | 1236 | "el": { |
| 1129 | 1237 | "type": "text", |
| 1130 | - "analyzer": "greek" | |
| 1238 | + "analyzer": "greek", | |
| 1239 | + "fields": { | |
| 1240 | + "keyword": { | |
| 1241 | + "type": "keyword", | |
| 1242 | + "normalizer": "lowercase" | |
| 1243 | + } | |
| 1244 | + } | |
| 1131 | 1245 | }, |
| 1132 | 1246 | "hi": { |
| 1133 | 1247 | "type": "text", |
| 1134 | - "analyzer": "hindi" | |
| 1248 | + "analyzer": "hindi", | |
| 1249 | + "fields": { | |
| 1250 | + "keyword": { | |
| 1251 | + "type": "keyword", | |
| 1252 | + "normalizer": "lowercase" | |
| 1253 | + } | |
| 1254 | + } | |
| 1135 | 1255 | }, |
| 1136 | 1256 | "hu": { |
| 1137 | 1257 | "type": "text", |
| 1138 | - "analyzer": "hungarian" | |
| 1258 | + "analyzer": "hungarian", | |
| 1259 | + "fields": { | |
| 1260 | + "keyword": { | |
| 1261 | + "type": "keyword", | |
| 1262 | + "normalizer": "lowercase" | |
| 1263 | + } | |
| 1264 | + } | |
| 1139 | 1265 | }, |
| 1140 | 1266 | "id": { |
| 1141 | 1267 | "type": "text", |
| 1142 | - "analyzer": "indonesian" | |
| 1268 | + "analyzer": "indonesian", | |
| 1269 | + "fields": { | |
| 1270 | + "keyword": { | |
| 1271 | + "type": "keyword", | |
| 1272 | + "normalizer": "lowercase" | |
| 1273 | + } | |
| 1274 | + } | |
| 1143 | 1275 | }, |
| 1144 | 1276 | "it": { |
| 1145 | 1277 | "type": "text", |
| 1146 | - "analyzer": "italian" | |
| 1278 | + "analyzer": "italian", | |
| 1279 | + "fields": { | |
| 1280 | + "keyword": { | |
| 1281 | + "type": "keyword", | |
| 1282 | + "normalizer": "lowercase" | |
| 1283 | + } | |
| 1284 | + } | |
| 1147 | 1285 | }, |
| 1148 | 1286 | "no": { |
| 1149 | 1287 | "type": "text", |
| 1150 | - "analyzer": "norwegian" | |
| 1288 | + "analyzer": "norwegian", | |
| 1289 | + "fields": { | |
| 1290 | + "keyword": { | |
| 1291 | + "type": "keyword", | |
| 1292 | + "normalizer": "lowercase" | |
| 1293 | + } | |
| 1294 | + } | |
| 1151 | 1295 | }, |
| 1152 | 1296 | "fa": { |
| 1153 | 1297 | "type": "text", |
| 1154 | - "analyzer": "persian" | |
| 1298 | + "analyzer": "persian", | |
| 1299 | + "fields": { | |
| 1300 | + "keyword": { | |
| 1301 | + "type": "keyword", | |
| 1302 | + "normalizer": "lowercase" | |
| 1303 | + } | |
| 1304 | + } | |
| 1155 | 1305 | }, |
| 1156 | 1306 | "pt": { |
| 1157 | 1307 | "type": "text", |
| 1158 | - "analyzer": "portuguese" | |
| 1308 | + "analyzer": "portuguese", | |
| 1309 | + "fields": { | |
| 1310 | + "keyword": { | |
| 1311 | + "type": "keyword", | |
| 1312 | + "normalizer": "lowercase" | |
| 1313 | + } | |
| 1314 | + } | |
| 1159 | 1315 | }, |
| 1160 | 1316 | "ro": { |
| 1161 | 1317 | "type": "text", |
| 1162 | - "analyzer": "romanian" | |
| 1318 | + "analyzer": "romanian", | |
| 1319 | + "fields": { | |
| 1320 | + "keyword": { | |
| 1321 | + "type": "keyword", | |
| 1322 | + "normalizer": "lowercase" | |
| 1323 | + } | |
| 1324 | + } | |
| 1163 | 1325 | }, |
| 1164 | 1326 | "ru": { |
| 1165 | 1327 | "type": "text", |
| 1166 | - "analyzer": "russian" | |
| 1328 | + "analyzer": "russian", | |
| 1329 | + "fields": { | |
| 1330 | + "keyword": { | |
| 1331 | + "type": "keyword", | |
| 1332 | + "normalizer": "lowercase" | |
| 1333 | + } | |
| 1334 | + } | |
| 1167 | 1335 | }, |
| 1168 | 1336 | "es": { |
| 1169 | 1337 | "type": "text", |
| 1170 | - "analyzer": "spanish" | |
| 1338 | + "analyzer": "spanish", | |
| 1339 | + "fields": { | |
| 1340 | + "keyword": { | |
| 1341 | + "type": "keyword", | |
| 1342 | + "normalizer": "lowercase" | |
| 1343 | + } | |
| 1344 | + } | |
| 1171 | 1345 | }, |
| 1172 | 1346 | "sv": { |
| 1173 | 1347 | "type": "text", |
| 1174 | - "analyzer": "swedish" | |
| 1348 | + "analyzer": "swedish", | |
| 1349 | + "fields": { | |
| 1350 | + "keyword": { | |
| 1351 | + "type": "keyword", | |
| 1352 | + "normalizer": "lowercase" | |
| 1353 | + } | |
| 1354 | + } | |
| 1175 | 1355 | }, |
| 1176 | 1356 | "tr": { |
| 1177 | 1357 | "type": "text", |
| 1178 | - "analyzer": "turkish" | |
| 1358 | + "analyzer": "turkish", | |
| 1359 | + "fields": { | |
| 1360 | + "keyword": { | |
| 1361 | + "type": "keyword", | |
| 1362 | + "normalizer": "lowercase" | |
| 1363 | + } | |
| 1364 | + } | |
| 1179 | 1365 | }, |
| 1180 | 1366 | "th": { |
| 1181 | 1367 | "type": "text", |
| 1182 | - "analyzer": "thai" | |
| 1368 | + "analyzer": "thai", | |
| 1369 | + "fields": { | |
| 1370 | + "keyword": { | |
| 1371 | + "type": "keyword", | |
| 1372 | + "normalizer": "lowercase" | |
| 1373 | + } | |
| 1374 | + } | |
| 1183 | 1375 | } |
| 1184 | 1376 | } |
| 1185 | 1377 | }, |
| ... | ... | @@ -1189,123 +1381,303 @@ |
| 1189 | 1381 | "zh": { |
| 1190 | 1382 | "type": "text", |
| 1191 | 1383 | "analyzer": "index_ik", |
| 1192 | - "search_analyzer": "query_ik" | |
| 1384 | + "search_analyzer": "query_ik", | |
| 1385 | + "fields": { | |
| 1386 | + "keyword": { | |
| 1387 | + "type": "keyword", | |
| 1388 | + "normalizer": "lowercase" | |
| 1389 | + } | |
| 1390 | + } | |
| 1193 | 1391 | }, |
| 1194 | 1392 | "en": { |
| 1195 | 1393 | "type": "text", |
| 1196 | - "analyzer": "english" | |
| 1394 | + "analyzer": "english", | |
| 1395 | + "fields": { | |
| 1396 | + "keyword": { | |
| 1397 | + "type": "keyword", | |
| 1398 | + "normalizer": "lowercase" | |
| 1399 | + } | |
| 1400 | + } | |
| 1197 | 1401 | }, |
| 1198 | 1402 | "ar": { |
| 1199 | 1403 | "type": "text", |
| 1200 | - "analyzer": "arabic" | |
| 1404 | + "analyzer": "arabic", | |
| 1405 | + "fields": { | |
| 1406 | + "keyword": { | |
| 1407 | + "type": "keyword", | |
| 1408 | + "normalizer": "lowercase" | |
| 1409 | + } | |
| 1410 | + } | |
| 1201 | 1411 | }, |
| 1202 | 1412 | "hy": { |
| 1203 | 1413 | "type": "text", |
| 1204 | - "analyzer": "armenian" | |
| 1414 | + "analyzer": "armenian", | |
| 1415 | + "fields": { | |
| 1416 | + "keyword": { | |
| 1417 | + "type": "keyword", | |
| 1418 | + "normalizer": "lowercase" | |
| 1419 | + } | |
| 1420 | + } | |
| 1205 | 1421 | }, |
| 1206 | 1422 | "eu": { |
| 1207 | 1423 | "type": "text", |
| 1208 | - "analyzer": "basque" | |
| 1424 | + "analyzer": "basque", | |
| 1425 | + "fields": { | |
| 1426 | + "keyword": { | |
| 1427 | + "type": "keyword", | |
| 1428 | + "normalizer": "lowercase" | |
| 1429 | + } | |
| 1430 | + } | |
| 1209 | 1431 | }, |
| 1210 | 1432 | "pt_br": { |
| 1211 | 1433 | "type": "text", |
| 1212 | - "analyzer": "brazilian" | |
| 1434 | + "analyzer": "brazilian", | |
| 1435 | + "fields": { | |
| 1436 | + "keyword": { | |
| 1437 | + "type": "keyword", | |
| 1438 | + "normalizer": "lowercase" | |
| 1439 | + } | |
| 1440 | + } | |
| 1213 | 1441 | }, |
| 1214 | 1442 | "bg": { |
| 1215 | 1443 | "type": "text", |
| 1216 | - "analyzer": "bulgarian" | |
| 1444 | + "analyzer": "bulgarian", | |
| 1445 | + "fields": { | |
| 1446 | + "keyword": { | |
| 1447 | + "type": "keyword", | |
| 1448 | + "normalizer": "lowercase" | |
| 1449 | + } | |
| 1450 | + } | |
| 1217 | 1451 | }, |
| 1218 | 1452 | "ca": { |
| 1219 | 1453 | "type": "text", |
| 1220 | - "analyzer": "catalan" | |
| 1454 | + "analyzer": "catalan", | |
| 1455 | + "fields": { | |
| 1456 | + "keyword": { | |
| 1457 | + "type": "keyword", | |
| 1458 | + "normalizer": "lowercase" | |
| 1459 | + } | |
| 1460 | + } | |
| 1221 | 1461 | }, |
| 1222 | 1462 | "cjk": { |
| 1223 | 1463 | "type": "text", |
| 1224 | - "analyzer": "cjk" | |
| 1464 | + "analyzer": "cjk", | |
| 1465 | + "fields": { | |
| 1466 | + "keyword": { | |
| 1467 | + "type": "keyword", | |
| 1468 | + "normalizer": "lowercase" | |
| 1469 | + } | |
| 1470 | + } | |
| 1225 | 1471 | }, |
| 1226 | 1472 | "cs": { |
| 1227 | 1473 | "type": "text", |
| 1228 | - "analyzer": "czech" | |
| 1474 | + "analyzer": "czech", | |
| 1475 | + "fields": { | |
| 1476 | + "keyword": { | |
| 1477 | + "type": "keyword", | |
| 1478 | + "normalizer": "lowercase" | |
| 1479 | + } | |
| 1480 | + } | |
| 1229 | 1481 | }, |
| 1230 | 1482 | "da": { |
| 1231 | 1483 | "type": "text", |
| 1232 | - "analyzer": "danish" | |
| 1484 | + "analyzer": "danish", | |
| 1485 | + "fields": { | |
| 1486 | + "keyword": { | |
| 1487 | + "type": "keyword", | |
| 1488 | + "normalizer": "lowercase" | |
| 1489 | + } | |
| 1490 | + } | |
| 1233 | 1491 | }, |
| 1234 | 1492 | "nl": { |
| 1235 | 1493 | "type": "text", |
| 1236 | - "analyzer": "dutch" | |
| 1494 | + "analyzer": "dutch", | |
| 1495 | + "fields": { | |
| 1496 | + "keyword": { | |
| 1497 | + "type": "keyword", | |
| 1498 | + "normalizer": "lowercase" | |
| 1499 | + } | |
| 1500 | + } | |
| 1237 | 1501 | }, |
| 1238 | 1502 | "fi": { |
| 1239 | 1503 | "type": "text", |
| 1240 | - "analyzer": "finnish" | |
| 1504 | + "analyzer": "finnish", | |
| 1505 | + "fields": { | |
| 1506 | + "keyword": { | |
| 1507 | + "type": "keyword", | |
| 1508 | + "normalizer": "lowercase" | |
| 1509 | + } | |
| 1510 | + } | |
| 1241 | 1511 | }, |
| 1242 | 1512 | "fr": { |
| 1243 | 1513 | "type": "text", |
| 1244 | - "analyzer": "french" | |
| 1514 | + "analyzer": "french", | |
| 1515 | + "fields": { | |
| 1516 | + "keyword": { | |
| 1517 | + "type": "keyword", | |
| 1518 | + "normalizer": "lowercase" | |
| 1519 | + } | |
| 1520 | + } | |
| 1245 | 1521 | }, |
| 1246 | 1522 | "gl": { |
| 1247 | 1523 | "type": "text", |
| 1248 | - "analyzer": "galician" | |
| 1524 | + "analyzer": "galician", | |
| 1525 | + "fields": { | |
| 1526 | + "keyword": { | |
| 1527 | + "type": "keyword", | |
| 1528 | + "normalizer": "lowercase" | |
| 1529 | + } | |
| 1530 | + } | |
| 1249 | 1531 | }, |
| 1250 | 1532 | "de": { |
| 1251 | 1533 | "type": "text", |
| 1252 | - "analyzer": "german" | |
| 1534 | + "analyzer": "german", | |
| 1535 | + "fields": { | |
| 1536 | + "keyword": { | |
| 1537 | + "type": "keyword", | |
| 1538 | + "normalizer": "lowercase" | |
| 1539 | + } | |
| 1540 | + } | |
| 1253 | 1541 | }, |
| 1254 | 1542 | "el": { |
| 1255 | 1543 | "type": "text", |
| 1256 | - "analyzer": "greek" | |
| 1544 | + "analyzer": "greek", | |
| 1545 | + "fields": { | |
| 1546 | + "keyword": { | |
| 1547 | + "type": "keyword", | |
| 1548 | + "normalizer": "lowercase" | |
| 1549 | + } | |
| 1550 | + } | |
| 1257 | 1551 | }, |
| 1258 | 1552 | "hi": { |
| 1259 | 1553 | "type": "text", |
| 1260 | - "analyzer": "hindi" | |
| 1554 | + "analyzer": "hindi", | |
| 1555 | + "fields": { | |
| 1556 | + "keyword": { | |
| 1557 | + "type": "keyword", | |
| 1558 | + "normalizer": "lowercase" | |
| 1559 | + } | |
| 1560 | + } | |
| 1261 | 1561 | }, |
| 1262 | 1562 | "hu": { |
| 1263 | 1563 | "type": "text", |
| 1264 | - "analyzer": "hungarian" | |
| 1564 | + "analyzer": "hungarian", | |
| 1565 | + "fields": { | |
| 1566 | + "keyword": { | |
| 1567 | + "type": "keyword", | |
| 1568 | + "normalizer": "lowercase" | |
| 1569 | + } | |
| 1570 | + } | |
| 1265 | 1571 | }, |
| 1266 | 1572 | "id": { |
| 1267 | 1573 | "type": "text", |
| 1268 | - "analyzer": "indonesian" | |
| 1574 | + "analyzer": "indonesian", | |
| 1575 | + "fields": { | |
| 1576 | + "keyword": { | |
| 1577 | + "type": "keyword", | |
| 1578 | + "normalizer": "lowercase" | |
| 1579 | + } | |
| 1580 | + } | |
| 1269 | 1581 | }, |
| 1270 | 1582 | "it": { |
| 1271 | 1583 | "type": "text", |
| 1272 | - "analyzer": "italian" | |
| 1584 | + "analyzer": "italian", | |
| 1585 | + "fields": { | |
| 1586 | + "keyword": { | |
| 1587 | + "type": "keyword", | |
| 1588 | + "normalizer": "lowercase" | |
| 1589 | + } | |
| 1590 | + } | |
| 1273 | 1591 | }, |
| 1274 | 1592 | "no": { |
| 1275 | 1593 | "type": "text", |
| 1276 | - "analyzer": "norwegian" | |
| 1594 | + "analyzer": "norwegian", | |
| 1595 | + "fields": { | |
| 1596 | + "keyword": { | |
| 1597 | + "type": "keyword", | |
| 1598 | + "normalizer": "lowercase" | |
| 1599 | + } | |
| 1600 | + } | |
| 1277 | 1601 | }, |
| 1278 | 1602 | "fa": { |
| 1279 | 1603 | "type": "text", |
| 1280 | - "analyzer": "persian" | |
| 1604 | + "analyzer": "persian", | |
| 1605 | + "fields": { | |
| 1606 | + "keyword": { | |
| 1607 | + "type": "keyword", | |
| 1608 | + "normalizer": "lowercase" | |
| 1609 | + } | |
| 1610 | + } | |
| 1281 | 1611 | }, |
| 1282 | 1612 | "pt": { |
| 1283 | 1613 | "type": "text", |
| 1284 | - "analyzer": "portuguese" | |
| 1614 | + "analyzer": "portuguese", | |
| 1615 | + "fields": { | |
| 1616 | + "keyword": { | |
| 1617 | + "type": "keyword", | |
| 1618 | + "normalizer": "lowercase" | |
| 1619 | + } | |
| 1620 | + } | |
| 1285 | 1621 | }, |
| 1286 | 1622 | "ro": { |
| 1287 | 1623 | "type": "text", |
| 1288 | - "analyzer": "romanian" | |
| 1624 | + "analyzer": "romanian", | |
| 1625 | + "fields": { | |
| 1626 | + "keyword": { | |
| 1627 | + "type": "keyword", | |
| 1628 | + "normalizer": "lowercase" | |
| 1629 | + } | |
| 1630 | + } | |
| 1289 | 1631 | }, |
| 1290 | 1632 | "ru": { |
| 1291 | 1633 | "type": "text", |
| 1292 | - "analyzer": "russian" | |
| 1634 | + "analyzer": "russian", | |
| 1635 | + "fields": { | |
| 1636 | + "keyword": { | |
| 1637 | + "type": "keyword", | |
| 1638 | + "normalizer": "lowercase" | |
| 1639 | + } | |
| 1640 | + } | |
| 1293 | 1641 | }, |
| 1294 | 1642 | "es": { |
| 1295 | 1643 | "type": "text", |
| 1296 | - "analyzer": "spanish" | |
| 1644 | + "analyzer": "spanish", | |
| 1645 | + "fields": { | |
| 1646 | + "keyword": { | |
| 1647 | + "type": "keyword", | |
| 1648 | + "normalizer": "lowercase" | |
| 1649 | + } | |
| 1650 | + } | |
| 1297 | 1651 | }, |
| 1298 | 1652 | "sv": { |
| 1299 | 1653 | "type": "text", |
| 1300 | - "analyzer": "swedish" | |
| 1654 | + "analyzer": "swedish", | |
| 1655 | + "fields": { | |
| 1656 | + "keyword": { | |
| 1657 | + "type": "keyword", | |
| 1658 | + "normalizer": "lowercase" | |
| 1659 | + } | |
| 1660 | + } | |
| 1301 | 1661 | }, |
| 1302 | 1662 | "tr": { |
| 1303 | 1663 | "type": "text", |
| 1304 | - "analyzer": "turkish" | |
| 1664 | + "analyzer": "turkish", | |
| 1665 | + "fields": { | |
| 1666 | + "keyword": { | |
| 1667 | + "type": "keyword", | |
| 1668 | + "normalizer": "lowercase" | |
| 1669 | + } | |
| 1670 | + } | |
| 1305 | 1671 | }, |
| 1306 | 1672 | "th": { |
| 1307 | 1673 | "type": "text", |
| 1308 | - "analyzer": "thai" | |
| 1674 | + "analyzer": "thai", | |
| 1675 | + "fields": { | |
| 1676 | + "keyword": { | |
| 1677 | + "type": "keyword", | |
| 1678 | + "normalizer": "lowercase" | |
| 1679 | + } | |
| 1680 | + } | |
| 1309 | 1681 | } |
| 1310 | 1682 | } |
| 1311 | 1683 | }, |
| ... | ... | @@ -1377,6 +1749,9 @@ |
| 1377 | 1749 | "type": "keyword" |
| 1378 | 1750 | }, |
| 1379 | 1751 | "value": { |
| 1752 | + "type": "keyword" | |
| 1753 | + }, | |
| 1754 | + "value_text": { | |
| 1380 | 1755 | "type": "object", |
| 1381 | 1756 | "properties": { |
| 1382 | 1757 | "zh": { |
| ... | ... | @@ -1399,6 +1774,286 @@ |
| 1399 | 1774 | "normalizer": "lowercase" |
| 1400 | 1775 | } |
| 1401 | 1776 | } |
| 1777 | + }, | |
| 1778 | + "ar": { | |
| 1779 | + "type": "text", | |
| 1780 | + "analyzer": "arabic", | |
| 1781 | + "fields": { | |
| 1782 | + "keyword": { | |
| 1783 | + "type": "keyword", | |
| 1784 | + "normalizer": "lowercase" | |
| 1785 | + } | |
| 1786 | + } | |
| 1787 | + }, | |
| 1788 | + "hy": { | |
| 1789 | + "type": "text", | |
| 1790 | + "analyzer": "armenian", | |
| 1791 | + "fields": { | |
| 1792 | + "keyword": { | |
| 1793 | + "type": "keyword", | |
| 1794 | + "normalizer": "lowercase" | |
| 1795 | + } | |
| 1796 | + } | |
| 1797 | + }, | |
| 1798 | + "eu": { | |
| 1799 | + "type": "text", | |
| 1800 | + "analyzer": "basque", | |
| 1801 | + "fields": { | |
| 1802 | + "keyword": { | |
| 1803 | + "type": "keyword", | |
| 1804 | + "normalizer": "lowercase" | |
| 1805 | + } | |
| 1806 | + } | |
| 1807 | + }, | |
| 1808 | + "pt_br": { | |
| 1809 | + "type": "text", | |
| 1810 | + "analyzer": "brazilian", | |
| 1811 | + "fields": { | |
| 1812 | + "keyword": { | |
| 1813 | + "type": "keyword", | |
| 1814 | + "normalizer": "lowercase" | |
| 1815 | + } | |
| 1816 | + } | |
| 1817 | + }, | |
| 1818 | + "bg": { | |
| 1819 | + "type": "text", | |
| 1820 | + "analyzer": "bulgarian", | |
| 1821 | + "fields": { | |
| 1822 | + "keyword": { | |
| 1823 | + "type": "keyword", | |
| 1824 | + "normalizer": "lowercase" | |
| 1825 | + } | |
| 1826 | + } | |
| 1827 | + }, | |
| 1828 | + "ca": { | |
| 1829 | + "type": "text", | |
| 1830 | + "analyzer": "catalan", | |
| 1831 | + "fields": { | |
| 1832 | + "keyword": { | |
| 1833 | + "type": "keyword", | |
| 1834 | + "normalizer": "lowercase" | |
| 1835 | + } | |
| 1836 | + } | |
| 1837 | + }, | |
| 1838 | + "cjk": { | |
| 1839 | + "type": "text", | |
| 1840 | + "analyzer": "cjk", | |
| 1841 | + "fields": { | |
| 1842 | + "keyword": { | |
| 1843 | + "type": "keyword", | |
| 1844 | + "normalizer": "lowercase" | |
| 1845 | + } | |
| 1846 | + } | |
| 1847 | + }, | |
| 1848 | + "cs": { | |
| 1849 | + "type": "text", | |
| 1850 | + "analyzer": "czech", | |
| 1851 | + "fields": { | |
| 1852 | + "keyword": { | |
| 1853 | + "type": "keyword", | |
| 1854 | + "normalizer": "lowercase" | |
| 1855 | + } | |
| 1856 | + } | |
| 1857 | + }, | |
| 1858 | + "da": { | |
| 1859 | + "type": "text", | |
| 1860 | + "analyzer": "danish", | |
| 1861 | + "fields": { | |
| 1862 | + "keyword": { | |
| 1863 | + "type": "keyword", | |
| 1864 | + "normalizer": "lowercase" | |
| 1865 | + } | |
| 1866 | + } | |
| 1867 | + }, | |
| 1868 | + "nl": { | |
| 1869 | + "type": "text", | |
| 1870 | + "analyzer": "dutch", | |
| 1871 | + "fields": { | |
| 1872 | + "keyword": { | |
| 1873 | + "type": "keyword", | |
| 1874 | + "normalizer": "lowercase" | |
| 1875 | + } | |
| 1876 | + } | |
| 1877 | + }, | |
| 1878 | + "fi": { | |
| 1879 | + "type": "text", | |
| 1880 | + "analyzer": "finnish", | |
| 1881 | + "fields": { | |
| 1882 | + "keyword": { | |
| 1883 | + "type": "keyword", | |
| 1884 | + "normalizer": "lowercase" | |
| 1885 | + } | |
| 1886 | + } | |
| 1887 | + }, | |
| 1888 | + "fr": { | |
| 1889 | + "type": "text", | |
| 1890 | + "analyzer": "french", | |
| 1891 | + "fields": { | |
| 1892 | + "keyword": { | |
| 1893 | + "type": "keyword", | |
| 1894 | + "normalizer": "lowercase" | |
| 1895 | + } | |
| 1896 | + } | |
| 1897 | + }, | |
| 1898 | + "gl": { | |
| 1899 | + "type": "text", | |
| 1900 | + "analyzer": "galician", | |
| 1901 | + "fields": { | |
| 1902 | + "keyword": { | |
| 1903 | + "type": "keyword", | |
| 1904 | + "normalizer": "lowercase" | |
| 1905 | + } | |
| 1906 | + } | |
| 1907 | + }, | |
| 1908 | + "de": { | |
| 1909 | + "type": "text", | |
| 1910 | + "analyzer": "german", | |
| 1911 | + "fields": { | |
| 1912 | + "keyword": { | |
| 1913 | + "type": "keyword", | |
| 1914 | + "normalizer": "lowercase" | |
| 1915 | + } | |
| 1916 | + } | |
| 1917 | + }, | |
| 1918 | + "el": { | |
| 1919 | + "type": "text", | |
| 1920 | + "analyzer": "greek", | |
| 1921 | + "fields": { | |
| 1922 | + "keyword": { | |
| 1923 | + "type": "keyword", | |
| 1924 | + "normalizer": "lowercase" | |
| 1925 | + } | |
| 1926 | + } | |
| 1927 | + }, | |
| 1928 | + "hi": { | |
| 1929 | + "type": "text", | |
| 1930 | + "analyzer": "hindi", | |
| 1931 | + "fields": { | |
| 1932 | + "keyword": { | |
| 1933 | + "type": "keyword", | |
| 1934 | + "normalizer": "lowercase" | |
| 1935 | + } | |
| 1936 | + } | |
| 1937 | + }, | |
| 1938 | + "hu": { | |
| 1939 | + "type": "text", | |
| 1940 | + "analyzer": "hungarian", | |
| 1941 | + "fields": { | |
| 1942 | + "keyword": { | |
| 1943 | + "type": "keyword", | |
| 1944 | + "normalizer": "lowercase" | |
| 1945 | + } | |
| 1946 | + } | |
| 1947 | + }, | |
| 1948 | + "id": { | |
| 1949 | + "type": "text", | |
| 1950 | + "analyzer": "indonesian", | |
| 1951 | + "fields": { | |
| 1952 | + "keyword": { | |
| 1953 | + "type": "keyword", | |
| 1954 | + "normalizer": "lowercase" | |
| 1955 | + } | |
| 1956 | + } | |
| 1957 | + }, | |
| 1958 | + "it": { | |
| 1959 | + "type": "text", | |
| 1960 | + "analyzer": "italian", | |
| 1961 | + "fields": { | |
| 1962 | + "keyword": { | |
| 1963 | + "type": "keyword", | |
| 1964 | + "normalizer": "lowercase" | |
| 1965 | + } | |
| 1966 | + } | |
| 1967 | + }, | |
| 1968 | + "no": { | |
| 1969 | + "type": "text", | |
| 1970 | + "analyzer": "norwegian", | |
| 1971 | + "fields": { | |
| 1972 | + "keyword": { | |
| 1973 | + "type": "keyword", | |
| 1974 | + "normalizer": "lowercase" | |
| 1975 | + } | |
| 1976 | + } | |
| 1977 | + }, | |
| 1978 | + "fa": { | |
| 1979 | + "type": "text", | |
| 1980 | + "analyzer": "persian", | |
| 1981 | + "fields": { | |
| 1982 | + "keyword": { | |
| 1983 | + "type": "keyword", | |
| 1984 | + "normalizer": "lowercase" | |
| 1985 | + } | |
| 1986 | + } | |
| 1987 | + }, | |
| 1988 | + "pt": { | |
| 1989 | + "type": "text", | |
| 1990 | + "analyzer": "portuguese", | |
| 1991 | + "fields": { | |
| 1992 | + "keyword": { | |
| 1993 | + "type": "keyword", | |
| 1994 | + "normalizer": "lowercase" | |
| 1995 | + } | |
| 1996 | + } | |
| 1997 | + }, | |
| 1998 | + "ro": { | |
| 1999 | + "type": "text", | |
| 2000 | + "analyzer": "romanian", | |
| 2001 | + "fields": { | |
| 2002 | + "keyword": { | |
| 2003 | + "type": "keyword", | |
| 2004 | + "normalizer": "lowercase" | |
| 2005 | + } | |
| 2006 | + } | |
| 2007 | + }, | |
| 2008 | + "ru": { | |
| 2009 | + "type": "text", | |
| 2010 | + "analyzer": "russian", | |
| 2011 | + "fields": { | |
| 2012 | + "keyword": { | |
| 2013 | + "type": "keyword", | |
| 2014 | + "normalizer": "lowercase" | |
| 2015 | + } | |
| 2016 | + } | |
| 2017 | + }, | |
| 2018 | + "es": { | |
| 2019 | + "type": "text", | |
| 2020 | + "analyzer": "spanish", | |
| 2021 | + "fields": { | |
| 2022 | + "keyword": { | |
| 2023 | + "type": "keyword", | |
| 2024 | + "normalizer": "lowercase" | |
| 2025 | + } | |
| 2026 | + } | |
| 2027 | + }, | |
| 2028 | + "sv": { | |
| 2029 | + "type": "text", | |
| 2030 | + "analyzer": "swedish", | |
| 2031 | + "fields": { | |
| 2032 | + "keyword": { | |
| 2033 | + "type": "keyword", | |
| 2034 | + "normalizer": "lowercase" | |
| 2035 | + } | |
| 2036 | + } | |
| 2037 | + }, | |
| 2038 | + "tr": { | |
| 2039 | + "type": "text", | |
| 2040 | + "analyzer": "turkish", | |
| 2041 | + "fields": { | |
| 2042 | + "keyword": { | |
| 2043 | + "type": "keyword", | |
| 2044 | + "normalizer": "lowercase" | |
| 2045 | + } | |
| 2046 | + } | |
| 2047 | + }, | |
| 2048 | + "th": { | |
| 2049 | + "type": "text", | |
| 2050 | + "analyzer": "thai", | |
| 2051 | + "fields": { | |
| 2052 | + "keyword": { | |
| 2053 | + "type": "keyword", | |
| 2054 | + "normalizer": "lowercase" | |
| 2055 | + } | |
| 2056 | + } | |
| 1402 | 2057 | } |
| 1403 | 2058 | } |
| 1404 | 2059 | } | ... | ... |
| ... | ... | @@ -0,0 +1,629 @@ |
| 1 | +{ | |
| 2 | + "settings": { | |
| 3 | + "number_of_shards": 1, | |
| 4 | + "number_of_replicas": 0, | |
| 5 | + "refresh_interval": "30s", | |
| 6 | + "analysis": { | |
| 7 | + "analyzer": { | |
| 8 | + "index_ik": { | |
| 9 | + "type": "custom", | |
| 10 | + "tokenizer": "ik_max_word", | |
| 11 | + "filter": [ | |
| 12 | + "lowercase", | |
| 13 | + "asciifolding" | |
| 14 | + ] | |
| 15 | + }, | |
| 16 | + "query_ik": { | |
| 17 | + "type": "custom", | |
| 18 | + "tokenizer": "ik_smart", | |
| 19 | + "filter": [ | |
| 20 | + "lowercase", | |
| 21 | + "asciifolding" | |
| 22 | + ] | |
| 23 | + } | |
| 24 | + }, | |
| 25 | + "normalizer": { | |
| 26 | + "lowercase": { | |
| 27 | + "type": "custom", | |
| 28 | + "filter": [ | |
| 29 | + "lowercase" | |
| 30 | + ] | |
| 31 | + } | |
| 32 | + } | |
| 33 | + }, | |
| 34 | + "similarity": { | |
| 35 | + "default": { | |
| 36 | + "type": "BM25", | |
| 37 | + "b": 0.0, | |
| 38 | + "k1": 0.0 | |
| 39 | + } | |
| 40 | + } | |
| 41 | + }, | |
| 42 | + "mappings": { | |
| 43 | + "properties": { | |
| 44 | + "tenant_id": { | |
| 45 | + "type": "keyword" | |
| 46 | + }, | |
| 47 | + "spu_id": { | |
| 48 | + "type": "keyword" | |
| 49 | + }, | |
| 50 | + "create_time": { | |
| 51 | + "type": "date" | |
| 52 | + }, | |
| 53 | + "update_time": { | |
| 54 | + "type": "date" | |
| 55 | + }, | |
| 56 | + "title": { | |
| 57 | + "type": "object", | |
| 58 | + "properties": { | |
| 59 | + "zh": { | |
| 60 | + "type": "text", | |
| 61 | + "analyzer": "index_ik", | |
| 62 | + "search_analyzer": "query_ik" | |
| 63 | + }, | |
| 64 | + "en": { | |
| 65 | + "type": "text", | |
| 66 | + "analyzer": "english" | |
| 67 | + }, | |
| 68 | + "ar": { | |
| 69 | + "type": "text", | |
| 70 | + "analyzer": "arabic" | |
| 71 | + }, | |
| 72 | + "hy": { | |
| 73 | + "type": "text", | |
| 74 | + "analyzer": "armenian" | |
| 75 | + }, | |
| 76 | + "eu": { | |
| 77 | + "type": "text", | |
| 78 | + "analyzer": "basque" | |
| 79 | + }, | |
| 80 | + "pt_br": { | |
| 81 | + "type": "text", | |
| 82 | + "analyzer": "brazilian" | |
| 83 | + }, | |
| 84 | + "bg": { | |
| 85 | + "type": "text", | |
| 86 | + "analyzer": "bulgarian" | |
| 87 | + }, | |
| 88 | + "ca": { | |
| 89 | + "type": "text", | |
| 90 | + "analyzer": "catalan" | |
| 91 | + }, | |
| 92 | + "cjk": { | |
| 93 | + "type": "text", | |
| 94 | + "analyzer": "cjk" | |
| 95 | + }, | |
| 96 | + "cs": { | |
| 97 | + "type": "text", | |
| 98 | + "analyzer": "czech" | |
| 99 | + }, | |
| 100 | + "da": { | |
| 101 | + "type": "text", | |
| 102 | + "analyzer": "danish" | |
| 103 | + }, | |
| 104 | + "nl": { | |
| 105 | + "type": "text", | |
| 106 | + "analyzer": "dutch" | |
| 107 | + }, | |
| 108 | + "fi": { | |
| 109 | + "type": "text", | |
| 110 | + "analyzer": "finnish" | |
| 111 | + }, | |
| 112 | + "fr": { | |
| 113 | + "type": "text", | |
| 114 | + "analyzer": "french" | |
| 115 | + }, | |
| 116 | + "gl": { | |
| 117 | + "type": "text", | |
| 118 | + "analyzer": "galician" | |
| 119 | + }, | |
| 120 | + "de": { | |
| 121 | + "type": "text", | |
| 122 | + "analyzer": "german" | |
| 123 | + }, | |
| 124 | + "el": { | |
| 125 | + "type": "text", | |
| 126 | + "analyzer": "greek" | |
| 127 | + }, | |
| 128 | + "hi": { | |
| 129 | + "type": "text", | |
| 130 | + "analyzer": "hindi" | |
| 131 | + }, | |
| 132 | + "hu": { | |
| 133 | + "type": "text", | |
| 134 | + "analyzer": "hungarian" | |
| 135 | + }, | |
| 136 | + "id": { | |
| 137 | + "type": "text", | |
| 138 | + "analyzer": "indonesian" | |
| 139 | + }, | |
| 140 | + "it": { | |
| 141 | + "type": "text", | |
| 142 | + "analyzer": "italian" | |
| 143 | + }, | |
| 144 | + "no": { | |
| 145 | + "type": "text", | |
| 146 | + "analyzer": "norwegian" | |
| 147 | + }, | |
| 148 | + "fa": { | |
| 149 | + "type": "text", | |
| 150 | + "analyzer": "persian" | |
| 151 | + }, | |
| 152 | + "pt": { | |
| 153 | + "type": "text", | |
| 154 | + "analyzer": "portuguese" | |
| 155 | + }, | |
| 156 | + "ro": { | |
| 157 | + "type": "text", | |
| 158 | + "analyzer": "romanian" | |
| 159 | + }, | |
| 160 | + "ru": { | |
| 161 | + "type": "text", | |
| 162 | + "analyzer": "russian" | |
| 163 | + }, | |
| 164 | + "es": { | |
| 165 | + "type": "text", | |
| 166 | + "analyzer": "spanish" | |
| 167 | + }, | |
| 168 | + "sv": { | |
| 169 | + "type": "text", | |
| 170 | + "analyzer": "swedish" | |
| 171 | + }, | |
| 172 | + "tr": { | |
| 173 | + "type": "text", | |
| 174 | + "analyzer": "turkish" | |
| 175 | + }, | |
| 176 | + "th": { | |
| 177 | + "type": "text", | |
| 178 | + "analyzer": "thai" | |
| 179 | + } | |
| 180 | + } | |
| 181 | + }, | |
| 182 | + "keywords": { | |
| 183 | + "type": "object", | |
| 184 | + "properties": { | |
| 185 | + "zh": { | |
| 186 | + "type": "text", | |
| 187 | + "analyzer": "index_ik", | |
| 188 | + "search_analyzer": "query_ik" | |
| 189 | + }, | |
| 190 | + "en": { | |
| 191 | + "type": "text", | |
| 192 | + "analyzer": "english", | |
| 193 | + "fields": { | |
| 194 | + "keyword": { | |
| 195 | + "type": "keyword", | |
| 196 | + "normalizer": "lowercase" | |
| 197 | + } | |
| 198 | + } | |
| 199 | + }, | |
| 200 | + "ar": { | |
| 201 | + "type": "text", | |
| 202 | + "analyzer": "arabic", | |
| 203 | + "fields": { | |
| 204 | + "keyword": { | |
| 205 | + "type": "keyword", | |
| 206 | + "normalizer": "lowercase" | |
| 207 | + } | |
| 208 | + } | |
| 209 | + }, | |
| 210 | +... | |
| 211 | + } | |
| 212 | + }, | |
| 213 | + "brief": { | |
| 214 | + "type": "object", | |
| 215 | + "properties": { | |
| 216 | + "zh": { | |
| 217 | + "type": "text", | |
| 218 | + "analyzer": "index_ik", | |
| 219 | + "search_analyzer": "query_ik" | |
| 220 | + }, | |
| 221 | + "en": { | |
| 222 | + "type": "text", | |
| 223 | + "analyzer": "english" | |
| 224 | + }, | |
| 225 | + "ar": { | |
| 226 | + "type": "text", | |
| 227 | + "analyzer": "arabic" | |
| 228 | + }, | |
| 229 | + ... | |
| 230 | + } | |
| 231 | + }, | |
| 232 | + "description": { | |
| 233 | + "type": "object", | |
| 234 | + "properties": { | |
| 235 | + "zh": { | |
| 236 | + "type": "text", | |
| 237 | + "analyzer": "index_ik", | |
| 238 | + "search_analyzer": "query_ik" | |
| 239 | + }, | |
| 240 | + "en": { | |
| 241 | + "type": "text", | |
| 242 | + "analyzer": "english" | |
| 243 | + }, | |
| 244 | + "ar": { | |
| 245 | + "type": "text", | |
| 246 | + "analyzer": "arabic" | |
| 247 | + }, | |
| 248 | + ... | |
| 249 | + } | |
| 250 | + }, | |
| 251 | + "vendor": { | |
| 252 | + "type": "object", | |
| 253 | + "properties": { | |
| 254 | + "zh": { | |
| 255 | + "type": "text", | |
| 256 | + "analyzer": "index_ik", | |
| 257 | + "search_analyzer": "query_ik" | |
| 258 | + }, | |
| 259 | + "en": { | |
| 260 | + "type": "text", | |
| 261 | + "analyzer": "english", | |
| 262 | + "fields": { | |
| 263 | + "keyword": { | |
| 264 | + "type": "keyword", | |
| 265 | + "normalizer": "lowercase" | |
| 266 | + } | |
| 267 | + } | |
| 268 | + }, | |
| 269 | + "ar": { | |
| 270 | + "type": "text", | |
| 271 | + "analyzer": "arabic", | |
| 272 | + "fields": { | |
| 273 | + "keyword": { | |
| 274 | + "type": "keyword", | |
| 275 | + "normalizer": "lowercase" | |
| 276 | + } | |
| 277 | + } | |
| 278 | + }, | |
| 279 | + ... | |
| 280 | + } | |
| 281 | + }, | |
| 282 | + "image_url": { | |
| 283 | + "type": "keyword", | |
| 284 | + "index": false | |
| 285 | + }, | |
| 286 | + "title_embedding": { | |
| 287 | + "type": "dense_vector", | |
| 288 | + "dims": 1024, | |
| 289 | + "index": true, | |
| 290 | + "similarity": "dot_product", | |
| 291 | + "element_type": "bfloat16" | |
| 292 | + }, | |
| 293 | + "image_embedding": { | |
| 294 | + "type": "nested", | |
| 295 | + "properties": { | |
| 296 | + "vector": { | |
| 297 | + "type": "dense_vector", | |
| 298 | + "dims": 768, | |
| 299 | + "index": true, | |
| 300 | + "similarity": "dot_product", | |
| 301 | + "element_type": "bfloat16" | |
| 302 | + }, | |
| 303 | + "url": { | |
| 304 | + "type": "text" | |
| 305 | + } | |
| 306 | + } | |
| 307 | + }, | |
| 308 | + "category_path": { | |
| 309 | + "type": "object", | |
| 310 | + "properties": { | |
| 311 | + "zh": { | |
| 312 | + "type": "text", | |
| 313 | + "analyzer": "index_ik", | |
| 314 | + "search_analyzer": "query_ik" | |
| 315 | + }, | |
| 316 | + "en": { | |
| 317 | + "type": "text", | |
| 318 | + "analyzer": "english" | |
| 319 | + }, | |
| 320 | + "ar": { | |
| 321 | + "type": "text", | |
| 322 | + "analyzer": "arabic" | |
| 323 | + }, | |
| 324 | + ... | |
| 325 | + } | |
| 326 | + } | |
| 327 | + }, | |
| 328 | + "category_name_text": { | |
| 329 | + "type": "object", | |
| 330 | + "properties": { | |
| 331 | + "zh": { | |
| 332 | + "type": "text", | |
| 333 | + "analyzer": "index_ik", | |
| 334 | + "search_analyzer": "query_ik" | |
| 335 | + }, | |
| 336 | + "en": { | |
| 337 | + "type": "text", | |
| 338 | + "analyzer": "english" | |
| 339 | + }, | |
| 340 | + "ar": { | |
| 341 | + "type": "text", | |
| 342 | + "analyzer": "arabic" | |
| 343 | + }, | |
| 344 | + ... | |
| 345 | + | |
| 346 | + } | |
| 347 | + }, | |
| 348 | + "qanchors": { | |
| 349 | + "type": "object", | |
| 350 | + "properties": { | |
| 351 | + "zh": { | |
| 352 | + "type": "text", | |
| 353 | + "analyzer": "index_ik", | |
| 354 | + "search_analyzer": "query_ik" | |
| 355 | + }, | |
| 356 | + "en": { | |
| 357 | + "type": "text", | |
| 358 | + "analyzer": "english" | |
| 359 | + } | |
| 360 | + } | |
| 361 | + }, | |
| 362 | + "tags": { | |
| 363 | + "type": "object", | |
| 364 | + "properties": { | |
| 365 | + "zh": { | |
| 366 | + "type": "text", | |
| 367 | + "analyzer": "index_ik", | |
| 368 | + "search_analyzer": "query_ik", | |
| 369 | + "fields": { | |
| 370 | + "keyword": { | |
| 371 | + "type": "keyword", | |
| 372 | + "normalizer": "lowercase" | |
| 373 | + } | |
| 374 | + } | |
| 375 | + }, | |
| 376 | + "en": { | |
| 377 | + "type": "text", | |
| 378 | + "analyzer": "english", | |
| 379 | + "fields": { | |
| 380 | + "keyword": { | |
| 381 | + "type": "keyword", | |
| 382 | + "normalizer": "lowercase" | |
| 383 | + } | |
| 384 | + } | |
| 385 | + } | |
| 386 | + } | |
| 387 | + }, | |
| 388 | + "category_id": { | |
| 389 | + "type": "keyword" | |
| 390 | + }, | |
| 391 | + "category_name": { | |
| 392 | + "type": "keyword" | |
| 393 | + }, | |
| 394 | + "category_level": { | |
| 395 | + "type": "integer" | |
| 396 | + }, | |
| 397 | + "category1_name": { | |
| 398 | + "type": "keyword" | |
| 399 | + }, | |
| 400 | + "category2_name": { | |
| 401 | + "type": "keyword" | |
| 402 | + }, | |
| 403 | + "category3_name": { | |
| 404 | + "type": "keyword" | |
| 405 | + }, | |
| 406 | + "specifications": { | |
| 407 | + "type": "nested", | |
| 408 | + "properties": { | |
| 409 | + "sku_id": { | |
| 410 | + "type": "keyword" | |
| 411 | + }, | |
| 412 | + "name": { | |
| 413 | + "type": "keyword" | |
| 414 | + }, | |
| 415 | + "value": { | |
| 416 | + "type": "object", | |
| 417 | + "properties": { | |
| 418 | + "zh": { | |
| 419 | + "type": "text", | |
| 420 | + "analyzer": "index_ik", | |
| 421 | + "search_analyzer": "query_ik", | |
| 422 | + "fields": { | |
| 423 | + "keyword": { | |
| 424 | + "type": "keyword", | |
| 425 | + "normalizer": "lowercase" | |
| 426 | + } | |
| 427 | + } | |
| 428 | + }, | |
| 429 | + "en": { | |
| 430 | + "type": "text", | |
| 431 | + "analyzer": "english", | |
| 432 | + "fields": { | |
| 433 | + "keyword": { | |
| 434 | + "type": "keyword", | |
| 435 | + "normalizer": "lowercase" | |
| 436 | + } | |
| 437 | + } | |
| 438 | + } | |
| 439 | + } | |
| 440 | + } | |
| 441 | + } | |
| 442 | + }, | |
| 443 | + "enriched_attributes": { | |
| 444 | + "type": "nested", | |
| 445 | + "properties": { | |
| 446 | + "name": { | |
| 447 | + "type": "keyword" | |
| 448 | + }, | |
| 449 | + "value": { | |
| 450 | + "type": "object", | |
| 451 | + "properties": { | |
| 452 | + "zh": { | |
| 453 | + "type": "text", | |
| 454 | + "analyzer": "index_ik", | |
| 455 | + "search_analyzer": "query_ik", | |
| 456 | + "fields": { | |
| 457 | + "keyword": { | |
| 458 | + "type": "keyword", | |
| 459 | + "normalizer": "lowercase" | |
| 460 | + } | |
| 461 | + } | |
| 462 | + }, | |
| 463 | + "en": { | |
| 464 | + "type": "text", | |
| 465 | + "analyzer": "english", | |
| 466 | + "fields": { | |
| 467 | + "keyword": { | |
| 468 | + "type": "keyword", | |
| 469 | + "normalizer": "lowercase" | |
| 470 | + } | |
| 471 | + } | |
| 472 | + } | |
| 473 | + } | |
| 474 | + } | |
| 475 | + } | |
| 476 | + }, | |
| 477 | + "option1_name": { | |
| 478 | + "type": "keyword" | |
| 479 | + }, | |
| 480 | + "option2_name": { | |
| 481 | + "type": "keyword" | |
| 482 | + }, | |
| 483 | + "option3_name": { | |
| 484 | + "type": "keyword" | |
| 485 | + }, | |
| 486 | + "option1_values": { | |
| 487 | + "type": "object", | |
| 488 | + "properties": { | |
| 489 | + "zh": { | |
| 490 | + "type": "text", | |
| 491 | + "analyzer": "index_ik", | |
| 492 | + "search_analyzer": "query_ik", | |
| 493 | + "fields": { | |
| 494 | + "keyword": { | |
| 495 | + "type": "keyword", | |
| 496 | + "normalizer": "lowercase" | |
| 497 | + } | |
| 498 | + } | |
| 499 | + }, | |
| 500 | + "en": { | |
| 501 | + "type": "text", | |
| 502 | + "analyzer": "english", | |
| 503 | + "fields": { | |
| 504 | + "keyword": { | |
| 505 | + "type": "keyword", | |
| 506 | + "normalizer": "lowercase" | |
| 507 | + } | |
| 508 | + } | |
| 509 | + } | |
| 510 | + } | |
| 511 | + }, | |
| 512 | + "option2_values": { | |
| 513 | + "type": "object", | |
| 514 | + "properties": { | |
| 515 | + "zh": { | |
| 516 | + "type": "text", | |
| 517 | + "analyzer": "index_ik", | |
| 518 | + "search_analyzer": "query_ik", | |
| 519 | + "fields": { | |
| 520 | + "keyword": { | |
| 521 | + "type": "keyword", | |
| 522 | + "normalizer": "lowercase" | |
| 523 | + } | |
| 524 | + } | |
| 525 | + }, | |
| 526 | + "en": { | |
| 527 | + "type": "text", | |
| 528 | + "analyzer": "english", | |
| 529 | + "fields": { | |
| 530 | + "keyword": { | |
| 531 | + "type": "keyword", | |
| 532 | + "normalizer": "lowercase" | |
| 533 | + } | |
| 534 | + } | |
| 535 | + } | |
| 536 | + } | |
| 537 | + }, | |
| 538 | + "option3_values": { | |
| 539 | + "type": "object", | |
| 540 | + "properties": { | |
| 541 | + "zh": { | |
| 542 | + "type": "text", | |
| 543 | + "analyzer": "index_ik", | |
| 544 | + "search_analyzer": "query_ik", | |
| 545 | + "fields": { | |
| 546 | + "keyword": { | |
| 547 | + "type": "keyword", | |
| 548 | + "normalizer": "lowercase" | |
| 549 | + } | |
| 550 | + } | |
| 551 | + }, | |
| 552 | + "en": { | |
| 553 | + "type": "text", | |
| 554 | + "analyzer": "english", | |
| 555 | + "fields": { | |
| 556 | + "keyword": { | |
| 557 | + "type": "keyword", | |
| 558 | + "normalizer": "lowercase" | |
| 559 | + } | |
| 560 | + } | |
| 561 | + } | |
| 562 | + } | |
| 563 | + }, | |
| 564 | + "min_price": { | |
| 565 | + "type": "float" | |
| 566 | + }, | |
| 567 | + "max_price": { | |
| 568 | + "type": "float" | |
| 569 | + }, | |
| 570 | + "compare_at_price": { | |
| 571 | + "type": "float" | |
| 572 | + }, | |
| 573 | + "sku_prices": { | |
| 574 | + "type": "float" | |
| 575 | + }, | |
| 576 | + "sku_weights": { | |
| 577 | + "type": "long" | |
| 578 | + }, | |
| 579 | + "sku_weight_units": { | |
| 580 | + "type": "keyword" | |
| 581 | + }, | |
| 582 | + "total_inventory": { | |
| 583 | + "type": "long" | |
| 584 | + }, | |
| 585 | + "sales": { | |
| 586 | + "type": "long" | |
| 587 | + }, | |
| 588 | + "skus": { | |
| 589 | + "type": "nested", | |
| 590 | + "properties": { | |
| 591 | + "sku_id": { | |
| 592 | + "type": "keyword" | |
| 593 | + }, | |
| 594 | + "price": { | |
| 595 | + "type": "float" | |
| 596 | + }, | |
| 597 | + "compare_at_price": { | |
| 598 | + "type": "float" | |
| 599 | + }, | |
| 600 | + "sku_code": { | |
| 601 | + "type": "keyword" | |
| 602 | + }, | |
| 603 | + "stock": { | |
| 604 | + "type": "long" | |
| 605 | + }, | |
| 606 | + "weight": { | |
| 607 | + "type": "float" | |
| 608 | + }, | |
| 609 | + "weight_unit": { | |
| 610 | + "type": "keyword" | |
| 611 | + }, | |
| 612 | + "option1_value": { | |
| 613 | + "type": "keyword" | |
| 614 | + }, | |
| 615 | + "option2_value": { | |
| 616 | + "type": "keyword" | |
| 617 | + }, | |
| 618 | + "option3_value": { | |
| 619 | + "type": "keyword" | |
| 620 | + }, | |
| 621 | + "image_src": { | |
| 622 | + "type": "keyword", | |
| 623 | + "index": false | |
| 624 | + } | |
| 625 | + } | |
| 626 | + } | |
| 627 | + } | |
| 628 | + } | |
| 629 | +} | |
| 0 | 630 | \ No newline at end of file | ... | ... |