Commit fca871fb17bf0366c5e2c324a76c4629c32dd729

Authored by tangwang
1 parent 36cf0ef9

索引字段修改

mappings/README.md
... ... @@ -2,32 +2,280 @@
2 2  
3 3 ## 概述
4 4  
5   -所有租户共享同一个ES mapping结构,直接使用手写的JSON文件,无需通过config.yaml生成
  5 +所有租户共享同一个 Elasticsearch mapping 结构
6 6  
7   -## Mapping文件
  7 +当前目录采用“声明式 Python 规格 + 字段模板 + 最终 JSON 产物”的方式维护 `search_products` 的索引定义:
8 8  
9   -- `search_products.json`: 完整的ES索引配置,包括settings和mappings
  9 +- `generate_search_products_mapping.py`: 唯一的生成源,包含字段模板、语言列表、分析器配置和递归生成逻辑
  10 +- `search_products.json`: 由脚本生成的完整 ES 索引配置,包括 `settings` 和 `mappings`
  11 +- `search_suggestions.json`: 搜索建议索引配置
10 12  
11   -## 使用方式
  13 +默认应修改生成脚本中的规格定义,而不是手工编辑 `search_products.json`。
12 14  
13   -### 创建索引
  15 +## 字段抽象
  16 +
  17 +脚本从业务语义上抽象出 4 类文本模板:
  18 +
  19 +- `all_language_text`: 全语言字段,不带 `keyword`
  20 +- `all_language_text_with_keyword`: 全语言字段,所有受支持语言都带 `keyword`
  21 +- `core_language_text`: 核心索引语言字段,不带 `keyword`
  22 +- `core_language_text_with_keyword`: 核心索引语言字段,核心语言都带 `keyword`
  23 +
  24 +这里的“核心索引语言”不是因为系统只支持两种语言,而是因为所有店铺、所有商品都必须至少产出这两种语言的索引内容。目前核心索引语言固定为:
  25 +
  26 +- `zh`
  27 +- `en`
  28 +
  29 +“全语言”表示 mapping 为原始商品语言预留了更多语言槽位。商品实际灌入时,不要求每个字段把所有语言都填满,只要求:
  30 +
  31 +- 核心索引语言字段必须填充 `zh` 和 `en`
  32 +- 全语言字段必须填充 `zh` 和 `en`
  33 +- 如果商品原始语言属于受支持语言,还应额外填充对应的原始语言字段,例如 `ru`
  34 +
  35 +当前字段大致分为几类:
  36 +
  37 +- 全语言字段:`title`、`keywords`、`brief`、`description`、`vendor`、`category_path`、`category_name_text`、`specifications.value`
  38 +- 核心索引语言字段:`qanchors`、`tags`、`option1_values`、`option2_values`、`option3_values`、`enriched_attributes.value`
  39 +- 复合嵌套字段:`image_embedding`、`specifications`、`enriched_attributes`、`skus`
  40 +- 其他标量字段:`tenant_id`、`spu_id`、价格、库存、类目等
  41 +
  42 +生成规则里的几个基础约束:
  43 +
  44 +- 中文字段使用 `index_ik`,并额外设置 `search_analyzer: query_ik`
  45 +- 非中文语言使用各自的 Elasticsearch 内置 analyzer
  46 +- 带 `with_keyword` 的模板会为对应语言增加 `.keyword`
  47 +- `settings.analysis`、`normalizer`、`similarity` 也属于生成结果的一部分,不能只维护 `mappings.properties`
  48 +
  49 +## 索引灌入指引
  50 +
  51 +### 基本原则
  52 +
  53 +1. 所有商品都必须生成核心索引语言版本,也就是 `zh` 和 `en`。
  54 +2. 全语言字段除了必须有 `zh` 和 `en`,还应尽量保留商品原始语言版本。
  55 +3. 如果商品原始语言本身就是 `zh` 或 `en`,则原文直接写入对应字段,另一种核心语言通过翻译补齐。
  56 +4. 如果商品原始语言是 `ru` 这类受支持的非核心语言,则应同时写入原始语言字段和 `zh/en` 翻译结果。
  57 +5. 如果某个值为空,不应写入伪造内容;应在上游清洗后决定是否跳过该字段。
  58 +
  59 +### 核心索引语言字段
  60 +
  61 +这类字段的目标是保证所有商品都至少能被中文和英文检索到。无论商品原始语言是什么,都应通过翻译或标准化得到 `zh` 和 `en` 两份结果。
  62 +
  63 +典型字段:
  64 +
  65 +- `qanchors`
  66 +- `tags`
  67 +- `option1_values`
  68 +- `option2_values`
  69 +- `option3_values`
  70 +- `enriched_attributes.value`
  71 +
  72 +以 `category_path` 和 `option*_values` 为例,核心语言灌入结果应至少包含:
  73 +
  74 +- `category_path.zh`
  75 +- `category_path.en`
  76 +- `option1_values.zh`
  77 +- `option1_values.en`
  78 +- `option2_values.zh`
  79 +- `option2_values.en`
  80 +- `option3_values.zh`
  81 +- `option3_values.en`
  82 +
  83 +示例:原始商品语言为俄语,原始 `option1_values` 为 `красный, синий`
  84 +
  85 +```json
  86 +{
  87 + "option1_values": {
  88 + "zh": "红色, 蓝色",
  89 + "en": "red, blue"
  90 + }
  91 +}
  92 +```
  93 +
  94 +示例:原始商品语言为俄语,类目路径为 `Одежда > Женская одежда > Куртки`
  95 +
  96 +```json
  97 +{
  98 + "category_path": {
  99 + "zh": "服饰 > 女装 > 夹克",
  100 + "en": "Apparel > Women's Clothing > Jackets",
  101 + "ru": "Одежда > Женская одежда > Куртки"
  102 + }
  103 +}
  104 +```
  105 +
  106 +注意:`category_path` 在 mapping 上属于全语言字段,但在灌入规范上依然要求 `zh/en` 必填。
  107 +
  108 +### 全语言字段
  109 +
  110 +这类字段既要保证 `zh/en` 两个核心索引语言可用,也要尽量保留商品原始语言,以便原语种召回和更自然的检索。
  111 +
  112 +典型字段:
  113 +
  114 +- `title`
  115 +- `keywords`
  116 +- `brief`
  117 +- `description`
  118 +- `vendor`
  119 +- `category_path`
  120 +- `category_name_text`
  121 +- `specifications.value`
  122 +
  123 +灌入规则:
  124 +
  125 +1. 找到商品原始语言,例如 `ru`
  126 +2. 原文写入对应语言字段,例如 `title.ru`
  127 +3. 将原文翻译成 `zh` 和 `en`
  128 +4. 分别写入 `title.zh` 和 `title.en`
  129 +
  130 +示例:原始商品语言为俄语,标题为 `Женская зимняя куртка`
  131 +
  132 +```json
  133 +{
  134 + "title": {
  135 + "zh": "女士冬季夹克",
  136 + "en": "Women's winter jacket",
  137 + "ru": "Женская зимняя куртка"
  138 + }
  139 +}
  140 +```
  141 +
  142 +示例:原始商品语言为俄语,类目名称为 `Женские куртки`
  143 +
  144 +```json
  145 +{
  146 + "category_name_text": {
  147 + "zh": "女式夹克",
  148 + "en": "Women's jackets",
  149 + "ru": "Женские куртки"
  150 + }
  151 +}
  152 +```
  153 +
  154 +示例:规格值 `specifications.value`
  155 +
  156 +```json
  157 +{
  158 + "specifications": [
  159 + {
  160 + "sku_id": "sku-red-s",
  161 + "name": "color",
  162 + "value": {
  163 + "zh": "红色",
  164 + "en": "red",
  165 + "ru": "красный"
  166 + }
  167 + }
  168 + ]
  169 +}
  170 +```
  171 +
  172 +### 原始语言为中文或英文时
  173 +
  174 +如果原始语言就是核心索引语言之一,不需要额外再写第三份语言字段。
  175 +
  176 +示例:原始语言为中文
  177 +
  178 +```json
  179 +{
  180 + "title": {
  181 + "zh": "女士冬季夹克",
  182 + "en": "Women's winter jacket"
  183 + },
  184 + "option1_values": {
  185 + "zh": "红色, 蓝色",
  186 + "en": "red, blue"
  187 + }
  188 +}
  189 +```
  190 +
  191 +示例:原始语言为英文
  192 +
  193 +```json
  194 +{
  195 + "title": {
  196 + "zh": "女士冬季夹克",
  197 + "en": "Women's winter jacket"
  198 + },
  199 + "vendor": {
  200 + "zh": "北境服饰",
  201 + "en": "Northern Apparel"
  202 + }
  203 +}
  204 +```
  205 +
  206 +### 不同字段的灌入方式
  207 +
  208 +可以按下面的方式理解和实现:
  209 +
  210 +- 标量字段:直接写固定值,例如 `tenant_id`、`spu_id`、`min_price`
  211 +- 核心索引语言字段:只生成 `zh/en`
  212 +- 全语言字段:生成 `zh/en`,再按原始语言补一个对应语种字段
  213 +- 嵌套字段:对每个元素内部重复应用同样规则,例如 `specifications[].value`
  214 +
  215 +### 推荐灌入流程
  216 +
  217 +1. 识别商品原始语言
  218 +2. 提取原文标题、描述、类目、规格、属性、选项值等字段
  219 +3. 生成 `zh` 和 `en` 两份核心索引语言内容
  220 +4. 对全语言字段,如果原始语言受支持,则额外写入原始语言字段
  221 +5. 组装最终 ES 文档并写入索引
  222 +
  223 +## 生成 Mapping
  224 +
  225 +在仓库根目录执行:
  226 +
  227 +```bash
  228 +source activate.sh
  229 +python mappings/generate_search_products_mapping.py > mappings/search_products.json
  230 +```
  231 +
  232 +如果只想查看输出而不覆盖文件:
  233 +
  234 +```bash
  235 +source activate.sh
  236 +python mappings/generate_search_products_mapping.py
  237 +```
  238 +
  239 +如果想先生成到临时文件:
  240 +
  241 +```bash
  242 +source activate.sh
  243 +python mappings/generate_search_products_mapping.py > mappings/search_products.generated.json
  244 +```
  245 +
  246 +## 校验 Mapping
  247 +
  248 +确认当前 `search_products.json` 是否与生成规则完全一致:
  249 +
  250 +```bash
  251 +source activate.sh
  252 +python mappings/generate_search_products_mapping.py --check mappings/search_products.json
  253 +```
  254 +
  255 +## 创建索引
14 256  
15 257 ```python
16 258 from indexer.mapping_generator import load_mapping, create_index_if_not_exists
17 259 from utils.es_client import ESClient
18 260  
19 261 es_client = ESClient(hosts=["http://localhost:9200"])
20   -mapping = load_mapping() # 从mappings/search_products.json加载
  262 +mapping = load_mapping()
21 263 create_index_if_not_exists(es_client, "search_products", mapping)
22 264 ```
23 265  
24   -### 修改Mapping
  266 +## 修改 Mapping
  267 +
  268 +推荐流程:
  269 +
  270 +1. 修改 `mappings/generate_search_products_mapping.py`
  271 +2. 重新生成 `mappings/search_products.json`
  272 +3. 用 `--check` 或 diff 确认变更符合预期
  273 +4. 重新创建索引并导入数据
25 274  
26   -直接编辑 `mappings/search_products.json` 文件,然后重新创建索引。
  275 +注意:Elasticsearch 不支持直接修改已有字段的 mapping 类型,只能新增字段。如需修改字段类型,需要:
27 276  
28   -注意:ES不支持修改已有字段的mapping类型,只能添加新字段。如需修改字段类型,需要:
29 277 1. 删除旧索引
30   -2. 使用新mapping创建索引
  278 +2. 使用新 mapping 创建索引
31 279 3. 重新导入数据
32 280  
33 281 ## 字段说明
... ...
mappings/generate_search_products_mapping.py 0 → 100644
... ... @@ -0,0 +1,355 @@
  1 +#!/usr/bin/env python3
  2 +from __future__ import annotations
  3 +
  4 +import argparse
  5 +import json
  6 +from pathlib import Path
  7 +from typing import Any
  8 +
  9 +ALL_LANGUAGE_CODES = [
  10 + "zh",
  11 + "en",
  12 + "ar",
  13 + "hy",
  14 + "eu",
  15 + "pt_br",
  16 + "bg",
  17 + "ca",
  18 + "cjk",
  19 + "cs",
  20 + "da",
  21 + "nl",
  22 + "fi",
  23 + "fr",
  24 + "gl",
  25 + "de",
  26 + "el",
  27 + "hi",
  28 + "hu",
  29 + "id",
  30 + "it",
  31 + "no",
  32 + "fa",
  33 + "pt",
  34 + "ro",
  35 + "ru",
  36 + "es",
  37 + "sv",
  38 + "tr",
  39 + "th",
  40 +]
  41 +
  42 +CORE_INDEX_LANGUAGES = ["zh", "en"]
  43 +
  44 +LANGUAGE_GROUPS = {
  45 + "all": ALL_LANGUAGE_CODES,
  46 + "core": CORE_INDEX_LANGUAGES,
  47 +}
  48 +
  49 +ANALYZERS = {
  50 + "zh": "index_ik",
  51 + "en": "english",
  52 + "ar": "arabic",
  53 + "hy": "armenian",
  54 + "eu": "basque",
  55 + "pt_br": "brazilian",
  56 + "bg": "bulgarian",
  57 + "ca": "catalan",
  58 + "cjk": "cjk",
  59 + "cs": "czech",
  60 + "da": "danish",
  61 + "nl": "dutch",
  62 + "fi": "finnish",
  63 + "fr": "french",
  64 + "gl": "galician",
  65 + "de": "german",
  66 + "el": "greek",
  67 + "hi": "hindi",
  68 + "hu": "hungarian",
  69 + "id": "indonesian",
  70 + "it": "italian",
  71 + "no": "norwegian",
  72 + "fa": "persian",
  73 + "pt": "portuguese",
  74 + "ro": "romanian",
  75 + "ru": "russian",
  76 + "es": "spanish",
  77 + "sv": "swedish",
  78 + "tr": "turkish",
  79 + "th": "thai",
  80 +}
  81 +
  82 +SETTINGS = {
  83 + "number_of_shards": 1,
  84 + "number_of_replicas": 0,
  85 + "refresh_interval": "30s",
  86 + "analysis": {
  87 + "analyzer": {
  88 + "index_ik": {
  89 + "type": "custom",
  90 + "tokenizer": "ik_max_word",
  91 + "filter": ["lowercase", "asciifolding"],
  92 + },
  93 + "query_ik": {
  94 + "type": "custom",
  95 + "tokenizer": "ik_smart",
  96 + "filter": ["lowercase", "asciifolding"],
  97 + },
  98 + },
  99 + "normalizer": {
  100 + "lowercase": {
  101 + "type": "custom",
  102 + "filter": ["lowercase"],
  103 + }
  104 + },
  105 + },
  106 + "similarity": {
  107 + "default": {
  108 + "type": "BM25",
  109 + "b": 0.0,
  110 + "k1": 0.0,
  111 + }
  112 + },
  113 +}
  114 +
  115 +TEXT_FIELD_TEMPLATES = {
  116 + "all_language_text": {
  117 + "language_group": "all",
  118 + "with_keyword": False,
  119 + },
  120 + "all_language_text_with_keyword": {
  121 + "language_group": "all",
  122 + "with_keyword": True,
  123 + },
  124 + "core_language_text": {
  125 + "language_group": "core",
  126 + "with_keyword": False,
  127 + },
  128 + "core_language_text_with_keyword": {
  129 + "language_group": "core",
  130 + "with_keyword": True,
  131 + },
  132 +}
  133 +
  134 +
  135 +def scalar_field(name: str, field_type: str, **extra: Any) -> dict[str, Any]:
  136 + spec = {
  137 + "name": name,
  138 + "kind": "scalar",
  139 + "type": field_type,
  140 + }
  141 + if extra:
  142 + spec["extra"] = extra
  143 + return spec
  144 +
  145 +
  146 +def text_field(name: str, template: str) -> dict[str, Any]:
  147 + return {
  148 + "name": name,
  149 + "kind": "text",
  150 + "template": template,
  151 + }
  152 +
  153 +
  154 +def nested_field(name: str, *fields: dict[str, Any]) -> dict[str, Any]:
  155 + return {
  156 + "name": name,
  157 + "kind": "nested",
  158 + "fields": list(fields),
  159 + }
  160 +
  161 +TEXT_EMBEDDING_SIZE = 1024
  162 +IMAGE_EMBEDDING_SIZE = 768
  163 +
  164 +FIELD_SPECS = [
  165 + scalar_field("tenant_id", "keyword"),
  166 + scalar_field("spu_id", "keyword"),
  167 + scalar_field("create_time", "date"),
  168 + scalar_field("update_time", "date"),
  169 + text_field("title", "all_language_text"),
  170 + text_field("keywords", "all_language_text_with_keyword"),
  171 + text_field("brief", "all_language_text"),
  172 + text_field("description", "all_language_text"),
  173 + text_field("vendor", "all_language_text_with_keyword"),
  174 + scalar_field("image_url", "keyword", index=False),
  175 + scalar_field(
  176 + "title_embedding",
  177 + "dense_vector",
  178 + dims=TEXT_EMBEDDING_SIZE,
  179 + index=True,
  180 + similarity="dot_product",
  181 + element_type="bfloat16",
  182 + ),
  183 + nested_field(
  184 + "image_embedding",
  185 + scalar_field(
  186 + "vector",
  187 + "dense_vector",
  188 + dims=IMAGE_EMBEDDING_SIZE,
  189 + index=True,
  190 + similarity="dot_product",
  191 + element_type="bfloat16",
  192 + ),
  193 + scalar_field("url", "text"),
  194 + ),
  195 + text_field("category_path", "all_language_text_with_keyword"),
  196 + text_field("category_name_text", "all_language_text_with_keyword"),
  197 + text_field("qanchors", "core_language_text"),
  198 + text_field("tags", "core_language_text_with_keyword"),
  199 + scalar_field("category_id", "keyword"),
  200 + scalar_field("category_name", "keyword"),
  201 + scalar_field("category_level", "integer"),
  202 + scalar_field("category1_name", "keyword"),
  203 + scalar_field("category2_name", "keyword"),
  204 + scalar_field("category3_name", "keyword"),
  205 + nested_field(
  206 + "specifications",
  207 + scalar_field("sku_id", "keyword"),
  208 + scalar_field("name", "keyword"),
  209 + scalar_field("value_keyword", "keyword"),
  210 + text_field("value_text", "core_language_text_with_keyword"),
  211 + ),
  212 + nested_field(
  213 + "enriched_attributes",
  214 + scalar_field("name", "keyword"),
  215 + text_field("value", "core_language_text_with_keyword"),
  216 + ),
  217 + scalar_field("option1_name", "keyword"),
  218 + scalar_field("option2_name", "keyword"),
  219 + scalar_field("option3_name", "keyword"),
  220 + text_field("option1_values", "core_language_text_with_keyword"),
  221 + text_field("option2_values", "core_language_text_with_keyword"),
  222 + text_field("option3_values", "core_language_text_with_keyword"),
  223 + scalar_field("min_price", "float"),
  224 + scalar_field("max_price", "float"),
  225 + scalar_field("compare_at_price", "float"),
  226 + scalar_field("sku_prices", "float"),
  227 + scalar_field("sku_weights", "long"),
  228 + scalar_field("sku_weight_units", "keyword"),
  229 + scalar_field("total_inventory", "long"),
  230 + scalar_field("sales", "long"),
  231 + nested_field(
  232 + "skus",
  233 + scalar_field("sku_id", "keyword"),
  234 + scalar_field("price", "float"),
  235 + scalar_field("compare_at_price", "float"),
  236 + scalar_field("sku_code", "keyword"),
  237 + scalar_field("stock", "long"),
  238 + scalar_field("weight", "float"),
  239 + scalar_field("weight_unit", "keyword"),
  240 + scalar_field("option1_value", "keyword"),
  241 + scalar_field("option2_value", "keyword"),
  242 + scalar_field("option3_value", "keyword"),
  243 + scalar_field("image_src", "keyword", index=False),
  244 + ),
  245 +]
  246 +
  247 +
  248 +def build_keyword_fields() -> dict[str, Any]:
  249 + return {
  250 + "keyword": {
  251 + "type": "keyword",
  252 + "normalizer": "lowercase",
  253 + }
  254 + }
  255 +
  256 +
  257 +def build_text_field(language: str, *, add_keyword: bool) -> dict[str, Any]:
  258 + field = {
  259 + "type": "text",
  260 + "analyzer": ANALYZERS[language],
  261 + }
  262 + if language == "zh":
  263 + field["search_analyzer"] = "query_ik"
  264 + if add_keyword:
  265 + field["fields"] = build_keyword_fields()
  266 + return field
  267 +
  268 +
  269 +def render_field(spec: dict[str, Any]) -> dict[str, Any]:
  270 + kind = spec["kind"]
  271 +
  272 + if kind == "scalar":
  273 + rendered = {"type": spec["type"]}
  274 + rendered.update(spec.get("extra", {}))
  275 + return rendered
  276 +
  277 + if kind == "text":
  278 + template = TEXT_FIELD_TEMPLATES[spec["template"]]
  279 + languages = LANGUAGE_GROUPS[template["language_group"]]
  280 + properties = {}
  281 + for language in languages:
  282 + properties[language] = build_text_field(
  283 + language,
  284 + add_keyword=template["with_keyword"],
  285 + )
  286 + return {
  287 + "type": "object",
  288 + "properties": properties,
  289 + }
  290 +
  291 + if kind == "nested":
  292 + properties = {}
  293 + for child in spec["fields"]:
  294 + properties[child["name"]] = render_field(child)
  295 + return {
  296 + "type": "nested",
  297 + "properties": properties,
  298 + }
  299 +
  300 + raise ValueError(f"Unknown field kind: {kind}")
  301 +
  302 +
  303 +def build_mapping() -> dict[str, Any]:
  304 + properties = {}
  305 + for spec in FIELD_SPECS:
  306 + properties[spec["name"]] = render_field(spec)
  307 +
  308 + return {
  309 + "settings": SETTINGS,
  310 + "mappings": {
  311 + "properties": properties,
  312 + },
  313 + }
  314 +
  315 +
  316 +def render_mapping() -> str:
  317 + return json.dumps(build_mapping(), indent=2, ensure_ascii=False)
  318 +
  319 +
  320 +def main() -> int:
  321 + parser = argparse.ArgumentParser(
  322 + description="Generate mappings/search_products.json from a compact Python spec.",
  323 + )
  324 + parser.add_argument(
  325 + "-o",
  326 + "--output",
  327 + type=Path,
  328 + help="Write the generated mapping to this file. Defaults to stdout.",
  329 + )
  330 + parser.add_argument(
  331 + "--check",
  332 + type=Path,
  333 + help="Fail if the generated output does not exactly match this file.",
  334 + )
  335 + args = parser.parse_args()
  336 +
  337 + rendered = render_mapping()
  338 +
  339 + if args.check is not None:
  340 + existing = args.check.read_text(encoding="utf-8")
  341 + if existing != rendered:
  342 + print(f"Generated mapping does not match {args.check}")
  343 + return 1
  344 + print(f"Generated mapping matches {args.check}")
  345 +
  346 + if args.output is not None:
  347 + args.output.write_text(rendered, encoding="utf-8")
  348 + elif args.check is None:
  349 + print(rendered, end="")
  350 +
  351 + return 0
  352 +
  353 +
  354 +if __name__ == "__main__":
  355 + raise SystemExit(main())
... ...
mappings/search_products.json
... ... @@ -185,7 +185,13 @@
185 185 "zh": {
186 186 "type": "text",
187 187 "analyzer": "index_ik",
188   - "search_analyzer": "query_ik"
  188 + "search_analyzer": "query_ik",
  189 + "fields": {
  190 + "keyword": {
  191 + "type": "keyword",
  192 + "normalizer": "lowercase"
  193 + }
  194 + }
189 195 },
190 196 "en": {
191 197 "type": "text",
... ... @@ -737,7 +743,13 @@
737 743 "zh": {
738 744 "type": "text",
739 745 "analyzer": "index_ik",
740   - "search_analyzer": "query_ik"
  746 + "search_analyzer": "query_ik",
  747 + "fields": {
  748 + "keyword": {
  749 + "type": "keyword",
  750 + "normalizer": "lowercase"
  751 + }
  752 + }
741 753 },
742 754 "en": {
743 755 "type": "text",
... ... @@ -1063,123 +1075,303 @@
1063 1075 "zh": {
1064 1076 "type": "text",
1065 1077 "analyzer": "index_ik",
1066   - "search_analyzer": "query_ik"
  1078 + "search_analyzer": "query_ik",
  1079 + "fields": {
  1080 + "keyword": {
  1081 + "type": "keyword",
  1082 + "normalizer": "lowercase"
  1083 + }
  1084 + }
1067 1085 },
1068 1086 "en": {
1069 1087 "type": "text",
1070   - "analyzer": "english"
  1088 + "analyzer": "english",
  1089 + "fields": {
  1090 + "keyword": {
  1091 + "type": "keyword",
  1092 + "normalizer": "lowercase"
  1093 + }
  1094 + }
1071 1095 },
1072 1096 "ar": {
1073 1097 "type": "text",
1074   - "analyzer": "arabic"
  1098 + "analyzer": "arabic",
  1099 + "fields": {
  1100 + "keyword": {
  1101 + "type": "keyword",
  1102 + "normalizer": "lowercase"
  1103 + }
  1104 + }
1075 1105 },
1076 1106 "hy": {
1077 1107 "type": "text",
1078   - "analyzer": "armenian"
  1108 + "analyzer": "armenian",
  1109 + "fields": {
  1110 + "keyword": {
  1111 + "type": "keyword",
  1112 + "normalizer": "lowercase"
  1113 + }
  1114 + }
1079 1115 },
1080 1116 "eu": {
1081 1117 "type": "text",
1082   - "analyzer": "basque"
  1118 + "analyzer": "basque",
  1119 + "fields": {
  1120 + "keyword": {
  1121 + "type": "keyword",
  1122 + "normalizer": "lowercase"
  1123 + }
  1124 + }
1083 1125 },
1084 1126 "pt_br": {
1085 1127 "type": "text",
1086   - "analyzer": "brazilian"
  1128 + "analyzer": "brazilian",
  1129 + "fields": {
  1130 + "keyword": {
  1131 + "type": "keyword",
  1132 + "normalizer": "lowercase"
  1133 + }
  1134 + }
1087 1135 },
1088 1136 "bg": {
1089 1137 "type": "text",
1090   - "analyzer": "bulgarian"
  1138 + "analyzer": "bulgarian",
  1139 + "fields": {
  1140 + "keyword": {
  1141 + "type": "keyword",
  1142 + "normalizer": "lowercase"
  1143 + }
  1144 + }
1091 1145 },
1092 1146 "ca": {
1093 1147 "type": "text",
1094   - "analyzer": "catalan"
  1148 + "analyzer": "catalan",
  1149 + "fields": {
  1150 + "keyword": {
  1151 + "type": "keyword",
  1152 + "normalizer": "lowercase"
  1153 + }
  1154 + }
1095 1155 },
1096 1156 "cjk": {
1097 1157 "type": "text",
1098   - "analyzer": "cjk"
  1158 + "analyzer": "cjk",
  1159 + "fields": {
  1160 + "keyword": {
  1161 + "type": "keyword",
  1162 + "normalizer": "lowercase"
  1163 + }
  1164 + }
1099 1165 },
1100 1166 "cs": {
1101 1167 "type": "text",
1102   - "analyzer": "czech"
  1168 + "analyzer": "czech",
  1169 + "fields": {
  1170 + "keyword": {
  1171 + "type": "keyword",
  1172 + "normalizer": "lowercase"
  1173 + }
  1174 + }
1103 1175 },
1104 1176 "da": {
1105 1177 "type": "text",
1106   - "analyzer": "danish"
  1178 + "analyzer": "danish",
  1179 + "fields": {
  1180 + "keyword": {
  1181 + "type": "keyword",
  1182 + "normalizer": "lowercase"
  1183 + }
  1184 + }
1107 1185 },
1108 1186 "nl": {
1109 1187 "type": "text",
1110   - "analyzer": "dutch"
  1188 + "analyzer": "dutch",
  1189 + "fields": {
  1190 + "keyword": {
  1191 + "type": "keyword",
  1192 + "normalizer": "lowercase"
  1193 + }
  1194 + }
1111 1195 },
1112 1196 "fi": {
1113 1197 "type": "text",
1114   - "analyzer": "finnish"
  1198 + "analyzer": "finnish",
  1199 + "fields": {
  1200 + "keyword": {
  1201 + "type": "keyword",
  1202 + "normalizer": "lowercase"
  1203 + }
  1204 + }
1115 1205 },
1116 1206 "fr": {
1117 1207 "type": "text",
1118   - "analyzer": "french"
  1208 + "analyzer": "french",
  1209 + "fields": {
  1210 + "keyword": {
  1211 + "type": "keyword",
  1212 + "normalizer": "lowercase"
  1213 + }
  1214 + }
1119 1215 },
1120 1216 "gl": {
1121 1217 "type": "text",
1122   - "analyzer": "galician"
  1218 + "analyzer": "galician",
  1219 + "fields": {
  1220 + "keyword": {
  1221 + "type": "keyword",
  1222 + "normalizer": "lowercase"
  1223 + }
  1224 + }
1123 1225 },
1124 1226 "de": {
1125 1227 "type": "text",
1126   - "analyzer": "german"
  1228 + "analyzer": "german",
  1229 + "fields": {
  1230 + "keyword": {
  1231 + "type": "keyword",
  1232 + "normalizer": "lowercase"
  1233 + }
  1234 + }
1127 1235 },
1128 1236 "el": {
1129 1237 "type": "text",
1130   - "analyzer": "greek"
  1238 + "analyzer": "greek",
  1239 + "fields": {
  1240 + "keyword": {
  1241 + "type": "keyword",
  1242 + "normalizer": "lowercase"
  1243 + }
  1244 + }
1131 1245 },
1132 1246 "hi": {
1133 1247 "type": "text",
1134   - "analyzer": "hindi"
  1248 + "analyzer": "hindi",
  1249 + "fields": {
  1250 + "keyword": {
  1251 + "type": "keyword",
  1252 + "normalizer": "lowercase"
  1253 + }
  1254 + }
1135 1255 },
1136 1256 "hu": {
1137 1257 "type": "text",
1138   - "analyzer": "hungarian"
  1258 + "analyzer": "hungarian",
  1259 + "fields": {
  1260 + "keyword": {
  1261 + "type": "keyword",
  1262 + "normalizer": "lowercase"
  1263 + }
  1264 + }
1139 1265 },
1140 1266 "id": {
1141 1267 "type": "text",
1142   - "analyzer": "indonesian"
  1268 + "analyzer": "indonesian",
  1269 + "fields": {
  1270 + "keyword": {
  1271 + "type": "keyword",
  1272 + "normalizer": "lowercase"
  1273 + }
  1274 + }
1143 1275 },
1144 1276 "it": {
1145 1277 "type": "text",
1146   - "analyzer": "italian"
  1278 + "analyzer": "italian",
  1279 + "fields": {
  1280 + "keyword": {
  1281 + "type": "keyword",
  1282 + "normalizer": "lowercase"
  1283 + }
  1284 + }
1147 1285 },
1148 1286 "no": {
1149 1287 "type": "text",
1150   - "analyzer": "norwegian"
  1288 + "analyzer": "norwegian",
  1289 + "fields": {
  1290 + "keyword": {
  1291 + "type": "keyword",
  1292 + "normalizer": "lowercase"
  1293 + }
  1294 + }
1151 1295 },
1152 1296 "fa": {
1153 1297 "type": "text",
1154   - "analyzer": "persian"
  1298 + "analyzer": "persian",
  1299 + "fields": {
  1300 + "keyword": {
  1301 + "type": "keyword",
  1302 + "normalizer": "lowercase"
  1303 + }
  1304 + }
1155 1305 },
1156 1306 "pt": {
1157 1307 "type": "text",
1158   - "analyzer": "portuguese"
  1308 + "analyzer": "portuguese",
  1309 + "fields": {
  1310 + "keyword": {
  1311 + "type": "keyword",
  1312 + "normalizer": "lowercase"
  1313 + }
  1314 + }
1159 1315 },
1160 1316 "ro": {
1161 1317 "type": "text",
1162   - "analyzer": "romanian"
  1318 + "analyzer": "romanian",
  1319 + "fields": {
  1320 + "keyword": {
  1321 + "type": "keyword",
  1322 + "normalizer": "lowercase"
  1323 + }
  1324 + }
1163 1325 },
1164 1326 "ru": {
1165 1327 "type": "text",
1166   - "analyzer": "russian"
  1328 + "analyzer": "russian",
  1329 + "fields": {
  1330 + "keyword": {
  1331 + "type": "keyword",
  1332 + "normalizer": "lowercase"
  1333 + }
  1334 + }
1167 1335 },
1168 1336 "es": {
1169 1337 "type": "text",
1170   - "analyzer": "spanish"
  1338 + "analyzer": "spanish",
  1339 + "fields": {
  1340 + "keyword": {
  1341 + "type": "keyword",
  1342 + "normalizer": "lowercase"
  1343 + }
  1344 + }
1171 1345 },
1172 1346 "sv": {
1173 1347 "type": "text",
1174   - "analyzer": "swedish"
  1348 + "analyzer": "swedish",
  1349 + "fields": {
  1350 + "keyword": {
  1351 + "type": "keyword",
  1352 + "normalizer": "lowercase"
  1353 + }
  1354 + }
1175 1355 },
1176 1356 "tr": {
1177 1357 "type": "text",
1178   - "analyzer": "turkish"
  1358 + "analyzer": "turkish",
  1359 + "fields": {
  1360 + "keyword": {
  1361 + "type": "keyword",
  1362 + "normalizer": "lowercase"
  1363 + }
  1364 + }
1179 1365 },
1180 1366 "th": {
1181 1367 "type": "text",
1182   - "analyzer": "thai"
  1368 + "analyzer": "thai",
  1369 + "fields": {
  1370 + "keyword": {
  1371 + "type": "keyword",
  1372 + "normalizer": "lowercase"
  1373 + }
  1374 + }
1183 1375 }
1184 1376 }
1185 1377 },
... ... @@ -1189,123 +1381,303 @@
1189 1381 "zh": {
1190 1382 "type": "text",
1191 1383 "analyzer": "index_ik",
1192   - "search_analyzer": "query_ik"
  1384 + "search_analyzer": "query_ik",
  1385 + "fields": {
  1386 + "keyword": {
  1387 + "type": "keyword",
  1388 + "normalizer": "lowercase"
  1389 + }
  1390 + }
1193 1391 },
1194 1392 "en": {
1195 1393 "type": "text",
1196   - "analyzer": "english"
  1394 + "analyzer": "english",
  1395 + "fields": {
  1396 + "keyword": {
  1397 + "type": "keyword",
  1398 + "normalizer": "lowercase"
  1399 + }
  1400 + }
1197 1401 },
1198 1402 "ar": {
1199 1403 "type": "text",
1200   - "analyzer": "arabic"
  1404 + "analyzer": "arabic",
  1405 + "fields": {
  1406 + "keyword": {
  1407 + "type": "keyword",
  1408 + "normalizer": "lowercase"
  1409 + }
  1410 + }
1201 1411 },
1202 1412 "hy": {
1203 1413 "type": "text",
1204   - "analyzer": "armenian"
  1414 + "analyzer": "armenian",
  1415 + "fields": {
  1416 + "keyword": {
  1417 + "type": "keyword",
  1418 + "normalizer": "lowercase"
  1419 + }
  1420 + }
1205 1421 },
1206 1422 "eu": {
1207 1423 "type": "text",
1208   - "analyzer": "basque"
  1424 + "analyzer": "basque",
  1425 + "fields": {
  1426 + "keyword": {
  1427 + "type": "keyword",
  1428 + "normalizer": "lowercase"
  1429 + }
  1430 + }
1209 1431 },
1210 1432 "pt_br": {
1211 1433 "type": "text",
1212   - "analyzer": "brazilian"
  1434 + "analyzer": "brazilian",
  1435 + "fields": {
  1436 + "keyword": {
  1437 + "type": "keyword",
  1438 + "normalizer": "lowercase"
  1439 + }
  1440 + }
1213 1441 },
1214 1442 "bg": {
1215 1443 "type": "text",
1216   - "analyzer": "bulgarian"
  1444 + "analyzer": "bulgarian",
  1445 + "fields": {
  1446 + "keyword": {
  1447 + "type": "keyword",
  1448 + "normalizer": "lowercase"
  1449 + }
  1450 + }
1217 1451 },
1218 1452 "ca": {
1219 1453 "type": "text",
1220   - "analyzer": "catalan"
  1454 + "analyzer": "catalan",
  1455 + "fields": {
  1456 + "keyword": {
  1457 + "type": "keyword",
  1458 + "normalizer": "lowercase"
  1459 + }
  1460 + }
1221 1461 },
1222 1462 "cjk": {
1223 1463 "type": "text",
1224   - "analyzer": "cjk"
  1464 + "analyzer": "cjk",
  1465 + "fields": {
  1466 + "keyword": {
  1467 + "type": "keyword",
  1468 + "normalizer": "lowercase"
  1469 + }
  1470 + }
1225 1471 },
1226 1472 "cs": {
1227 1473 "type": "text",
1228   - "analyzer": "czech"
  1474 + "analyzer": "czech",
  1475 + "fields": {
  1476 + "keyword": {
  1477 + "type": "keyword",
  1478 + "normalizer": "lowercase"
  1479 + }
  1480 + }
1229 1481 },
1230 1482 "da": {
1231 1483 "type": "text",
1232   - "analyzer": "danish"
  1484 + "analyzer": "danish",
  1485 + "fields": {
  1486 + "keyword": {
  1487 + "type": "keyword",
  1488 + "normalizer": "lowercase"
  1489 + }
  1490 + }
1233 1491 },
1234 1492 "nl": {
1235 1493 "type": "text",
1236   - "analyzer": "dutch"
  1494 + "analyzer": "dutch",
  1495 + "fields": {
  1496 + "keyword": {
  1497 + "type": "keyword",
  1498 + "normalizer": "lowercase"
  1499 + }
  1500 + }
1237 1501 },
1238 1502 "fi": {
1239 1503 "type": "text",
1240   - "analyzer": "finnish"
  1504 + "analyzer": "finnish",
  1505 + "fields": {
  1506 + "keyword": {
  1507 + "type": "keyword",
  1508 + "normalizer": "lowercase"
  1509 + }
  1510 + }
1241 1511 },
1242 1512 "fr": {
1243 1513 "type": "text",
1244   - "analyzer": "french"
  1514 + "analyzer": "french",
  1515 + "fields": {
  1516 + "keyword": {
  1517 + "type": "keyword",
  1518 + "normalizer": "lowercase"
  1519 + }
  1520 + }
1245 1521 },
1246 1522 "gl": {
1247 1523 "type": "text",
1248   - "analyzer": "galician"
  1524 + "analyzer": "galician",
  1525 + "fields": {
  1526 + "keyword": {
  1527 + "type": "keyword",
  1528 + "normalizer": "lowercase"
  1529 + }
  1530 + }
1249 1531 },
1250 1532 "de": {
1251 1533 "type": "text",
1252   - "analyzer": "german"
  1534 + "analyzer": "german",
  1535 + "fields": {
  1536 + "keyword": {
  1537 + "type": "keyword",
  1538 + "normalizer": "lowercase"
  1539 + }
  1540 + }
1253 1541 },
1254 1542 "el": {
1255 1543 "type": "text",
1256   - "analyzer": "greek"
  1544 + "analyzer": "greek",
  1545 + "fields": {
  1546 + "keyword": {
  1547 + "type": "keyword",
  1548 + "normalizer": "lowercase"
  1549 + }
  1550 + }
1257 1551 },
1258 1552 "hi": {
1259 1553 "type": "text",
1260   - "analyzer": "hindi"
  1554 + "analyzer": "hindi",
  1555 + "fields": {
  1556 + "keyword": {
  1557 + "type": "keyword",
  1558 + "normalizer": "lowercase"
  1559 + }
  1560 + }
1261 1561 },
1262 1562 "hu": {
1263 1563 "type": "text",
1264   - "analyzer": "hungarian"
  1564 + "analyzer": "hungarian",
  1565 + "fields": {
  1566 + "keyword": {
  1567 + "type": "keyword",
  1568 + "normalizer": "lowercase"
  1569 + }
  1570 + }
1265 1571 },
1266 1572 "id": {
1267 1573 "type": "text",
1268   - "analyzer": "indonesian"
  1574 + "analyzer": "indonesian",
  1575 + "fields": {
  1576 + "keyword": {
  1577 + "type": "keyword",
  1578 + "normalizer": "lowercase"
  1579 + }
  1580 + }
1269 1581 },
1270 1582 "it": {
1271 1583 "type": "text",
1272   - "analyzer": "italian"
  1584 + "analyzer": "italian",
  1585 + "fields": {
  1586 + "keyword": {
  1587 + "type": "keyword",
  1588 + "normalizer": "lowercase"
  1589 + }
  1590 + }
1273 1591 },
1274 1592 "no": {
1275 1593 "type": "text",
1276   - "analyzer": "norwegian"
  1594 + "analyzer": "norwegian",
  1595 + "fields": {
  1596 + "keyword": {
  1597 + "type": "keyword",
  1598 + "normalizer": "lowercase"
  1599 + }
  1600 + }
1277 1601 },
1278 1602 "fa": {
1279 1603 "type": "text",
1280   - "analyzer": "persian"
  1604 + "analyzer": "persian",
  1605 + "fields": {
  1606 + "keyword": {
  1607 + "type": "keyword",
  1608 + "normalizer": "lowercase"
  1609 + }
  1610 + }
1281 1611 },
1282 1612 "pt": {
1283 1613 "type": "text",
1284   - "analyzer": "portuguese"
  1614 + "analyzer": "portuguese",
  1615 + "fields": {
  1616 + "keyword": {
  1617 + "type": "keyword",
  1618 + "normalizer": "lowercase"
  1619 + }
  1620 + }
1285 1621 },
1286 1622 "ro": {
1287 1623 "type": "text",
1288   - "analyzer": "romanian"
  1624 + "analyzer": "romanian",
  1625 + "fields": {
  1626 + "keyword": {
  1627 + "type": "keyword",
  1628 + "normalizer": "lowercase"
  1629 + }
  1630 + }
1289 1631 },
1290 1632 "ru": {
1291 1633 "type": "text",
1292   - "analyzer": "russian"
  1634 + "analyzer": "russian",
  1635 + "fields": {
  1636 + "keyword": {
  1637 + "type": "keyword",
  1638 + "normalizer": "lowercase"
  1639 + }
  1640 + }
1293 1641 },
1294 1642 "es": {
1295 1643 "type": "text",
1296   - "analyzer": "spanish"
  1644 + "analyzer": "spanish",
  1645 + "fields": {
  1646 + "keyword": {
  1647 + "type": "keyword",
  1648 + "normalizer": "lowercase"
  1649 + }
  1650 + }
1297 1651 },
1298 1652 "sv": {
1299 1653 "type": "text",
1300   - "analyzer": "swedish"
  1654 + "analyzer": "swedish",
  1655 + "fields": {
  1656 + "keyword": {
  1657 + "type": "keyword",
  1658 + "normalizer": "lowercase"
  1659 + }
  1660 + }
1301 1661 },
1302 1662 "tr": {
1303 1663 "type": "text",
1304   - "analyzer": "turkish"
  1664 + "analyzer": "turkish",
  1665 + "fields": {
  1666 + "keyword": {
  1667 + "type": "keyword",
  1668 + "normalizer": "lowercase"
  1669 + }
  1670 + }
1305 1671 },
1306 1672 "th": {
1307 1673 "type": "text",
1308   - "analyzer": "thai"
  1674 + "analyzer": "thai",
  1675 + "fields": {
  1676 + "keyword": {
  1677 + "type": "keyword",
  1678 + "normalizer": "lowercase"
  1679 + }
  1680 + }
1309 1681 }
1310 1682 }
1311 1683 },
... ... @@ -1377,6 +1749,9 @@
1377 1749 "type": "keyword"
1378 1750 },
1379 1751 "value": {
  1752 + "type": "keyword"
  1753 + },
  1754 + "value_text": {
1380 1755 "type": "object",
1381 1756 "properties": {
1382 1757 "zh": {
... ... @@ -1399,6 +1774,286 @@
1399 1774 "normalizer": "lowercase"
1400 1775 }
1401 1776 }
  1777 + },
  1778 + "ar": {
  1779 + "type": "text",
  1780 + "analyzer": "arabic",
  1781 + "fields": {
  1782 + "keyword": {
  1783 + "type": "keyword",
  1784 + "normalizer": "lowercase"
  1785 + }
  1786 + }
  1787 + },
  1788 + "hy": {
  1789 + "type": "text",
  1790 + "analyzer": "armenian",
  1791 + "fields": {
  1792 + "keyword": {
  1793 + "type": "keyword",
  1794 + "normalizer": "lowercase"
  1795 + }
  1796 + }
  1797 + },
  1798 + "eu": {
  1799 + "type": "text",
  1800 + "analyzer": "basque",
  1801 + "fields": {
  1802 + "keyword": {
  1803 + "type": "keyword",
  1804 + "normalizer": "lowercase"
  1805 + }
  1806 + }
  1807 + },
  1808 + "pt_br": {
  1809 + "type": "text",
  1810 + "analyzer": "brazilian",
  1811 + "fields": {
  1812 + "keyword": {
  1813 + "type": "keyword",
  1814 + "normalizer": "lowercase"
  1815 + }
  1816 + }
  1817 + },
  1818 + "bg": {
  1819 + "type": "text",
  1820 + "analyzer": "bulgarian",
  1821 + "fields": {
  1822 + "keyword": {
  1823 + "type": "keyword",
  1824 + "normalizer": "lowercase"
  1825 + }
  1826 + }
  1827 + },
  1828 + "ca": {
  1829 + "type": "text",
  1830 + "analyzer": "catalan",
  1831 + "fields": {
  1832 + "keyword": {
  1833 + "type": "keyword",
  1834 + "normalizer": "lowercase"
  1835 + }
  1836 + }
  1837 + },
  1838 + "cjk": {
  1839 + "type": "text",
  1840 + "analyzer": "cjk",
  1841 + "fields": {
  1842 + "keyword": {
  1843 + "type": "keyword",
  1844 + "normalizer": "lowercase"
  1845 + }
  1846 + }
  1847 + },
  1848 + "cs": {
  1849 + "type": "text",
  1850 + "analyzer": "czech",
  1851 + "fields": {
  1852 + "keyword": {
  1853 + "type": "keyword",
  1854 + "normalizer": "lowercase"
  1855 + }
  1856 + }
  1857 + },
  1858 + "da": {
  1859 + "type": "text",
  1860 + "analyzer": "danish",
  1861 + "fields": {
  1862 + "keyword": {
  1863 + "type": "keyword",
  1864 + "normalizer": "lowercase"
  1865 + }
  1866 + }
  1867 + },
  1868 + "nl": {
  1869 + "type": "text",
  1870 + "analyzer": "dutch",
  1871 + "fields": {
  1872 + "keyword": {
  1873 + "type": "keyword",
  1874 + "normalizer": "lowercase"
  1875 + }
  1876 + }
  1877 + },
  1878 + "fi": {
  1879 + "type": "text",
  1880 + "analyzer": "finnish",
  1881 + "fields": {
  1882 + "keyword": {
  1883 + "type": "keyword",
  1884 + "normalizer": "lowercase"
  1885 + }
  1886 + }
  1887 + },
  1888 + "fr": {
  1889 + "type": "text",
  1890 + "analyzer": "french",
  1891 + "fields": {
  1892 + "keyword": {
  1893 + "type": "keyword",
  1894 + "normalizer": "lowercase"
  1895 + }
  1896 + }
  1897 + },
  1898 + "gl": {
  1899 + "type": "text",
  1900 + "analyzer": "galician",
  1901 + "fields": {
  1902 + "keyword": {
  1903 + "type": "keyword",
  1904 + "normalizer": "lowercase"
  1905 + }
  1906 + }
  1907 + },
  1908 + "de": {
  1909 + "type": "text",
  1910 + "analyzer": "german",
  1911 + "fields": {
  1912 + "keyword": {
  1913 + "type": "keyword",
  1914 + "normalizer": "lowercase"
  1915 + }
  1916 + }
  1917 + },
  1918 + "el": {
  1919 + "type": "text",
  1920 + "analyzer": "greek",
  1921 + "fields": {
  1922 + "keyword": {
  1923 + "type": "keyword",
  1924 + "normalizer": "lowercase"
  1925 + }
  1926 + }
  1927 + },
  1928 + "hi": {
  1929 + "type": "text",
  1930 + "analyzer": "hindi",
  1931 + "fields": {
  1932 + "keyword": {
  1933 + "type": "keyword",
  1934 + "normalizer": "lowercase"
  1935 + }
  1936 + }
  1937 + },
  1938 + "hu": {
  1939 + "type": "text",
  1940 + "analyzer": "hungarian",
  1941 + "fields": {
  1942 + "keyword": {
  1943 + "type": "keyword",
  1944 + "normalizer": "lowercase"
  1945 + }
  1946 + }
  1947 + },
  1948 + "id": {
  1949 + "type": "text",
  1950 + "analyzer": "indonesian",
  1951 + "fields": {
  1952 + "keyword": {
  1953 + "type": "keyword",
  1954 + "normalizer": "lowercase"
  1955 + }
  1956 + }
  1957 + },
  1958 + "it": {
  1959 + "type": "text",
  1960 + "analyzer": "italian",
  1961 + "fields": {
  1962 + "keyword": {
  1963 + "type": "keyword",
  1964 + "normalizer": "lowercase"
  1965 + }
  1966 + }
  1967 + },
  1968 + "no": {
  1969 + "type": "text",
  1970 + "analyzer": "norwegian",
  1971 + "fields": {
  1972 + "keyword": {
  1973 + "type": "keyword",
  1974 + "normalizer": "lowercase"
  1975 + }
  1976 + }
  1977 + },
  1978 + "fa": {
  1979 + "type": "text",
  1980 + "analyzer": "persian",
  1981 + "fields": {
  1982 + "keyword": {
  1983 + "type": "keyword",
  1984 + "normalizer": "lowercase"
  1985 + }
  1986 + }
  1987 + },
  1988 + "pt": {
  1989 + "type": "text",
  1990 + "analyzer": "portuguese",
  1991 + "fields": {
  1992 + "keyword": {
  1993 + "type": "keyword",
  1994 + "normalizer": "lowercase"
  1995 + }
  1996 + }
  1997 + },
  1998 + "ro": {
  1999 + "type": "text",
  2000 + "analyzer": "romanian",
  2001 + "fields": {
  2002 + "keyword": {
  2003 + "type": "keyword",
  2004 + "normalizer": "lowercase"
  2005 + }
  2006 + }
  2007 + },
  2008 + "ru": {
  2009 + "type": "text",
  2010 + "analyzer": "russian",
  2011 + "fields": {
  2012 + "keyword": {
  2013 + "type": "keyword",
  2014 + "normalizer": "lowercase"
  2015 + }
  2016 + }
  2017 + },
  2018 + "es": {
  2019 + "type": "text",
  2020 + "analyzer": "spanish",
  2021 + "fields": {
  2022 + "keyword": {
  2023 + "type": "keyword",
  2024 + "normalizer": "lowercase"
  2025 + }
  2026 + }
  2027 + },
  2028 + "sv": {
  2029 + "type": "text",
  2030 + "analyzer": "swedish",
  2031 + "fields": {
  2032 + "keyword": {
  2033 + "type": "keyword",
  2034 + "normalizer": "lowercase"
  2035 + }
  2036 + }
  2037 + },
  2038 + "tr": {
  2039 + "type": "text",
  2040 + "analyzer": "turkish",
  2041 + "fields": {
  2042 + "keyword": {
  2043 + "type": "keyword",
  2044 + "normalizer": "lowercase"
  2045 + }
  2046 + }
  2047 + },
  2048 + "th": {
  2049 + "type": "text",
  2050 + "analyzer": "thai",
  2051 + "fields": {
  2052 + "keyword": {
  2053 + "type": "keyword",
  2054 + "normalizer": "lowercase"
  2055 + }
  2056 + }
1402 2057 }
1403 2058 }
1404 2059 }
... ...
mappings/search_products.json.bak 0 → 100644
... ... @@ -0,0 +1,629 @@
  1 +{
  2 + "settings": {
  3 + "number_of_shards": 1,
  4 + "number_of_replicas": 0,
  5 + "refresh_interval": "30s",
  6 + "analysis": {
  7 + "analyzer": {
  8 + "index_ik": {
  9 + "type": "custom",
  10 + "tokenizer": "ik_max_word",
  11 + "filter": [
  12 + "lowercase",
  13 + "asciifolding"
  14 + ]
  15 + },
  16 + "query_ik": {
  17 + "type": "custom",
  18 + "tokenizer": "ik_smart",
  19 + "filter": [
  20 + "lowercase",
  21 + "asciifolding"
  22 + ]
  23 + }
  24 + },
  25 + "normalizer": {
  26 + "lowercase": {
  27 + "type": "custom",
  28 + "filter": [
  29 + "lowercase"
  30 + ]
  31 + }
  32 + }
  33 + },
  34 + "similarity": {
  35 + "default": {
  36 + "type": "BM25",
  37 + "b": 0.0,
  38 + "k1": 0.0
  39 + }
  40 + }
  41 + },
  42 + "mappings": {
  43 + "properties": {
  44 + "tenant_id": {
  45 + "type": "keyword"
  46 + },
  47 + "spu_id": {
  48 + "type": "keyword"
  49 + },
  50 + "create_time": {
  51 + "type": "date"
  52 + },
  53 + "update_time": {
  54 + "type": "date"
  55 + },
  56 + "title": {
  57 + "type": "object",
  58 + "properties": {
  59 + "zh": {
  60 + "type": "text",
  61 + "analyzer": "index_ik",
  62 + "search_analyzer": "query_ik"
  63 + },
  64 + "en": {
  65 + "type": "text",
  66 + "analyzer": "english"
  67 + },
  68 + "ar": {
  69 + "type": "text",
  70 + "analyzer": "arabic"
  71 + },
  72 + "hy": {
  73 + "type": "text",
  74 + "analyzer": "armenian"
  75 + },
  76 + "eu": {
  77 + "type": "text",
  78 + "analyzer": "basque"
  79 + },
  80 + "pt_br": {
  81 + "type": "text",
  82 + "analyzer": "brazilian"
  83 + },
  84 + "bg": {
  85 + "type": "text",
  86 + "analyzer": "bulgarian"
  87 + },
  88 + "ca": {
  89 + "type": "text",
  90 + "analyzer": "catalan"
  91 + },
  92 + "cjk": {
  93 + "type": "text",
  94 + "analyzer": "cjk"
  95 + },
  96 + "cs": {
  97 + "type": "text",
  98 + "analyzer": "czech"
  99 + },
  100 + "da": {
  101 + "type": "text",
  102 + "analyzer": "danish"
  103 + },
  104 + "nl": {
  105 + "type": "text",
  106 + "analyzer": "dutch"
  107 + },
  108 + "fi": {
  109 + "type": "text",
  110 + "analyzer": "finnish"
  111 + },
  112 + "fr": {
  113 + "type": "text",
  114 + "analyzer": "french"
  115 + },
  116 + "gl": {
  117 + "type": "text",
  118 + "analyzer": "galician"
  119 + },
  120 + "de": {
  121 + "type": "text",
  122 + "analyzer": "german"
  123 + },
  124 + "el": {
  125 + "type": "text",
  126 + "analyzer": "greek"
  127 + },
  128 + "hi": {
  129 + "type": "text",
  130 + "analyzer": "hindi"
  131 + },
  132 + "hu": {
  133 + "type": "text",
  134 + "analyzer": "hungarian"
  135 + },
  136 + "id": {
  137 + "type": "text",
  138 + "analyzer": "indonesian"
  139 + },
  140 + "it": {
  141 + "type": "text",
  142 + "analyzer": "italian"
  143 + },
  144 + "no": {
  145 + "type": "text",
  146 + "analyzer": "norwegian"
  147 + },
  148 + "fa": {
  149 + "type": "text",
  150 + "analyzer": "persian"
  151 + },
  152 + "pt": {
  153 + "type": "text",
  154 + "analyzer": "portuguese"
  155 + },
  156 + "ro": {
  157 + "type": "text",
  158 + "analyzer": "romanian"
  159 + },
  160 + "ru": {
  161 + "type": "text",
  162 + "analyzer": "russian"
  163 + },
  164 + "es": {
  165 + "type": "text",
  166 + "analyzer": "spanish"
  167 + },
  168 + "sv": {
  169 + "type": "text",
  170 + "analyzer": "swedish"
  171 + },
  172 + "tr": {
  173 + "type": "text",
  174 + "analyzer": "turkish"
  175 + },
  176 + "th": {
  177 + "type": "text",
  178 + "analyzer": "thai"
  179 + }
  180 + }
  181 + },
  182 + "keywords": {
  183 + "type": "object",
  184 + "properties": {
  185 + "zh": {
  186 + "type": "text",
  187 + "analyzer": "index_ik",
  188 + "search_analyzer": "query_ik"
  189 + },
  190 + "en": {
  191 + "type": "text",
  192 + "analyzer": "english",
  193 + "fields": {
  194 + "keyword": {
  195 + "type": "keyword",
  196 + "normalizer": "lowercase"
  197 + }
  198 + }
  199 + },
  200 + "ar": {
  201 + "type": "text",
  202 + "analyzer": "arabic",
  203 + "fields": {
  204 + "keyword": {
  205 + "type": "keyword",
  206 + "normalizer": "lowercase"
  207 + }
  208 + }
  209 + },
  210 +...
  211 + }
  212 + },
  213 + "brief": {
  214 + "type": "object",
  215 + "properties": {
  216 + "zh": {
  217 + "type": "text",
  218 + "analyzer": "index_ik",
  219 + "search_analyzer": "query_ik"
  220 + },
  221 + "en": {
  222 + "type": "text",
  223 + "analyzer": "english"
  224 + },
  225 + "ar": {
  226 + "type": "text",
  227 + "analyzer": "arabic"
  228 + },
  229 + ...
  230 + }
  231 + },
  232 + "description": {
  233 + "type": "object",
  234 + "properties": {
  235 + "zh": {
  236 + "type": "text",
  237 + "analyzer": "index_ik",
  238 + "search_analyzer": "query_ik"
  239 + },
  240 + "en": {
  241 + "type": "text",
  242 + "analyzer": "english"
  243 + },
  244 + "ar": {
  245 + "type": "text",
  246 + "analyzer": "arabic"
  247 + },
  248 + ...
  249 + }
  250 + },
  251 + "vendor": {
  252 + "type": "object",
  253 + "properties": {
  254 + "zh": {
  255 + "type": "text",
  256 + "analyzer": "index_ik",
  257 + "search_analyzer": "query_ik"
  258 + },
  259 + "en": {
  260 + "type": "text",
  261 + "analyzer": "english",
  262 + "fields": {
  263 + "keyword": {
  264 + "type": "keyword",
  265 + "normalizer": "lowercase"
  266 + }
  267 + }
  268 + },
  269 + "ar": {
  270 + "type": "text",
  271 + "analyzer": "arabic",
  272 + "fields": {
  273 + "keyword": {
  274 + "type": "keyword",
  275 + "normalizer": "lowercase"
  276 + }
  277 + }
  278 + },
  279 + ...
  280 + }
  281 + },
  282 + "image_url": {
  283 + "type": "keyword",
  284 + "index": false
  285 + },
  286 + "title_embedding": {
  287 + "type": "dense_vector",
  288 + "dims": 1024,
  289 + "index": true,
  290 + "similarity": "dot_product",
  291 + "element_type": "bfloat16"
  292 + },
  293 + "image_embedding": {
  294 + "type": "nested",
  295 + "properties": {
  296 + "vector": {
  297 + "type": "dense_vector",
  298 + "dims": 768,
  299 + "index": true,
  300 + "similarity": "dot_product",
  301 + "element_type": "bfloat16"
  302 + },
  303 + "url": {
  304 + "type": "text"
  305 + }
  306 + }
  307 + },
  308 + "category_path": {
  309 + "type": "object",
  310 + "properties": {
  311 + "zh": {
  312 + "type": "text",
  313 + "analyzer": "index_ik",
  314 + "search_analyzer": "query_ik"
  315 + },
  316 + "en": {
  317 + "type": "text",
  318 + "analyzer": "english"
  319 + },
  320 + "ar": {
  321 + "type": "text",
  322 + "analyzer": "arabic"
  323 + },
  324 + ...
  325 + }
  326 + }
  327 + },
  328 + "category_name_text": {
  329 + "type": "object",
  330 + "properties": {
  331 + "zh": {
  332 + "type": "text",
  333 + "analyzer": "index_ik",
  334 + "search_analyzer": "query_ik"
  335 + },
  336 + "en": {
  337 + "type": "text",
  338 + "analyzer": "english"
  339 + },
  340 + "ar": {
  341 + "type": "text",
  342 + "analyzer": "arabic"
  343 + },
  344 + ...
  345 +
  346 + }
  347 + },
  348 + "qanchors": {
  349 + "type": "object",
  350 + "properties": {
  351 + "zh": {
  352 + "type": "text",
  353 + "analyzer": "index_ik",
  354 + "search_analyzer": "query_ik"
  355 + },
  356 + "en": {
  357 + "type": "text",
  358 + "analyzer": "english"
  359 + }
  360 + }
  361 + },
  362 + "tags": {
  363 + "type": "object",
  364 + "properties": {
  365 + "zh": {
  366 + "type": "text",
  367 + "analyzer": "index_ik",
  368 + "search_analyzer": "query_ik",
  369 + "fields": {
  370 + "keyword": {
  371 + "type": "keyword",
  372 + "normalizer": "lowercase"
  373 + }
  374 + }
  375 + },
  376 + "en": {
  377 + "type": "text",
  378 + "analyzer": "english",
  379 + "fields": {
  380 + "keyword": {
  381 + "type": "keyword",
  382 + "normalizer": "lowercase"
  383 + }
  384 + }
  385 + }
  386 + }
  387 + },
  388 + "category_id": {
  389 + "type": "keyword"
  390 + },
  391 + "category_name": {
  392 + "type": "keyword"
  393 + },
  394 + "category_level": {
  395 + "type": "integer"
  396 + },
  397 + "category1_name": {
  398 + "type": "keyword"
  399 + },
  400 + "category2_name": {
  401 + "type": "keyword"
  402 + },
  403 + "category3_name": {
  404 + "type": "keyword"
  405 + },
  406 + "specifications": {
  407 + "type": "nested",
  408 + "properties": {
  409 + "sku_id": {
  410 + "type": "keyword"
  411 + },
  412 + "name": {
  413 + "type": "keyword"
  414 + },
  415 + "value": {
  416 + "type": "object",
  417 + "properties": {
  418 + "zh": {
  419 + "type": "text",
  420 + "analyzer": "index_ik",
  421 + "search_analyzer": "query_ik",
  422 + "fields": {
  423 + "keyword": {
  424 + "type": "keyword",
  425 + "normalizer": "lowercase"
  426 + }
  427 + }
  428 + },
  429 + "en": {
  430 + "type": "text",
  431 + "analyzer": "english",
  432 + "fields": {
  433 + "keyword": {
  434 + "type": "keyword",
  435 + "normalizer": "lowercase"
  436 + }
  437 + }
  438 + }
  439 + }
  440 + }
  441 + }
  442 + },
  443 + "enriched_attributes": {
  444 + "type": "nested",
  445 + "properties": {
  446 + "name": {
  447 + "type": "keyword"
  448 + },
  449 + "value": {
  450 + "type": "object",
  451 + "properties": {
  452 + "zh": {
  453 + "type": "text",
  454 + "analyzer": "index_ik",
  455 + "search_analyzer": "query_ik",
  456 + "fields": {
  457 + "keyword": {
  458 + "type": "keyword",
  459 + "normalizer": "lowercase"
  460 + }
  461 + }
  462 + },
  463 + "en": {
  464 + "type": "text",
  465 + "analyzer": "english",
  466 + "fields": {
  467 + "keyword": {
  468 + "type": "keyword",
  469 + "normalizer": "lowercase"
  470 + }
  471 + }
  472 + }
  473 + }
  474 + }
  475 + }
  476 + },
  477 + "option1_name": {
  478 + "type": "keyword"
  479 + },
  480 + "option2_name": {
  481 + "type": "keyword"
  482 + },
  483 + "option3_name": {
  484 + "type": "keyword"
  485 + },
  486 + "option1_values": {
  487 + "type": "object",
  488 + "properties": {
  489 + "zh": {
  490 + "type": "text",
  491 + "analyzer": "index_ik",
  492 + "search_analyzer": "query_ik",
  493 + "fields": {
  494 + "keyword": {
  495 + "type": "keyword",
  496 + "normalizer": "lowercase"
  497 + }
  498 + }
  499 + },
  500 + "en": {
  501 + "type": "text",
  502 + "analyzer": "english",
  503 + "fields": {
  504 + "keyword": {
  505 + "type": "keyword",
  506 + "normalizer": "lowercase"
  507 + }
  508 + }
  509 + }
  510 + }
  511 + },
  512 + "option2_values": {
  513 + "type": "object",
  514 + "properties": {
  515 + "zh": {
  516 + "type": "text",
  517 + "analyzer": "index_ik",
  518 + "search_analyzer": "query_ik",
  519 + "fields": {
  520 + "keyword": {
  521 + "type": "keyword",
  522 + "normalizer": "lowercase"
  523 + }
  524 + }
  525 + },
  526 + "en": {
  527 + "type": "text",
  528 + "analyzer": "english",
  529 + "fields": {
  530 + "keyword": {
  531 + "type": "keyword",
  532 + "normalizer": "lowercase"
  533 + }
  534 + }
  535 + }
  536 + }
  537 + },
  538 + "option3_values": {
  539 + "type": "object",
  540 + "properties": {
  541 + "zh": {
  542 + "type": "text",
  543 + "analyzer": "index_ik",
  544 + "search_analyzer": "query_ik",
  545 + "fields": {
  546 + "keyword": {
  547 + "type": "keyword",
  548 + "normalizer": "lowercase"
  549 + }
  550 + }
  551 + },
  552 + "en": {
  553 + "type": "text",
  554 + "analyzer": "english",
  555 + "fields": {
  556 + "keyword": {
  557 + "type": "keyword",
  558 + "normalizer": "lowercase"
  559 + }
  560 + }
  561 + }
  562 + }
  563 + },
  564 + "min_price": {
  565 + "type": "float"
  566 + },
  567 + "max_price": {
  568 + "type": "float"
  569 + },
  570 + "compare_at_price": {
  571 + "type": "float"
  572 + },
  573 + "sku_prices": {
  574 + "type": "float"
  575 + },
  576 + "sku_weights": {
  577 + "type": "long"
  578 + },
  579 + "sku_weight_units": {
  580 + "type": "keyword"
  581 + },
  582 + "total_inventory": {
  583 + "type": "long"
  584 + },
  585 + "sales": {
  586 + "type": "long"
  587 + },
  588 + "skus": {
  589 + "type": "nested",
  590 + "properties": {
  591 + "sku_id": {
  592 + "type": "keyword"
  593 + },
  594 + "price": {
  595 + "type": "float"
  596 + },
  597 + "compare_at_price": {
  598 + "type": "float"
  599 + },
  600 + "sku_code": {
  601 + "type": "keyword"
  602 + },
  603 + "stock": {
  604 + "type": "long"
  605 + },
  606 + "weight": {
  607 + "type": "float"
  608 + },
  609 + "weight_unit": {
  610 + "type": "keyword"
  611 + },
  612 + "option1_value": {
  613 + "type": "keyword"
  614 + },
  615 + "option2_value": {
  616 + "type": "keyword"
  617 + },
  618 + "option3_value": {
  619 + "type": "keyword"
  620 + },
  621 + "image_src": {
  622 + "type": "keyword",
  623 + "index": false
  624 + }
  625 + }
  626 + }
  627 + }
  628 + }
  629 +}
0 630 \ No newline at end of file
... ...