Commit 3b84605d2be02bbf0e64d54b5f2a388144d9b3bc
1 parent
484adbfe
docs
Showing
3 changed files
with
731 additions
and
730 deletions
Show diff stats
docs/ES/ES_8.18/1_ES配置和使用.md
| 1 | -# Elasticsearch 文档 | |
| 2 | - | |
| 3 | -## 相关链接 | |
| 4 | -- 接口文档:http://rap.essa.top:88/workspace/myWorkspace.do?projectId=78#2187 | |
| 5 | -- Kibana 控制台:http://43.166.252.75:5601/app/dev_tools#/console/shell | |
| 6 | - | |
| 7 | -## 分词方面 | |
| 8 | - | |
| 9 | -Ansj 分词插件安装 | |
| 10 | -ES可以用的中文分词,效果最好的是hanLP和ansj,其次是jieba。 | |
| 11 | - | |
| 12 | -我们老的搜索 solr 已经在几年前替代掉了ik,使用的是mmseg。但是我没找到mmseg的ES插件。 | |
| 13 | - | |
| 14 | -为了分词方面不至于比老版本差,这里先安装了ansj | |
| 15 | - | |
| 16 | -### 1. 下载插件 | |
| 17 | -从 [elasticsearch-analysis-ansj releases](https://github.com/NLPchina/elasticsearch-analysis-ansj/releases) 选择对应版本下载: | |
| 18 | - | |
| 19 | -- ES 8.18 版本: | |
| 20 | -```bash | |
| 21 | -wget https://github.com/NLPchina/elasticsearch-analysis-ansj/archive/refs/tags/v8.18.0.zip | |
| 22 | -``` | |
| 23 | - | |
| 24 | -- ES 8.17 版本: | |
| 25 | -```bash | |
| 26 | -wget https://github.com/NLPchina/elasticsearch-analysis-ansj/archive/refs/tags/v8.17.6.zip | |
| 27 | -``` | |
| 28 | - | |
| 29 | -### 2. 编译 | |
| 30 | -执行 `mvn package` 命令,编译成功后将在 `target/releases/` 目录生成插件压缩包: | |
| 31 | -`elasticsearch-analysis-ansj-<版本号>-release.zip` | |
| 32 | - | |
| 33 | -### 3. 安装步骤 | |
| 34 | -1. 进入 ES 安装路径(默认:`/usr/share/elasticsearch/`) | |
| 35 | -2. 执行安装命令: | |
| 36 | -```bash | |
| 37 | -bin/elasticsearch-plugin install file:///xxx/绝对路径到/elasticsearch-analysis-ansj-8.18.0.0-release.zip | |
| 38 | -``` | |
| 39 | -3. 重启服务: | |
| 40 | -```bash | |
| 41 | -systemctl restart elasticsearch | |
| 42 | -``` | |
| 43 | - | |
| 44 | -其他分词插件安装方法: | |
| 45 | -《3.1_hanlp安装.md》 | |
| 46 | -《3.2_jieba插件安装.md》 | |
| 47 | -在ES8上面安装过,但是没试过具体的版本 8.17 8.18 | |
| 48 | - | |
| 49 | -### 4. 配置说明 | |
| 50 | -停用词、同义词配置位于 `<ES_HOME>/config/elasticsearch-analysis-ansj/ansj.cfg.yml`(暂未使用) | |
| 51 | - | |
| 52 | -## 字段说明 | |
| 53 | - | |
| 54 | -```bash | |
| 55 | -需要的字段: | |
| 56 | -id 商品skuId | |
| 57 | -goods_id 商品spuId | |
| 58 | -buyer_id 所属专属采购商id | |
| 59 | -trader_buyer_ids 所属贸易商名下平台客户的专属采购商id | |
| 60 | -goods_certification_types 商品证书类型 | |
| 61 | -supplier_code 供应商编码 | |
| 62 | -supplier_name 供应商名称 | |
| 63 | -supplier_certification_code 供应商企业证书编码(列表) | |
| 64 | -auth_buyer_level_list 商品可见采购商等级(集合) | |
| 65 | -show_price_level_list 价格可见采购商等级(集合) | |
| 66 | -goods_composition 成分列表(材质) | |
| 67 | -compositions_main_secondary 物料主副(主:1,副:2),格式:物料代码_主副类型 | |
| 68 | -goods_key_word_zh 商品关键词中文 | |
| 69 | -goods_key_word_en 商品关键词英文 | |
| 70 | -goods_key_word_ru 商品关键词俄文 | |
| 71 | -goods_copyright 版权(自有、第三方、无授权、A货) | |
| 72 | -goods_main_material 主材质(字典:材质) | |
| 73 | -is_in_new_protect 是否在新品保护期(0否,1是) | |
| 74 | -goods_new_protect_date_stamp 新品保护期日期时间戳 | |
| 75 | -goods_attribute_name_zh spu属性中文(列表) | |
| 76 | -goods_attribute_name_en spu属性英文(列表) | |
| 77 | -goods_attribute_name_ru spu属性俄文(列表) | |
| 78 | -purchase_moq 采购MOQ | |
| 79 | -ts 触发索引的时间 | |
| 80 | -deliver_day 货期 | |
| 81 | -factory_no 工厂货号 | |
| 82 | -factory_no_buyer 工厂货号(客户) | |
| 83 | -fir_on_sell_time 首次上架时间 | |
| 84 | -fir_on_sell_time_stamp 首次上架时间timestamp | |
| 85 | -no 商品编码 | |
| 86 | -hs_no 宏升编码 | |
| 87 | -package_type 包装类型值(来自商品属性编码:PKG) | |
| 88 | -package_type_id 包装类型ID(来自商品属性编码:PKG) | |
| 89 | -labelId_by_skuId_essaone_* essaone商品标签,国家编码标识 | |
| 90 | -sale_goods_certificate_* 商品证书ID,国家编码标识 | |
| 91 | -labelId_by_skuId_essa_* essa商品标签,区域ID标识 | |
| 92 | -``` | |
| 93 | - | |
| 94 | -## Mapping 配置 | |
| 95 | -参考文件 `create_index.sh`: | |
| 96 | - | |
| 97 | -## 快速入门 | |
| 98 | - | |
| 99 | -### Shell | |
| 100 | -参考 [索引和查询测试](../docs/3.3_索引和查询测试.md) 包含了在ES服务器进行本地进行一些常用的查询操作。 | |
| 101 | - | |
| 102 | -### Python | |
| 103 | -- [test_index_and_search.py](../tests/test_index_and_search.py) 是一个简单的例子,创建索引,导入数据,查询数据 | |
| 104 | -- [batch_bulk_goods.py](../batch_bulk_goods.py) 功能是 通过sql 读取最近3年的所有数据,按batch(1000)通过bulk接口进行逐批入库,入库到goods索引。 | |
| 105 | - | |
| 106 | -### Kibana | |
| 107 | - | |
| 108 | -#### 分词相关 | |
| 109 | -```bash | |
| 110 | -# 索引分词 | |
| 111 | -GET /_cat/ansj?text=14寸第4代真眼珠实身冰雪公仔带手动大推车,搪胶雪宝宝&type=index_ansj | |
| 112 | - | |
| 113 | -# 查询分词 | |
| 114 | -GET /_cat/ansj?text=14寸第4代真眼珠实身冰雪公仔带手动大推车,搪胶雪宝宝&type=query_ansj | |
| 115 | - | |
| 116 | -# 查看配置 | |
| 117 | -GET /_cat/ansj/config | |
| 118 | -``` | |
| 119 | -#### 查询相关 | |
| 120 | -GET /goods/_search | |
| 121 | -{ | |
| 122 | - "query": { | |
| 123 | - "match_all": {} | |
| 124 | - }, | |
| 125 | - "size": 5 | |
| 126 | -} | |
| 127 | - | |
| 128 | -#### 1. 查看字段分词结果 | |
| 129 | -```bash | |
| 130 | -# 查看中文名称分词结果 | |
| 131 | -GET /_cat/ansj?text=14寸第4代真眼珠实身冰雪公仔带手动大推车&type=index_ansj | |
| 132 | - | |
| 133 | -# 查看英文名称分词结果 | |
| 134 | -GET /_cat/ansj?text=14 inch 4th generation real eye snow doll with manual cart&type=standard | |
| 135 | -``` | |
| 136 | - | |
| 137 | -#### 2. 查看索引随机10条内容 | |
| 138 | -```bash | |
| 139 | -GET /goods/_search | |
| 140 | -{ | |
| 141 | - "size": 10, | |
| 142 | - "query": { | |
| 143 | - "function_score": { | |
| 144 | - "query": { "match_all": {} }, | |
| 145 | - "random_score": {} | |
| 146 | - } | |
| 147 | - } | |
| 148 | -} | |
| 149 | -``` | |
| 150 | - | |
| 151 | -#### 3. 关键词查询 | |
| 152 | -```bash | |
| 153 | -# 简单关键词匹配 | |
| 154 | -GET /goods/_search | |
| 155 | -{ | |
| 156 | - "query": { | |
| 157 | - "match": { | |
| 158 | - "name_zh": "冰雪公仔" | |
| 159 | - } | |
| 160 | - } | |
| 161 | -} | |
| 162 | - | |
| 163 | -# 多字段关键词匹配 | |
| 164 | -GET /goods/_search | |
| 165 | -{ | |
| 166 | - "query": { | |
| 167 | - "multi_match": { | |
| 168 | - "query": "冰雪公仔", | |
| 169 | - "fields": ["name_zh", "sub_name_zh", "category_name_zh"] | |
| 170 | - } | |
| 171 | - } | |
| 172 | -} | |
| 173 | -``` | |
| 174 | - | |
| 175 | -#### 4. 向量查询 | |
| 176 | -```bash | |
| 177 | -# 使用向量相似度查询 | |
| 178 | -GET /goods/_search | |
| 179 | -{ | |
| 180 | - "query": { | |
| 181 | - "script_score": { | |
| 182 | - "query": { "match_all": {} }, | |
| 183 | - "script": { | |
| 184 | - "source": "cosineSimilarity(params.query_vector, 'name_prefix') + 1.0", | |
| 185 | - "params": { | |
| 186 | - "query_vector": [0.1, 0.2, ...] # 1024维向量 | |
| 187 | - } | |
| 188 | - } | |
| 189 | - } | |
| 190 | - } | |
| 191 | -} | |
| 192 | -``` | |
| 193 | - | |
| 194 | -#### 5. SKUID查询 | |
| 195 | -```bash | |
| 196 | -# 精确匹配SKUID | |
| 197 | -GET /goods/_search | |
| 198 | -{ | |
| 199 | - "query": { | |
| 200 | - "term": { | |
| 201 | - "goods_id": "2817667" | |
| 202 | - } | |
| 203 | - } | |
| 204 | -} | |
| 205 | -``` | |
| 206 | - | |
| 207 | -#### 6. 名称查询测试 | |
| 208 | -```bash | |
| 209 | -# 中文名称模糊匹配 | |
| 210 | -GET /goods/_search | |
| 211 | -{ | |
| 212 | - "query": { | |
| 213 | - "match": { | |
| 214 | - "name_zh": { | |
| 215 | - "query": "冰雪公仔", | |
| 216 | - "fuzziness": "AUTO" | |
| 217 | - } | |
| 218 | - } | |
| 219 | - } | |
| 220 | -} | |
| 221 | - | |
| 222 | -# 英文名称匹配 | |
| 223 | -GET /goods/_search | |
| 224 | -{ | |
| 225 | - "query": { | |
| 226 | - "match": { | |
| 227 | - "name_en": "snow doll" | |
| 228 | - } | |
| 229 | - } | |
| 230 | -} | |
| 231 | - | |
| 232 | -# 俄语名称匹配 | |
| 233 | -GET /goods/_search | |
| 234 | -{ | |
| 235 | - "query": { | |
| 236 | - "match": { | |
| 237 | - "name_ru": "снежная кукла" | |
| 238 | - } | |
| 239 | - } | |
| 240 | -} | |
| 241 | - | |
| 242 | -# 使用 match_phrase 进行短语匹配 | |
| 243 | -GET /goods/_search | |
| 244 | -{ | |
| 245 | - "query": { | |
| 246 | - "match_phrase": { | |
| 247 | - "name_zh": "冰雪公仔" | |
| 248 | - } | |
| 249 | - } | |
| 250 | -} | |
| 251 | - | |
| 252 | -# 使用 match_phrase 进行多语言短语匹配 | |
| 253 | -GET /goods/_search | |
| 254 | -{ | |
| 255 | - "query": { | |
| 256 | - "bool": { | |
| 257 | - "should": [ | |
| 258 | - { | |
| 259 | - "match_phrase": { | |
| 260 | - "name_zh": "冰雪公仔" | |
| 261 | - } | |
| 262 | - }, | |
| 263 | - { | |
| 264 | - "match_phrase": { | |
| 265 | - "name_en": "snow doll" | |
| 266 | - } | |
| 267 | - }, | |
| 268 | - { | |
| 269 | - "match_phrase": { | |
| 270 | - "name_ru": "снежная кукла" | |
| 271 | - } | |
| 272 | - } | |
| 273 | - ], | |
| 274 | - "minimum_should_match": 1 | |
| 275 | - } | |
| 276 | - } | |
| 277 | -} | |
| 278 | - | |
| 279 | -# 使用 match_phrase 配合 slop 参数进行模糊短语匹配 | |
| 280 | -GET /goods/_search | |
| 281 | -{ | |
| 282 | - "query": { | |
| 283 | - "match_phrase": { | |
| 284 | - "name_zh": { | |
| 285 | - "query": "冰雪公仔", | |
| 286 | - "slop": 2 | |
| 287 | - } | |
| 288 | - } | |
| 289 | - } | |
| 290 | -} | |
| 291 | - | |
| 292 | -# 多语言 match_phrase 配合 slop 参数 | |
| 293 | -GET /goods/_search | |
| 294 | -{ | |
| 295 | - "query": { | |
| 296 | - "bool": { | |
| 297 | - "should": [ | |
| 298 | - { | |
| 299 | - "match_phrase": { | |
| 300 | - "name_zh": { | |
| 301 | - "query": "冰雪公仔", | |
| 302 | - "slop": 2 | |
| 303 | - } | |
| 304 | - } | |
| 305 | - }, | |
| 306 | - { | |
| 307 | - "match_phrase": { | |
| 308 | - "name_en": { | |
| 309 | - "query": "snow doll", | |
| 310 | - "slop": 1 | |
| 311 | - } | |
| 312 | - } | |
| 313 | - }, | |
| 314 | - { | |
| 315 | - "match_phrase": { | |
| 316 | - "name_ru": { | |
| 317 | - "query": "снежная кукла", | |
| 318 | - "slop": 2 | |
| 319 | - } | |
| 320 | - } | |
| 321 | - } | |
| 322 | - ], | |
| 323 | - "minimum_should_match": 1 | |
| 324 | - } | |
| 325 | - } | |
| 326 | -} | |
| 327 | -``` | |
| 328 | - | |
| 329 | -#### 7. 多语言查询测试 | |
| 330 | -```bash | |
| 331 | -# 同时查询中英文名称 | |
| 332 | -GET /goods/_search | |
| 333 | -{ | |
| 334 | - "query": { | |
| 335 | - "bool": { | |
| 336 | - "should": [ | |
| 337 | - { | |
| 338 | - "match": { | |
| 339 | - "name_zh": "冰雪公仔" | |
| 340 | - } | |
| 341 | - }, | |
| 342 | - { | |
| 343 | - "match": { | |
| 344 | - "name_en": "snow doll" | |
| 345 | - } | |
| 346 | - } | |
| 347 | - ], | |
| 348 | - "minimum_should_match": 1 | |
| 349 | - } | |
| 350 | - } | |
| 351 | -} | |
| 352 | -``` | |
| 353 | - | |
| 354 | -#### 8. 向量索引查询测试 | |
| 355 | -注意:向量查询中的向量维度必须与索引中定义的维度匹配(1024维) | |
| 356 | -```bash | |
| 357 | -# 使用向量相似度进行商品推荐 | |
| 358 | -GET /goods/_search | |
| 359 | -{ | |
| 360 | - "query": { | |
| 361 | - "script_score": { | |
| 362 | - "query": { "match_all": {} }, | |
| 363 | - "script": { | |
| 364 | - "source": "cosineSimilarity(params.query_vector, 'ru_name') + 1.0", | |
| 365 | - "params": { | |
| 366 | - "query_vector": [0.1, 0.2, ...] # 1024维向量 | |
| 367 | - } | |
| 368 | - } | |
| 369 | - } | |
| 370 | - }, | |
| 371 | - "size": 10 | |
| 372 | -} | |
| 373 | -``` | |
| 374 | - | |
| 375 | -#### 9. 关键词+向量索引组合查询测试 | |
| 376 | -```bash | |
| 377 | -# 关键词搜索+向量相似度提权 | |
| 378 | -GET /goods/_search | |
| 379 | -{ | |
| 380 | - "query": { | |
| 381 | - "function_score": { | |
| 382 | - "query": { | |
| 383 | - "match": { | |
| 384 | - "name_zh": "冰雪公仔", | |
| 385 | - "boost": 1.0 | |
| 386 | - } | |
| 387 | - }, | |
| 388 | - "functions": [ | |
| 389 | - { | |
| 390 | - "script_score": { | |
| 391 | - "script": { | |
| 392 | - "source": "cosineSimilarity(params.query_vector, 'name_prefix') + 1.0", | |
| 393 | - "params": { | |
| 394 | - "query_vector": [0.1, 0.2, ...] # 1024维向量 | |
| 395 | - } | |
| 396 | - } | |
| 397 | - } | |
| 398 | - } | |
| 399 | - ], | |
| 400 | - "boost_mode": "multiply" | |
| 401 | - } | |
| 402 | - } | |
| 403 | -} | |
| 404 | - | |
| 405 | -# source 可以支持embedding为空 : "source": "doc['embedding'].isEmpty() ? 1.0 : dotProduct(params.query_vector, 'embedding') + 1.0", | |
| 406 | - | |
| 407 | -两者乘起来: | |
| 408 | -{ | |
| 409 | - "query": { | |
| 410 | - "function_score": { | |
| 411 | - "score_mode": "sum", | |
| 412 | - "boost_mode": "multiply", | |
| 413 | - "query": { | |
| 414 | - "match": { | |
| 415 | - "content": { | |
| 416 | - "query": keywords, | |
| 417 | - "boost": 1.0 | |
| 418 | - } | |
| 419 | - } | |
| 420 | - }, | |
| 421 | - "functions": [ | |
| 422 | - { | |
| 423 | - "script_score": { | |
| 424 | - "script": { | |
| 425 | - "source": "doc['embedding'].isEmpty() ? 1.0 : dotProduct(params.query_vector, 'embedding') + 1.0", | |
| 426 | - "params": {"query_vector": context.embeddings[0][1]} | |
| 427 | - } | |
| 428 | - } | |
| 429 | - } | |
| 430 | - ] | |
| 431 | - } | |
| 432 | - } | |
| 433 | - } | |
| 434 | - | |
| 435 | -#### 9. 向量搜索+关键词搜索 | |
| 436 | -GET /goods/_search | |
| 437 | -{ | |
| 438 | - "query": { | |
| 439 | - "match": { | |
| 440 | - "content": { | |
| 441 | - "query": "玩具", | |
| 442 | - "boost": 1.0 | |
| 443 | - } | |
| 444 | - } | |
| 445 | - }, | |
| 446 | - "knn": { | |
| 447 | - "field": "name_prefix", | |
| 448 | - "query_vector": [-0.05291186273097992, ...], | |
| 449 | - "k": 5, | |
| 450 | - "num_candidates": 10, | |
| 451 | - "boost": 1.0 | |
| 452 | - } | |
| 453 | - } | |
| 454 | - | |
| 455 | - | |
| 456 | - | |
| 457 | - | |
| 458 | -参考代码: | |
| 459 | -```python | |
| 460 | - def execute_search(self, context, search_type="match_phrase", search_type_attachment=0, size=10): | |
| 461 | - query = context.query | |
| 462 | - normalized_query = context.normalized_query | |
| 463 | - core_term = context.core_term | |
| 464 | - keywords = context.keywords | |
| 465 | - knn_boost_keywords = core_term if core_term else keywords | |
| 466 | - expand = context.expand | |
| 467 | - | |
| 468 | - seen_queries = set() | |
| 469 | - unique_queries = [] | |
| 470 | - for q, weight in [(query, 1.0), (normalized_query, 1.0), (keywords, 0.5)]: | |
| 471 | - if q and q not in seen_queries: | |
| 472 | - unique_queries.append((q, weight)) | |
| 473 | - seen_queries.add(q) | |
| 474 | - | |
| 475 | - # 关于混合检索: | |
| 476 | - # knn和文本查询同时作用: | |
| 477 | - # 8.12之前query里面不能包含knn,kNN搜索作为查询已在 8.12 版本中引入: https://www.elastic.co/search-labs/blog/knn-query-elasticsearch | |
| 478 | - # { | |
| 479 | - # "size": 3, | |
| 480 | - # "query": { | |
| 481 | - # "bool": { | |
| 482 | - # "should": [ | |
| 483 | - # { | |
| 484 | - # "knn": { | |
| 485 | - # "field": "embedding", | |
| 486 | - # "query_vector": [2,2,2,0], | |
| 487 | - # "num_candidates": 10, | |
| 488 | - # "_name": "knn_query" | |
| 489 | - # } | |
| 490 | - # }, | |
| 491 | - # { | |
| 492 | - # "match": { | |
| 493 | - # "description": { | |
| 494 | - # "query": "luxury", | |
| 495 | - # "_name": "bm25query" | |
| 496 | - # } | |
| 497 | - # } | |
| 498 | - # } | |
| 499 | - # ] | |
| 500 | - # } | |
| 501 | - # | |
| 502 | - # knn里面不能包含query。 | |
| 503 | - # knn和query并列(hybrid search 混合检索),是求或的关系。 | |
| 504 | - # knn里面可以加filter,比如: "filter": {"match": {"my_label": "red"}} | |
| 505 | - | |
| 506 | - if search_type == "match_phrase": | |
| 507 | - body = { | |
| 508 | - "query": { | |
| 509 | - "bool": { | |
| 510 | - "should": [ | |
| 511 | - { | |
| 512 | - "match_phrase": { | |
| 513 | - "content": { | |
| 514 | - "query": unique_query, | |
| 515 | - "boost": weight, | |
| 516 | - "slop": search_type_attachment | |
| 517 | - } | |
| 518 | - } | |
| 519 | - } for unique_query, weight in unique_queries | |
| 520 | - ], | |
| 521 | - "minimum_should_match": 1 | |
| 522 | - } | |
| 523 | - } | |
| 524 | - } | |
| 525 | - # 纯关键词检索 2 | |
| 526 | - elif search_type == "match_keywords": | |
| 527 | - body = { | |
| 528 | - "query": { | |
| 529 | - "bool": { | |
| 530 | - "must": [ | |
| 531 | - { | |
| 532 | - "match": { | |
| 533 | - "content": {"query": core_term, "boost": 1.0} | |
| 534 | - } | |
| 535 | - } | |
| 536 | - ], | |
| 537 | - "should": [ | |
| 538 | - { | |
| 539 | - "match": { | |
| 540 | - "content": {"query": q, "boost": boost} | |
| 541 | - } | |
| 542 | - } for (q, boost) in [(keywords, 1.0), (expand, 0.6), (normalized_query, 1.0)] if q | |
| 543 | - ], | |
| 544 | - "minimum_should_match": 1 | |
| 545 | - } | |
| 546 | - } | |
| 547 | - } | |
| 548 | - # 关键词搜索+向量排序 | |
| 549 | - elif search_type == "match&boost": | |
| 550 | - body = { | |
| 551 | - "query": { | |
| 552 | - "function_score": { | |
| 553 | - "score_mode": "sum", | |
| 554 | - "boost_mode": "multiply", | |
| 555 | - "query": { | |
| 556 | - "match": { | |
| 557 | - "content": { | |
| 558 | - "query": keywords, | |
| 559 | - "boost": 1.0 | |
| 560 | - } | |
| 561 | - } | |
| 562 | - }, | |
| 563 | - "functions": [ | |
| 564 | - { | |
| 565 | - "script_score": { | |
| 566 | - "script": { | |
| 567 | - "source": "doc['embedding'].isEmpty() ? 1.0 : dotProduct(params.query_vector, 'embedding') + 1.0", | |
| 568 | - "params": {"query_vector": context.embeddings[0][1]} | |
| 569 | - } | |
| 570 | - } | |
| 571 | - } | |
| 572 | - ] | |
| 573 | - } | |
| 574 | - } | |
| 575 | - } | |
| 576 | - # 这个太慢 | |
| 577 | - elif search_type == "match&boost2": | |
| 578 | - body = { | |
| 579 | - "query": { | |
| 580 | - "script_score": { | |
| 581 | - "query": { | |
| 582 | - "match": { | |
| 583 | - "content": { | |
| 584 | - "query": keywords, | |
| 585 | - "boost": 1.0 | |
| 586 | - } | |
| 587 | - } | |
| 588 | - }, | |
| 589 | - "script": { | |
| 590 | - "source": "doc['embedding'].isEmpty() ? 1.0 : dotProduct(params.query_vector, 'embedding') + 1.0", | |
| 591 | - "params": {"query_vector": context.embeddings[0][1]} | |
| 592 | - } | |
| 593 | - } | |
| 594 | - } | |
| 595 | - } | |
| 596 | - # 向量搜索+关键词搜索 | |
| 597 | - elif search_type == "match&knn": | |
| 598 | - body = { | |
| 599 | - "query": { | |
| 600 | - "match": { | |
| 601 | - "content": { | |
| 602 | - "query": knn_boost_keywords, | |
| 603 | - "boost": 1.0 | |
| 604 | - } | |
| 605 | - } | |
| 606 | - }, | |
| 607 | - "knn": { | |
| 608 | - "field": "embedding", | |
| 609 | - "query_vector": context.embeddings[search_type_attachment][1], | |
| 610 | - "k": 5, | |
| 611 | - "num_candidates": 10, | |
| 612 | - "boost": 1.0 | |
| 613 | - } | |
| 614 | - } | |
| 615 | - # 纯向量搜索 | |
| 616 | - elif search_type == "knn": | |
| 617 | - body = { | |
| 618 | - "knn": { | |
| 619 | - "field": "embedding", | |
| 620 | - "query_vector": context.embeddings[search_type_attachment][1], | |
| 621 | - "k": 5, | |
| 622 | - "num_candidates": 10 | |
| 623 | - } | |
| 624 | - } | |
| 625 | - | |
| 626 | - need_embedding = (search_type == "match&boost") | |
| 627 | - need_highlights = (search_type != "knn") | |
| 628 | - | |
| 629 | - body["_source"] = {"excludes": ["keywords", "quotes"]} | |
| 630 | - if not need_embedding: | |
| 631 | - body["_source"]["excludes"].append("embedding") | |
| 632 | - # 在填充highlight之前写入search_from | |
| 633 | - search_from = f'searchtype[{search_type}],param[{search_type_attachment}],body:{body}' | |
| 634 | - | |
| 635 | - if need_highlights: | |
| 636 | - body["_source"]["excludes"] = [] | |
| 637 | - body["highlight"] = { | |
| 638 | - "pre_tags": [settings.HIGHTLIGHT_PRE_TAG], | |
| 639 | - "post_tags": [settings.HIGHTLIGHT_POST_TAG], | |
| 640 | - "fields": {"chapter_name": {}, "content": {}} | |
| 641 | - } | |
| 642 | - | |
| 643 | - body["size"] = size | |
| 644 | - | |
| 645 | - se_debug_info = '' | |
| 646 | - start_time = time.time() | |
| 647 | - try: | |
| 648 | - es_response = context.es.search(index=context.index_name, body=body) | |
| 649 | - except Exception as e: | |
| 650 | - se_debug_info = f'Error in executing search: {e}. request: {body}' | |
| 651 | - return None, se_debug_info | |
| 652 | - end_time = time.time() | |
| 653 | - elapsed_time = end_time - start_time | |
| 654 | - total_hits = es_response.get("hits", {}).get("total", {}).get("value", 0) | |
| 655 | - returned_hits = len(es_response.get("hits", {}).get("hits", [])) | |
| 656 | - | |
| 657 | - if not '"' in search_from: | |
| 658 | - search_from = search_from.replace('\'', '"') | |
| 659 | - search_from = search_from if len(search_from) < 400 else search_from[:400] + '...' | |
| 660 | - | |
| 661 | - str_body = str(body) | |
| 662 | - if not '"' in str_body: | |
| 663 | - str_body = str_body.replace('\'', '"') | |
| 664 | - se_debug_info = f'({elapsed_time:.2f} seconds. Total: {total_hits}. Returned: {returned_hits}) : {search_from[:400]}' | |
| 665 | - | |
| 666 | - if not 'hits' in es_response or not 'hits' in es_response['hits']: | |
| 667 | - se_debug_info += f' InvalidResponce in executing search: {e}. request: {body}' | |
| 668 | - return None, se_debug_info | |
| 669 | - | |
| 670 | - for hit in es_response['hits']['hits']: | |
| 671 | - hit['search_from'] = search_from | |
| 672 | - | |
| 673 | - return es_response, se_debug_info | |
| 674 | - | |
| 675 | -``` | |
| 676 | - | |
| 677 | - | |
| 678 | -# 测试向量: | |
| 679 | -# [-0.05291186273097992, 0.0274342093616724, -0.016730275005102158, 0.010487289167940617, -0.022640341892838478, -0.048682719469070435, 0.04544096067547798, 0.023079438135027885, 0.007221410982310772, 0.023566091433167458, 0.026696473360061646, 0.08252757787704468, -0.042835772037506104, 0.0009668126585893333, -0.02860398218035698, -0.004426108207553625, -0.002644421299919486, -0.027699561789631844, 0.005749804899096489, -0.04468372091650963, -0.0296687763184309, -0.009487600065767765, 0.020041221752762794, 0.00778265530243516, 0.008522099815309048, 0.03497027978301048, -0.021573258563876152, -0.028293319046497345, -8.54598984005861e-05, -0.03164539486169815, -0.017121458426117897, -0.0006902766763232648, 0.04650883004069328, -0.030234992504119873, -0.010207684710621834, -0.035288386046886444, -0.0047269039787352085, -0.0006454040994867682, -0.056146346032619476, 0.008901881985366344, 0.010757357813417912, -0.013022932223975658, 0.04627145081758499, -0.020669423043727875, -0.02031278982758522, -0.052186835557222366, -0.0148158585652709, -0.018267231062054634, -0.059003304690122604, -0.011793344281613827, 0.027096575126051903, 0.019299808889627457, 0.04161312058568001, -0.019393721595406532, -0.02361445501446724, 0.07711422443389893, -0.02068573422729969, -0.004702702630311251, -0.011135494336485863, 0.0101374052464962, -0.020808257162570953, 0.011924360878765583, -0.020093027502298355, -0.007138500455766916, 0.014727798290550709, 0.05770261213183403, 0.017841406166553497, 0.044339124113321304, -0.01490224339067936, -0.008343652822077274, -0.04842463508248329, 0.0336640290915966, -0.004893577191978693, -0.021536342799663544, -0.032384153455495834, -0.009452177211642265, -0.027460120618343353, -0.009426826611161232, 0.006357531528919935, 0.019494572654366493, 0.009722599759697914, -0.00497430982068181, 0.023032115772366524, 0.05221958085894585, -0.01671120524406433, 0.061740316450595856, -0.06789620220661163, -0.023851843550801277, -0.02249223366379738, -0.01231105625629425, -0.0499565526843071, 0.004251780919730663, 0.05466651916503906, -0.024449756368994713, -0.034151963889598846, 0.037387508898973465, -0.0016276679234579206, -0.02609393745660782, 0.01800747588276863, -0.0028136332985013723, -0.06036405637860298, 0.028903907164931297, 0.006318055558949709, 0.012870929203927517, -0.0021476889960467815, -0.012034566141664982, -0.008372323587536812, 0.024942906573414803, 0.08258169889450073, 0.006757829803973436, 0.032017264515161514, -0.012414710596203804, 0.014826267957687378, -0.040858786553144455, -0.0060302577912807465, 0.00843990221619606, -0.031066348776221275, -0.06313654035329819, 0.0056659989058971405, -0.007768781390041113, 0.011673268862068653, 0.007261875085532665, 0.006112886127084494, -0.07374890148639679, 0.06602894514799118, -0.05385972931981087, -0.0010994652984663844, 0.05939924344420433, 0.015503636561334133, 0.034621711820364, 0.008040975779294968, -0.023962488397955894, -0.06270411610603333, 0.00027893096557818353, -0.0436306893825531, -0.006309020332992077, 0.02416943572461605, -0.015391307882964611, -0.012442439794540405, -0.003181715961545706, -0.0021985983476042747, 0.008671553805470467, 0.004063367377966642, -0.02560708485543728, 0.03469422832131386, -0.04249674826860428, -0.013552767224609852, -0.052823010832071304, 0.014670411124825478, -0.011493593454360962, 0.024076055735349655, 0.056352417916059494, -0.008510314859449863, 0.015936613082885742, 0.003935575485229492, 0.0037949192337691784, 0.015074086375534534, 0.016583971679210663, -0.0057802870869636536, 0.005751866847276688, -0.009386995807290077, -0.03710195794701576, -0.03144300729036331, -0.07106415182352066, -0.003882911056280136, -0.010697683319449425, -0.014338435605168343, 0.007036983501166105, -0.035716522485017776, 0.06593189388513565, 0.007752529811114073, -0.030261363834142685, -0.02513342909514904, -0.039278656244277954, 0.015320679172873497, -0.012659071013331413, 0.014207725413143635, 0.010264124721288681, 0.01617652177810669, -0.022644126787781715, -0.031033707782626152, 0.04160666465759277, -0.05329348146915436, 0.02423500455915928, -0.019389694556593895, 0.008645910769701004, -0.005958682857453823, -0.03648180514574051, 0.011972597800195217, 0.037404924631118774, -0.007001751102507114, -0.05138246342539787, 0.0013400549069046974, -0.03268183395266533, 0.07687076926231384, -0.02033335529267788, -0.020667986944317818, 0.0038236891850829124, 0.029960744082927704, 0.015430699102580547, 0.05047214776277542, 0.0052254535257816315, 0.013995353132486343, -0.031164521351456642, -0.014291719533503056, 0.015829795971512794, -0.0013409113744273782, -0.044300951063632965, 0.045415859669446945, -0.005037966184318066, -0.03883415088057518, 0.027200160548090935, 0.008182630874216557, -0.046456750482320786, -0.029778052121400833, 0.02067168429493904, -0.006381513085216284, -0.04693000763654709, 0.009974686428904533, 0.03109011799097061, -0.012696364894509315, 0.030124813318252563, 0.02372679114341736, 0.06566771119832993, 0.03553507477045059, -0.032816141843795776, 0.028003521263599396, 0.06498659402132034, -0.013530750758945942, 0.0312667116522789, -0.015660811215639114, -0.00776742585003376, -0.004829467739909887, -0.015968922525644302, 0.04765664413571358, -0.0026502758264541626, 0.01891564577817917, 0.04119837284088135, 0.012158435769379139, 0.008338023908436298, -0.006039333995431662, 0.0630166307091713, -0.02758428454399109, 0.029347822070121765, -0.030129415914416313, 0.023165738210082054, 0.04064684361219406, 0.04446929693222046, -0.006133638322353363, -0.013095719739794731, -0.041152223944664, -0.01038535125553608, 0.01738007925450802, 0.0010595708154141903, -0.055003564804792404, 0.036829687654972076, -0.030270753428339958, -0.009607627056539059, 0.014103117398917675, 0.005140293389558792, 0.032931022346019745, 0.026972685009241104, -0.00039128100615926087, 0.00550195062533021, 0.062454141676425934, 0.02344602160155773, -0.01688288524746895, 0.011600837111473083, 0.009648085571825504, 0.012827200815081596, 0.02368510514497757, -0.044808436185121536, 0.006574536208063364, 0.03677171841263771, 0.021754244342446327, -0.0031720376573503017, -0.03498553857207298, -0.027119319885969162, 0.05196662247180939, 0.0063033513724803925, -0.002766692778095603, -0.03879206255078316, -0.005737128667533398, -0.02351462095975876, 0.04338989034295082, -0.03623301535844803, 0.003727369010448456, 0.044172726571559906, 0.06180792301893234, -0.025736358016729355, 0.01280374638736248, -0.01768171414732933, 0.0413120836019516, 0.036350950598716736, 0.020034022629261017, -0.00938474852591753, -0.04920303076505661, -0.1626604050397873, 0.0016566020203754306, -0.010797491297125816, 0.0037245014682412148, 0.039030417799949646, -0.009399985894560814, 0.016659803688526154, -0.047097429633140564, -0.00987484585493803, 0.020634479820728302, 0.005361238028854132, -0.05283225327730179, 0.002501025330275297, -0.004766151309013367, 0.00850654486566782, -0.0050267502665519714, -0.046555373817682266, 0.012670878320932388, 0.0018581973854452372, -0.010647253133356571, 0.01990092545747757, 0.02013244479894638, 0.04490885138511658, 0.029433563351631165, -0.01408607978373766, 0.029722925275564194, 0.04512600228190422, -0.04305345192551613, 0.0053901285864412785, -0.010685979388654232, 0.01516974437981844, 0.02340293675661087, -0.014181641861796379, -0.0013334851246327162, 0.020624764263629913, 0.06469231843948364, 0.016654038801789284, -0.043994754552841187, 0.025707466527819633, -0.004160136915743351, 0.021129926666617393, 0.041262850165367126, 0.006293899845331907, 0.056005991995334625, -0.006883381400257349, -0.07502268254756927, -0.02920101210474968, -0.019043054431676865, 0.00737513042986393, 0.013621360063552856, -0.02504715882241726, -0.01138006430119276, -0.010744514875113964, -0.02502342313528061, -0.03335903584957123, 0.012180354446172714, -0.03276645019650459, 0.05202409625053406, 0.03246080502867699, 0.03068908303976059, -0.029587913304567337, -0.04850265011191368, -0.006388102192431688, -0.03203853219747543, -0.050761956721544266, -0.021925227716565132, 0.036384399980306625, -0.011895880103111267, -0.007408954203128815, -0.012625153176486492, 0.0024322718381881714, -0.012196220457553864, -0.007011729292571545, -0.0337890200316906, -0.030034994706511497, 0.04638829082250595, -0.028362803161144257, -0.01176459901034832, 0.00956833828240633, -0.12054562568664551, -0.020540419965982437, 0.014624865725636482, -0.025515791028738022, -0.005027926992624998, -0.03586679324507713, -0.05585843697190285, -0.01700599677860737, -0.00044939795043319464, 0.029278729110956192, 0.25503888726234436, -0.024952411651611328, 0.005794796161353588, -0.007252118084579706, 0.03397773951292038, -0.0030146583449095488, -0.016645856201648712, -0.0008194005931727588, 0.02789629064500332, -0.039116114377975464, -0.035631854087114334, 0.04917449131608009, -0.006455820053815842, -0.011818122118711472, -0.00958359707146883, 0.013176187872886658, 0.037286531180143356, 0.022334400564432144, 0.05832865461707115, 0.010104321874678135, -0.04915979504585266, -0.022671189159154892, -0.016606582328677177, -0.007431587669998407, 0.0025214774068444967, -0.038979604840278625, 0.014895224012434483, 0.03583076596260071, 0.0006473385728895664, 0.04958082735538483, -0.017827684059739113, 0.015710417181253433, 0.062094446271657944, -0.014381879940629005, 0.0002880772517528385, 0.004948006477206945, -8.711735063116066e-06, -0.0029445397667586803, -0.044325683265924454, 0.047702621668577194, -0.03197811171412468, -0.02109563909471035, 0.03041824884712696, 0.021582895889878273, -0.004118872340768576, -0.025784745812416077, 0.06275995075702667, 0.006879465654492378, 0.04185185581445694, 0.02031264826655388, -0.02274201810359955, -0.009617358446121216, -0.04315454140305519, -0.033287111669778824, -0.025126483291387558, -0.003923895303159952, -0.041508499532938004, -0.0009355457150377333, -0.033565372228622437, 0.02229289337992668, -0.0026574484072625637, -0.0028596664778888226, -0.02223617024719715, -0.016868866980075836, 0.04172029718756676, 0.0014162511797621846, -0.037737537175416946, -0.010155809111893177, -0.010357595980167389, 0.04541466012597084, 0.03563382104039192, -0.019189776852726936, -0.012577632442116737, -0.013781189918518066, 0.026566311717033386, 0.020911909639835358, 0.02781282551586628, 0.053938526660203934, 0.0194545891135931, 0.0015139722963795066, -0.0357731431722641, -0.005088387057185173, 0.004257760010659695, 0.04332628846168518, -0.012149352580308914, -0.04734082147479057, 0.018029984086751938, -0.01322091929614544, -0.059820450842380524, -0.03677783161401749, -0.006745075341314077, -0.02209635078907013, -0.012663901783525944, -0.0059855030849576, 0.016270749270915985, -0.00725028058513999, 0.03019685670733452, 0.010252268984913826, -0.06314245611429214, -0.005512078758329153, -0.016377074643969536, -0.0014438428916037083, 0.029021194204688072, -0.015355946496129036, 0.02559172362089157, -0.04241044819355011, 0.010147088207304478, -0.016036594286561012, 0.023162752389907837, 0.047236304730176926, 0.0166736152023077, 0.01226564310491085, -0.015224735252559185, -0.01298521552234888, -0.008012642152607441, 0.028470756486058235, -0.013741613365709782, 0.019896863028407097, 0.01720179058611393, 0.01571199856698513, 0.030143165960907936, -0.02969514951109886, 0.014739652164280415, -0.01854291744530201, -0.045576371252536774, -0.04516203701496124, 0.02147211693227291, 0.007073952350765467, 0.008106761611998081, -0.01828523352742195, 0.002731812885031104, -0.04545339569449425, 0.019007619470357895, 0.03504781052470207, 0.037705861032009125, -0.0045634908601641655, 0.0070000626146793365, 0.0037205498665571213, 0.005224148277193308, -0.017060590907931328, -0.04246727377176285, -0.006265614647418261, -0.015374364331364632, -0.03380871191620827, -0.005029333755373955, 0.007065227720886469, 0.003886009333655238, 0.008613690733909607, -0.012133199721574783, 0.005556005053222179, -0.021959641948342323, 0.04834386706352234, 0.03787781298160553, -0.057815466076135635, 0.015909207984805107, -0.03855409845709801, 0.0018244135426357388, 0.04186264052987099, -0.054983459413051605, 0.006219237111508846, 0.03494301065802574, 0.023722950369119644, 0.0312604121863842, 0.05597991123795509, -0.030345493927598, 0.016615940257906914, -0.0207205917686224, 0.055960651487112045, -0.012713379226624966, -0.0261109359562397, 0.014332456514239311, -0.017245708033442497, -0.06636268645524979, 0.00592504907399416, 0.04649018123745918, -0.018362276256084442, 0.009620632976293564, -0.0044480785727500916, -0.0014729035319760442, 0.015621249563992023, 0.0367378331720829, -0.011857259087264538, -0.045088741928339005, 0.0006832792423665524, 0.02601524256169796, -0.02120809443295002, 0.018104318529367447, 0.008069046773016453, 0.013658273033797741, 0.004183551296591759, -0.04133244603872299, 0.05436890944838524, 0.009334285743534565, -0.014695074409246445, -0.011054124683141708, 0.009796642698347569, -0.008759389631450176, -0.06399217247962952, -0.0028859861195087433, -0.008736967109143734, -0.003506746841594577, 0.008123806677758694, 0.008794951252639294, -0.02940259501338005, 0.009597218595445156, -0.02197900228202343, -0.02082076109945774, 0.023915970697999, -0.059058744460344315, -0.010253551416099072, 0.024443935602903366, -0.029604850336909294, 0.008135135285556316, 0.03568771481513977, -0.017330091446638107, -0.003135789418593049, 0.035103678703308105, 0.0370408296585083, -0.01022601593285799, -0.045891791582107544, 0.01726667769253254, -0.008570673875510693, 0.015297998674213886, -0.015412220731377602, -0.01425748411566019, 0.031544867902994156, 0.013110813684761524, -0.057211123406887054, -0.0008968000765889883, 0.001981658162549138, -0.002101168967783451, -0.09516698867082596, -0.034693196415901184, 0.011157260276377201, 0.010063023306429386, -0.02550840750336647, 0.009959851391613483, 0.022281678393483162, -0.03908146917819977, 0.02196437120437622, 0.03520793840289116, -0.06856158375740051, -0.004901218693703413, 0.1122148334980011, -0.01498009730130434, 0.03165500983595848, -0.07618033140897751, -0.014297851361334324, 0.02150021120905876, 0.005999598652124405, -0.013493427075445652, 0.013868110254406929, 0.00079053093213588, 0.006475066766142845, 0.000955471652559936, -0.03403160721063614, -0.02295752801001072, 0.0041635241359472275, -0.03955964744091034, -0.04943346977233887, 0.00032474088948220015, 0.039174411445856094, -0.011974001303315163, 0.008057610131800175, 0.03809700161218643, -0.041719768196344376, 0.037615906447172165, -0.035932306200265884, 0.008293192833662033, -0.03261689469218254, -0.023902395740151405, -7.811257091816515e-05, -0.011328466236591339, -0.026476409286260605, 0.055370282381772995, 0.03128054738044739, -0.014991461299359798, 0.017835773527622223, 0.01642710715532303, 0.029273470863699913, -0.012139911763370037, 0.01371818222105503, -0.013113478198647499, -0.04071088507771492, 0.0233455840498209, -0.019497444853186607, -0.01747158169746399, 0.02493683062493801, 0.024074571207165718, -0.03614620864391327, -0.025289475917816162, -0.04030011221766472, -0.046772539615631104, 0.009969661012291908, 0.003724620910361409, 0.007474626414477825, -0.04855594411492348, 0.04697829484939575, 0.010695616714656353, 0.027944304049015045, -0.003937696572393179, -0.011591222137212753, -0.011533009819686413, 0.03215765953063965, -0.04699324443936348, -9.356102236779407e-05, -0.01535400003194809, -0.010238519869744778, 0.002703386126086116, 0.04759520664811134, 0.0074842446483671665, -0.04050430282950401, -0.028402622789144516, -0.03205197677016258, 0.011288953013718128, 0.006053865421563387, 0.04641448333859444, 0.005652922671288252, -0.018560705706477165, 0.02581481821835041, 0.00962467584758997, -0.017888177186250687, -0.026476262137293816, -0.005547264125198126, 0.012222226709127426, -0.004069746006280184, -0.020438821986317635, 0.01929863728582859, -0.0053736320696771145, 0.02221786603331566, -0.007175051141530275, 0.003961225971579552, -0.012380941770970821, -0.0040277824737131596, 0.009086307138204575, 0.012202796526253223, 0.018483169376850128, 0.017530532553792, 0.0422886498272419, 0.04987001419067383, 0.003722204128280282, 0.06421508640050888, -0.016258088871836662, -0.027659112587571144, 0.004458434879779816, -0.02898143045604229, -0.014475414529442787, 0.032039571553468704, -0.025734663009643555, -0.01585981249809265, 0.04900333285331726, -0.06422552466392517, -0.0007134959450922906, -0.04035528376698494, 0.03290264680981636, -0.0018848407780751586, 0.0068516512401402, 0.00032433189335279167, -0.002669606124982238, -0.017596688121557236, -0.026878179982304573, 0.014075388200581074, 0.020072080194950104, -0.00295435544103384, -0.01918656937777996, -0.007689833641052246, 0.039347097277641296, 0.0026605715975165367, 0.011779646389186382, 0.04189120978116989, -0.03846775367856026, -0.01993645168840885, 0.04546443000435829, 0.05682912468910217, -0.012384516187012196, -0.004507445730268955, 0.007476931903511286, -0.01160018052905798, 0.006559243891388178, 0.04354899004101753, 0.006185194011777639, 0.028355205431580544, -0.006518798414617777, -0.029528537765145302, 0.06740271300077438, -0.052158474922180176, 0.0025031850673258305, -0.005957300774753094, 0.00500349560752511, 0.022637680172920227, -0.0027129461523145437, -0.011677206493914127, -0.042732879519462585, -0.0021236639004200697, -0.1499215066432953, 0.02914350852370262, -0.031246500089764595, -0.027244996279478073, -0.006904688663780689, 0.01088196225464344, 0.01271661464124918, -0.0430884025990963, -0.020760131999850273, -0.006593034137040377, -0.0007962957606650889, -0.031729113310575485, 0.052976224571466446, -0.03149586543440819, 0.0392388291656971, 0.023318620398640633, -0.01383691094815731, 0.02858218550682068, 0.023135144263505936, 0.026421336457133293, 0.00027594034327194095, -0.03901490569114685, 0.008533132262527943, -0.03802476078271866, -0.011105065234005451, -0.028275510296225548, 0.04846742004156113, 0.021237077191472054, -0.027375172823667526, -0.02717825025320053, -0.031243441626429558, -0.021638689562678337, 0.024066096171736717, 0.05689090117812157, -0.04352620989084244, 0.03599394112825394, 0.05153508856892586, 0.002263782313093543, 0.047110624611377716, 0.006084555760025978, 0.003244618885219097, -0.0015037712873890996, 0.027960799634456635, -0.013650861568748951, 0.03281615301966667, 0.012363187968730927, 0.02162906341254711, -0.010951842181384563, -0.02786285988986492, 0.03754381462931633, 0.01957041770219803, -0.017010418698191643, -0.008339766412973404, 0.0755641758441925, 0.023412147536873817, -0.005748848430812359, -0.05465301498770714, -0.02190011739730835, 0.0054182386957108974, 0.032733004540205, -0.05342638120055199, 0.009907999075949192, -0.02370712347328663, -0.015652501955628395, -0.011254304088652134, -0.019827252253890038, -0.021032121032476425, -0.02607329562306404, -0.0008710312540642917, -0.06800976395606995, -0.017296750098466873, 0.015312970615923405, -0.015649013221263885, -0.016449443995952606, -0.012058117426931858, 0.002104945247992873, 0.020476385951042175, 0.014795565977692604, -0.02145536057651043, -0.028734024614095688, -0.041212357580661774, -0.008211270906031132, 0.033569078892469406, -0.0033273063600063324, -0.02339683100581169, 0.0421740785241127, -0.009677124209702015, -0.006869456730782986, -0.016001028940081596, 0.029614608734846115, -0.06062136963009834, -0.011824233457446098, 0.012096629478037357, -0.028248939663171768, -0.03703905642032623, 0.012119539082050323, -0.041021380573511124, 0.01975782960653305, -0.028443211689591408, 0.020459437742829323, 0.0073023103177547455, -0.06498327851295471, -0.004016770515590906, 0.06460512429475784, -0.053343966603279114, 0.03865537419915199, -5.4113028454594314e-05, -0.008642046712338924, -0.009384138509631157, -0.037736788392066956, -0.035090748220682144, 0.018596891313791275, -0.008763385005295277, 0.040228284895420074, 0.03811536356806755, -0.034618355333805084, -0.004665717948228121, 0.04813361540436745, -0.004303373862057924, 0.00795511994510889, -0.017838604748249054, 0.00563138909637928, -0.03171280398964882, -0.0259436946362257, 0.004301885142922401, -0.02739236131310463, 0.03270035237073898, 0.009064823389053345, -0.0363747663795948, 0.02325567975640297, 0.03453107923269272, -0.012906554155051708, 0.028347544372081757, 0.01234712265431881, 0.030589573085308075, 0.0024874424561858177, -0.0173872709274292, 0.0247347354888916, 0.004171399865299463, 0.02350561134517193, -0.05499064922332764, -0.023146219551563263, -0.012485259212553501, -0.0228674728423357, 0.013267520815134048, 0.021304689347743988, -0.018937893211841583, -0.0260267723351717, -0.022532619535923004, 0.0030378480441868305, -0.008528024889528751, -0.030528495088219643, -0.009305189363658428, -0.0074027362279593945, -0.020641637966036797, 0.006984233390539885, 0.04300186410546303, -0.033014994114637375, -0.006089311558753252, 0.04753036051988602, -0.036625705659389496, -0.04691743850708008, -0.007467558141797781, 0.0652017593383789, -0.03861508145928383, -0.00741452956572175, 0.003471594536677003, 0.016132064163684845, 0.01570185460150242, 0.018733495846390724, -0.019025148823857307, 0.003490244736894965, -0.017714614048600197, -0.003447450464591384, 0.015267218463122845, 0.015076974406838417, -0.002631498035043478, 0.005311752203851938, 0.014075293205678463, 0.0026123111601918936, 0.011874910444021225, 0.0714355856180191, 0.06941138952970505, 0.022251378744840622, 0.01972009800374508, 0.04719123989343643, 0.023544959723949432, 0.017852554097771645, 0.01843070052564144, -0.05294886603951454, -0.008682304993271828, 0.010625398717820644, 0.0428495928645134, 0.002173527143895626, 0.06291069090366364, 0.024296458810567856, 0.008714474737644196, 0.06520587205886841, 0.015627536922693253, 0.04247526824474335, 0.0009774811333045363, 0.00738496845588088, -0.024803027510643005, 0.013228596188127995, -0.037615202367305756, -0.028807995840907097, 0.012890785001218319, -0.01587829552590847, -0.01928863860666752, 0.0011809614952653646, -0.026926854625344276, -0.020252779126167297, -0.010968486778438091, -0.015348547138273716, 0.008559435606002808, -0.009286923334002495, 0.0014621232403442264, 0.03831499442458153, 0.016517579555511475, 0.037184324115514755, -0.041231196373701096, 0.03757374733686447, -0.039465345442295074, -0.04308579862117767, 0.0011091071646660566, -0.029794104397296906, 0.008459310978651047, -0.01713281124830246, -0.016625113785266876, -0.05582521855831146, -0.0415986105799675, 0.028725938871502876, 0.04966316372156143, 0.012718678452074528, -0.025533588603138924, 0.013822318986058235, -5.168768620933406e-05, 0.02616700902581215, -0.06113629788160324, -0.03175340220332146, 0.03593592345714569, -0.04014921560883522, -0.020605407655239105, 0.02186705358326435] | |
| 1 | +# Elasticsearch 文档 | |
| 2 | + | |
| 3 | +## 相关链接 | |
| 4 | +- 接口文档:http://rap.essa.top:88/workspace/myWorkspace.do?projectId=78#2187 | |
| 5 | +- Kibana 控制台:http://43.166.252.75:5601/app/dev_tools#/console/shell | |
| 6 | + | |
| 7 | +## 分词方面 | |
| 8 | + | |
| 9 | +Ansj 分词插件安装 | |
| 10 | +ES可以用的中文分词,效果最好的是hanLP和ansj,其次是jieba。 | |
| 11 | + | |
| 12 | +我们老的搜索 solr 已经在几年前替代掉了ik,使用的是mmseg。但是我没找到mmseg的ES插件。 | |
| 13 | + | |
| 14 | +为了分词方面不至于比老版本差,这里先安装了ansj | |
| 15 | + | |
| 16 | +### 1. 下载插件 | |
| 17 | +从 [elasticsearch-analysis-ansj releases](https://github.com/NLPchina/elasticsearch-analysis-ansj/releases) 选择对应版本下载: | |
| 18 | + | |
| 19 | +- ES 8.18 版本: | |
| 20 | +```bash | |
| 21 | +wget https://github.com/NLPchina/elasticsearch-analysis-ansj/archive/refs/tags/v8.18.0.zip | |
| 22 | +``` | |
| 23 | + | |
| 24 | +- ES 8.17 版本: | |
| 25 | +```bash | |
| 26 | +wget https://github.com/NLPchina/elasticsearch-analysis-ansj/archive/refs/tags/v8.17.6.zip | |
| 27 | +``` | |
| 28 | + | |
| 29 | +### 2. 编译 | |
| 30 | +执行 `mvn package` 命令,编译成功后将在 `target/releases/` 目录生成插件压缩包: | |
| 31 | +`elasticsearch-analysis-ansj-<版本号>-release.zip` | |
| 32 | + | |
| 33 | +### 3. 安装步骤 | |
| 34 | +1. 进入 ES 安装路径(默认:`/usr/share/elasticsearch/`) | |
| 35 | +2. 执行安装命令: | |
| 36 | +```bash | |
| 37 | +bin/elasticsearch-plugin install file:///xxx/绝对路径到/elasticsearch-analysis-ansj-8.18.0.0-release.zip | |
| 38 | +``` | |
| 39 | +3. 重启服务: | |
| 40 | +```bash | |
| 41 | +systemctl restart elasticsearch | |
| 42 | +``` | |
| 43 | + | |
| 44 | +其他分词插件安装方法: | |
| 45 | +《3.1_hanlp安装.md》 | |
| 46 | +《3.2_jieba插件安装.md》 | |
| 47 | +在ES8上面安装过,但是没试过具体的版本 8.17 8.18 | |
| 48 | + | |
| 49 | +### 4. 配置说明 | |
| 50 | +停用词、同义词配置位于 `<ES_HOME>/config/elasticsearch-analysis-ansj/ansj.cfg.yml`(暂未使用) | |
| 51 | + | |
| 52 | +## 字段说明 | |
| 53 | + | |
| 54 | +```bash | |
| 55 | +需要的字段: | |
| 56 | +id 商品skuId | |
| 57 | +goods_id 商品spuId | |
| 58 | +buyer_id 所属专属采购商id | |
| 59 | +trader_buyer_ids 所属贸易商名下平台客户的专属采购商id | |
| 60 | +goods_certification_types 商品证书类型 | |
| 61 | +supplier_code 供应商编码 | |
| 62 | +supplier_name 供应商名称 | |
| 63 | +supplier_certification_code 供应商企业证书编码(列表) | |
| 64 | +auth_buyer_level_list 商品可见采购商等级(集合) | |
| 65 | +show_price_level_list 价格可见采购商等级(集合) | |
| 66 | +goods_composition 成分列表(材质) | |
| 67 | +compositions_main_secondary 物料主副(主:1,副:2),格式:物料代码_主副类型 | |
| 68 | +goods_key_word_zh 商品关键词中文 | |
| 69 | +goods_key_word_en 商品关键词英文 | |
| 70 | +goods_key_word_ru 商品关键词俄文 | |
| 71 | +goods_copyright 版权(自有、第三方、无授权、A货) | |
| 72 | +goods_main_material 主材质(字典:材质) | |
| 73 | +is_in_new_protect 是否在新品保护期(0否,1是) | |
| 74 | +goods_new_protect_date_stamp 新品保护期日期时间戳 | |
| 75 | +goods_attribute_name_zh spu属性中文(列表) | |
| 76 | +goods_attribute_name_en spu属性英文(列表) | |
| 77 | +goods_attribute_name_ru spu属性俄文(列表) | |
| 78 | +purchase_moq 采购MOQ | |
| 79 | +ts 触发索引的时间 | |
| 80 | +deliver_day 货期 | |
| 81 | +factory_no 工厂货号 | |
| 82 | +factory_no_buyer 工厂货号(客户) | |
| 83 | +fir_on_sell_time 首次上架时间 | |
| 84 | +fir_on_sell_time_stamp 首次上架时间timestamp | |
| 85 | +no 商品编码 | |
| 86 | +hs_no 宏升编码 | |
| 87 | +package_type 包装类型值(来自商品属性编码:PKG) | |
| 88 | +package_type_id 包装类型ID(来自商品属性编码:PKG) | |
| 89 | +labelId_by_skuId_essaone_* essaone商品标签,国家编码标识 | |
| 90 | +sale_goods_certificate_* 商品证书ID,国家编码标识 | |
| 91 | +labelId_by_skuId_essa_* essa商品标签,区域ID标识 | |
| 92 | +``` | |
| 93 | + | |
| 94 | +## Mapping 配置 | |
| 95 | +参考文件 `create_index.sh`: | |
| 96 | + | |
| 97 | +## 快速入门 | |
| 98 | + | |
| 99 | +### Shell | |
| 100 | +参考 [索引和查询测试](../docs/3.3_索引和查询测试.md) 包含了在ES服务器进行本地进行一些常用的查询操作。 | |
| 101 | + | |
| 102 | +### Python | |
| 103 | +- [test_index_and_search.py](../tests/test_index_and_search.py) 是一个简单的例子,创建索引,导入数据,查询数据 | |
| 104 | +- [batch_bulk_goods.py](../batch_bulk_goods.py) 功能是 通过sql 读取最近3年的所有数据,按batch(1000)通过bulk接口进行逐批入库,入库到goods索引。 | |
| 105 | + | |
| 106 | +### Kibana | |
| 107 | + | |
| 108 | +#### 分词相关 | |
| 109 | +```bash | |
| 110 | +# 索引分词 | |
| 111 | +GET /_cat/ansj?text=14寸第4代真眼珠实身冰雪公仔带手动大推车,搪胶雪宝宝&type=index_ansj | |
| 112 | + | |
| 113 | +# 查询分词 | |
| 114 | +GET /_cat/ansj?text=14寸第4代真眼珠实身冰雪公仔带手动大推车,搪胶雪宝宝&type=query_ansj | |
| 115 | + | |
| 116 | +# 查看配置 | |
| 117 | +GET /_cat/ansj/config | |
| 118 | +``` | |
| 119 | +#### 查询相关 | |
| 120 | +GET /goods/_search | |
| 121 | +{ | |
| 122 | + "query": { | |
| 123 | + "match_all": {} | |
| 124 | + }, | |
| 125 | + "size": 5 | |
| 126 | +} | |
| 127 | + | |
| 128 | +#### 1. 查看字段分词结果 | |
| 129 | +```bash | |
| 130 | +# 查看中文名称分词结果 | |
| 131 | +GET /_cat/ansj?text=14寸第4代真眼珠实身冰雪公仔带手动大推车&type=index_ansj | |
| 132 | + | |
| 133 | +# 查看英文名称分词结果 | |
| 134 | +GET /_cat/ansj?text=14 inch 4th generation real eye snow doll with manual cart&type=standard | |
| 135 | +``` | |
| 136 | + | |
| 137 | +#### 2. 查看索引随机10条内容 | |
| 138 | +```bash | |
| 139 | +GET /goods/_search | |
| 140 | +{ | |
| 141 | + "size": 10, | |
| 142 | + "query": { | |
| 143 | + "function_score": { | |
| 144 | + "query": { "match_all": {} }, | |
| 145 | + "random_score": {} | |
| 146 | + } | |
| 147 | + } | |
| 148 | +} | |
| 149 | +``` | |
| 150 | + | |
| 151 | +#### 3. 关键词查询 | |
| 152 | +```bash | |
| 153 | +# 简单关键词匹配 | |
| 154 | +GET /goods/_search | |
| 155 | +{ | |
| 156 | + "query": { | |
| 157 | + "match": { | |
| 158 | + "name_zh": "冰雪公仔" | |
| 159 | + } | |
| 160 | + } | |
| 161 | +} | |
| 162 | + | |
| 163 | +# 多字段关键词匹配 | |
| 164 | +GET /goods/_search | |
| 165 | +{ | |
| 166 | + "query": { | |
| 167 | + "multi_match": { | |
| 168 | + "query": "冰雪公仔", | |
| 169 | + "fields": ["name_zh", "sub_name_zh", "category_name_zh"] | |
| 170 | + } | |
| 171 | + } | |
| 172 | +} | |
| 173 | +``` | |
| 174 | + | |
| 175 | +#### 4. 向量查询 | |
| 176 | +```bash | |
| 177 | +# 使用向量相似度查询 | |
| 178 | +GET /goods/_search | |
| 179 | +{ | |
| 180 | + "query": { | |
| 181 | + "script_score": { | |
| 182 | + "query": { "match_all": {} }, | |
| 183 | + "script": { | |
| 184 | + "source": "cosineSimilarity(params.query_vector, 'name_prefix') + 1.0", | |
| 185 | + "params": { | |
| 186 | + "query_vector": [0.1, 0.2, ...] # 1024维向量 | |
| 187 | + } | |
| 188 | + } | |
| 189 | + } | |
| 190 | + } | |
| 191 | +} | |
| 192 | +``` | |
| 193 | + | |
| 194 | +#### 5. SKUID查询 | |
| 195 | +```bash | |
| 196 | +# 精确匹配SKUID | |
| 197 | +GET /goods/_search | |
| 198 | +{ | |
| 199 | + "query": { | |
| 200 | + "term": { | |
| 201 | + "goods_id": "2817667" | |
| 202 | + } | |
| 203 | + } | |
| 204 | +} | |
| 205 | +``` | |
| 206 | + | |
| 207 | +#### 6. 名称查询测试 | |
| 208 | +```bash | |
| 209 | +# 中文名称模糊匹配 | |
| 210 | +GET /goods/_search | |
| 211 | +{ | |
| 212 | + "query": { | |
| 213 | + "match": { | |
| 214 | + "name_zh": { | |
| 215 | + "query": "冰雪公仔", | |
| 216 | + "fuzziness": "AUTO" | |
| 217 | + } | |
| 218 | + } | |
| 219 | + } | |
| 220 | +} | |
| 221 | + | |
| 222 | +# 英文名称匹配 | |
| 223 | +GET /goods/_search | |
| 224 | +{ | |
| 225 | + "query": { | |
| 226 | + "match": { | |
| 227 | + "name_en": "snow doll" | |
| 228 | + } | |
| 229 | + } | |
| 230 | +} | |
| 231 | + | |
| 232 | +# 俄语名称匹配 | |
| 233 | +GET /goods/_search | |
| 234 | +{ | |
| 235 | + "query": { | |
| 236 | + "match": { | |
| 237 | + "name_ru": "снежная кукла" | |
| 238 | + } | |
| 239 | + } | |
| 240 | +} | |
| 241 | + | |
| 242 | +# 使用 match_phrase 进行短语匹配 | |
| 243 | +GET /goods/_search | |
| 244 | +{ | |
| 245 | + "query": { | |
| 246 | + "match_phrase": { | |
| 247 | + "name_zh": "冰雪公仔" | |
| 248 | + } | |
| 249 | + } | |
| 250 | +} | |
| 251 | + | |
| 252 | +# 使用 match_phrase 进行多语言短语匹配 | |
| 253 | +GET /goods/_search | |
| 254 | +{ | |
| 255 | + "query": { | |
| 256 | + "bool": { | |
| 257 | + "should": [ | |
| 258 | + { | |
| 259 | + "match_phrase": { | |
| 260 | + "name_zh": "冰雪公仔" | |
| 261 | + } | |
| 262 | + }, | |
| 263 | + { | |
| 264 | + "match_phrase": { | |
| 265 | + "name_en": "snow doll" | |
| 266 | + } | |
| 267 | + }, | |
| 268 | + { | |
| 269 | + "match_phrase": { | |
| 270 | + "name_ru": "снежная кукла" | |
| 271 | + } | |
| 272 | + } | |
| 273 | + ], | |
| 274 | + "minimum_should_match": 1 | |
| 275 | + } | |
| 276 | + } | |
| 277 | +} | |
| 278 | + | |
| 279 | +# 使用 match_phrase 配合 slop 参数进行模糊短语匹配 | |
| 280 | +GET /goods/_search | |
| 281 | +{ | |
| 282 | + "query": { | |
| 283 | + "match_phrase": { | |
| 284 | + "name_zh": { | |
| 285 | + "query": "冰雪公仔", | |
| 286 | + "slop": 2 | |
| 287 | + } | |
| 288 | + } | |
| 289 | + } | |
| 290 | +} | |
| 291 | + | |
| 292 | +# 多语言 match_phrase 配合 slop 参数 | |
| 293 | +GET /goods/_search | |
| 294 | +{ | |
| 295 | + "query": { | |
| 296 | + "bool": { | |
| 297 | + "should": [ | |
| 298 | + { | |
| 299 | + "match_phrase": { | |
| 300 | + "name_zh": { | |
| 301 | + "query": "冰雪公仔", | |
| 302 | + "slop": 2 | |
| 303 | + } | |
| 304 | + } | |
| 305 | + }, | |
| 306 | + { | |
| 307 | + "match_phrase": { | |
| 308 | + "name_en": { | |
| 309 | + "query": "snow doll", | |
| 310 | + "slop": 1 | |
| 311 | + } | |
| 312 | + } | |
| 313 | + }, | |
| 314 | + { | |
| 315 | + "match_phrase": { | |
| 316 | + "name_ru": { | |
| 317 | + "query": "снежная кукла", | |
| 318 | + "slop": 2 | |
| 319 | + } | |
| 320 | + } | |
| 321 | + } | |
| 322 | + ], | |
| 323 | + "minimum_should_match": 1 | |
| 324 | + } | |
| 325 | + } | |
| 326 | +} | |
| 327 | +``` | |
| 328 | + | |
| 329 | +#### 7. 多语言查询测试 | |
| 330 | +```bash | |
| 331 | +# 同时查询中英文名称 | |
| 332 | +GET /goods/_search | |
| 333 | +{ | |
| 334 | + "query": { | |
| 335 | + "bool": { | |
| 336 | + "should": [ | |
| 337 | + { | |
| 338 | + "match": { | |
| 339 | + "name_zh": "冰雪公仔" | |
| 340 | + } | |
| 341 | + }, | |
| 342 | + { | |
| 343 | + "match": { | |
| 344 | + "name_en": "snow doll" | |
| 345 | + } | |
| 346 | + } | |
| 347 | + ], | |
| 348 | + "minimum_should_match": 1 | |
| 349 | + } | |
| 350 | + } | |
| 351 | +} | |
| 352 | +``` | |
| 353 | + | |
| 354 | +#### 8. 向量索引查询测试 | |
| 355 | +注意:向量查询中的向量维度必须与索引中定义的维度匹配(1024维) | |
| 356 | +```bash | |
| 357 | +# 使用向量相似度进行商品推荐 | |
| 358 | +GET /goods/_search | |
| 359 | +{ | |
| 360 | + "query": { | |
| 361 | + "script_score": { | |
| 362 | + "query": { "match_all": {} }, | |
| 363 | + "script": { | |
| 364 | + "source": "cosineSimilarity(params.query_vector, 'ru_name') + 1.0", | |
| 365 | + "params": { | |
| 366 | + "query_vector": [0.1, 0.2, ...] # 1024维向量 | |
| 367 | + } | |
| 368 | + } | |
| 369 | + } | |
| 370 | + }, | |
| 371 | + "size": 10 | |
| 372 | +} | |
| 373 | +``` | |
| 374 | + | |
| 375 | +#### 9. 关键词+向量索引组合查询测试 | |
| 376 | +```bash | |
| 377 | +# 关键词搜索+向量相似度提权 | |
| 378 | +GET /goods/_search | |
| 379 | +{ | |
| 380 | + "query": { | |
| 381 | + "function_score": { | |
| 382 | + "query": { | |
| 383 | + "match": { | |
| 384 | + "name_zh": "冰雪公仔", | |
| 385 | + "boost": 1.0 | |
| 386 | + } | |
| 387 | + }, | |
| 388 | + "functions": [ | |
| 389 | + { | |
| 390 | + "script_score": { | |
| 391 | + "script": { | |
| 392 | + "source": "cosineSimilarity(params.query_vector, 'name_prefix') + 1.0", | |
| 393 | + "params": { | |
| 394 | + "query_vector": [0.1, 0.2, ...] # 1024维向量 | |
| 395 | + } | |
| 396 | + } | |
| 397 | + } | |
| 398 | + } | |
| 399 | + ], | |
| 400 | + "boost_mode": "multiply" | |
| 401 | + } | |
| 402 | + } | |
| 403 | +} | |
| 404 | + | |
| 405 | +# source 可以支持embedding为空 : "source": "doc['embedding'].isEmpty() ? 1.0 : dotProduct(params.query_vector, 'embedding') + 1.0", | |
| 406 | + | |
| 407 | +两者乘起来: | |
| 408 | +{ | |
| 409 | + "query": { | |
| 410 | + "function_score": { | |
| 411 | + "score_mode": "sum", | |
| 412 | + "boost_mode": "multiply", | |
| 413 | + "query": { | |
| 414 | + "match": { | |
| 415 | + "content": { | |
| 416 | + "query": keywords, | |
| 417 | + "boost": 1.0 | |
| 418 | + } | |
| 419 | + } | |
| 420 | + }, | |
| 421 | + "functions": [ | |
| 422 | + { | |
| 423 | + "script_score": { | |
| 424 | + "script": { | |
| 425 | + "source": "doc['embedding'].isEmpty() ? 1.0 : dotProduct(params.query_vector, 'embedding') + 1.0", | |
| 426 | + "params": {"query_vector": context.embeddings[0][1]} | |
| 427 | + } | |
| 428 | + } | |
| 429 | + } | |
| 430 | + ] | |
| 431 | + } | |
| 432 | + } | |
| 433 | + } | |
| 434 | + | |
| 435 | +#### 9. 向量搜索+关键词搜索 | |
| 436 | +GET /goods/_search | |
| 437 | +{ | |
| 438 | + "query": { | |
| 439 | + "match": { | |
| 440 | + "content": { | |
| 441 | + "query": "玩具", | |
| 442 | + "boost": 1.0 | |
| 443 | + } | |
| 444 | + } | |
| 445 | + }, | |
| 446 | + "knn": { | |
| 447 | + "field": "name_prefix", | |
| 448 | + "query_vector": [-0.05291186273097992, ...], | |
| 449 | + "k": 5, | |
| 450 | + "num_candidates": 10, | |
| 451 | + "boost": 1.0 | |
| 452 | + } | |
| 453 | + } | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | +参考代码: | |
| 459 | +```python | |
| 460 | + def execute_search(self, context, search_type="match_phrase", search_type_attachment=0, size=10): | |
| 461 | + query = context.query | |
| 462 | + normalized_query = context.normalized_query | |
| 463 | + core_term = context.core_term | |
| 464 | + keywords = context.keywords | |
| 465 | + knn_boost_keywords = core_term if core_term else keywords | |
| 466 | + expand = context.expand | |
| 467 | + | |
| 468 | + seen_queries = set() | |
| 469 | + unique_queries = [] | |
| 470 | + for q, weight in [(query, 1.0), (normalized_query, 1.0), (keywords, 0.5)]: | |
| 471 | + if q and q not in seen_queries: | |
| 472 | + unique_queries.append((q, weight)) | |
| 473 | + seen_queries.add(q) | |
| 474 | + | |
| 475 | + # 关于混合检索: | |
| 476 | + # knn和文本查询同时作用: | |
| 477 | + # 8.12之前query里面不能包含knn,kNN搜索作为查询已在 8.12 版本中引入: https://www.elastic.co/search-labs/blog/knn-query-elasticsearch | |
| 478 | + # { | |
| 479 | + # "size": 3, | |
| 480 | + # "query": { | |
| 481 | + # "bool": { | |
| 482 | + # "should": [ | |
| 483 | + # { | |
| 484 | + # "knn": { | |
| 485 | + # "field": "embedding", | |
| 486 | + # "query_vector": [2,2,2,0], | |
| 487 | + # "num_candidates": 10, | |
| 488 | + # "_name": "knn_query" | |
| 489 | + # } | |
| 490 | + # }, | |
| 491 | + # { | |
| 492 | + # "match": { | |
| 493 | + # "description": { | |
| 494 | + # "query": "luxury", | |
| 495 | + # "_name": "bm25query" | |
| 496 | + # } | |
| 497 | + # } | |
| 498 | + # } | |
| 499 | + # ] | |
| 500 | + # } | |
| 501 | + # | |
| 502 | + # knn里面不能包含query。 | |
| 503 | + # knn和query并列(hybrid search 混合检索),是求或的关系。 | |
| 504 | + # knn里面可以加filter,比如: "filter": {"match": {"my_label": "red"}} | |
| 505 | + | |
| 506 | + if search_type == "match_phrase": | |
| 507 | + body = { | |
| 508 | + "query": { | |
| 509 | + "bool": { | |
| 510 | + "should": [ | |
| 511 | + { | |
| 512 | + "match_phrase": { | |
| 513 | + "content": { | |
| 514 | + "query": unique_query, | |
| 515 | + "boost": weight, | |
| 516 | + "slop": search_type_attachment | |
| 517 | + } | |
| 518 | + } | |
| 519 | + } for unique_query, weight in unique_queries | |
| 520 | + ], | |
| 521 | + "minimum_should_match": 1 | |
| 522 | + } | |
| 523 | + } | |
| 524 | + } | |
| 525 | + # 纯关键词检索 2 | |
| 526 | + elif search_type == "match_keywords": | |
| 527 | + body = { | |
| 528 | + "query": { | |
| 529 | + "bool": { | |
| 530 | + "must": [ | |
| 531 | + { | |
| 532 | + "match": { | |
| 533 | + "content": {"query": core_term, "boost": 1.0} | |
| 534 | + } | |
| 535 | + } | |
| 536 | + ], | |
| 537 | + "should": [ | |
| 538 | + { | |
| 539 | + "match": { | |
| 540 | + "content": {"query": q, "boost": boost} | |
| 541 | + } | |
| 542 | + } for (q, boost) in [(keywords, 1.0), (expand, 0.6), (normalized_query, 1.0)] if q | |
| 543 | + ], | |
| 544 | + "minimum_should_match": 1 | |
| 545 | + } | |
| 546 | + } | |
| 547 | + } | |
| 548 | + # 关键词搜索+向量排序 | |
| 549 | + elif search_type == "match&boost": | |
| 550 | + body = { | |
| 551 | + "query": { | |
| 552 | + "function_score": { | |
| 553 | + "score_mode": "sum", | |
| 554 | + "boost_mode": "multiply", | |
| 555 | + "query": { | |
| 556 | + "match": { | |
| 557 | + "content": { | |
| 558 | + "query": keywords, | |
| 559 | + "boost": 1.0 | |
| 560 | + } | |
| 561 | + } | |
| 562 | + }, | |
| 563 | + "functions": [ | |
| 564 | + { | |
| 565 | + "script_score": { | |
| 566 | + "script": { | |
| 567 | + "source": "doc['embedding'].isEmpty() ? 1.0 : dotProduct(params.query_vector, 'embedding') + 1.0", | |
| 568 | + "params": {"query_vector": context.embeddings[0][1]} | |
| 569 | + } | |
| 570 | + } | |
| 571 | + } | |
| 572 | + ] | |
| 573 | + } | |
| 574 | + } | |
| 575 | + } | |
| 576 | + # 这个太慢 | |
| 577 | + elif search_type == "match&boost2": | |
| 578 | + body = { | |
| 579 | + "query": { | |
| 580 | + "script_score": { | |
| 581 | + "query": { | |
| 582 | + "match": { | |
| 583 | + "content": { | |
| 584 | + "query": keywords, | |
| 585 | + "boost": 1.0 | |
| 586 | + } | |
| 587 | + } | |
| 588 | + }, | |
| 589 | + "script": { | |
| 590 | + "source": "doc['embedding'].isEmpty() ? 1.0 : dotProduct(params.query_vector, 'embedding') + 1.0", | |
| 591 | + "params": {"query_vector": context.embeddings[0][1]} | |
| 592 | + } | |
| 593 | + } | |
| 594 | + } | |
| 595 | + } | |
| 596 | + # 向量搜索+关键词搜索 | |
| 597 | + elif search_type == "match&knn": | |
| 598 | + body = { | |
| 599 | + "query": { | |
| 600 | + "match": { | |
| 601 | + "content": { | |
| 602 | + "query": knn_boost_keywords, | |
| 603 | + "boost": 1.0 | |
| 604 | + } | |
| 605 | + } | |
| 606 | + }, | |
| 607 | + "knn": { | |
| 608 | + "field": "embedding", | |
| 609 | + "query_vector": context.embeddings[search_type_attachment][1], | |
| 610 | + "k": 5, | |
| 611 | + "num_candidates": 10, | |
| 612 | + "boost": 1.0 | |
| 613 | + } | |
| 614 | + } | |
| 615 | + # 纯向量搜索 | |
| 616 | + elif search_type == "knn": | |
| 617 | + body = { | |
| 618 | + "knn": { | |
| 619 | + "field": "embedding", | |
| 620 | + "query_vector": context.embeddings[search_type_attachment][1], | |
| 621 | + "k": 5, | |
| 622 | + "num_candidates": 10 | |
| 623 | + } | |
| 624 | + } | |
| 625 | + | |
| 626 | + need_embedding = (search_type == "match&boost") | |
| 627 | + need_highlights = (search_type != "knn") | |
| 628 | + | |
| 629 | + body["_source"] = {"excludes": ["keywords", "quotes"]} | |
| 630 | + if not need_embedding: | |
| 631 | + body["_source"]["excludes"].append("embedding") | |
| 632 | + # 在填充highlight之前写入search_from | |
| 633 | + search_from = f'searchtype[{search_type}],param[{search_type_attachment}],body:{body}' | |
| 634 | + | |
| 635 | + if need_highlights: | |
| 636 | + body["_source"]["excludes"] = [] | |
| 637 | + body["highlight"] = { | |
| 638 | + "pre_tags": [settings.HIGHTLIGHT_PRE_TAG], | |
| 639 | + "post_tags": [settings.HIGHTLIGHT_POST_TAG], | |
| 640 | + "fields": {"chapter_name": {}, "content": {}} | |
| 641 | + } | |
| 642 | + | |
| 643 | + body["size"] = size | |
| 644 | + | |
| 645 | + se_debug_info = '' | |
| 646 | + start_time = time.time() | |
| 647 | + try: | |
| 648 | + es_response = context.es.search(index=context.index_name, body=body) | |
| 649 | + except Exception as e: | |
| 650 | + se_debug_info = f'Error in executing search: {e}. request: {body}' | |
| 651 | + return None, se_debug_info | |
| 652 | + end_time = time.time() | |
| 653 | + elapsed_time = end_time - start_time | |
| 654 | + total_hits = es_response.get("hits", {}).get("total", {}).get("value", 0) | |
| 655 | + returned_hits = len(es_response.get("hits", {}).get("hits", [])) | |
| 656 | + | |
| 657 | + if not '"' in search_from: | |
| 658 | + search_from = search_from.replace('\'', '"') | |
| 659 | + search_from = search_from if len(search_from) < 400 else search_from[:400] + '...' | |
| 660 | + | |
| 661 | + str_body = str(body) | |
| 662 | + if not '"' in str_body: | |
| 663 | + str_body = str_body.replace('\'', '"') | |
| 664 | + se_debug_info = f'({elapsed_time:.2f} seconds. Total: {total_hits}. Returned: {returned_hits}) : {search_from[:400]}' | |
| 665 | + | |
| 666 | + if not 'hits' in es_response or not 'hits' in es_response['hits']: | |
| 667 | + se_debug_info += f' InvalidResponce in executing search: {e}. request: {body}' | |
| 668 | + return None, se_debug_info | |
| 669 | + | |
| 670 | + for hit in es_response['hits']['hits']: | |
| 671 | + hit['search_from'] = search_from | |
| 672 | + | |
| 673 | + return es_response, se_debug_info | |
| 674 | + | |
| 675 | +``` | |
| 676 | + | |
| 677 | + | |
| 678 | +# 测试向量: | |
| 679 | +# [-0.05291186273097992, 0.0274342093616724, -0.016730275005102158, 0.010487289167940617, -0.022640341892838478, -0.048682719469070435, 0.04544096067547798, 0.023079438135027885, 0.007221410982310772, 0.023566091433167458, 0.026696473360061646, 0.08252757787704468, -0.042835772037506104, 0.0009668126585893333, -0.02860398218035698, -0.004426108207553625, -0.002644421299919486, -0.027699561789631844, 0.005749804899096489, -0.04468372091650963, -0.0296687763184309, -0.009487600065767765, 0.020041221752762794, 0.00778265530243516, 0.008522099815309048, 0.03497027978301048, -0.021573258563876152, -0.028293319046497345, -8.54598984005861e-05, -0.03164539486169815, -0.017121458426117897, -0.0006902766763232648, 0.04650883004069328, -0.030234992504119873, -0.010207684710621834, -0.035288386046886444, -0.0047269039787352085, -0.0006454040994867682, -0.056146346032619476, 0.008901881985366344, 0.010757357813417912, -0.013022932223975658, 0.04627145081758499, -0.020669423043727875, -0.02031278982758522, -0.052186835557222366, -0.0148158585652709, -0.018267231062054634, -0.059003304690122604, -0.011793344281613827, 0.027096575126051903, 0.019299808889627457, 0.04161312058568001, -0.019393721595406532, -0.02361445501446724, 0.07711422443389893, -0.02068573422729969, -0.004702702630311251, -0.011135494336485863, 0.0101374052464962, -0.020808257162570953, 0.011924360878765583, -0.020093027502298355, -0.007138500455766916, 0.014727798290550709, 0.05770261213183403, 0.017841406166553497, 0.044339124113321304, -0.01490224339067936, -0.008343652822077274, -0.04842463508248329, 0.0336640290915966, -0.004893577191978693, -0.021536342799663544, -0.032384153455495834, -0.009452177211642265, -0.027460120618343353, -0.009426826611161232, 0.006357531528919935, 0.019494572654366493, 0.009722599759697914, -0.00497430982068181, 0.023032115772366524, 0.05221958085894585, -0.01671120524406433, 0.061740316450595856, -0.06789620220661163, -0.023851843550801277, -0.02249223366379738, -0.01231105625629425, -0.0499565526843071, 0.004251780919730663, 0.05466651916503906, -0.024449756368994713, -0.034151963889598846, 0.037387508898973465, -0.0016276679234579206, -0.02609393745660782, 0.01800747588276863, -0.0028136332985013723, -0.06036405637860298, 0.028903907164931297, 0.006318055558949709, 0.012870929203927517, -0.0021476889960467815, -0.012034566141664982, -0.008372323587536812, 0.024942906573414803, 0.08258169889450073, 0.006757829803973436, 0.032017264515161514, -0.012414710596203804, 0.014826267957687378, -0.040858786553144455, -0.0060302577912807465, 0.00843990221619606, -0.031066348776221275, -0.06313654035329819, 0.0056659989058971405, -0.007768781390041113, 0.011673268862068653, 0.007261875085532665, 0.006112886127084494, -0.07374890148639679, 0.06602894514799118, -0.05385972931981087, -0.0010994652984663844, 0.05939924344420433, 0.015503636561334133, 0.034621711820364, 0.008040975779294968, -0.023962488397955894, -0.06270411610603333, 0.00027893096557818353, -0.0436306893825531, -0.006309020332992077, 0.02416943572461605, -0.015391307882964611, -0.012442439794540405, -0.003181715961545706, -0.0021985983476042747, 0.008671553805470467, 0.004063367377966642, -0.02560708485543728, 0.03469422832131386, -0.04249674826860428, -0.013552767224609852, -0.052823010832071304, 0.014670411124825478, -0.011493593454360962, 0.024076055735349655, 0.056352417916059494, -0.008510314859449863, 0.015936613082885742, 0.003935575485229492, 0.0037949192337691784, 0.015074086375534534, 0.016583971679210663, -0.0057802870869636536, 0.005751866847276688, -0.009386995807290077, -0.03710195794701576, -0.03144300729036331, -0.07106415182352066, -0.003882911056280136, -0.010697683319449425, -0.014338435605168343, 0.007036983501166105, -0.035716522485017776, 0.06593189388513565, 0.007752529811114073, -0.030261363834142685, -0.02513342909514904, -0.039278656244277954, 0.015320679172873497, -0.012659071013331413, 0.014207725413143635, 0.010264124721288681, 0.01617652177810669, -0.022644126787781715, -0.031033707782626152, 0.04160666465759277, -0.05329348146915436, 0.02423500455915928, -0.019389694556593895, 0.008645910769701004, -0.005958682857453823, -0.03648180514574051, 0.011972597800195217, 0.037404924631118774, -0.007001751102507114, -0.05138246342539787, 0.0013400549069046974, -0.03268183395266533, 0.07687076926231384, -0.02033335529267788, -0.020667986944317818, 0.0038236891850829124, 0.029960744082927704, 0.015430699102580547, 0.05047214776277542, 0.0052254535257816315, 0.013995353132486343, -0.031164521351456642, -0.014291719533503056, 0.015829795971512794, -0.0013409113744273782, -0.044300951063632965, 0.045415859669446945, -0.005037966184318066, -0.03883415088057518, 0.027200160548090935, 0.008182630874216557, -0.046456750482320786, -0.029778052121400833, 0.02067168429493904, -0.006381513085216284, -0.04693000763654709, 0.009974686428904533, 0.03109011799097061, -0.012696364894509315, 0.030124813318252563, 0.02372679114341736, 0.06566771119832993, 0.03553507477045059, -0.032816141843795776, 0.028003521263599396, 0.06498659402132034, -0.013530750758945942, 0.0312667116522789, -0.015660811215639114, -0.00776742585003376, -0.004829467739909887, -0.015968922525644302, 0.04765664413571358, -0.0026502758264541626, 0.01891564577817917, 0.04119837284088135, 0.012158435769379139, 0.008338023908436298, -0.006039333995431662, 0.0630166307091713, -0.02758428454399109, 0.029347822070121765, -0.030129415914416313, 0.023165738210082054, 0.04064684361219406, 0.04446929693222046, -0.006133638322353363, -0.013095719739794731, -0.041152223944664, -0.01038535125553608, 0.01738007925450802, 0.0010595708154141903, -0.055003564804792404, 0.036829687654972076, -0.030270753428339958, -0.009607627056539059, 0.014103117398917675, 0.005140293389558792, 0.032931022346019745, 0.026972685009241104, -0.00039128100615926087, 0.00550195062533021, 0.062454141676425934, 0.02344602160155773, -0.01688288524746895, 0.011600837111473083, 0.009648085571825504, 0.012827200815081596, 0.02368510514497757, -0.044808436185121536, 0.006574536208063364, 0.03677171841263771, 0.021754244342446327, -0.0031720376573503017, -0.03498553857207298, -0.027119319885969162, 0.05196662247180939, 0.0063033513724803925, -0.002766692778095603, -0.03879206255078316, -0.005737128667533398, -0.02351462095975876, 0.04338989034295082, -0.03623301535844803, 0.003727369010448456, 0.044172726571559906, 0.06180792301893234, -0.025736358016729355, 0.01280374638736248, -0.01768171414732933, 0.0413120836019516, 0.036350950598716736, 0.020034022629261017, -0.00938474852591753, -0.04920303076505661, -0.1626604050397873, 0.0016566020203754306, -0.010797491297125816, 0.0037245014682412148, 0.039030417799949646, -0.009399985894560814, 0.016659803688526154, -0.047097429633140564, -0.00987484585493803, 0.020634479820728302, 0.005361238028854132, -0.05283225327730179, 0.002501025330275297, -0.004766151309013367, 0.00850654486566782, -0.0050267502665519714, -0.046555373817682266, 0.012670878320932388, 0.0018581973854452372, -0.010647253133356571, 0.01990092545747757, 0.02013244479894638, 0.04490885138511658, 0.029433563351631165, -0.01408607978373766, 0.029722925275564194, 0.04512600228190422, -0.04305345192551613, 0.0053901285864412785, -0.010685979388654232, 0.01516974437981844, 0.02340293675661087, -0.014181641861796379, -0.0013334851246327162, 0.020624764263629913, 0.06469231843948364, 0.016654038801789284, -0.043994754552841187, 0.025707466527819633, -0.004160136915743351, 0.021129926666617393, 0.041262850165367126, 0.006293899845331907, 0.056005991995334625, -0.006883381400257349, -0.07502268254756927, -0.02920101210474968, -0.019043054431676865, 0.00737513042986393, 0.013621360063552856, -0.02504715882241726, -0.01138006430119276, -0.010744514875113964, -0.02502342313528061, -0.03335903584957123, 0.012180354446172714, -0.03276645019650459, 0.05202409625053406, 0.03246080502867699, 0.03068908303976059, -0.029587913304567337, -0.04850265011191368, -0.006388102192431688, -0.03203853219747543, -0.050761956721544266, -0.021925227716565132, 0.036384399980306625, -0.011895880103111267, -0.007408954203128815, -0.012625153176486492, 0.0024322718381881714, -0.012196220457553864, -0.007011729292571545, -0.0337890200316906, -0.030034994706511497, 0.04638829082250595, -0.028362803161144257, -0.01176459901034832, 0.00956833828240633, -0.12054562568664551, -0.020540419965982437, 0.014624865725636482, -0.025515791028738022, -0.005027926992624998, -0.03586679324507713, -0.05585843697190285, -0.01700599677860737, -0.00044939795043319464, 0.029278729110956192, 0.25503888726234436, -0.024952411651611328, 0.005794796161353588, -0.007252118084579706, 0.03397773951292038, -0.0030146583449095488, -0.016645856201648712, -0.0008194005931727588, 0.02789629064500332, -0.039116114377975464, -0.035631854087114334, 0.04917449131608009, -0.006455820053815842, -0.011818122118711472, -0.00958359707146883, 0.013176187872886658, 0.037286531180143356, 0.022334400564432144, 0.05832865461707115, 0.010104321874678135, -0.04915979504585266, -0.022671189159154892, -0.016606582328677177, -0.007431587669998407, 0.0025214774068444967, -0.038979604840278625, 0.014895224012434483, 0.03583076596260071, 0.0006473385728895664, 0.04958082735538483, -0.017827684059739113, 0.015710417181253433, 0.062094446271657944, -0.014381879940629005, 0.0002880772517528385, 0.004948006477206945, -8.711735063116066e-06, -0.0029445397667586803, -0.044325683265924454, 0.047702621668577194, -0.03197811171412468, -0.02109563909471035, 0.03041824884712696, 0.021582895889878273, -0.004118872340768576, -0.025784745812416077, 0.06275995075702667, 0.006879465654492378, 0.04185185581445694, 0.02031264826655388, -0.02274201810359955, -0.009617358446121216, -0.04315454140305519, -0.033287111669778824, -0.025126483291387558, -0.003923895303159952, -0.041508499532938004, -0.0009355457150377333, -0.033565372228622437, 0.02229289337992668, -0.0026574484072625637, -0.0028596664778888226, -0.02223617024719715, -0.016868866980075836, 0.04172029718756676, 0.0014162511797621846, -0.037737537175416946, -0.010155809111893177, -0.010357595980167389, 0.04541466012597084, 0.03563382104039192, -0.019189776852726936, -0.012577632442116737, -0.013781189918518066, 0.026566311717033386, 0.020911909639835358, 0.02781282551586628, 0.053938526660203934, 0.0194545891135931, 0.0015139722963795066, -0.0357731431722641, -0.005088387057185173, 0.004257760010659695, 0.04332628846168518, -0.012149352580308914, -0.04734082147479057, 0.018029984086751938, -0.01322091929614544, -0.059820450842380524, -0.03677783161401749, -0.006745075341314077, -0.02209635078907013, -0.012663901783525944, -0.0059855030849576, 0.016270749270915985, -0.00725028058513999, 0.03019685670733452, 0.010252268984913826, -0.06314245611429214, -0.005512078758329153, -0.016377074643969536, -0.0014438428916037083, 0.029021194204688072, -0.015355946496129036, 0.02559172362089157, -0.04241044819355011, 0.010147088207304478, -0.016036594286561012, 0.023162752389907837, 0.047236304730176926, 0.0166736152023077, 0.01226564310491085, -0.015224735252559185, -0.01298521552234888, -0.008012642152607441, 0.028470756486058235, -0.013741613365709782, 0.019896863028407097, 0.01720179058611393, 0.01571199856698513, 0.030143165960907936, -0.02969514951109886, 0.014739652164280415, -0.01854291744530201, -0.045576371252536774, -0.04516203701496124, 0.02147211693227291, 0.007073952350765467, 0.008106761611998081, -0.01828523352742195, 0.002731812885031104, -0.04545339569449425, 0.019007619470357895, 0.03504781052470207, 0.037705861032009125, -0.0045634908601641655, 0.0070000626146793365, 0.0037205498665571213, 0.005224148277193308, -0.017060590907931328, -0.04246727377176285, -0.006265614647418261, -0.015374364331364632, -0.03380871191620827, -0.005029333755373955, 0.007065227720886469, 0.003886009333655238, 0.008613690733909607, -0.012133199721574783, 0.005556005053222179, -0.021959641948342323, 0.04834386706352234, 0.03787781298160553, -0.057815466076135635, 0.015909207984805107, -0.03855409845709801, 0.0018244135426357388, 0.04186264052987099, -0.054983459413051605, 0.006219237111508846, 0.03494301065802574, 0.023722950369119644, 0.0312604121863842, 0.05597991123795509, -0.030345493927598, 0.016615940257906914, -0.0207205917686224, 0.055960651487112045, -0.012713379226624966, -0.0261109359562397, 0.014332456514239311, -0.017245708033442497, -0.06636268645524979, 0.00592504907399416, 0.04649018123745918, -0.018362276256084442, 0.009620632976293564, -0.0044480785727500916, -0.0014729035319760442, 0.015621249563992023, 0.0367378331720829, -0.011857259087264538, -0.045088741928339005, 0.0006832792423665524, 0.02601524256169796, -0.02120809443295002, 0.018104318529367447, 0.008069046773016453, 0.013658273033797741, 0.004183551296591759, -0.04133244603872299, 0.05436890944838524, 0.009334285743534565, -0.014695074409246445, -0.011054124683141708, 0.009796642698347569, -0.008759389631450176, -0.06399217247962952, -0.0028859861195087433, -0.008736967109143734, -0.003506746841594577, 0.008123806677758694, 0.008794951252639294, -0.02940259501338005, 0.009597218595445156, -0.02197900228202343, -0.02082076109945774, 0.023915970697999, -0.059058744460344315, -0.010253551416099072, 0.024443935602903366, -0.029604850336909294, 0.008135135285556316, 0.03568771481513977, -0.017330091446638107, -0.003135789418593049, 0.035103678703308105, 0.0370408296585083, -0.01022601593285799, -0.045891791582107544, 0.01726667769253254, -0.008570673875510693, 0.015297998674213886, -0.015412220731377602, -0.01425748411566019, 0.031544867902994156, 0.013110813684761524, -0.057211123406887054, -0.0008968000765889883, 0.001981658162549138, -0.002101168967783451, -0.09516698867082596, -0.034693196415901184, 0.011157260276377201, 0.010063023306429386, -0.02550840750336647, 0.009959851391613483, 0.022281678393483162, -0.03908146917819977, 0.02196437120437622, 0.03520793840289116, -0.06856158375740051, -0.004901218693703413, 0.1122148334980011, -0.01498009730130434, 0.03165500983595848, -0.07618033140897751, -0.014297851361334324, 0.02150021120905876, 0.005999598652124405, -0.013493427075445652, 0.013868110254406929, 0.00079053093213588, 0.006475066766142845, 0.000955471652559936, -0.03403160721063614, -0.02295752801001072, 0.0041635241359472275, -0.03955964744091034, -0.04943346977233887, 0.00032474088948220015, 0.039174411445856094, -0.011974001303315163, 0.008057610131800175, 0.03809700161218643, -0.041719768196344376, 0.037615906447172165, -0.035932306200265884, 0.008293192833662033, -0.03261689469218254, -0.023902395740151405, -7.811257091816515e-05, -0.011328466236591339, -0.026476409286260605, 0.055370282381772995, 0.03128054738044739, -0.014991461299359798, 0.017835773527622223, 0.01642710715532303, 0.029273470863699913, -0.012139911763370037, 0.01371818222105503, -0.013113478198647499, -0.04071088507771492, 0.0233455840498209, -0.019497444853186607, -0.01747158169746399, 0.02493683062493801, 0.024074571207165718, -0.03614620864391327, -0.025289475917816162, -0.04030011221766472, -0.046772539615631104, 0.009969661012291908, 0.003724620910361409, 0.007474626414477825, -0.04855594411492348, 0.04697829484939575, 0.010695616714656353, 0.027944304049015045, -0.003937696572393179, -0.011591222137212753, -0.011533009819686413, 0.03215765953063965, -0.04699324443936348, -9.356102236779407e-05, -0.01535400003194809, -0.010238519869744778, 0.002703386126086116, 0.04759520664811134, 0.0074842446483671665, -0.04050430282950401, -0.028402622789144516, -0.03205197677016258, 0.011288953013718128, 0.006053865421563387, 0.04641448333859444, 0.005652922671288252, -0.018560705706477165, 0.02581481821835041, 0.00962467584758997, -0.017888177186250687, -0.026476262137293816, -0.005547264125198126, 0.012222226709127426, -0.004069746006280184, -0.020438821986317635, 0.01929863728582859, -0.0053736320696771145, 0.02221786603331566, -0.007175051141530275, 0.003961225971579552, -0.012380941770970821, -0.0040277824737131596, 0.009086307138204575, 0.012202796526253223, 0.018483169376850128, 0.017530532553792, 0.0422886498272419, 0.04987001419067383, 0.003722204128280282, 0.06421508640050888, -0.016258088871836662, -0.027659112587571144, 0.004458434879779816, -0.02898143045604229, -0.014475414529442787, 0.032039571553468704, -0.025734663009643555, -0.01585981249809265, 0.04900333285331726, -0.06422552466392517, -0.0007134959450922906, -0.04035528376698494, 0.03290264680981636, -0.0018848407780751586, 0.0068516512401402, 0.00032433189335279167, -0.002669606124982238, -0.017596688121557236, -0.026878179982304573, 0.014075388200581074, 0.020072080194950104, -0.00295435544103384, -0.01918656937777996, -0.007689833641052246, 0.039347097277641296, 0.0026605715975165367, 0.011779646389186382, 0.04189120978116989, -0.03846775367856026, -0.01993645168840885, 0.04546443000435829, 0.05682912468910217, -0.012384516187012196, -0.004507445730268955, 0.007476931903511286, -0.01160018052905798, 0.006559243891388178, 0.04354899004101753, 0.006185194011777639, 0.028355205431580544, -0.006518798414617777, -0.029528537765145302, 0.06740271300077438, -0.052158474922180176, 0.0025031850673258305, -0.005957300774753094, 0.00500349560752511, 0.022637680172920227, -0.0027129461523145437, -0.011677206493914127, -0.042732879519462585, -0.0021236639004200697, -0.1499215066432953, 0.02914350852370262, -0.031246500089764595, -0.027244996279478073, -0.006904688663780689, 0.01088196225464344, 0.01271661464124918, -0.0430884025990963, -0.020760131999850273, -0.006593034137040377, -0.0007962957606650889, -0.031729113310575485, 0.052976224571466446, -0.03149586543440819, 0.0392388291656971, 0.023318620398640633, -0.01383691094815731, 0.02858218550682068, 0.023135144263505936, 0.026421336457133293, 0.00027594034327194095, -0.03901490569114685, 0.008533132262527943, -0.03802476078271866, -0.011105065234005451, -0.028275510296225548, 0.04846742004156113, 0.021237077191472054, -0.027375172823667526, -0.02717825025320053, -0.031243441626429558, -0.021638689562678337, 0.024066096171736717, 0.05689090117812157, -0.04352620989084244, 0.03599394112825394, 0.05153508856892586, 0.002263782313093543, 0.047110624611377716, 0.006084555760025978, 0.003244618885219097, -0.0015037712873890996, 0.027960799634456635, -0.013650861568748951, 0.03281615301966667, 0.012363187968730927, 0.02162906341254711, -0.010951842181384563, -0.02786285988986492, 0.03754381462931633, 0.01957041770219803, -0.017010418698191643, -0.008339766412973404, 0.0755641758441925, 0.023412147536873817, -0.005748848430812359, -0.05465301498770714, -0.02190011739730835, 0.0054182386957108974, 0.032733004540205, -0.05342638120055199, 0.009907999075949192, -0.02370712347328663, -0.015652501955628395, -0.011254304088652134, -0.019827252253890038, -0.021032121032476425, -0.02607329562306404, -0.0008710312540642917, -0.06800976395606995, -0.017296750098466873, 0.015312970615923405, -0.015649013221263885, -0.016449443995952606, -0.012058117426931858, 0.002104945247992873, 0.020476385951042175, 0.014795565977692604, -0.02145536057651043, -0.028734024614095688, -0.041212357580661774, -0.008211270906031132, 0.033569078892469406, -0.0033273063600063324, -0.02339683100581169, 0.0421740785241127, -0.009677124209702015, -0.006869456730782986, -0.016001028940081596, 0.029614608734846115, -0.06062136963009834, -0.011824233457446098, 0.012096629478037357, -0.028248939663171768, -0.03703905642032623, 0.012119539082050323, -0.041021380573511124, 0.01975782960653305, -0.028443211689591408, 0.020459437742829323, 0.0073023103177547455, -0.06498327851295471, -0.004016770515590906, 0.06460512429475784, -0.053343966603279114, 0.03865537419915199, -5.4113028454594314e-05, -0.008642046712338924, -0.009384138509631157, -0.037736788392066956, -0.035090748220682144, 0.018596891313791275, -0.008763385005295277, 0.040228284895420074, 0.03811536356806755, -0.034618355333805084, -0.004665717948228121, 0.04813361540436745, -0.004303373862057924, 0.00795511994510889, -0.017838604748249054, 0.00563138909637928, -0.03171280398964882, -0.0259436946362257, 0.004301885142922401, -0.02739236131310463, 0.03270035237073898, 0.009064823389053345, -0.0363747663795948, 0.02325567975640297, 0.03453107923269272, -0.012906554155051708, 0.028347544372081757, 0.01234712265431881, 0.030589573085308075, 0.0024874424561858177, -0.0173872709274292, 0.0247347354888916, 0.004171399865299463, 0.02350561134517193, -0.05499064922332764, -0.023146219551563263, -0.012485259212553501, -0.0228674728423357, 0.013267520815134048, 0.021304689347743988, -0.018937893211841583, -0.0260267723351717, -0.022532619535923004, 0.0030378480441868305, -0.008528024889528751, -0.030528495088219643, -0.009305189363658428, -0.0074027362279593945, -0.020641637966036797, 0.006984233390539885, 0.04300186410546303, -0.033014994114637375, -0.006089311558753252, 0.04753036051988602, -0.036625705659389496, -0.04691743850708008, -0.007467558141797781, 0.0652017593383789, -0.03861508145928383, -0.00741452956572175, 0.003471594536677003, 0.016132064163684845, 0.01570185460150242, 0.018733495846390724, -0.019025148823857307, 0.003490244736894965, -0.017714614048600197, -0.003447450464591384, 0.015267218463122845, 0.015076974406838417, -0.002631498035043478, 0.005311752203851938, 0.014075293205678463, 0.0026123111601918936, 0.011874910444021225, 0.0714355856180191, 0.06941138952970505, 0.022251378744840622, 0.01972009800374508, 0.04719123989343643, 0.023544959723949432, 0.017852554097771645, 0.01843070052564144, -0.05294886603951454, -0.008682304993271828, 0.010625398717820644, 0.0428495928645134, 0.002173527143895626, 0.06291069090366364, 0.024296458810567856, 0.008714474737644196, 0.06520587205886841, 0.015627536922693253, 0.04247526824474335, 0.0009774811333045363, 0.00738496845588088, -0.024803027510643005, 0.013228596188127995, -0.037615202367305756, -0.028807995840907097, 0.012890785001218319, -0.01587829552590847, -0.01928863860666752, 0.0011809614952653646, -0.026926854625344276, -0.020252779126167297, -0.010968486778438091, -0.015348547138273716, 0.008559435606002808, -0.009286923334002495, 0.0014621232403442264, 0.03831499442458153, 0.016517579555511475, 0.037184324115514755, -0.041231196373701096, 0.03757374733686447, -0.039465345442295074, -0.04308579862117767, 0.0011091071646660566, -0.029794104397296906, 0.008459310978651047, -0.01713281124830246, -0.016625113785266876, -0.05582521855831146, -0.0415986105799675, 0.028725938871502876, 0.04966316372156143, 0.012718678452074528, -0.025533588603138924, 0.013822318986058235, -5.168768620933406e-05, 0.02616700902581215, -0.06113629788160324, -0.03175340220332146, 0.03593592345714569, -0.04014921560883522, -0.020605407655239105, 0.02186705358326435] | |
| 680 | 680 | ``` |
| 681 | 681 | \ No newline at end of file | ... | ... |
docs/ES/ES_8.18/2_kibana安装.md
| 1 | - | |
| 2 | - | |
| 3 | - | |
| 4 | - | |
| 5 | - | |
| 6 | -## 1. yum安装 | |
| 7 | -添加yum仓库。 | |
| 8 | -Kibana通常不是yum默认仓库的一部分,因此需要添加 Elastic 仓库:: | |
| 9 | -```shell | |
| 10 | - | |
| 11 | -导入 Elastic 签名密钥: | |
| 12 | -sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch | |
| 13 | - | |
| 14 | -添加 Elastic 仓库: | |
| 15 | -sudo tee /etc/yum.repos.d/elastic.repo <<EOF | |
| 16 | -[elastic-8.x] | |
| 17 | -name=Elastic repository for 8.x packages | |
| 18 | -baseurl=https://artifacts.elastic.co/packages/8.x/yum | |
| 19 | -gpgcheck=1 | |
| 20 | -gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch | |
| 21 | -enabled=1 | |
| 22 | -autorefresh=1 | |
| 23 | -type=rpm-md | |
| 24 | -EOF | |
| 25 | -``` | |
| 26 | -然后可以安装 (版本要匹配,否则后面会有错误) | |
| 27 | -```shell | |
| 28 | -sudo yum install -y kibana-8.18.0 | |
| 29 | -``` | |
| 30 | - | |
| 31 | -## 2. 修改配置文件 | |
| 32 | -```shell | |
| 33 | -# 使用yum安装的kibana,默认安装的主目录在/usr/share/kibana中。kibana配置文件的位置为/etc/kibana/kibana.yml。 | |
| 34 | -vim /etc/kibana/kibana.yml | |
| 35 | -# 补充内容: | |
| 36 | -server.host: "0.0.0.0" | |
| 37 | -elasticsearch.hosts: ["http://ip:9200"] | |
| 38 | -i18n.locale: "zh-CN" | |
| 39 | -``` | |
| 40 | -## 3. 启动 | |
| 41 | -```shell | |
| 42 | -# 启动kibana | |
| 43 | -systemctl start kibana | |
| 44 | -systemctl status kibana | |
| 45 | -# 设置开机自启动 | |
| 46 | -systemctl enable kibana | |
| 47 | -``` | |
| 48 | - | |
| 49 | -在阿里云上面配置允许访问5601端口后,可以浏览器打开: | |
| 50 | -http://43.166.252.75:5601/ | |
| 51 | - | |
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | +## 1. yum安装 | |
| 7 | +添加yum仓库。 | |
| 8 | +Kibana通常不是yum默认仓库的一部分,因此需要添加 Elastic 仓库:: | |
| 9 | +```shell | |
| 10 | + | |
| 11 | +导入 Elastic 签名密钥: | |
| 12 | +sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch | |
| 13 | + | |
| 14 | +添加 Elastic 仓库: | |
| 15 | +sudo tee /etc/yum.repos.d/elastic.repo <<EOF | |
| 16 | +[elastic-8.x] | |
| 17 | +name=Elastic repository for 8.x packages | |
| 18 | +baseurl=https://artifacts.elastic.co/packages/8.x/yum | |
| 19 | +gpgcheck=1 | |
| 20 | +gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch | |
| 21 | +enabled=1 | |
| 22 | +autorefresh=1 | |
| 23 | +type=rpm-md | |
| 24 | +EOF | |
| 25 | +``` | |
| 26 | +然后可以安装 (版本要匹配,否则后面会有错误) | |
| 27 | +```shell | |
| 28 | +sudo yum install -y kibana-8.18.0 | |
| 29 | +``` | |
| 30 | + | |
| 31 | +## 2. 修改配置文件 | |
| 32 | +```shell | |
| 33 | +# 使用yum安装的kibana,默认安装的主目录在/usr/share/kibana中。kibana配置文件的位置为/etc/kibana/kibana.yml。 | |
| 34 | +vim /etc/kibana/kibana.yml | |
| 35 | +# 补充内容: | |
| 36 | +server.host: "0.0.0.0" | |
| 37 | +elasticsearch.hosts: ["http://ip:9200"] | |
| 38 | +i18n.locale: "zh-CN" | |
| 39 | +``` | |
| 40 | +## 3. 启动 | |
| 41 | +```shell | |
| 42 | +# 启动kibana | |
| 43 | +systemctl start kibana | |
| 44 | +systemctl status kibana | |
| 45 | +# 设置开机自启动 | |
| 46 | +systemctl enable kibana | |
| 47 | +``` | |
| 48 | + | |
| 49 | +在阿里云上面配置允许访问5601端口后,可以浏览器打开: | |
| 50 | +http://43.166.252.75:5601/ | |
| 51 | + | ... | ... |
query/translator.py
| ... | ... | @@ -62,6 +62,7 @@ class Translator: |
| 62 | 62 | |
| 63 | 63 | DEEPL_API_URL = "https://api.deepl.com/v2/translate" # Pro tier |
| 64 | 64 | QWEN_BASE_URL = "https://dashscope.aliyuncs.com/compatible-mode/v1" # 北京地域 |
| 65 | + # QWEN_BASE_URL = "https://dashscope-intl.aliyuncs.com/compatible-mode/v1" # 新加坡 | |
| 65 | 66 | # 如果使用新加坡地域的模型,需要将base_url替换为:https://dashscope-intl.aliyuncs.com/compatible-mode/v1 |
| 66 | 67 | QWEN_MODEL = "qwen-mt-flash" # 快速翻译模型 |
| 67 | 68 | ... | ... |