|
...
|
...
|
@@ -0,0 +1,1424 @@ |
|
|
1
|
+# 向量化模块和API说明文档 |
|
|
2
|
+ |
|
|
3
|
+本文档详细说明SearchEngine项目中的向量化模块架构、API接口、配置方法和使用指南。 |
|
|
4
|
+ |
|
|
5
|
+## 目录 |
|
|
6
|
+ |
|
|
7
|
+1. [概述](#概述) |
|
|
8
|
+ - 1.1 [向量化模块简介](#11-向量化模块简介) |
|
|
9
|
+ - 1.2 [技术选型](#12-技术选型) |
|
|
10
|
+ - 1.3 [应用场景](#13-应用场景) |
|
|
11
|
+ |
|
|
12
|
+2. [向量化服务架构](#向量化服务架构) |
|
|
13
|
+ - 2.1 [本地向量化服务](#21-本地向量化服务) |
|
|
14
|
+ - 2.2 [云端向量化服务](#22-云端向量化服务) |
|
|
15
|
+ - 2.3 [架构对比](#23-架构对比) |
|
|
16
|
+ |
|
|
17
|
+3. [本地向量化服务](#本地向量化服务) |
|
|
18
|
+ - 3.1 [服务启动](#31-服务启动) |
|
|
19
|
+ - 3.2 [服务配置](#32-服务配置) |
|
|
20
|
+ - 3.3 [模型说明](#33-模型说明) |
|
|
21
|
+ |
|
|
22
|
+4. [云端向量化服务](#云端向量化服务) |
|
|
23
|
+ - 4.1 [阿里云DashScope](#41-阿里云dashscope) |
|
|
24
|
+ - 4.2 [API Key配置](#42-api-key配置) |
|
|
25
|
+ - 4.3 [使用方式](#43-使用方式) |
|
|
26
|
+ |
|
|
27
|
+5. [Embedding API详细说明](#embedding-api详细说明) |
|
|
28
|
+ - 5.1 [API概览](#51-api概览) |
|
|
29
|
+ - 5.2 [健康检查接口](#52-健康检查接口) |
|
|
30
|
+ - 5.3 [文本向量化接口](#53-文本向量化接口) |
|
|
31
|
+ - 5.4 [图片向量化接口](#54-图片向量化接口) |
|
|
32
|
+ - 5.5 [错误处理](#55-错误处理) |
|
|
33
|
+ |
|
|
34
|
+6. [配置说明](#配置说明) |
|
|
35
|
+ - 6.1 [服务配置](#61-服务配置) |
|
|
36
|
+ - 6.2 [模型配置](#62-模型配置) |
|
|
37
|
+ - 6.3 [批处理配置](#63-批处理配置) |
|
|
38
|
+ |
|
|
39
|
+7. [客户端集成示例](#客户端集成示例) |
|
|
40
|
+ - 7.1 [Python客户端](#71-python客户端) |
|
|
41
|
+ - 7.2 [Java客户端](#72-java客户端) |
|
|
42
|
+ - 7.3 [cURL示例](#73-curl示例) |
|
|
43
|
+ |
|
|
44
|
+8. [性能对比与优化](#性能对比与优化) |
|
|
45
|
+ - 8.1 [性能对比](#81-性能对比) |
|
|
46
|
+ - 8.2 [成本对比](#82-成本对比) |
|
|
47
|
+ - 8.3 [优化建议](#83-优化建议) |
|
|
48
|
+ |
|
|
49
|
+9. [故障排查](#故障排查) |
|
|
50
|
+ - 9.1 [常见问题](#91-常见问题) |
|
|
51
|
+ - 9.2 [日志查看](#92-日志查看) |
|
|
52
|
+ - 9.3 [性能调优](#93-性能调优) |
|
|
53
|
+ |
|
|
54
|
+10. [附录](#附录) |
|
|
55
|
+ - 10.1 [向量维度说明](#101-向量维度说明) |
|
|
56
|
+ - 10.2 [模型版本信息](#102-模型版本信息) |
|
|
57
|
+ - 10.3 [相关文档](#103-相关文档) |
|
|
58
|
+ |
|
|
59
|
+--- |
|
|
60
|
+ |
|
|
61
|
+## 概述 |
|
|
62
|
+ |
|
|
63
|
+### 1.1 向量化模块简介 |
|
|
64
|
+ |
|
|
65
|
+SearchEngine项目实现了完整的文本和图片向量化能力,支持两种部署方式: |
|
|
66
|
+ |
|
|
67
|
+1. **本地向量化服务**:独立部署的微服务,基于本地GPU/CPU运行BGE-M3和CN-CLIP模型 |
|
|
68
|
+2. **云端向量化服务**:集成阿里云DashScope API,按使用量付费 |
|
|
69
|
+ |
|
|
70
|
+向量化模块是搜索引擎的核心组件,为语义搜索、图片搜索提供AI驱动的相似度计算能力。 |
|
|
71
|
+ |
|
|
72
|
+### 1.2 技术选型 |
|
|
73
|
+ |
|
|
74
|
+| 功能 | 本地服务 | 云端服务 | |
|
|
75
|
+|------|---------|---------| |
|
|
76
|
+| **文本模型** | BGE-M3 (Xorbits/bge-m3) | text-embedding-v4 | |
|
|
77
|
+| **图片模型** | CN-CLIP (ViT-H-14) | - | |
|
|
78
|
+| **向量维度** | 1024 | 1024 | |
|
|
79
|
+| **服务框架** | FastAPI | 阿里云API | |
|
|
80
|
+| **部署方式** | Docker/本地 | 云端API | |
|
|
81
|
+ |
|
|
82
|
+### 1.3 应用场景 |
|
|
83
|
+ |
|
|
84
|
+- **语义搜索**:查询文本向量化,与商品向量计算相似度 |
|
|
85
|
+- **图片搜索**:商品图片向量化,支持以图搜图 |
|
|
86
|
+- **混合检索**:BM25 + 向量相似度组合排序 |
|
|
87
|
+- **多语言搜索**:中英文跨语言语义理解 |
|
|
88
|
+ |
|
|
89
|
+--- |
|
|
90
|
+ |
|
|
91
|
+## 向量化服务架构 |
|
|
92
|
+ |
|
|
93
|
+### 2.1 本地向量化服务 |
|
|
94
|
+ |
|
|
95
|
+``` |
|
|
96
|
+┌─────────────────────────────────────────┐ |
|
|
97
|
+│ Embedding Microservice (FastAPI) │ |
|
|
98
|
+│ Port: 6005, Workers: 1 │ |
|
|
99
|
+└──────────────┬──────────────────────────┘ |
|
|
100
|
+ │ |
|
|
101
|
+ ┌───────┴───────┐ |
|
|
102
|
+ │ │ |
|
|
103
|
+┌──────▼──────┐ ┌────▼─────┐ |
|
|
104
|
+│ BGE-M3 │ │ CN-CLIP │ |
|
|
105
|
+│ Text Model │ │ Image │ |
|
|
106
|
+│ (CUDA/CPU) │ │ Model │ |
|
|
107
|
+└─────────────┘ └──────────┘ |
|
|
108
|
+``` |
|
|
109
|
+ |
|
|
110
|
+**核心特性**: |
|
|
111
|
+- 独立部署,可横向扩展 |
|
|
112
|
+- GPU加速支持 |
|
|
113
|
+- 线程安全设计 |
|
|
114
|
+- 启动时预加载模型 |
|
|
115
|
+ |
|
|
116
|
+### 2.2 云端向量化服务 |
|
|
117
|
+ |
|
|
118
|
+``` |
|
|
119
|
+┌─────────────────────────────────────┐ |
|
|
120
|
+│ SearchEngine Main Service │ |
|
|
121
|
+│ (uses CloudTextEncoder) │ |
|
|
122
|
+└──────────────┬──────────────────────┘ |
|
|
123
|
+ │ |
|
|
124
|
+ ▼ |
|
|
125
|
+┌─────────────────────────────────────┐ |
|
|
126
|
+│ Aliyun DashScope API │ |
|
|
127
|
+│ text-embedding-v4 │ |
|
|
128
|
+│ (HTTP/REST) │ |
|
|
129
|
+└─────────────────────────────────────┘ |
|
|
130
|
+``` |
|
|
131
|
+ |
|
|
132
|
+**核心特性**: |
|
|
133
|
+- 无需GPU资源 |
|
|
134
|
+- 按使用量计费 |
|
|
135
|
+- 自动扩展 |
|
|
136
|
+- 低运维成本 |
|
|
137
|
+ |
|
|
138
|
+### 2.3 架构对比 |
|
|
139
|
+ |
|
|
140
|
+| 维度 | 本地服务 | 云端服务 | |
|
|
141
|
+|------|---------|---------| |
|
|
142
|
+| **初始成本** | 高(GPU服务器) | 低(按需付费) | |
|
|
143
|
+| **运行成本** | 固定 | 变动(按调用量) | |
|
|
144
|
+| **延迟** | <100ms | 300-400ms | |
|
|
145
|
+| **吞吐量** | 高(~32 qps) | 中(~2-3 qps) | |
|
|
146
|
+| **离线支持** | ✅ | ❌ | |
|
|
147
|
+| **维护成本** | 高 | 低 | |
|
|
148
|
+| **扩展性** | 手动扩展 | 自动扩展 | |
|
|
149
|
+| **适用场景** | 大规模生产环境 | 初期开发/小规模应用 | |
|
|
150
|
+ |
|
|
151
|
+--- |
|
|
152
|
+ |
|
|
153
|
+## 本地向量化服务 |
|
|
154
|
+ |
|
|
155
|
+### 3.1 服务启动 |
|
|
156
|
+ |
|
|
157
|
+#### 方式1:使用脚本启动(推荐) |
|
|
158
|
+ |
|
|
159
|
+```bash |
|
|
160
|
+# 启动向量化服务 |
|
|
161
|
+./scripts/start_embedding_service.sh |
|
|
162
|
+``` |
|
|
163
|
+ |
|
|
164
|
+脚本特性: |
|
|
165
|
+- 自动激活conda环境 |
|
|
166
|
+- 读取配置文件获取端口 |
|
|
167
|
+- 单worker模式启动服务 |
|
|
168
|
+ |
|
|
169
|
+#### 方式2:手动启动 |
|
|
170
|
+ |
|
|
171
|
+```bash |
|
|
172
|
+# 激活环境 |
|
|
173
|
+source /home/tw/miniconda3/etc/profile.d/conda.sh |
|
|
174
|
+conda activate searchengine |
|
|
175
|
+ |
|
|
176
|
+# 启动服务 |
|
|
177
|
+python -m uvicorn embeddings.server:app \ |
|
|
178
|
+ --host 0.0.0.0 \ |
|
|
179
|
+ --port 6005 \ |
|
|
180
|
+ --workers 1 |
|
|
181
|
+``` |
|
|
182
|
+ |
|
|
183
|
+#### 方式3:Docker部署(生产环境) |
|
|
184
|
+ |
|
|
185
|
+```bash |
|
|
186
|
+# 构建镜像 |
|
|
187
|
+docker build -t searchengine-embedding:latest . |
|
|
188
|
+ |
|
|
189
|
+# 启动容器 |
|
|
190
|
+docker run -d \ |
|
|
191
|
+ --name embedding-service \ |
|
|
192
|
+ --gpus all \ |
|
|
193
|
+ -p 6005:6005 \ |
|
|
194
|
+ searchengine-embedding:latest |
|
|
195
|
+``` |
|
|
196
|
+ |
|
|
197
|
+### 3.2 服务配置 |
|
|
198
|
+ |
|
|
199
|
+配置文件:`embeddings/config.py` |
|
|
200
|
+ |
|
|
201
|
+```python |
|
|
202
|
+class EmbeddingConfig: |
|
|
203
|
+ # 服务配置 |
|
|
204
|
+ HOST = "0.0.0.0" # 监听地址 |
|
|
205
|
+ PORT = 6005 # 监听端口 |
|
|
206
|
+ |
|
|
207
|
+ # 文本模型 (BGE-M3) |
|
|
208
|
+ TEXT_MODEL_DIR = "Xorbits/bge-m3" # 模型路径/HuggingFace ID |
|
|
209
|
+ TEXT_DEVICE = "cuda" # 设备: "cuda" 或 "cpu" |
|
|
210
|
+ TEXT_BATCH_SIZE = 32 # 批处理大小 |
|
|
211
|
+ |
|
|
212
|
+ # 图片模型 (CN-CLIP) |
|
|
213
|
+ IMAGE_MODEL_NAME = "ViT-H-14" # 模型名称 |
|
|
214
|
+ IMAGE_DEVICE = None # None=自动, "cuda", "cpu" |
|
|
215
|
+ IMAGE_BATCH_SIZE = 8 # 批处理大小 |
|
|
216
|
+``` |
|
|
217
|
+ |
|
|
218
|
+### 3.3 模型说明 |
|
|
219
|
+ |
|
|
220
|
+#### BGE-M3 文本模型 |
|
|
221
|
+ |
|
|
222
|
+- **模型ID**: `Xorbits/bge-m3` |
|
|
223
|
+- **向量维度**: 1024 |
|
|
224
|
+- **支持语言**: 中文、英文、多语言(100+) |
|
|
225
|
+- **特性**: 强大的语义理解能力,支持长文本 |
|
|
226
|
+- **部署**: 自动从HuggingFace下载 |
|
|
227
|
+ |
|
|
228
|
+#### CN-CLIP 图片模型 |
|
|
229
|
+ |
|
|
230
|
+- **模型**: ViT-H-14 (Chinese CLIP) |
|
|
231
|
+- **向量维度**: 1024 |
|
|
232
|
+- **输入**: 图片URL或本地路径 |
|
|
233
|
+- **特性**: 中文图文理解,适合电商场景 |
|
|
234
|
+- **预处理**: 自动下载、缩放、归一化 |
|
|
235
|
+ |
|
|
236
|
+--- |
|
|
237
|
+ |
|
|
238
|
+## 云端向量化服务 |
|
|
239
|
+ |
|
|
240
|
+### 4.1 阿里云DashScope |
|
|
241
|
+ |
|
|
242
|
+**服务地址**: |
|
|
243
|
+- 北京地域:`https://dashscope.aliyuncs.com/compatible-mode/v1` |
|
|
244
|
+- 新加坡地域:`https://dashscope-intl.aliyuncs.com/compatible-mode/v1` |
|
|
245
|
+ |
|
|
246
|
+**模型信息**: |
|
|
247
|
+- **模型名**: `text-embedding-v4` |
|
|
248
|
+- **向量维度**: 1024 |
|
|
249
|
+- **输入限制**: 单次最多2048个文本,每个文本最大8192 token |
|
|
250
|
+- **速率限制**: 根据API套餐不同而不同 |
|
|
251
|
+ |
|
|
252
|
+### 4.2 API Key配置 |
|
|
253
|
+ |
|
|
254
|
+#### 方式1:环境变量(推荐) |
|
|
255
|
+ |
|
|
256
|
+```bash |
|
|
257
|
+# 临时设置 |
|
|
258
|
+export DASHSCOPE_API_KEY="sk-your-api-key-here" |
|
|
259
|
+ |
|
|
260
|
+# 永久设置(添加到 ~/.bashrc 或 ~/.zshrc) |
|
|
261
|
+echo 'export DASHSCOPE_API_KEY="sk-your-api-key-here"' >> ~/.bashrc |
|
|
262
|
+source ~/.bashrc |
|
|
263
|
+``` |
|
|
264
|
+ |
|
|
265
|
+#### 方式2:.env文件 |
|
|
266
|
+ |
|
|
267
|
+在项目根目录创建`.env`文件: |
|
|
268
|
+ |
|
|
269
|
+```bash |
|
|
270
|
+DASHSCOPE_API_KEY=sk-your-api-key-here |
|
|
271
|
+``` |
|
|
272
|
+ |
|
|
273
|
+**获取API Key**:https://help.aliyun.com/zh/model-studio/get-api-key |
|
|
274
|
+ |
|
|
275
|
+### 4.3 使用方式 |
|
|
276
|
+ |
|
|
277
|
+```python |
|
|
278
|
+from embeddings.cloud_text_encoder import CloudTextEncoder |
|
|
279
|
+ |
|
|
280
|
+# 初始化编码器(自动从环境变量读取API Key) |
|
|
281
|
+encoder = CloudTextEncoder() |
|
|
282
|
+ |
|
|
283
|
+# 单个文本向量化 |
|
|
284
|
+text = "衣服的质量杠杠的" |
|
|
285
|
+embedding = encoder.encode(text) |
|
|
286
|
+print(embedding.shape) # (1, 1024) |
|
|
287
|
+ |
|
|
288
|
+# 批量向量化 |
|
|
289
|
+texts = ["文本1", "文本2", "文本3"] |
|
|
290
|
+embeddings = encoder.encode(texts) |
|
|
291
|
+print(embeddings.shape) # (3, 1024) |
|
|
292
|
+ |
|
|
293
|
+# 大批量处理(自动分批) |
|
|
294
|
+large_texts = [f"商品 {i}" for i in range(1000)] |
|
|
295
|
+embeddings = encoder.encode_batch(large_texts, batch_size=32) |
|
|
296
|
+``` |
|
|
297
|
+ |
|
|
298
|
+**自定义配置**: |
|
|
299
|
+ |
|
|
300
|
+```python |
|
|
301
|
+# 使用新加坡地域 |
|
|
302
|
+encoder = CloudTextEncoder( |
|
|
303
|
+ api_key="sk-xxx", |
|
|
304
|
+ base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1" |
|
|
305
|
+) |
|
|
306
|
+``` |
|
|
307
|
+ |
|
|
308
|
+--- |
|
|
309
|
+ |
|
|
310
|
+## Embedding API详细说明 |
|
|
311
|
+ |
|
|
312
|
+### 5.1 API概览 |
|
|
313
|
+ |
|
|
314
|
+本地向量化服务提供RESTful API接口: |
|
|
315
|
+ |
|
|
316
|
+| 端点 | 方法 | 功能 | |
|
|
317
|
+|------|------|------| |
|
|
318
|
+| `/health` | GET | 健康检查 | |
|
|
319
|
+| `/embed/text` | POST | 文本向量化 | |
|
|
320
|
+| `/embed/image` | POST | 图片向量化 | |
|
|
321
|
+ |
|
|
322
|
+**服务地址**: |
|
|
323
|
+- 默认:`http://localhost:6005` |
|
|
324
|
+- 生产:`http://<your-server>:6005` |
|
|
325
|
+ |
|
|
326
|
+### 5.2 健康检查接口 |
|
|
327
|
+ |
|
|
328
|
+```http |
|
|
329
|
+GET /health |
|
|
330
|
+``` |
|
|
331
|
+ |
|
|
332
|
+**响应示例**: |
|
|
333
|
+```json |
|
|
334
|
+{ |
|
|
335
|
+ "status": "ok", |
|
|
336
|
+ "text_model_loaded": true, |
|
|
337
|
+ "image_model_loaded": true |
|
|
338
|
+} |
|
|
339
|
+``` |
|
|
340
|
+ |
|
|
341
|
+**字段说明**: |
|
|
342
|
+- `status`: 服务状态,"ok"表示正常 |
|
|
343
|
+- `text_model_loaded`: 文本模型是否加载成功 |
|
|
344
|
+- `image_model_loaded`: 图片模型是否加载成功 |
|
|
345
|
+ |
|
|
346
|
+**cURL示例**: |
|
|
347
|
+```bash |
|
|
348
|
+curl http://localhost:6005/health |
|
|
349
|
+``` |
|
|
350
|
+ |
|
|
351
|
+### 5.3 文本向量化接口 |
|
|
352
|
+ |
|
|
353
|
+```http |
|
|
354
|
+POST /embed/text |
|
|
355
|
+Content-Type: application/json |
|
|
356
|
+``` |
|
|
357
|
+ |
|
|
358
|
+#### 请求格式 |
|
|
359
|
+ |
|
|
360
|
+**请求体**(JSON数组): |
|
|
361
|
+```json |
|
|
362
|
+[ |
|
|
363
|
+ "衣服的质量杠杠的", |
|
|
364
|
+ "Bohemian Maxi Dress", |
|
|
365
|
+ "Vintage Denim Jacket" |
|
|
366
|
+] |
|
|
367
|
+``` |
|
|
368
|
+ |
|
|
369
|
+**参数说明**: |
|
|
370
|
+- 类型:`List[str]` |
|
|
371
|
+- 长度:建议≤100(避免超时) |
|
|
372
|
+- 单个文本:建议≤512个字符 |
|
|
373
|
+ |
|
|
374
|
+#### 响应格式 |
|
|
375
|
+ |
|
|
376
|
+**成功响应**(200 OK): |
|
|
377
|
+```json |
|
|
378
|
+[ |
|
|
379
|
+ [0.1234, -0.5678, 0.9012, ..., 0.3456], // 1024维向量 |
|
|
380
|
+ [0.2345, 0.6789, -0.1234, ..., 0.4567], // 1024维向量 |
|
|
381
|
+ [0.3456, -0.7890, 0.2345, ..., 0.5678] // 1024维向量 |
|
|
382
|
+] |
|
|
383
|
+``` |
|
|
384
|
+ |
|
|
385
|
+**字段说明**: |
|
|
386
|
+- 类型:`List[List[float]]` |
|
|
387
|
+- 每个向量:1024个浮点数 |
|
|
388
|
+- 对齐原则:输出数组与输入数组按索引一一对应 |
|
|
389
|
+- 失败项:返回`null` |
|
|
390
|
+ |
|
|
391
|
+**错误示例**: |
|
|
392
|
+```json |
|
|
393
|
+[ |
|
|
394
|
+ [0.1234, -0.5678, ...], // 成功 |
|
|
395
|
+ null, // 失败(空文本或其他错误) |
|
|
396
|
+ [0.3456, 0.7890, ...] // 成功 |
|
|
397
|
+] |
|
|
398
|
+``` |
|
|
399
|
+ |
|
|
400
|
+#### cURL示例 |
|
|
401
|
+ |
|
|
402
|
+```bash |
|
|
403
|
+# 单个文本 |
|
|
404
|
+curl -X POST http://localhost:6005/embed/text \ |
|
|
405
|
+ -H "Content-Type: application/json" \ |
|
|
406
|
+ -d '["测试查询文本"]' |
|
|
407
|
+ |
|
|
408
|
+# 批量文本 |
|
|
409
|
+curl -X POST http://localhost:6005/embed/text \ |
|
|
410
|
+ -H "Content-Type: application/json" \ |
|
|
411
|
+ -d '["红色连衣裙", "blue jeans", "vintage dress"]' |
|
|
412
|
+``` |
|
|
413
|
+ |
|
|
414
|
+#### Python示例 |
|
|
415
|
+ |
|
|
416
|
+```python |
|
|
417
|
+import requests |
|
|
418
|
+import numpy as np |
|
|
419
|
+ |
|
|
420
|
+def embed_texts(texts): |
|
|
421
|
+ """文本向量化""" |
|
|
422
|
+ response = requests.post( |
|
|
423
|
+ "http://localhost:6005/embed/text", |
|
|
424
|
+ json=texts, |
|
|
425
|
+ timeout=30 |
|
|
426
|
+ ) |
|
|
427
|
+ response.raise_for_status() |
|
|
428
|
+ embeddings = response.json() |
|
|
429
|
+ |
|
|
430
|
+ # 转换为numpy数组 |
|
|
431
|
+ valid_embeddings = [e for e in embeddings if e is not None] |
|
|
432
|
+ return np.array(valid_embeddings) |
|
|
433
|
+ |
|
|
434
|
+# 使用 |
|
|
435
|
+texts = ["红色连衣裙", "blue jeans"] |
|
|
436
|
+embeddings = embed_texts(texts) |
|
|
437
|
+print(f"Shape: {embeddings.shape}") # (2, 1024) |
|
|
438
|
+ |
|
|
439
|
+# 计算相似度 |
|
|
440
|
+similarity = np.dot(embeddings[0], embeddings[1]) |
|
|
441
|
+print(f"Similarity: {similarity}") |
|
|
442
|
+``` |
|
|
443
|
+ |
|
|
444
|
+### 5.4 图片向量化接口 |
|
|
445
|
+ |
|
|
446
|
+```http |
|
|
447
|
+POST /embed/image |
|
|
448
|
+Content-Type: application/json |
|
|
449
|
+``` |
|
|
450
|
+ |
|
|
451
|
+#### 请求格式 |
|
|
452
|
+ |
|
|
453
|
+**请求体**(JSON数组): |
|
|
454
|
+```json |
|
|
455
|
+[ |
|
|
456
|
+ "https://example.com/product1.jpg", |
|
|
457
|
+ "https://example.com/product2.png", |
|
|
458
|
+ "/local/path/to/product3.jpg" |
|
|
459
|
+] |
|
|
460
|
+``` |
|
|
461
|
+ |
|
|
462
|
+**参数说明**: |
|
|
463
|
+- 类型:`List[str]` |
|
|
464
|
+- 支持:HTTP URL或本地文件路径 |
|
|
465
|
+- 格式:JPG、PNG等常见图片格式 |
|
|
466
|
+- 长度:建议≤10(图片处理较慢) |
|
|
467
|
+ |
|
|
468
|
+#### 响应格式 |
|
|
469
|
+ |
|
|
470
|
+**成功响应**(200 OK): |
|
|
471
|
+```json |
|
|
472
|
+[ |
|
|
473
|
+ [0.1234, 0.5678, 0.9012, ..., 0.3456], // 1024维向量 |
|
|
474
|
+ null, // 失败(图片无效或下载失败) |
|
|
475
|
+ [0.3456, 0.7890, 0.2345, ..., 0.5678] // 1024维向量 |
|
|
476
|
+] |
|
|
477
|
+``` |
|
|
478
|
+ |
|
|
479
|
+**特性**: |
|
|
480
|
+- 自动下载:HTTP URL自动下载图片 |
|
|
481
|
+- 逐个处理:串行处理(带锁保证线程安全) |
|
|
482
|
+- 容错:单个失败不影响其他图片 |
|
|
483
|
+ |
|
|
484
|
+#### cURL示例 |
|
|
485
|
+ |
|
|
486
|
+```bash |
|
|
487
|
+# 单个图片(URL) |
|
|
488
|
+curl -X POST http://localhost:6005/embed/image \ |
|
|
489
|
+ -H "Content-Type: application/json" \ |
|
|
490
|
+ -d '["https://example.com/product.jpg"]' |
|
|
491
|
+ |
|
|
492
|
+# 多个图片(混合URL和本地路径) |
|
|
493
|
+curl -X POST http://localhost:6005/embed/image \ |
|
|
494
|
+ -H "Content-Type: application/json" \ |
|
|
495
|
+ -d '["https://example.com/img1.jpg", "/data/images/img2.png"]' |
|
|
496
|
+``` |
|
|
497
|
+ |
|
|
498
|
+#### Python示例 |
|
|
499
|
+ |
|
|
500
|
+```python |
|
|
501
|
+import requests |
|
|
502
|
+import numpy as np |
|
|
503
|
+ |
|
|
504
|
+def embed_images(image_urls): |
|
|
505
|
+ """图片向量化""" |
|
|
506
|
+ response = requests.post( |
|
|
507
|
+ "http://localhost:6005/embed/image", |
|
|
508
|
+ json=image_urls, |
|
|
509
|
+ timeout=120 # 图片处理较慢,设置更长超时 |
|
|
510
|
+ ) |
|
|
511
|
+ response.raise_for_status() |
|
|
512
|
+ embeddings = response.json() |
|
|
513
|
+ |
|
|
514
|
+ # 过滤成功的向量化结果 |
|
|
515
|
+ valid_embeddings = [(url, emb) for url, emb in zip(image_urls, embeddings) if emb is not None] |
|
|
516
|
+ return valid_embeddings |
|
|
517
|
+ |
|
|
518
|
+# 使用 |
|
|
519
|
+image_urls = [ |
|
|
520
|
+ "https://example.com/dress1.jpg", |
|
|
521
|
+ "https://example.com/dress2.jpg" |
|
|
522
|
+] |
|
|
523
|
+ |
|
|
524
|
+results = embed_images(image_urls) |
|
|
525
|
+for url, embedding in results: |
|
|
526
|
+ print(f"{url}: {len(embedding)} dimensions") |
|
|
527
|
+``` |
|
|
528
|
+ |
|
|
529
|
+### 5.5 错误处理 |
|
|
530
|
+ |
|
|
531
|
+#### HTTP状态码 |
|
|
532
|
+ |
|
|
533
|
+| 状态码 | 含义 | 处理方式 | |
|
|
534
|
+|--------|------|---------| |
|
|
535
|
+| 200 | 成功 | 正常处理响应 | |
|
|
536
|
+| 500 | 服务器错误 | 检查服务日志 | |
|
|
537
|
+| 503 | 服务不可用 | 模型未加载,检查启动日志 | |
|
|
538
|
+ |
|
|
539
|
+#### 常见错误场景 |
|
|
540
|
+ |
|
|
541
|
+1. **模型未加载** |
|
|
542
|
+```json |
|
|
543
|
+{ |
|
|
544
|
+ "detail": "Runtime Error: Text model not loaded" |
|
|
545
|
+} |
|
|
546
|
+``` |
|
|
547
|
+**解决**:检查服务启动日志,确认模型加载成功 |
|
|
548
|
+ |
|
|
549
|
+2. **无效输入** |
|
|
550
|
+```json |
|
|
551
|
+[null, null] |
|
|
552
|
+``` |
|
|
553
|
+**原因**:输入包含空字符串或None |
|
|
554
|
+ |
|
|
555
|
+3. **图片下载失败** |
|
|
556
|
+```json |
|
|
557
|
+[ |
|
|
558
|
+ [0.123, ...], |
|
|
559
|
+ null // URL无效或网络问题 |
|
|
560
|
+] |
|
|
561
|
+``` |
|
|
562
|
+**解决**:检查URL是否可访问 |
|
|
563
|
+ |
|
|
564
|
+--- |
|
|
565
|
+ |
|
|
566
|
+## 配置说明 |
|
|
567
|
+ |
|
|
568
|
+### 6.1 服务配置 |
|
|
569
|
+ |
|
|
570
|
+编辑 `embeddings/config.py` 修改服务配置: |
|
|
571
|
+ |
|
|
572
|
+```python |
|
|
573
|
+class EmbeddingConfig: |
|
|
574
|
+ # ========== 服务配置 ========== |
|
|
575
|
+ HOST = "0.0.0.0" # 监听所有网卡 |
|
|
576
|
+ PORT = 6005 # 默认端口 |
|
|
577
|
+``` |
|
|
578
|
+ |
|
|
579
|
+**生产环境建议**: |
|
|
580
|
+- 使用反向代理(Nginx)处理SSL |
|
|
581
|
+- 配置防火墙规则限制访问 |
|
|
582
|
+- 使用Docker容器隔离 |
|
|
583
|
+ |
|
|
584
|
+### 6.2 模型配置 |
|
|
585
|
+ |
|
|
586
|
+#### 文本模型配置 |
|
|
587
|
+ |
|
|
588
|
+```python |
|
|
589
|
+# ========== BGE-M3 文本模型 ========== |
|
|
590
|
+TEXT_MODEL_DIR = "Xorbits/bge-m3" # HuggingFace模型ID |
|
|
591
|
+TEXT_DEVICE = "cuda" # 设备选择 |
|
|
592
|
+TEXT_BATCH_SIZE = 32 # 批处理大小 |
|
|
593
|
+``` |
|
|
594
|
+ |
|
|
595
|
+**DEVICE选择**: |
|
|
596
|
+- `"cuda"`: GPU加速(推荐,需要CUDA) |
|
|
597
|
+- `"cpu"`: CPU模式(较慢,但兼容性好) |
|
|
598
|
+ |
|
|
599
|
+**批处理大小建议**: |
|
|
600
|
+- GPU(16GB显存):32-64 |
|
|
601
|
+- GPU(8GB显存):16-32 |
|
|
602
|
+- CPU:8-16 |
|
|
603
|
+ |
|
|
604
|
+#### 图片模型配置 |
|
|
605
|
+ |
|
|
606
|
+```python |
|
|
607
|
+# ========== CN-CLIP 图片模型 ========== |
|
|
608
|
+IMAGE_MODEL_NAME = "ViT-H-14" # 模型名称 |
|
|
609
|
+IMAGE_DEVICE = None # None=自动检测 |
|
|
610
|
+IMAGE_BATCH_SIZE = 8 # 批处理大小 |
|
|
611
|
+``` |
|
|
612
|
+ |
|
|
613
|
+**IMAGE_DEVICE选择**: |
|
|
614
|
+- `None`: 自动检测(推荐) |
|
|
615
|
+- `"cuda"`: 强制使用GPU |
|
|
616
|
+- `"cpu"`: 强制使用CPU |
|
|
617
|
+ |
|
|
618
|
+### 6.3 批处理配置 |
|
|
619
|
+ |
|
|
620
|
+**批处理大小调优**: |
|
|
621
|
+ |
|
|
622
|
+| 场景 | 文本Batch Size | 图片Batch Size | 说明 | |
|
|
623
|
+|------|---------------|---------------|------| |
|
|
624
|
+| 开发测试 | 16 | 1 | 快速响应 | |
|
|
625
|
+| 生产环境(GPU) | 32-64 | 4-8 | 平衡性能 | |
|
|
626
|
+| 生产环境(CPU) | 8-16 | 1-2 | 避免内存溢出 | |
|
|
627
|
+| 离线批处理 | 128+ | 16+ | 最大化吞吐 | |
|
|
628
|
+ |
|
|
629
|
+**批处理建议**: |
|
|
630
|
+1. 监控GPU内存使用:`nvidia-smi` |
|
|
631
|
+2. 逐步增加batch_size直到OOM |
|
|
632
|
+3. 预留20%内存余量 |
|
|
633
|
+ |
|
|
634
|
+--- |
|
|
635
|
+ |
|
|
636
|
+## 客户端集成示例 |
|
|
637
|
+ |
|
|
638
|
+### 7.1 Python客户端 |
|
|
639
|
+ |
|
|
640
|
+#### 基础客户端类 |
|
|
641
|
+ |
|
|
642
|
+```python |
|
|
643
|
+import requests |
|
|
644
|
+from typing import List, Optional |
|
|
645
|
+import numpy as np |
|
|
646
|
+ |
|
|
647
|
+class EmbeddingServiceClient: |
|
|
648
|
+ """向量化服务客户端""" |
|
|
649
|
+ |
|
|
650
|
+ def __init__(self, base_url: str = "http://localhost:6005"): |
|
|
651
|
+ self.base_url = base_url.rstrip('/') |
|
|
652
|
+ self.timeout = 30 |
|
|
653
|
+ |
|
|
654
|
+ def health_check(self) -> dict: |
|
|
655
|
+ """健康检查""" |
|
|
656
|
+ response = requests.get(f"{self.base_url}/health", timeout=5) |
|
|
657
|
+ response.raise_for_status() |
|
|
658
|
+ return response.json() |
|
|
659
|
+ |
|
|
660
|
+ def embed_text(self, text: str) -> Optional[List[float]]: |
|
|
661
|
+ """单个文本向量化""" |
|
|
662
|
+ result = self.embed_texts([text]) |
|
|
663
|
+ return result[0] if result else None |
|
|
664
|
+ |
|
|
665
|
+ def embed_texts(self, texts: List[str]) -> List[Optional[List[float]]]: |
|
|
666
|
+ """批量文本向量化""" |
|
|
667
|
+ if not texts: |
|
|
668
|
+ return [] |
|
|
669
|
+ |
|
|
670
|
+ response = requests.post( |
|
|
671
|
+ f"{self.base_url}/embed/text", |
|
|
672
|
+ json=texts, |
|
|
673
|
+ timeout=self.timeout |
|
|
674
|
+ ) |
|
|
675
|
+ response.raise_for_status() |
|
|
676
|
+ return response.json() |
|
|
677
|
+ |
|
|
678
|
+ def embed_image(self, image_url: str) -> Optional[List[float]]: |
|
|
679
|
+ """单个图片向量化""" |
|
|
680
|
+ result = self.embed_images([image_url]) |
|
|
681
|
+ return result[0] if result else None |
|
|
682
|
+ |
|
|
683
|
+ def embed_images(self, image_urls: List[str]) -> List[Optional[List[float]]]: |
|
|
684
|
+ """批量图片向量化""" |
|
|
685
|
+ if not image_urls: |
|
|
686
|
+ return [] |
|
|
687
|
+ |
|
|
688
|
+ response = requests.post( |
|
|
689
|
+ f"{self.base_url}/embed/image", |
|
|
690
|
+ json=image_urls, |
|
|
691
|
+ timeout=120 # 图片处理需要更长时间 |
|
|
692
|
+ ) |
|
|
693
|
+ response.raise_for_status() |
|
|
694
|
+ return response.json() |
|
|
695
|
+ |
|
|
696
|
+ def embed_texts_to_numpy(self, texts: List[str]) -> Optional[np.ndarray]: |
|
|
697
|
+ """批量文本向量化,返回numpy数组""" |
|
|
698
|
+ embeddings = self.embed_texts(texts) |
|
|
699
|
+ valid_embeddings = [e for e in embeddings if e is not None] |
|
|
700
|
+ if not valid_embeddings: |
|
|
701
|
+ return None |
|
|
702
|
+ return np.array(valid_embeddings, dtype=np.float32) |
|
|
703
|
+ |
|
|
704
|
+# 使用示例 |
|
|
705
|
+if __name__ == "__main__": |
|
|
706
|
+ client = EmbeddingServiceClient() |
|
|
707
|
+ |
|
|
708
|
+ # 健康检查 |
|
|
709
|
+ health = client.health_check() |
|
|
710
|
+ print(f"Service status: {health}") |
|
|
711
|
+ |
|
|
712
|
+ # 文本向量化 |
|
|
713
|
+ texts = ["红色连衣裙", "blue jeans", "vintage dress"] |
|
|
714
|
+ embeddings = client.embed_texts_to_numpy(texts) |
|
|
715
|
+ print(f"Embeddings shape: {embeddings.shape}") |
|
|
716
|
+ |
|
|
717
|
+ # 计算相似度 |
|
|
718
|
+ from sklearn.metrics.pairwise import cosine_similarity |
|
|
719
|
+ similarities = cosine_similarity(embeddings) |
|
|
720
|
+ print(f"Similarity matrix:\n{similarities}") |
|
|
721
|
+``` |
|
|
722
|
+ |
|
|
723
|
+#### 高级用法:异步客户端 |
|
|
724
|
+ |
|
|
725
|
+```python |
|
|
726
|
+import aiohttp |
|
|
727
|
+import asyncio |
|
|
728
|
+from typing import List, Optional |
|
|
729
|
+ |
|
|
730
|
+class AsyncEmbeddingClient: |
|
|
731
|
+ """异步向量化服务客户端""" |
|
|
732
|
+ |
|
|
733
|
+ def __init__(self, base_url: str = "http://localhost:6005"): |
|
|
734
|
+ self.base_url = base_url.rstrip('/') |
|
|
735
|
+ self.session: Optional[aiohttp.ClientSession] = None |
|
|
736
|
+ |
|
|
737
|
+ async def __aenter__(self): |
|
|
738
|
+ self.session = aiohttp.ClientSession() |
|
|
739
|
+ return self |
|
|
740
|
+ |
|
|
741
|
+ async def __aexit__(self, exc_type, exc_val, exc_tb): |
|
|
742
|
+ if self.session: |
|
|
743
|
+ await self.session.close() |
|
|
744
|
+ |
|
|
745
|
+ async def embed_texts(self, texts: List[str]) -> List[Optional[List[float]]]: |
|
|
746
|
+ """异步批量文本向量化""" |
|
|
747
|
+ if not texts: |
|
|
748
|
+ return [] |
|
|
749
|
+ |
|
|
750
|
+ if not self.session: |
|
|
751
|
+ raise RuntimeError("Client not initialized. Use 'async with'.") |
|
|
752
|
+ |
|
|
753
|
+ async with self.session.post( |
|
|
754
|
+ f"{self.base_url}/embed/text", |
|
|
755
|
+ json=texts, |
|
|
756
|
+ timeout=aiohttp.ClientTimeout(total=30) |
|
|
757
|
+ ) as response: |
|
|
758
|
+ response.raise_for_status() |
|
|
759
|
+ return await response.json() |
|
|
760
|
+ |
|
|
761
|
+# 使用示例 |
|
|
762
|
+async def main(): |
|
|
763
|
+ async with AsyncEmbeddingClient() as client: |
|
|
764
|
+ texts = ["text1", "text2", "text3"] |
|
|
765
|
+ embeddings = await client.embed_texts(texts) |
|
|
766
|
+ print(f"Got {len(embeddings)} embeddings") |
|
|
767
|
+ |
|
|
768
|
+asyncio.run(main()) |
|
|
769
|
+``` |
|
|
770
|
+ |
|
|
771
|
+### 7.2 Java客户端 |
|
|
772
|
+ |
|
|
773
|
+#### 基础客户端类 |
|
|
774
|
+ |
|
|
775
|
+```java |
|
|
776
|
+import java.net.URI; |
|
|
777
|
+import java.net.http.HttpClient; |
|
|
778
|
+import java.net.http.HttpRequest; |
|
|
779
|
+import java.net.http.HttpResponse; |
|
|
780
|
+import java.time.Duration; |
|
|
781
|
+import java.util.List; |
|
|
782
|
+import com.fasterxml.jackson.databind.ObjectMapper; |
|
|
783
|
+import com.fasterxml.jackson.databind.JsonNode; |
|
|
784
|
+import com.fasterxml.jackson.databind.node.ArrayNode; |
|
|
785
|
+ |
|
|
786
|
+public class EmbeddingServiceClient { |
|
|
787
|
+ private final HttpClient httpClient; |
|
|
788
|
+ private final ObjectMapper objectMapper; |
|
|
789
|
+ private final String baseUrl; |
|
|
790
|
+ |
|
|
791
|
+ public EmbeddingServiceClient(String baseUrl) { |
|
|
792
|
+ this.baseUrl = baseUrl.replaceAll("/$", ""); |
|
|
793
|
+ this.httpClient = HttpClient.newBuilder() |
|
|
794
|
+ .connectTimeout(Duration.ofSeconds(10)) |
|
|
795
|
+ .build(); |
|
|
796
|
+ this.objectMapper = new ObjectMapper(); |
|
|
797
|
+ } |
|
|
798
|
+ |
|
|
799
|
+ /** |
|
|
800
|
+ * 健康检查 |
|
|
801
|
+ */ |
|
|
802
|
+ public HealthStatus healthCheck() throws Exception { |
|
|
803
|
+ HttpRequest request = HttpRequest.newBuilder() |
|
|
804
|
+ .uri(URI.create(baseUrl + "/health")) |
|
|
805
|
+ .timeout(Duration.ofSeconds(5)) |
|
|
806
|
+ .GET() |
|
|
807
|
+ .build(); |
|
|
808
|
+ |
|
|
809
|
+ HttpResponse<String> response = httpClient.send( |
|
|
810
|
+ request, |
|
|
811
|
+ HttpResponse.BodyHandlers.ofString() |
|
|
812
|
+ ); |
|
|
813
|
+ |
|
|
814
|
+ JsonNode json = objectMapper.readTree(response.body()); |
|
|
815
|
+ return new HealthStatus( |
|
|
816
|
+ json.get("status").asText(), |
|
|
817
|
+ json.get("text_model_loaded").asBoolean(), |
|
|
818
|
+ json.get("image_model_loaded").asBoolean() |
|
|
819
|
+ ); |
|
|
820
|
+ } |
|
|
821
|
+ |
|
|
822
|
+ /** |
|
|
823
|
+ * 批量文本向量化 |
|
|
824
|
+ */ |
|
|
825
|
+ public List<float[]> embedTexts(List<String> texts) throws Exception { |
|
|
826
|
+ // 构建请求体 |
|
|
827
|
+ ArrayNode requestBody = objectMapper.createArrayNode(); |
|
|
828
|
+ for (String text : texts) { |
|
|
829
|
+ requestBody.add(text); |
|
|
830
|
+ } |
|
|
831
|
+ |
|
|
832
|
+ HttpRequest request = HttpRequest.newBuilder() |
|
|
833
|
+ .uri(URI.create(baseUrl + "/embed/text")) |
|
|
834
|
+ .header("Content-Type", "application/json") |
|
|
835
|
+ .timeout(Duration.ofSeconds(30)) |
|
|
836
|
+ .POST(HttpRequest.BodyPublishers.ofString( |
|
|
837
|
+ objectMapper.writeValueAsString(requestBody) |
|
|
838
|
+ )) |
|
|
839
|
+ .build(); |
|
|
840
|
+ |
|
|
841
|
+ HttpResponse<String> response = httpClient.send( |
|
|
842
|
+ request, |
|
|
843
|
+ HttpResponse.BodyHandlers.ofString() |
|
|
844
|
+ ); |
|
|
845
|
+ |
|
|
846
|
+ if (response.statusCode() != 200) { |
|
|
847
|
+ throw new RuntimeException("API error: " + response.body()); |
|
|
848
|
+ } |
|
|
849
|
+ |
|
|
850
|
+ // 解析响应 |
|
|
851
|
+ JsonNode root = objectMapper.readTree(response.body()); |
|
|
852
|
+ List<float[]> embeddings = new java.util.ArrayList<>(); |
|
|
853
|
+ |
|
|
854
|
+ for (JsonNode item : root) { |
|
|
855
|
+ if (item.isNull()) { |
|
|
856
|
+ embeddings.add(null); |
|
|
857
|
+ } else { |
|
|
858
|
+ float[] vector = objectMapper.treeToValue(item, float[].class); |
|
|
859
|
+ embeddings.add(vector); |
|
|
860
|
+ } |
|
|
861
|
+ } |
|
|
862
|
+ |
|
|
863
|
+ return embeddings; |
|
|
864
|
+ } |
|
|
865
|
+ |
|
|
866
|
+ /** |
|
|
867
|
+ * 计算余弦相似度 |
|
|
868
|
+ */ |
|
|
869
|
+ public static float cosineSimilarity(float[] v1, float[] v2) { |
|
|
870
|
+ if (v1.length != v2.length) { |
|
|
871
|
+ throw new IllegalArgumentException("Vectors must be same length"); |
|
|
872
|
+ } |
|
|
873
|
+ |
|
|
874
|
+ float dotProduct = 0.0f; |
|
|
875
|
+ float norm1 = 0.0f; |
|
|
876
|
+ float norm2 = 0.0f; |
|
|
877
|
+ |
|
|
878
|
+ for (int i = 0; i < v1.length; i++) { |
|
|
879
|
+ dotProduct += v1[i] * v2[i]; |
|
|
880
|
+ norm1 += v1[i] * v1[i]; |
|
|
881
|
+ norm2 += v2[i] * v2[i]; |
|
|
882
|
+ } |
|
|
883
|
+ |
|
|
884
|
+ return (float) (dotProduct / (Math.sqrt(norm1) * Math.sqrt(norm2))); |
|
|
885
|
+ } |
|
|
886
|
+ |
|
|
887
|
+ // 健康状态数据类 |
|
|
888
|
+ public static class HealthStatus { |
|
|
889
|
+ public final String status; |
|
|
890
|
+ public final boolean textModelLoaded; |
|
|
891
|
+ public final boolean imageModelLoaded; |
|
|
892
|
+ |
|
|
893
|
+ public HealthStatus(String status, boolean textModelLoaded, boolean imageModelLoaded) { |
|
|
894
|
+ this.status = status; |
|
|
895
|
+ this.textModelLoaded = textModelLoaded; |
|
|
896
|
+ this.imageModelLoaded = imageModelLoaded; |
|
|
897
|
+ } |
|
|
898
|
+ |
|
|
899
|
+ @Override |
|
|
900
|
+ public String toString() { |
|
|
901
|
+ return String.format("HealthStatus{status='%s', textModelLoaded=%b, imageModelLoaded=%b}", |
|
|
902
|
+ status, textModelLoaded, imageModelLoaded); |
|
|
903
|
+ } |
|
|
904
|
+ } |
|
|
905
|
+ |
|
|
906
|
+ // 使用示例 |
|
|
907
|
+ public static void main(String[] args) throws Exception { |
|
|
908
|
+ EmbeddingServiceClient client = new EmbeddingServiceClient("http://localhost:6005"); |
|
|
909
|
+ |
|
|
910
|
+ // 健康检查 |
|
|
911
|
+ HealthStatus health = client.healthCheck(); |
|
|
912
|
+ System.out.println("Health: " + health); |
|
|
913
|
+ |
|
|
914
|
+ // 文本向量化 |
|
|
915
|
+ List<String> texts = List.of("红色连衣裙", "blue jeans", "vintage dress"); |
|
|
916
|
+ List<float[]> embeddings = client.embedTexts(texts); |
|
|
917
|
+ |
|
|
918
|
+ System.out.println("Got " + embeddings.size() + " embeddings"); |
|
|
919
|
+ for (int i = 0; i < embeddings.size(); i++) { |
|
|
920
|
+ System.out.println("Embedding " + i + " dimensions: " + |
|
|
921
|
+ (embeddings.get(i) != null ? embeddings.get(i).length : "null")); |
|
|
922
|
+ } |
|
|
923
|
+ |
|
|
924
|
+ // 计算相似度 |
|
|
925
|
+ if (embeddings.get(0) != null && embeddings.get(1) != null) { |
|
|
926
|
+ float similarity = cosineSimilarity(embeddings.get(0), embeddings.get(1)); |
|
|
927
|
+ System.out.println("Similarity between text 0 and 1: " + similarity); |
|
|
928
|
+ } |
|
|
929
|
+ } |
|
|
930
|
+} |
|
|
931
|
+``` |
|
|
932
|
+ |
|
|
933
|
+**Maven依赖**(`pom.xml`): |
|
|
934
|
+ |
|
|
935
|
+```xml |
|
|
936
|
+<dependencies> |
|
|
937
|
+ <dependency> |
|
|
938
|
+ <groupId>com.fasterxml.jackson.core</groupId> |
|
|
939
|
+ <artifactId>jackson-databind</artifactId> |
|
|
940
|
+ <version>2.15.2</version> |
|
|
941
|
+ </dependency> |
|
|
942
|
+</dependencies> |
|
|
943
|
+``` |
|
|
944
|
+ |
|
|
945
|
+### 7.3 cURL示例 |
|
|
946
|
+ |
|
|
947
|
+#### 健康检查 |
|
|
948
|
+ |
|
|
949
|
+```bash |
|
|
950
|
+curl http://localhost:6005/health |
|
|
951
|
+``` |
|
|
952
|
+ |
|
|
953
|
+#### 文本向量化 |
|
|
954
|
+ |
|
|
955
|
+```bash |
|
|
956
|
+# 单个文本 |
|
|
957
|
+curl -X POST http://localhost:6005/embed/text \ |
|
|
958
|
+ -H "Content-Type: application/json" \ |
|
|
959
|
+ -d '["衣服的质量杠杠的"]' \ |
|
|
960
|
+ | jq '.[0][0:10]' # 打印前10维 |
|
|
961
|
+ |
|
|
962
|
+# 批量文本 |
|
|
963
|
+curl -X POST http://localhost:6005/embed/text \ |
|
|
964
|
+ -H "Content-Type: application/json" \ |
|
|
965
|
+ -d '["红色连衣裙", "blue jeans", "vintage dress"]' \ |
|
|
966
|
+ | jq '. | length' # 检查返回数量 |
|
|
967
|
+``` |
|
|
968
|
+ |
|
|
969
|
+#### 图片向量化 |
|
|
970
|
+ |
|
|
971
|
+```bash |
|
|
972
|
+# URL图片 |
|
|
973
|
+curl -X POST http://localhost:6005/embed/image \ |
|
|
974
|
+ -H "Content-Type: application/json" \ |
|
|
975
|
+ -d '["https://example.com/product.jpg"]' \ |
|
|
976
|
+ | jq '.[0][0:5]' |
|
|
977
|
+ |
|
|
978
|
+# 本地图片 |
|
|
979
|
+curl -X POST http://localhost:6005/embed/image \ |
|
|
980
|
+ -H "Content-Type: application/json" \ |
|
|
981
|
+ -d '["/data/images/product.jpg"]' |
|
|
982
|
+``` |
|
|
983
|
+ |
|
|
984
|
+#### 错误处理示例 |
|
|
985
|
+ |
|
|
986
|
+```bash |
|
|
987
|
+# 检查服务状态 |
|
|
988
|
+if ! curl -f http://localhost:6005/health > /dev/null 2>&1; then |
|
|
989
|
+ echo "Embedding service is not healthy!" |
|
|
990
|
+ exit 1 |
|
|
991
|
+fi |
|
|
992
|
+ |
|
|
993
|
+# 调用API并检查错误 |
|
|
994
|
+response=$(curl -s -X POST http://localhost:6005/embed/text \ |
|
|
995
|
+ -H "Content-Type: application/json" \ |
|
|
996
|
+ -d '["test query"]') |
|
|
997
|
+ |
|
|
998
|
+if echo "$response" | jq -e '.[0] == null' > /dev/null; then |
|
|
999
|
+ echo "Embedding failed!" |
|
|
1000
|
+ echo "$response" |
|
|
1001
|
+ exit 1 |
|
|
1002
|
+fi |
|
|
1003
|
+ |
|
|
1004
|
+echo "Embedding succeeded!" |
|
|
1005
|
+``` |
|
|
1006
|
+ |
|
|
1007
|
+--- |
|
|
1008
|
+ |
|
|
1009
|
+## 性能对比与优化 |
|
|
1010
|
+ |
|
|
1011
|
+### 8.1 性能对比 |
|
|
1012
|
+ |
|
|
1013
|
+#### 本地服务性能 |
|
|
1014
|
+ |
|
|
1015
|
+| 操作 | 硬件配置 | 延迟 | 吞吐量 | |
|
|
1016
|
+|------|---------|------|--------| |
|
|
1017
|
+| 文本向量化(单个) | GPU (RTX 3090) | ~80ms | ~12 qps | |
|
|
1018
|
+| 文本向量化(批量32) | GPU (RTX 3090) | ~2.5s | ~256 qps | |
|
|
1019
|
+| 文本向量化(单个) | CPU (16核) | ~500ms | ~2 qps | |
|
|
1020
|
+| 图片向量化(单个) | GPU (RTX 3090) | ~150ms | ~6 qps | |
|
|
1021
|
+| 图片向量化(批量4) | GPU (RTX 3090) | ~600ms | ~6 qps | |
|
|
1022
|
+ |
|
|
1023
|
+#### 云端服务性能 |
|
|
1024
|
+ |
|
|
1025
|
+| 操作 | 指标 | 值 | |
|
|
1026
|
+|------|------|-----| |
|
|
1027
|
+| 文本向量化(单个) | 延迟 | 300-400ms | |
|
|
1028
|
+| 文本向量化(批量) | 吞吐量 | ~2-3 qps | |
|
|
1029
|
+| API限制 | 速率限制 | 取决于套餐 | |
|
|
1030
|
+| 可用性 | SLA | 99.9% | |
|
|
1031
|
+ |
|
|
1032
|
+### 8.2 成本对比 |
|
|
1033
|
+ |
|
|
1034
|
+#### 本地服务成本 |
|
|
1035
|
+ |
|
|
1036
|
+| 配置 | 硬件成本(月) | 电费(月) | 总成本(月) | |
|
|
1037
|
+|------|--------------|-----------|------------| |
|
|
1038
|
+| GPU服务器 (RTX 3090) | ¥3000 | ¥500 | ¥3500 | |
|
|
1039
|
+| GPU服务器 (A100) | ¥8000 | ¥800 | ¥8800 | |
|
|
1040
|
+| CPU服务器(16核) | ¥800 | ¥200 | ¥1000 | |
|
|
1041
|
+ |
|
|
1042
|
+#### 云端服务成本 |
|
|
1043
|
+ |
|
|
1044
|
+阿里云DashScope定价(参考): |
|
|
1045
|
+ |
|
|
1046
|
+| 套餐 | 价格 | 调用量 | 适用场景 | |
|
|
1047
|
+|------|------|--------|---------| |
|
|
1048
|
+| 按量付费 | ¥0.0007/1K tokens | 无限制 | 测试/小规模 | |
|
|
1049
|
+| 基础版 | ¥100/月 | 1M tokens | 小规模应用 | |
|
|
1050
|
+| 专业版 | ¥500/月 | 10M tokens | 中等规模 | |
|
|
1051
|
+| 企业版 | 定制 | 无限制 | 大规模 | |
|
|
1052
|
+ |
|
|
1053
|
+**成本计算示例**: |
|
|
1054
|
+ |
|
|
1055
|
+假设每天10万次搜索,每次查询平均10个token: |
|
|
1056
|
+- 日调用量:1M tokens |
|
|
1057
|
+- 月调用量:30M tokens |
|
|
1058
|
+- 月成本:30 × 0.7 = ¥21(按量付费) |
|
|
1059
|
+ |
|
|
1060
|
+### 8.3 优化建议 |
|
|
1061
|
+ |
|
|
1062
|
+#### 本地服务优化 |
|
|
1063
|
+ |
|
|
1064
|
+1. **GPU利用率优化** |
|
|
1065
|
+```python |
|
|
1066
|
+# 增加批处理大小 |
|
|
1067
|
+TEXT_BATCH_SIZE = 64 # 从32增加到64 |
|
|
1068
|
+``` |
|
|
1069
|
+ |
|
|
1070
|
+2. **模型量化** |
|
|
1071
|
+```python |
|
|
1072
|
+# 使用半精度浮点数(节省显存) |
|
|
1073
|
+import torch |
|
|
1074
|
+model = model.half() # FP16 |
|
|
1075
|
+``` |
|
|
1076
|
+ |
|
|
1077
|
+3. **预热模型** |
|
|
1078
|
+```python |
|
|
1079
|
+# 服务启动后预热 |
|
|
1080
|
+@app.on_event("startup") |
|
|
1081
|
+async def warmup(): |
|
|
1082
|
+ _text_model.encode(["warmup"], device="cuda") |
|
|
1083
|
+``` |
|
|
1084
|
+ |
|
|
1085
|
+4. **连接池优化** |
|
|
1086
|
+```python |
|
|
1087
|
+# uvicorn配置 |
|
|
1088
|
+--workers 1 \ # 单worker(GPU模型限制) |
|
|
1089
|
+--backlog 2048 \ # 增加连接队列 |
|
|
1090
|
+--limit-concurrency 32 # 限制并发数 |
|
|
1091
|
+``` |
|
|
1092
|
+ |
|
|
1093
|
+#### 云端服务优化 |
|
|
1094
|
+ |
|
|
1095
|
+1. **批量合并** |
|
|
1096
|
+```python |
|
|
1097
|
+# 累积多个请求后批量调用 |
|
|
1098
|
+class BatchEncoder: |
|
|
1099
|
+ def __init__(self, batch_size=32, timeout=0.1): |
|
|
1100
|
+ self.batch_size = batch_size |
|
|
1101
|
+ self.timeout = timeout |
|
|
1102
|
+ self.queue = [] |
|
|
1103
|
+ |
|
|
1104
|
+ async def encode(self, text: str): |
|
|
1105
|
+ # 等待批量积累 |
|
|
1106
|
+ future = asyncio.Future() |
|
|
1107
|
+ self.queue.append((text, future)) |
|
|
1108
|
+ |
|
|
1109
|
+ if len(self.queue) >= self.batch_size: |
|
|
1110
|
+ self._flush() |
|
|
1111
|
+ |
|
|
1112
|
+ return await future |
|
|
1113
|
+``` |
|
|
1114
|
+ |
|
|
1115
|
+2. **本地缓存** |
|
|
1116
|
+```python |
|
|
1117
|
+import hashlib |
|
|
1118
|
+import pickle |
|
|
1119
|
+ |
|
|
1120
|
+class CachedEncoder: |
|
|
1121
|
+ def __init__(self, cache_file="embedding_cache.pkl"): |
|
|
1122
|
+ self.cache = self._load_cache(cache_file) |
|
|
1123
|
+ |
|
|
1124
|
+ def encode(self, text: str): |
|
|
1125
|
+ key = hashlib.md5(text.encode()).hexdigest() |
|
|
1126
|
+ if key in self.cache: |
|
|
1127
|
+ return self.cache[key] |
|
|
1128
|
+ |
|
|
1129
|
+ embedding = self._call_api(text) |
|
|
1130
|
+ self.cache[key] = embedding |
|
|
1131
|
+ return embedding |
|
|
1132
|
+``` |
|
|
1133
|
+ |
|
|
1134
|
+3. **降级策略** |
|
|
1135
|
+```python |
|
|
1136
|
+class HybridEncoder: |
|
|
1137
|
+ def __init__(self): |
|
|
1138
|
+ self.cloud_encoder = CloudTextEncoder() |
|
|
1139
|
+ self.local_encoder = None # 按需加载 |
|
|
1140
|
+ |
|
|
1141
|
+ def encode(self, text: str): |
|
|
1142
|
+ try: |
|
|
1143
|
+ return self.cloud_encoder.encode(text) |
|
|
1144
|
+ except Exception as e: |
|
|
1145
|
+ logger.warning(f"Cloud API failed: {e}, falling back to local") |
|
|
1146
|
+ if not self.local_encoder: |
|
|
1147
|
+ self.local_encoder = BgeEncoder() |
|
|
1148
|
+ return self.local_encoder.encode(text) |
|
|
1149
|
+``` |
|
|
1150
|
+ |
|
|
1151
|
+--- |
|
|
1152
|
+ |
|
|
1153
|
+## 故障排查 |
|
|
1154
|
+ |
|
|
1155
|
+### 9.1 常见问题 |
|
|
1156
|
+ |
|
|
1157
|
+#### 问题1:服务无法启动 |
|
|
1158
|
+ |
|
|
1159
|
+**症状**: |
|
|
1160
|
+```bash |
|
|
1161
|
+$ ./scripts/start_embedding_service.sh |
|
|
1162
|
+Error: Port 6005 already in use |
|
|
1163
|
+``` |
|
|
1164
|
+ |
|
|
1165
|
+**解决**: |
|
|
1166
|
+```bash |
|
|
1167
|
+# 检查端口占用 |
|
|
1168
|
+lsof -i :6005 |
|
|
1169
|
+ |
|
|
1170
|
+# 杀死占用进程 |
|
|
1171
|
+kill -9 <PID> |
|
|
1172
|
+ |
|
|
1173
|
+# 或者修改配置文件中的端口 |
|
|
1174
|
+# embeddings/config.py: PORT = 6006 |
|
|
1175
|
+``` |
|
|
1176
|
+ |
|
|
1177
|
+#### 问题2:CUDA Out of Memory |
|
|
1178
|
+ |
|
|
1179
|
+**症状**: |
|
|
1180
|
+``` |
|
|
1181
|
+RuntimeError: CUDA out of memory. Tried to allocate 2.00 GiB |
|
|
1182
|
+``` |
|
|
1183
|
+ |
|
|
1184
|
+**解决**: |
|
|
1185
|
+```python |
|
|
1186
|
+# 减小批处理大小 |
|
|
1187
|
+TEXT_BATCH_SIZE = 16 # 从32减少到16 |
|
|
1188
|
+ |
|
|
1189
|
+# 或者使用CPU模式 |
|
|
1190
|
+TEXT_DEVICE = "cpu" |
|
|
1191
|
+``` |
|
|
1192
|
+ |
|
|
1193
|
+#### 问题3:模型下载失败 |
|
|
1194
|
+ |
|
|
1195
|
+**症状**: |
|
|
1196
|
+``` |
|
|
1197
|
+OSError: Can't load tokenizer for 'Xorbits/bge-m3' |
|
|
1198
|
+``` |
|
|
1199
|
+ |
|
|
1200
|
+**解决**: |
|
|
1201
|
+```bash |
|
|
1202
|
+# 手动下载模型 |
|
|
1203
|
+huggingface-cli download Xorbits/bge-m3 |
|
|
1204
|
+ |
|
|
1205
|
+# 或使用镜像 |
|
|
1206
|
+export HF_ENDPOINT=https://hf-mirror.com |
|
|
1207
|
+``` |
|
|
1208
|
+ |
|
|
1209
|
+#### 问题4:云端API Key无效 |
|
|
1210
|
+ |
|
|
1211
|
+**症状**: |
|
|
1212
|
+``` |
|
|
1213
|
+ERROR: DASHSCOPE_API_KEY environment variable is not set! |
|
|
1214
|
+``` |
|
|
1215
|
+ |
|
|
1216
|
+**解决**: |
|
|
1217
|
+```bash |
|
|
1218
|
+# 设置环境变量 |
|
|
1219
|
+export DASHSCOPE_API_KEY="sk-your-key" |
|
|
1220
|
+ |
|
|
1221
|
+# 验证 |
|
|
1222
|
+echo $DASHSCOPE_API_KEY |
|
|
1223
|
+``` |
|
|
1224
|
+ |
|
|
1225
|
+#### 问题5:API速率限制 |
|
|
1226
|
+ |
|
|
1227
|
+**症状**: |
|
|
1228
|
+``` |
|
|
1229
|
+Rate limit exceeded. Please try again later. |
|
|
1230
|
+``` |
|
|
1231
|
+ |
|
|
1232
|
+**解决**: |
|
|
1233
|
+```python |
|
|
1234
|
+# 添加延迟 |
|
|
1235
|
+import time |
|
|
1236
|
+for batch in batches: |
|
|
1237
|
+ embeddings = encoder.encode_batch(batch) |
|
|
1238
|
+ time.sleep(0.1) # 每批之间延迟100ms |
|
|
1239
|
+``` |
|
|
1240
|
+ |
|
|
1241
|
+### 9.2 日志查看 |
|
|
1242
|
+ |
|
|
1243
|
+#### 服务日志 |
|
|
1244
|
+ |
|
|
1245
|
+```bash |
|
|
1246
|
+# 查看实时日志 |
|
|
1247
|
+./scripts/start_embedding_service.sh 2>&1 | tee embedding.log |
|
|
1248
|
+ |
|
|
1249
|
+# 或使用systemd(如果配置了服务) |
|
|
1250
|
+journalctl -u embedding-service -f |
|
|
1251
|
+``` |
|
|
1252
|
+ |
|
|
1253
|
+#### Python应用日志 |
|
|
1254
|
+ |
|
|
1255
|
+```python |
|
|
1256
|
+import logging |
|
|
1257
|
+ |
|
|
1258
|
+# 配置日志 |
|
|
1259
|
+logging.basicConfig( |
|
|
1260
|
+ level=logging.INFO, |
|
|
1261
|
+ format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' |
|
|
1262
|
+) |
|
|
1263
|
+ |
|
|
1264
|
+logger = logging.getLogger(__name__) |
|
|
1265
|
+ |
|
|
1266
|
+# 使用 |
|
|
1267
|
+logger.info("Encoding texts...") |
|
|
1268
|
+logger.error("Encoding failed: %s", str(e)) |
|
|
1269
|
+``` |
|
|
1270
|
+ |
|
|
1271
|
+#### GPU监控 |
|
|
1272
|
+ |
|
|
1273
|
+```bash |
|
|
1274
|
+# 实时监控GPU使用 |
|
|
1275
|
+watch -n 1 nvidia-smi |
|
|
1276
|
+ |
|
|
1277
|
+# 查看详细信息 |
|
|
1278
|
+nvidia-smi --query-gpu=timestamp,name,temperature.gpu,utilization.gpu,utilization.memory,memory.total,memory.used,memory.free --format=csv |
|
|
1279
|
+``` |
|
|
1280
|
+ |
|
|
1281
|
+### 9.3 性能调优 |
|
|
1282
|
+ |
|
|
1283
|
+#### 性能分析 |
|
|
1284
|
+ |
|
|
1285
|
+```python |
|
|
1286
|
+import time |
|
|
1287
|
+import numpy as np |
|
|
1288
|
+ |
|
|
1289
|
+def benchmark_encoder(encoder, texts, iterations=100): |
|
|
1290
|
+ """性能基准测试""" |
|
|
1291
|
+ times = [] |
|
|
1292
|
+ |
|
|
1293
|
+ for i in range(iterations): |
|
|
1294
|
+ start = time.time() |
|
|
1295
|
+ embeddings = encoder.encode(texts) |
|
|
1296
|
+ end = time.time() |
|
|
1297
|
+ times.append(end - start) |
|
|
1298
|
+ |
|
|
1299
|
+ times = np.array(times) |
|
|
1300
|
+ print(f"Mean: {times.mean():.3f}s") |
|
|
1301
|
+ print(f"Std: {times.std():.3f}s") |
|
|
1302
|
+ print(f"Min: {times.min():.3f}s") |
|
|
1303
|
+ print(f"Max: {times.max():.3f}s") |
|
|
1304
|
+ print(f"QPS: {len(texts) / times.mean():.2f}") |
|
|
1305
|
+ |
|
|
1306
|
+# 使用 |
|
|
1307
|
+benchmark_encoder(encoder, texts=["test"] * 32, iterations=100) |
|
|
1308
|
+``` |
|
|
1309
|
+ |
|
|
1310
|
+#### 内存分析 |
|
|
1311
|
+ |
|
|
1312
|
+```bash |
|
|
1313
|
+# Python内存分析 |
|
|
1314
|
+pip install memory_profiler |
|
|
1315
|
+ |
|
|
1316
|
+# 在代码中添加 |
|
|
1317
|
+from memory_profiler import profile |
|
|
1318
|
+ |
|
|
1319
|
+@profile |
|
|
1320
|
+def encode_batch(texts): |
|
|
1321
|
+ return encoder.encode(texts) |
|
|
1322
|
+ |
|
|
1323
|
+# 运行 |
|
|
1324
|
+python -m memory_profiler script.py |
|
|
1325
|
+``` |
|
|
1326
|
+ |
|
|
1327
|
+--- |
|
|
1328
|
+ |
|
|
1329
|
+## 附录 |
|
|
1330
|
+ |
|
|
1331
|
+### 10.1 向量维度说明 |
|
|
1332
|
+ |
|
|
1333
|
+#### 为什么是1024维? |
|
|
1334
|
+ |
|
|
1335
|
+1. **表达能力**:1024维可以捕捉丰富的语义信息 |
|
|
1336
|
+2. **计算效率**:维度适中,计算速度快 |
|
|
1337
|
+3. **存储平衡**:向量大小合理(每个向量约4KB) |
|
|
1338
|
+4. **模型选择**:BGE-M3和text-embedding-v4都使用1024维 |
|
|
1339
|
+ |
|
|
1340
|
+#### 向量存储计算 |
|
|
1341
|
+ |
|
|
1342
|
+``` |
|
|
1343
|
+单个向量大小 = 1024 × 4字节(FP32) = 4KB |
|
|
1344
|
+100万向量大小 = 4KB × 1,000,000 = 4GB |
|
|
1345
|
+1000万向量大小 = 4KB × 10,000,000 = 40GB |
|
|
1346
|
+``` |
|
|
1347
|
+ |
|
|
1348
|
+### 10.2 模型版本信息 |
|
|
1349
|
+ |
|
|
1350
|
+#### BGE-M3 |
|
|
1351
|
+ |
|
|
1352
|
+- **HuggingFace ID**: `Xorbits/bge-m3` |
|
|
1353
|
+- **论文**: [BGE-M3: Multi-Functionality, Multi-Linguality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation](https://arxiv.org/abs/2402.03616) |
|
|
1354
|
+- **GitHub**: https://github.com/FlagOpen/FlagEmbedding |
|
|
1355
|
+- **特性**: |
|
|
1356
|
+ - 支持100+种语言 |
|
|
1357
|
+ - 最大支持8192 token长度 |
|
|
1358
|
+ - 丰富的语义表达能力 |
|
|
1359
|
+ |
|
|
1360
|
+#### CN-CLIP |
|
|
1361
|
+ |
|
|
1362
|
+- **模型**: ViT-H-14 |
|
|
1363
|
+- **论文**: [Chinese CLIP: Contrastive Language-Image Pretraining in Chinese](https://arxiv.org/abs/2211.01935) |
|
|
1364
|
+- **GitHub**: https://github.com/OFA-Sys/Chinese-CLIP |
|
|
1365
|
+- **特性**: |
|
|
1366
|
+ - 中文图文理解 |
|
|
1367
|
+ - 支持图片检索和文本检索 |
|
|
1368
|
+ - 适合电商场景 |
|
|
1369
|
+ |
|
|
1370
|
+#### Aliyun text-embedding-v4 |
|
|
1371
|
+ |
|
|
1372
|
+- **提供商**: 阿里云DashScope |
|
|
1373
|
+- **文档**: https://help.aliyun.com/zh/model-studio/getting-started/models |
|
|
1374
|
+- **特性**: |
|
|
1375
|
+ - 云端API,无需部署 |
|
|
1376
|
+ - 高可用性(99.9% SLA) |
|
|
1377
|
+ - 自动扩展 |
|
|
1378
|
+ |
|
|
1379
|
+### 10.3 相关文档 |
|
|
1380
|
+ |
|
|
1381
|
+#### 项目文档 |
|
|
1382
|
+ |
|
|
1383
|
+- **搜索API对接指南**: `docs/搜索API对接指南.md` |
|
|
1384
|
+- **索引字段说明**: `docs/索引字段说明v2.md` |
|
|
1385
|
+- **系统设计文档**: `docs/系统设计文档.md` |
|
|
1386
|
+- **CLAUDE项目指南**: `CLAUDE.md` |
|
|
1387
|
+ |
|
|
1388
|
+#### 外部参考 |
|
|
1389
|
+ |
|
|
1390
|
+- **BGE-M3官方文档**: https://github.com/FlagOpen/FlagEmbedding/tree/master/BGE_M3 |
|
|
1391
|
+- **阿里云DashScope**: https://help.aliyun.com/zh/model-studio/ |
|
|
1392
|
+- **Elasticsearch向量搜索**: https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html |
|
|
1393
|
+- **FastAPI文档**: https://fastapi.tiangolo.com/ |
|
|
1394
|
+ |
|
|
1395
|
+#### 测试脚本 |
|
|
1396
|
+ |
|
|
1397
|
+```bash |
|
|
1398
|
+# 本地向量化服务测试 |
|
|
1399
|
+./scripts/test_embedding_service.sh |
|
|
1400
|
+ |
|
|
1401
|
+# 云端向量化服务测试 |
|
|
1402
|
+python scripts/test_cloud_embedding.py |
|
|
1403
|
+ |
|
|
1404
|
+# 性能基准测试 |
|
|
1405
|
+python scripts/benchmark_embeddings.py |
|
|
1406
|
+``` |
|
|
1407
|
+ |
|
|
1408
|
+--- |
|
|
1409
|
+ |
|
|
1410
|
+## 版本历史 |
|
|
1411
|
+ |
|
|
1412
|
+| 版本 | 日期 | 变更说明 | |
|
|
1413
|
+|------|------|---------| |
|
|
1414
|
+| v1.0 | 2025-12-23 | 初始版本,完整的向量化模块文档 | |
|
|
1415
|
+ |
|
|
1416
|
+--- |
|
|
1417
|
+ |
|
|
1418
|
+## 联系方式 |
|
|
1419
|
+ |
|
|
1420
|
+如有问题或建议,请联系项目维护者。 |
|
|
1421
|
+ |
|
|
1422
|
+**项目仓库**: `/data/tw/SearchEngine` |
|
|
1423
|
+ |
|
|
1424
|
+**相关文档目录**: `docs/` |
...
|
...
|
|