Commit 7a013ca7b2030bf2b14b92eeb9f5282fea304300
1 parent
d47889b9
多模态文本向量服务ok
Showing
12 changed files
with
493 additions
and
179 deletions
Show diff stats
docs/CNCLIP_SERVICE说明文档.md
| ... | ... | @@ -4,9 +4,11 @@ |
| 4 | 4 | |
| 5 | 5 | ## 1. 作用与边界 |
| 6 | 6 | |
| 7 | -- `cnclip` 是独立 gRPC 服务(默认 `grpc://127.0.0.1:51000`)。 | |
| 8 | -- 图片 embedding 服务(默认 `6008`)在 `USE_CLIP_AS_SERVICE=true` 时调用它完成 `/embed/image`。 | |
| 9 | -- `cnclip` 不负责文本向量;文本向量由 TEI(8080)负责。 | |
| 7 | +- `cnclip` 是独立 gRPC 服务(默认 `grpc://127.0.0.1:51000`),底层为 **Chinese-CLIP**:**图像与文本在同一向量空间**(图文可互检)。 | |
| 8 | +- 图片 embedding 服务(默认 `6008`)在 `image_backend: clip_as_service` 时通过 gRPC 调用它完成: | |
| 9 | + - `POST /embed/image`:图片 URL / 本地路径 → 图向量; | |
| 10 | + - `POST /embed/clip_text`:**自然语言短语** → 文本塔向量(与上图向量同空间,用于 `image_embedding` 检索、以文搜图等)。 | |
| 11 | +- **语义检索用的文本向量**(`title_embedding`、query 语义召回)仍由 **TEI(8080)+ `POST /embed/text`(6005)** 负责,与 CN-CLIP 不是同一模型、不是同一向量空间;请勿混淆。 | |
| 10 | 12 | |
| 11 | 13 | ## 2. 代码与脚本入口 |
| 12 | 14 | |
| ... | ... | @@ -138,6 +140,8 @@ cat third-party/clip-as-service/server/torch-flow-temp.yml |
| 138 | 140 | |
| 139 | 141 | ### 7.3 发送一次编码请求(触发模型加载) |
| 140 | 142 | |
| 143 | +**gRPC 直连(文本或图片 URL 混传时由 client 自动区分):** | |
| 144 | + | |
| 141 | 145 | ```bash |
| 142 | 146 | PYTHONPATH="third-party/clip-as-service/client:${PYTHONPATH}" \ |
| 143 | 147 | NO_VERSION_CHECK=1 \ |
| ... | ... | @@ -151,6 +155,14 @@ PY |
| 151 | 155 | |
| 152 | 156 | 预期 `shape` 为 `(1, 1024)`。 |
| 153 | 157 | |
| 158 | +**HTTP(推荐业务侧):图片走 6008 `/embed/image`,纯文本走 `/embed/clip_text`(勿把 `http://` 图 URL 发到 `clip_text`):** | |
| 159 | + | |
| 160 | +```bash | |
| 161 | +curl -sS -X POST "http://127.0.0.1:6008/embed/clip_text?normalize=true&priority=1" \ | |
| 162 | + -H "Content-Type: application/json" \ | |
| 163 | + -d '["纯棉T恤", "芭比娃娃"]' | |
| 164 | +``` | |
| 165 | + | |
| 154 | 166 | ### 7.4 GPU 验证 |
| 155 | 167 | |
| 156 | 168 | ```bash | ... | ... |
docs/搜索API对接指南-07-微服务接口(Embedding-Reranker-Translation).md
| ... | ... | @@ -9,7 +9,7 @@ |
| 9 | 9 | | 服务 | 默认端口 | Base URL | 说明 | |
| 10 | 10 | |------|----------|----------|------| |
| 11 | 11 | | 向量服务(文本) | 6005 | `http://localhost:6005` | 文本向量化,用于 query/doc 语义检索 | |
| 12 | -| 向量服务(图片) | 6008 | `http://localhost:6008` | 图片向量化,用于以图搜图 | | |
| 12 | +| 向量服务(图片 / 多模态 CN-CLIP) | 6008 | `http://localhost:6008` | 图片向量 `/embed/image`;同空间文本向量 `/embed/clip_text`(以文搜图等) | | |
| 13 | 13 | | 翻译服务 | 6006 | `http://localhost:6006` | 多语言翻译(云端与本地模型统一入口) | |
| 14 | 14 | | 重排服务 | 6007 | `http://localhost:6007` | 对检索结果进行二次排序 | |
| 15 | 15 | |
| ... | ... | @@ -100,7 +100,35 @@ curl -X POST "http://localhost:6008/embed/image?normalize=true&priority=1" \ |
| 100 | 100 | |
| 101 | 101 | 在线以图搜图等实时场景可传 `priority=1`;离线索引回填保持默认 `priority=0`。 |
| 102 | 102 | |
| 103 | -#### 7.1.3 `GET /health` — 健康检查 | |
| 103 | +#### 7.1.3 `POST /embed/clip_text` — CN-CLIP 文本多模态向量(与图片同空间) | |
| 104 | + | |
| 105 | +将**自然语言短语**编码为向量,与 `POST /embed/image` 输出的图向量**处于同一向量空间**(Chinese-CLIP 文本塔 / 图塔),用于 **以文搜图**、与 ES `image_embedding` 对齐的 KNN 等。 | |
| 106 | + | |
| 107 | +与 **7.1.1** 的 `POST /embed/text`(TEI/BGE,语义检索)**不是同一模型、不是同一空间**,请勿混用。 | |
| 108 | + | |
| 109 | +**请求体**(JSON 数组,每项为字符串;**不要**传入 `http://` / `https://` 图片 URL,图片请用 `/embed/image`): | |
| 110 | + | |
| 111 | +```json | |
| 112 | +["纯棉短袖T恤", "芭比娃娃连衣裙"] | |
| 113 | +``` | |
| 114 | + | |
| 115 | +**响应**(JSON 数组,与输入一一对应): | |
| 116 | + | |
| 117 | +```json | |
| 118 | +[[0.01, -0.02, ...], [0.03, 0.01, ...], ...] | |
| 119 | +``` | |
| 120 | + | |
| 121 | +**curl 示例**: | |
| 122 | + | |
| 123 | +```bash | |
| 124 | +curl -X POST "http://localhost:6008/embed/clip_text?normalize=true&priority=1" \ | |
| 125 | + -H "Content-Type: application/json" \ | |
| 126 | + -d '["纯棉短袖", "street tee"]' | |
| 127 | +``` | |
| 128 | + | |
| 129 | +说明:与 `/embed/image` 共用图片侧限流与 `IMAGE_MAX_INFLIGHT`;Redis 缓存键 namespace 为 `clip_text`,与 TEI 文本缓存区分。 | |
| 130 | + | |
| 131 | +#### 7.1.4 `GET /health` — 健康检查 | |
| 104 | 132 | |
| 105 | 133 | ```bash |
| 106 | 134 | curl "http://localhost:6005/health" |
| ... | ... | @@ -110,18 +138,18 @@ curl "http://localhost:6008/health" |
| 110 | 138 | 返回中会包含: |
| 111 | 139 | |
| 112 | 140 | - `service_kind`:`text` / `image` / `all` |
| 113 | -- `cache_enabled`:text/image Redis 缓存是否可用 | |
| 141 | +- `cache_enabled`:text/image/clip_text Redis 缓存是否可用 | |
| 114 | 142 | - `limits`:当前 inflight limit、active、rejected_total 等 |
| 115 | 143 | - `stats`:request_total、cache_hits、cache_misses、avg_latency_ms 等 |
| 116 | 144 | |
| 117 | -#### 7.1.4 `GET /ready` — 就绪检查 | |
| 145 | +#### 7.1.5 `GET /ready` — 就绪检查 | |
| 118 | 146 | |
| 119 | 147 | ```bash |
| 120 | 148 | curl "http://localhost:6005/ready" |
| 121 | 149 | curl "http://localhost:6008/ready" |
| 122 | 150 | ``` |
| 123 | 151 | |
| 124 | -#### 7.1.5 缓存与限流说明 | |
| 152 | +#### 7.1.6 缓存与限流说明 | |
| 125 | 153 | |
| 126 | 154 | - 文本与图片都会先查 Redis 向量缓存。 |
| 127 | 155 | - Redis 中 value 仍是 **BF16 bytes**,读取后恢复成 `float32` 返回。 |
| ... | ... | @@ -130,8 +158,9 @@ curl "http://localhost:6008/ready" |
| 130 | 158 | - 当服务端发现超过 `TEXT_MAX_INFLIGHT` / `IMAGE_MAX_INFLIGHT` 时,会直接拒绝,而不是无限排队。 |
| 131 | 159 | - 其中 `POST /embed/text` 的 `priority=0` 会按上面的 inflight 规则直接拒绝;`priority>0` 不会被 admission 拒绝,但仍计入 inflight,并在服务端排队时优先于 `priority=0` 请求。 |
| 132 | 160 | - `POST /embed/image` 的 `priority=0` 受 `IMAGE_MAX_INFLIGHT` 约束;`priority>0` 不会被 admission 拒绝,但仍计入 inflight(无插队)。 |
| 161 | +- `POST /embed/clip_text` 与 `/embed/image` 共用同一后端与 `IMAGE_MAX_INFLIGHT`(计入图片侧并发)。 | |
| 133 | 162 | |
| 134 | -#### 7.1.6 TEI 统一调优建议(主服务) | |
| 163 | +#### 7.1.7 TEI 统一调优建议(主服务) | |
| 135 | 164 | |
| 136 | 165 | 使用单套主服务即可同时兼顾: |
| 137 | 166 | - 在线 query 向量化(低延迟,常见 `batch=1~4`) |
| ... | ... | @@ -147,7 +176,7 @@ curl "http://localhost:6008/ready" |
| 147 | 176 | 默认端口: |
| 148 | 177 | - TEI: `http://127.0.0.1:8080` |
| 149 | 178 | - 文本向量服务(`/embed/text`): `http://127.0.0.1:6005` |
| 150 | -- 图片向量服务(`/embed/image`): `http://127.0.0.1:6008` | |
| 179 | +- 图片向量服务(`/embed/image`、`/embed/clip_text`): `http://127.0.0.1:6008` | |
| 151 | 180 | |
| 152 | 181 | 当前主 TEI 启动默认值(已按 T4/短文本场景调优): |
| 153 | 182 | - `TEI_MAX_BATCH_TOKENS=4096` | ... | ... |
embeddings/README.md
| ... | ... | @@ -16,7 +16,7 @@ |
| 16 | 16 | - **clip-as-service 客户端**:`clip_as_service_encoder.py`(图片向量,推荐) |
| 17 | 17 | - **向量化服务(FastAPI)**:`server.py` |
| 18 | 18 | - **统一配置**:`config.py` |
| 19 | -- **接口契约**:`protocols.ImageEncoderProtocol`(图片编码统一为 `encode_image_urls(urls, batch_size, normalize_embeddings)`,本地 CN-CLIP 与 clip-as-service 均实现该接口) | |
| 19 | +- **接口契约**:`protocols.ImageEncoderProtocol`(`encode_image_urls` + `encode_clip_texts`;本地 CN-CLIP 与 clip-as-service 均实现) | |
| 20 | 20 | |
| 21 | 21 | 说明:历史上的云端 embedding 试验实现(DashScope)已从主仓库移除。当前默认部署为**文本服务 6005** 与**图片服务 6008** 两条独立链路;`all` 模式仅作为单进程调试入口。 |
| 22 | 22 | |
| ... | ... | @@ -36,8 +36,9 @@ |
| 36 | 36 | - 返回:`[[...], [...], ...]` |
| 37 | 37 | - 健康接口:`GET /health`、`GET /ready` |
| 38 | 38 | - 图片服务(默认 `6008`) |
| 39 | - - `POST /embed/image` | |
| 40 | - - 请求体:`["url或本地路径1", ...]` | |
| 39 | + - `POST /embed/image`:图片 URL 或本地路径 | |
| 40 | + - `POST /embed/clip_text`:自然语言(CN-CLIP 文本塔,与 `/embed/image` 同向量空间;勿传 `http://` 图 URL) | |
| 41 | + - 请求体:`["...", ...]` 字符串数组 | |
| 41 | 42 | - 可选 query 参数:`normalize=true|false`、`priority=0|1` |
| 42 | 43 | - 返回:`[[...], [...], ...]` |
| 43 | 44 | - 健康接口:`GET /health`、`GET /ready` |
| ... | ... | @@ -51,8 +52,9 @@ |
| 51 | 52 | - client 侧:`text_encoder.py` / `image_encoder.py` |
| 52 | 53 | - service 侧:`server.py` |
| 53 | 54 | - 当前主 key 格式: |
| 54 | - - 文本:`embedding:embed:norm{0|1}:{text}` | |
| 55 | + - 文本(TEI):`embedding:embed:norm{0|1}:{text}` | |
| 55 | 56 | - 图片:`embedding:image:embed:norm{0|1}:{url_or_path}` |
| 57 | + - CN-CLIP 文本:`embedding:clip_text:clip_mm:norm{0|1}:{text}` | |
| 56 | 58 | - 当前实现不再兼容历史 key 规则,只保留这一套格式,减少代码路径和缓存歧义。 |
| 57 | 59 | |
| 58 | 60 | ### 压力隔离与拒绝策略 | ... | ... |
embeddings/cache_keys.py
| 1 | 1 | """Shared cache key helpers for embedding inputs. |
| 2 | 2 | |
| 3 | 3 | Current canonical raw-key format: |
| 4 | -- text: ``embed:norm1:<text>`` / ``embed:norm0:<text>`` | |
| 5 | -- image: ``embed:norm1:<url>`` / ``embed:norm0:<url>`` | |
| 4 | +- text (TEI/BGE): ``embed:norm1:<text>`` / ``embed:norm0:<text>`` | |
| 5 | +- image (CLIP): ``embed:norm1:<url>`` / ``embed:norm0:<url>`` | |
| 6 | +- clip_text (CN-CLIP 文本,与图同空间): ``clip_mm:norm1:<text>`` / ``clip_mm:norm0:<text>`` | |
| 6 | 7 | |
| 7 | 8 | `RedisEmbeddingCache` adds the configured key prefix and optional namespace on top. |
| 8 | 9 | """ |
| ... | ... | @@ -18,3 +19,9 @@ def build_text_cache_key(text: str, *, normalize: bool) -> str: |
| 18 | 19 | def build_image_cache_key(url: str, *, normalize: bool) -> str: |
| 19 | 20 | normalized_url = str(url or "").strip() |
| 20 | 21 | return f"embed:norm{1 if normalize else 0}:{normalized_url}" |
| 22 | + | |
| 23 | + | |
| 24 | +def build_clip_text_cache_key(text: str, *, normalize: bool) -> str: | |
| 25 | + """CN-CLIP / multimodal text (same vector space as /embed/image).""" | |
| 26 | + normalized_text = str(text or "").strip() | |
| 27 | + return f"clip_mm:norm{1 if normalize else 0}:{normalized_text}" | ... | ... |
embeddings/clip_as_service_encoder.py
| ... | ... | @@ -127,3 +127,41 @@ class ClipAsServiceImageEncoder: |
| 127 | 127 | if not results: |
| 128 | 128 | raise RuntimeError("clip-as-service returned empty result for single image URL") |
| 129 | 129 | return results[0] |
| 130 | + | |
| 131 | + def encode_clip_texts( | |
| 132 | + self, | |
| 133 | + texts: List[str], | |
| 134 | + batch_size: Optional[int] = None, | |
| 135 | + normalize_embeddings: bool = True, | |
| 136 | + ) -> List[np.ndarray]: | |
| 137 | + """ | |
| 138 | + CN-CLIP 文本塔:与 encode_image_urls 输出同一向量空间(图文检索 / image_embedding)。 | |
| 139 | + | |
| 140 | + 仅传入自然语言字符串;HTTP 侧见 ``POST /embed/clip_text``。 | |
| 141 | + """ | |
| 142 | + if not texts: | |
| 143 | + return [] | |
| 144 | + bs = batch_size if batch_size is not None else self._batch_size | |
| 145 | + arr = self._client.encode( | |
| 146 | + texts, | |
| 147 | + batch_size=bs, | |
| 148 | + show_progress=self._show_progress, | |
| 149 | + ) | |
| 150 | + if arr is None or not hasattr(arr, "shape"): | |
| 151 | + raise RuntimeError("clip-as-service encode (text) returned empty result") | |
| 152 | + if len(arr) != len(texts): | |
| 153 | + raise RuntimeError( | |
| 154 | + f"clip-as-service text encode length mismatch: expected {len(texts)}, got {len(arr)}" | |
| 155 | + ) | |
| 156 | + out: List[np.ndarray] = [] | |
| 157 | + for row in arr: | |
| 158 | + vec = np.asarray(row, dtype=np.float32) | |
| 159 | + if vec.ndim != 1 or vec.size == 0 or not np.isfinite(vec).all(): | |
| 160 | + raise RuntimeError("clip-as-service returned invalid text embedding vector") | |
| 161 | + if normalize_embeddings: | |
| 162 | + norm = float(np.linalg.norm(vec)) | |
| 163 | + if not np.isfinite(norm) or norm <= 0.0: | |
| 164 | + raise RuntimeError("clip-as-service returned zero/invalid norm vector") | |
| 165 | + vec = vec / norm | |
| 166 | + out.append(vec) | |
| 167 | + return out | ... | ... |
embeddings/clip_model.py
| ... | ... | @@ -131,3 +131,28 @@ class ClipImageModel(object): |
| 131 | 131 | else: |
| 132 | 132 | raise ValueError(f"Unsupported image input type: {type(img)!r}") |
| 133 | 133 | return results |
| 134 | + | |
| 135 | + def encode_clip_texts( | |
| 136 | + self, | |
| 137 | + texts: List[str], | |
| 138 | + batch_size: Optional[int] = None, | |
| 139 | + normalize_embeddings: bool = True, | |
| 140 | + ) -> List[np.ndarray]: | |
| 141 | + """ | |
| 142 | + CN-CLIP 文本塔向量,与 encode_image 同空间;供 ``POST /embed/clip_text`` 使用。 | |
| 143 | + """ | |
| 144 | + if not texts: | |
| 145 | + return [] | |
| 146 | + bs = batch_size or 8 | |
| 147 | + out: List[np.ndarray] = [] | |
| 148 | + for i in range(0, len(texts), bs): | |
| 149 | + batch = texts[i : i + bs] | |
| 150 | + text_data = clip.tokenize(batch).to(self.device) | |
| 151 | + with torch.no_grad(): | |
| 152 | + feats = self.model.encode_text(text_data) | |
| 153 | + if normalize_embeddings: | |
| 154 | + feats = feats / feats.norm(dim=-1, keepdim=True) | |
| 155 | + arr = feats.cpu().numpy().astype("float32") | |
| 156 | + for row in arr: | |
| 157 | + out.append(np.asarray(row, dtype=np.float32)) | |
| 158 | + return out | ... | ... |
embeddings/image_encoder.py
| ... | ... | @@ -10,8 +10,8 @@ from PIL import Image |
| 10 | 10 | logger = logging.getLogger(__name__) |
| 11 | 11 | |
| 12 | 12 | from config.loader import get_app_config |
| 13 | -from config.services_config import get_embedding_image_base_url | |
| 14 | -from embeddings.cache_keys import build_image_cache_key | |
| 13 | +from config.services_config import get_embedding_image_backend_config, get_embedding_image_base_url | |
| 14 | +from embeddings.cache_keys import build_clip_text_cache_key, build_image_cache_key | |
| 15 | 15 | from embeddings.redis_embedding_cache import RedisEmbeddingCache |
| 16 | 16 | from request_log_context import build_downstream_request_headers, build_request_log_extra |
| 17 | 17 | |
| ... | ... | @@ -28,6 +28,7 @@ class CLIPImageEncoder: |
| 28 | 28 | redis_config = get_app_config().infrastructure.redis |
| 29 | 29 | self.service_url = str(resolved_url).rstrip("/") |
| 30 | 30 | self.endpoint = f"{self.service_url}/embed/image" |
| 31 | + self.clip_text_endpoint = f"{self.service_url}/embed/clip_text" | |
| 31 | 32 | # Reuse embedding cache prefix, but separate namespace for images to avoid collisions. |
| 32 | 33 | self.cache_prefix = str(redis_config.embedding_cache_prefix).strip() or "embedding" |
| 33 | 34 | logger.info("Creating CLIPImageEncoder instance with service URL: %s", self.service_url) |
| ... | ... | @@ -35,6 +36,10 @@ class CLIPImageEncoder: |
| 35 | 36 | key_prefix=self.cache_prefix, |
| 36 | 37 | namespace="image", |
| 37 | 38 | ) |
| 39 | + self._clip_text_cache = RedisEmbeddingCache( | |
| 40 | + key_prefix=self.cache_prefix, | |
| 41 | + namespace="clip_text", | |
| 42 | + ) | |
| 38 | 43 | |
| 39 | 44 | def _call_service( |
| 40 | 45 | self, |
| ... | ... | @@ -84,6 +89,108 @@ class CLIPImageEncoder: |
| 84 | 89 | ) |
| 85 | 90 | raise |
| 86 | 91 | |
| 92 | + def _clip_text_via_grpc( | |
| 93 | + self, | |
| 94 | + request_data: List[str], | |
| 95 | + normalize_embeddings: bool, | |
| 96 | + ) -> List[Any]: | |
| 97 | + """旧版 6008 无 ``/embed/clip_text`` 时走 gRPC(需 ``image_backend: clip_as_service``)。""" | |
| 98 | + backend, cfg = get_embedding_image_backend_config() | |
| 99 | + if backend != "clip_as_service": | |
| 100 | + raise RuntimeError( | |
| 101 | + "POST /embed/clip_text 返回 404:请重启图片向量服务(6008)以加载新路由;" | |
| 102 | + "或配置 services.embedding.image_backend=clip_as_service 并启动 grpc cnclip。" | |
| 103 | + ) | |
| 104 | + from embeddings.clip_as_service_encoder import ClipAsServiceImageEncoder | |
| 105 | + from embeddings.config import CONFIG | |
| 106 | + | |
| 107 | + enc = ClipAsServiceImageEncoder( | |
| 108 | + server=str(cfg.get("server") or CONFIG.CLIP_AS_SERVICE_SERVER), | |
| 109 | + batch_size=int(cfg.get("batch_size") or CONFIG.IMAGE_BATCH_SIZE), | |
| 110 | + ) | |
| 111 | + arrs = enc.encode_clip_texts( | |
| 112 | + request_data, | |
| 113 | + batch_size=len(request_data), | |
| 114 | + normalize_embeddings=normalize_embeddings, | |
| 115 | + ) | |
| 116 | + return [v.tolist() for v in arrs] | |
| 117 | + | |
| 118 | + def _call_clip_text_service( | |
| 119 | + self, | |
| 120 | + request_data: List[str], | |
| 121 | + normalize_embeddings: bool = True, | |
| 122 | + priority: int = 1, | |
| 123 | + request_id: Optional[str] = None, | |
| 124 | + user_id: Optional[str] = None, | |
| 125 | + ) -> List[Any]: | |
| 126 | + response = None | |
| 127 | + try: | |
| 128 | + response = requests.post( | |
| 129 | + self.clip_text_endpoint, | |
| 130 | + params={ | |
| 131 | + "normalize": "true" if normalize_embeddings else "false", | |
| 132 | + "priority": max(0, int(priority)), | |
| 133 | + }, | |
| 134 | + json=request_data, | |
| 135 | + headers=build_downstream_request_headers(request_id=request_id, user_id=user_id), | |
| 136 | + timeout=60, | |
| 137 | + ) | |
| 138 | + if response.status_code == 404: | |
| 139 | + logger.warning( | |
| 140 | + "POST %s returned 404; using clip-as-service gRPC fallback (restart 6008 after deploy to use HTTP)", | |
| 141 | + self.clip_text_endpoint, | |
| 142 | + ) | |
| 143 | + return self._clip_text_via_grpc(request_data, normalize_embeddings) | |
| 144 | + response.raise_for_status() | |
| 145 | + return response.json() | |
| 146 | + except requests.exceptions.RequestException as e: | |
| 147 | + body_preview = "" | |
| 148 | + if response is not None: | |
| 149 | + try: | |
| 150 | + body_preview = (response.text or "")[:300] | |
| 151 | + except Exception: | |
| 152 | + body_preview = "" | |
| 153 | + logger.error( | |
| 154 | + "CLIPImageEncoder clip_text request failed | status=%s body=%s error=%s", | |
| 155 | + getattr(response, "status_code", "n/a"), | |
| 156 | + body_preview, | |
| 157 | + e, | |
| 158 | + exc_info=True, | |
| 159 | + extra=build_request_log_extra(request_id=request_id, user_id=user_id), | |
| 160 | + ) | |
| 161 | + raise | |
| 162 | + | |
| 163 | + def encode_clip_text( | |
| 164 | + self, | |
| 165 | + text: str, | |
| 166 | + normalize_embeddings: bool = True, | |
| 167 | + priority: int = 1, | |
| 168 | + request_id: Optional[str] = None, | |
| 169 | + user_id: Optional[str] = None, | |
| 170 | + ) -> np.ndarray: | |
| 171 | + """ | |
| 172 | + CN-CLIP 文本塔(与 ``/embed/image`` 同向量空间),对应服务端 ``POST /embed/clip_text``。 | |
| 173 | + """ | |
| 174 | + cache_key = build_clip_text_cache_key(text, normalize=normalize_embeddings) | |
| 175 | + cached = self._clip_text_cache.get(cache_key) | |
| 176 | + if cached is not None: | |
| 177 | + return cached | |
| 178 | + | |
| 179 | + response_data = self._call_clip_text_service( | |
| 180 | + [text.strip()], | |
| 181 | + normalize_embeddings=normalize_embeddings, | |
| 182 | + priority=priority, | |
| 183 | + request_id=request_id, | |
| 184 | + user_id=user_id, | |
| 185 | + ) | |
| 186 | + if not response_data or len(response_data) != 1 or response_data[0] is None: | |
| 187 | + raise RuntimeError(f"No CLIP text embedding returned for: {text[:80]!r}") | |
| 188 | + vec = np.array(response_data[0], dtype=np.float32) | |
| 189 | + if vec.ndim != 1 or vec.size == 0 or not np.isfinite(vec).all(): | |
| 190 | + raise RuntimeError("Invalid CLIP text embedding returned") | |
| 191 | + self._clip_text_cache.set(cache_key, vec) | |
| 192 | + return vec | |
| 193 | + | |
| 87 | 194 | def encode_image(self, image: Image.Image) -> np.ndarray: |
| 88 | 195 | """ |
| 89 | 196 | Encode image to embedding vector using network service. | ... | ... |
embeddings/protocols.py
| ... | ... | @@ -27,3 +27,17 @@ class ImageEncoderProtocol(Protocol): |
| 27 | 27 | of returning partial None placeholders. |
| 28 | 28 | """ |
| 29 | 29 | ... |
| 30 | + | |
| 31 | + def encode_clip_texts( | |
| 32 | + self, | |
| 33 | + texts: List[str], | |
| 34 | + batch_size: Optional[int] = None, | |
| 35 | + normalize_embeddings: bool = True, | |
| 36 | + ) -> List[np.ndarray]: | |
| 37 | + """ | |
| 38 | + Encode natural-language strings with the CLIP/CN-CLIP text tower (same space as images). | |
| 39 | + | |
| 40 | + Returns: | |
| 41 | + List of vectors, same length as texts. | |
| 42 | + """ | |
| 43 | + ... | ... | ... |
embeddings/server.py
| ... | ... | @@ -2,8 +2,9 @@ |
| 2 | 2 | Embedding service (FastAPI). |
| 3 | 3 | |
| 4 | 4 | API (simple list-in, list-out; aligned by index): |
| 5 | -- POST /embed/text body: ["text1", "text2", ...] -> [[...], ...] | |
| 6 | -- POST /embed/image body: ["url_or_path1", ...] -> [[...], ...] | |
| 5 | +- POST /embed/text body: ["text1", "text2", ...] -> [[...], ...] (TEI/BGE,语义检索 title_embedding) | |
| 6 | +- POST /embed/image body: ["url_or_path1", ...] -> [[...], ...] (CN-CLIP 图向量) | |
| 7 | +- POST /embed/clip_text body: ["短语1", "短语2", ...] -> [[...], ...] (CN-CLIP 文本塔,与 /embed/image 同空间) | |
| 7 | 8 | """ |
| 8 | 9 | |
| 9 | 10 | import logging |
| ... | ... | @@ -22,7 +23,7 @@ from fastapi.concurrency import run_in_threadpool |
| 22 | 23 | |
| 23 | 24 | from config.env_config import REDIS_CONFIG |
| 24 | 25 | from config.services_config import get_embedding_backend_config |
| 25 | -from embeddings.cache_keys import build_image_cache_key, build_text_cache_key | |
| 26 | +from embeddings.cache_keys import build_clip_text_cache_key, build_image_cache_key, build_text_cache_key | |
| 26 | 27 | from embeddings.config import CONFIG |
| 27 | 28 | from embeddings.protocols import ImageEncoderProtocol |
| 28 | 29 | from embeddings.redis_embedding_cache import RedisEmbeddingCache |
| ... | ... | @@ -373,6 +374,7 @@ _text_stats = _EndpointStats(name="text") |
| 373 | 374 | _image_stats = _EndpointStats(name="image") |
| 374 | 375 | _text_cache = RedisEmbeddingCache(key_prefix=_CACHE_PREFIX, namespace="") |
| 375 | 376 | _image_cache = RedisEmbeddingCache(key_prefix=_CACHE_PREFIX, namespace="image") |
| 377 | +_clip_text_cache = RedisEmbeddingCache(key_prefix=_CACHE_PREFIX, namespace="clip_text") | |
| 376 | 378 | |
| 377 | 379 | |
| 378 | 380 | @dataclass |
| ... | ... | @@ -752,13 +754,20 @@ def _try_full_text_cache_hit( |
| 752 | 754 | ) |
| 753 | 755 | |
| 754 | 756 | |
| 755 | -def _try_full_image_cache_hit( | |
| 756 | - urls: List[str], | |
| 757 | +def _try_full_image_lane_cache_hit( | |
| 758 | + items: List[str], | |
| 757 | 759 | effective_normalize: bool, |
| 760 | + *, | |
| 761 | + lane: str, | |
| 758 | 762 | ) -> Optional[_EmbedResult]: |
| 759 | 763 | out: List[Optional[List[float]]] = [] |
| 760 | - for url in urls: | |
| 761 | - cached = _image_cache.get(build_image_cache_key(url, normalize=effective_normalize)) | |
| 764 | + for item in items: | |
| 765 | + if lane == "image": | |
| 766 | + ck = build_image_cache_key(item, normalize=effective_normalize) | |
| 767 | + cached = _image_cache.get(ck) | |
| 768 | + else: | |
| 769 | + ck = build_clip_text_cache_key(item, normalize=effective_normalize) | |
| 770 | + cached = _clip_text_cache.get(ck) | |
| 762 | 771 | if cached is None: |
| 763 | 772 | return None |
| 764 | 773 | vec = _as_list(cached, normalize=False) |
| ... | ... | @@ -774,6 +783,108 @@ def _try_full_image_cache_hit( |
| 774 | 783 | ) |
| 775 | 784 | |
| 776 | 785 | |
| 786 | +def _embed_image_lane_impl( | |
| 787 | + items: List[str], | |
| 788 | + effective_normalize: bool, | |
| 789 | + request_id: str, | |
| 790 | + user_id: str, | |
| 791 | + *, | |
| 792 | + lane: str, | |
| 793 | +) -> _EmbedResult: | |
| 794 | + if _image_model is None: | |
| 795 | + raise RuntimeError("Image model not loaded") | |
| 796 | + | |
| 797 | + out: List[Optional[List[float]]] = [None] * len(items) | |
| 798 | + missing_indices: List[int] = [] | |
| 799 | + missing_items: List[str] = [] | |
| 800 | + missing_keys: List[str] = [] | |
| 801 | + cache_hits = 0 | |
| 802 | + for idx, item in enumerate(items): | |
| 803 | + if lane == "image": | |
| 804 | + ck = build_image_cache_key(item, normalize=effective_normalize) | |
| 805 | + cached = _image_cache.get(ck) | |
| 806 | + else: | |
| 807 | + ck = build_clip_text_cache_key(item, normalize=effective_normalize) | |
| 808 | + cached = _clip_text_cache.get(ck) | |
| 809 | + if cached is not None: | |
| 810 | + vec = _as_list(cached, normalize=False) | |
| 811 | + if vec is not None: | |
| 812 | + out[idx] = vec | |
| 813 | + cache_hits += 1 | |
| 814 | + continue | |
| 815 | + missing_indices.append(idx) | |
| 816 | + missing_items.append(item) | |
| 817 | + missing_keys.append(ck) | |
| 818 | + | |
| 819 | + if not missing_items: | |
| 820 | + logger.info( | |
| 821 | + "%s lane cache-only | inputs=%d normalize=%s dim=%d cache_hits=%d", | |
| 822 | + lane, | |
| 823 | + len(items), | |
| 824 | + effective_normalize, | |
| 825 | + len(out[0]) if out and out[0] is not None else 0, | |
| 826 | + cache_hits, | |
| 827 | + extra=build_request_log_extra(request_id=request_id, user_id=user_id), | |
| 828 | + ) | |
| 829 | + return _EmbedResult( | |
| 830 | + vectors=out, | |
| 831 | + cache_hits=cache_hits, | |
| 832 | + cache_misses=0, | |
| 833 | + backend_elapsed_ms=0.0, | |
| 834 | + mode="cache-only", | |
| 835 | + ) | |
| 836 | + | |
| 837 | + backend_t0 = time.perf_counter() | |
| 838 | + with _image_encode_lock: | |
| 839 | + if lane == "image": | |
| 840 | + vectors = _image_model.encode_image_urls( | |
| 841 | + missing_items, | |
| 842 | + batch_size=CONFIG.IMAGE_BATCH_SIZE, | |
| 843 | + normalize_embeddings=effective_normalize, | |
| 844 | + ) | |
| 845 | + else: | |
| 846 | + vectors = _image_model.encode_clip_texts( | |
| 847 | + missing_items, | |
| 848 | + batch_size=CONFIG.IMAGE_BATCH_SIZE, | |
| 849 | + normalize_embeddings=effective_normalize, | |
| 850 | + ) | |
| 851 | + if vectors is None or len(vectors) != len(missing_items): | |
| 852 | + raise RuntimeError( | |
| 853 | + f"{lane} lane length mismatch: expected {len(missing_items)}, " | |
| 854 | + f"got {0 if vectors is None else len(vectors)}" | |
| 855 | + ) | |
| 856 | + | |
| 857 | + for pos, ck, vec in zip(missing_indices, missing_keys, vectors): | |
| 858 | + out_vec = _as_list(vec, normalize=effective_normalize) | |
| 859 | + if out_vec is None: | |
| 860 | + raise RuntimeError(f"{lane} lane empty embedding at position {pos}") | |
| 861 | + out[pos] = out_vec | |
| 862 | + if lane == "image": | |
| 863 | + _image_cache.set(ck, np.asarray(out_vec, dtype=np.float32)) | |
| 864 | + else: | |
| 865 | + _clip_text_cache.set(ck, np.asarray(out_vec, dtype=np.float32)) | |
| 866 | + | |
| 867 | + backend_elapsed_ms = (time.perf_counter() - backend_t0) * 1000.0 | |
| 868 | + logger.info( | |
| 869 | + "%s lane backend-batch | inputs=%d normalize=%s dim=%d cache_hits=%d cache_misses=%d backend_elapsed_ms=%.2f", | |
| 870 | + lane, | |
| 871 | + len(items), | |
| 872 | + effective_normalize, | |
| 873 | + len(out[0]) if out and out[0] is not None else 0, | |
| 874 | + cache_hits, | |
| 875 | + len(missing_items), | |
| 876 | + backend_elapsed_ms, | |
| 877 | + extra=build_request_log_extra(request_id=request_id, user_id=user_id), | |
| 878 | + ) | |
| 879 | + return _EmbedResult( | |
| 880 | + vectors=out, | |
| 881 | + cache_hits=cache_hits, | |
| 882 | + cache_misses=len(missing_items), | |
| 883 | + backend_elapsed_ms=backend_elapsed_ms, | |
| 884 | + mode="backend-batch", | |
| 885 | + ) | |
| 886 | + | |
| 887 | + | |
| 777 | 888 | @app.get("/health") |
| 778 | 889 | def health() -> Dict[str, Any]: |
| 779 | 890 | """Health check endpoint. Returns status and current throttling stats.""" |
| ... | ... | @@ -789,6 +900,7 @@ def health() -> Dict[str, Any]: |
| 789 | 900 | "cache_enabled": { |
| 790 | 901 | "text": _text_cache.redis_client is not None, |
| 791 | 902 | "image": _image_cache.redis_client is not None, |
| 903 | + "clip_text": _clip_text_cache.redis_client is not None, | |
| 792 | 904 | }, |
| 793 | 905 | "limits": { |
| 794 | 906 | "text": _text_request_limiter.snapshot(), |
| ... | ... | @@ -1148,102 +1260,29 @@ async def embed_text( |
| 1148 | 1260 | reset_request_log_context(log_tokens) |
| 1149 | 1261 | |
| 1150 | 1262 | |
| 1151 | -def _embed_image_impl( | |
| 1152 | - urls: List[str], | |
| 1153 | - effective_normalize: bool, | |
| 1154 | - request_id: str, | |
| 1155 | - user_id: str, | |
| 1156 | -) -> _EmbedResult: | |
| 1157 | - if _image_model is None: | |
| 1158 | - raise RuntimeError("Image model not loaded") | |
| 1159 | - | |
| 1160 | - out: List[Optional[List[float]]] = [None] * len(urls) | |
| 1161 | - missing_indices: List[int] = [] | |
| 1162 | - missing_urls: List[str] = [] | |
| 1163 | - missing_cache_keys: List[str] = [] | |
| 1164 | - cache_hits = 0 | |
| 1165 | - for idx, url in enumerate(urls): | |
| 1166 | - cache_key = build_image_cache_key(url, normalize=effective_normalize) | |
| 1167 | - cached = _image_cache.get(cache_key) | |
| 1168 | - if cached is not None: | |
| 1169 | - vec = _as_list(cached, normalize=False) | |
| 1170 | - if vec is not None: | |
| 1171 | - out[idx] = vec | |
| 1172 | - cache_hits += 1 | |
| 1173 | - continue | |
| 1174 | - missing_indices.append(idx) | |
| 1175 | - missing_urls.append(url) | |
| 1176 | - missing_cache_keys.append(cache_key) | |
| 1177 | - | |
| 1178 | - if not missing_urls: | |
| 1179 | - logger.info( | |
| 1180 | - "image backend done | mode=cache-only inputs=%d normalize=%s dim=%d cache_hits=%d cache_misses=0 backend_elapsed_ms=0.00", | |
| 1181 | - len(urls), | |
| 1182 | - effective_normalize, | |
| 1183 | - len(out[0]) if out and out[0] is not None else 0, | |
| 1184 | - cache_hits, | |
| 1185 | - extra=build_request_log_extra(request_id, user_id), | |
| 1186 | - ) | |
| 1187 | - return _EmbedResult( | |
| 1188 | - vectors=out, | |
| 1189 | - cache_hits=cache_hits, | |
| 1190 | - cache_misses=0, | |
| 1191 | - backend_elapsed_ms=0.0, | |
| 1192 | - mode="cache-only", | |
| 1193 | - ) | |
| 1194 | - | |
| 1195 | - backend_t0 = time.perf_counter() | |
| 1196 | - with _image_encode_lock: | |
| 1197 | - vectors = _image_model.encode_image_urls( | |
| 1198 | - missing_urls, | |
| 1199 | - batch_size=CONFIG.IMAGE_BATCH_SIZE, | |
| 1200 | - normalize_embeddings=effective_normalize, | |
| 1201 | - ) | |
| 1202 | - if vectors is None or len(vectors) != len(missing_urls): | |
| 1203 | - raise RuntimeError( | |
| 1204 | - f"Image model response length mismatch: expected {len(missing_urls)}, " | |
| 1205 | - f"got {0 if vectors is None else len(vectors)}" | |
| 1206 | - ) | |
| 1263 | +def _parse_string_inputs(raw: List[Any], *, kind: str, empty_detail: str) -> List[str]: | |
| 1264 | + out: List[str] = [] | |
| 1265 | + for i, x in enumerate(raw): | |
| 1266 | + if not isinstance(x, str): | |
| 1267 | + raise HTTPException(status_code=400, detail=f"Invalid {kind} at index {i}: must be string") | |
| 1268 | + s = x.strip() | |
| 1269 | + if not s: | |
| 1270 | + raise HTTPException(status_code=400, detail=f"Invalid {kind} at index {i}: {empty_detail}") | |
| 1271 | + out.append(s) | |
| 1272 | + return out | |
| 1207 | 1273 | |
| 1208 | - for pos, cache_key, vec in zip(missing_indices, missing_cache_keys, vectors): | |
| 1209 | - out_vec = _as_list(vec, normalize=effective_normalize) | |
| 1210 | - if out_vec is None: | |
| 1211 | - raise RuntimeError(f"Image model returned empty embedding for position {pos}") | |
| 1212 | - out[pos] = out_vec | |
| 1213 | - _image_cache.set(cache_key, np.asarray(out_vec, dtype=np.float32)) | |
| 1214 | - | |
| 1215 | - backend_elapsed_ms = (time.perf_counter() - backend_t0) * 1000.0 | |
| 1216 | - | |
| 1217 | - logger.info( | |
| 1218 | - "image backend done | mode=backend-batch inputs=%d normalize=%s dim=%d cache_hits=%d cache_misses=%d backend_elapsed_ms=%.2f", | |
| 1219 | - len(urls), | |
| 1220 | - effective_normalize, | |
| 1221 | - len(out[0]) if out and out[0] is not None else 0, | |
| 1222 | - cache_hits, | |
| 1223 | - len(missing_urls), | |
| 1224 | - backend_elapsed_ms, | |
| 1225 | - extra=build_request_log_extra(request_id, user_id), | |
| 1226 | - ) | |
| 1227 | - return _EmbedResult( | |
| 1228 | - vectors=out, | |
| 1229 | - cache_hits=cache_hits, | |
| 1230 | - cache_misses=len(missing_urls), | |
| 1231 | - backend_elapsed_ms=backend_elapsed_ms, | |
| 1232 | - mode="backend-batch", | |
| 1233 | - ) | |
| 1234 | 1274 | |
| 1235 | - | |
| 1236 | -@app.post("/embed/image") | |
| 1237 | -async def embed_image( | |
| 1238 | - images: List[str], | |
| 1275 | +async def _run_image_lane_embed( | |
| 1276 | + *, | |
| 1277 | + route: str, | |
| 1278 | + lane: str, | |
| 1279 | + items: List[str], | |
| 1239 | 1280 | http_request: Request, |
| 1240 | 1281 | response: Response, |
| 1241 | - normalize: Optional[bool] = None, | |
| 1242 | - priority: int = 0, | |
| 1282 | + normalize: Optional[bool], | |
| 1283 | + priority: int, | |
| 1284 | + preview_chars: int, | |
| 1243 | 1285 | ) -> List[Optional[List[float]]]: |
| 1244 | - if _image_model is None: | |
| 1245 | - raise HTTPException(status_code=503, detail="Image embedding model not loaded in this service") | |
| 1246 | - | |
| 1247 | 1286 | request_id = _resolve_request_id(http_request) |
| 1248 | 1287 | user_id = _resolve_user_id(http_request) |
| 1249 | 1288 | _, _, log_tokens = bind_request_log_context(request_id, user_id) |
| ... | ... | @@ -1255,24 +1294,16 @@ async def embed_image( |
| 1255 | 1294 | cache_hits = 0 |
| 1256 | 1295 | cache_misses = 0 |
| 1257 | 1296 | limiter_acquired = False |
| 1297 | + items_in: List[str] = list(items) | |
| 1258 | 1298 | |
| 1259 | 1299 | try: |
| 1260 | 1300 | if priority < 0: |
| 1261 | 1301 | raise HTTPException(status_code=400, detail="priority must be >= 0") |
| 1262 | 1302 | effective_priority = _effective_priority(priority) |
| 1263 | - | |
| 1264 | 1303 | effective_normalize = bool(CONFIG.IMAGE_NORMALIZE_EMBEDDINGS) if normalize is None else bool(normalize) |
| 1265 | - urls: List[str] = [] | |
| 1266 | - for i, url_or_path in enumerate(images): | |
| 1267 | - if not isinstance(url_or_path, str): | |
| 1268 | - raise HTTPException(status_code=400, detail=f"Invalid image at index {i}: must be string URL/path") | |
| 1269 | - s = url_or_path.strip() | |
| 1270 | - if not s: | |
| 1271 | - raise HTTPException(status_code=400, detail=f"Invalid image at index {i}: empty URL/path") | |
| 1272 | - urls.append(s) | |
| 1273 | 1304 | |
| 1274 | 1305 | cache_check_started = time.perf_counter() |
| 1275 | - cache_only = _try_full_image_cache_hit(urls, effective_normalize) | |
| 1306 | + cache_only = _try_full_image_lane_cache_hit(items, effective_normalize, lane=lane) | |
| 1276 | 1307 | if cache_only is not None: |
| 1277 | 1308 | latency_ms = (time.perf_counter() - cache_check_started) * 1000.0 |
| 1278 | 1309 | _image_stats.record_completed( |
| ... | ... | @@ -1283,9 +1314,10 @@ async def embed_image( |
| 1283 | 1314 | cache_misses=0, |
| 1284 | 1315 | ) |
| 1285 | 1316 | logger.info( |
| 1286 | - "embed_image response | mode=cache-only priority=%s inputs=%d normalize=%s dim=%d cache_hits=%d cache_misses=0 first_vector=%s latency_ms=%.2f", | |
| 1317 | + "%s response | mode=cache-only priority=%s inputs=%d normalize=%s dim=%d cache_hits=%d first_vector=%s latency_ms=%.2f", | |
| 1318 | + route, | |
| 1287 | 1319 | _priority_label(effective_priority), |
| 1288 | - len(urls), | |
| 1320 | + len(items), | |
| 1289 | 1321 | effective_normalize, |
| 1290 | 1322 | len(cache_only.vectors[0]) if cache_only.vectors and cache_only.vectors[0] is not None else 0, |
| 1291 | 1323 | cache_only.cache_hits, |
| ... | ... | @@ -1299,14 +1331,15 @@ async def embed_image( |
| 1299 | 1331 | if not accepted: |
| 1300 | 1332 | _image_stats.record_rejected() |
| 1301 | 1333 | logger.warning( |
| 1302 | - "embed_image rejected | client=%s priority=%s inputs=%d normalize=%s active=%d limit=%d preview=%s", | |
| 1334 | + "%s rejected | client=%s priority=%s inputs=%d normalize=%s active=%d limit=%d preview=%s", | |
| 1335 | + route, | |
| 1303 | 1336 | _request_client(http_request), |
| 1304 | 1337 | _priority_label(effective_priority), |
| 1305 | - len(urls), | |
| 1338 | + len(items), | |
| 1306 | 1339 | effective_normalize, |
| 1307 | 1340 | active, |
| 1308 | 1341 | _IMAGE_MAX_INFLIGHT, |
| 1309 | - _preview_inputs(urls, _LOG_PREVIEW_COUNT, _LOG_IMAGE_PREVIEW_CHARS), | |
| 1342 | + _preview_inputs(items, _LOG_PREVIEW_COUNT, preview_chars), | |
| 1310 | 1343 | extra=build_request_log_extra(request_id, user_id), |
| 1311 | 1344 | ) |
| 1312 | 1345 | raise HTTPException( |
| ... | ... | @@ -1318,24 +1351,33 @@ async def embed_image( |
| 1318 | 1351 | ) |
| 1319 | 1352 | limiter_acquired = True |
| 1320 | 1353 | logger.info( |
| 1321 | - "embed_image request | client=%s priority=%s inputs=%d normalize=%s active=%d limit=%d preview=%s", | |
| 1354 | + "%s request | client=%s priority=%s inputs=%d normalize=%s active=%d limit=%d preview=%s", | |
| 1355 | + route, | |
| 1322 | 1356 | _request_client(http_request), |
| 1323 | 1357 | _priority_label(effective_priority), |
| 1324 | - len(urls), | |
| 1358 | + len(items), | |
| 1325 | 1359 | effective_normalize, |
| 1326 | 1360 | active, |
| 1327 | 1361 | _IMAGE_MAX_INFLIGHT, |
| 1328 | - _preview_inputs(urls, _LOG_PREVIEW_COUNT, _LOG_IMAGE_PREVIEW_CHARS), | |
| 1362 | + _preview_inputs(items, _LOG_PREVIEW_COUNT, preview_chars), | |
| 1329 | 1363 | extra=build_request_log_extra(request_id, user_id), |
| 1330 | 1364 | ) |
| 1331 | 1365 | verbose_logger.info( |
| 1332 | - "embed_image detail | payload=%s normalize=%s priority=%s", | |
| 1333 | - urls, | |
| 1366 | + "%s detail | payload=%s normalize=%s priority=%s", | |
| 1367 | + route, | |
| 1368 | + items, | |
| 1334 | 1369 | effective_normalize, |
| 1335 | 1370 | _priority_label(effective_priority), |
| 1336 | 1371 | extra=build_request_log_extra(request_id, user_id), |
| 1337 | 1372 | ) |
| 1338 | - result = await run_in_threadpool(_embed_image_impl, urls, effective_normalize, request_id, user_id) | |
| 1373 | + result = await run_in_threadpool( | |
| 1374 | + _embed_image_lane_impl, | |
| 1375 | + items, | |
| 1376 | + effective_normalize, | |
| 1377 | + request_id, | |
| 1378 | + user_id, | |
| 1379 | + lane=lane, | |
| 1380 | + ) | |
| 1339 | 1381 | success = True |
| 1340 | 1382 | backend_elapsed_ms = result.backend_elapsed_ms |
| 1341 | 1383 | cache_hits = result.cache_hits |
| ... | ... | @@ -1349,10 +1391,11 @@ async def embed_image( |
| 1349 | 1391 | cache_misses=cache_misses, |
| 1350 | 1392 | ) |
| 1351 | 1393 | logger.info( |
| 1352 | - "embed_image response | mode=%s priority=%s inputs=%d normalize=%s dim=%d cache_hits=%d cache_misses=%d first_vector=%s latency_ms=%.2f", | |
| 1394 | + "%s response | mode=%s priority=%s inputs=%d normalize=%s dim=%d cache_hits=%d cache_misses=%d first_vector=%s latency_ms=%.2f", | |
| 1395 | + route, | |
| 1353 | 1396 | result.mode, |
| 1354 | 1397 | _priority_label(effective_priority), |
| 1355 | - len(urls), | |
| 1398 | + len(items), | |
| 1356 | 1399 | effective_normalize, |
| 1357 | 1400 | len(result.vectors[0]) if result.vectors and result.vectors[0] is not None else 0, |
| 1358 | 1401 | cache_hits, |
| ... | ... | @@ -1362,7 +1405,8 @@ async def embed_image( |
| 1362 | 1405 | extra=build_request_log_extra(request_id, user_id), |
| 1363 | 1406 | ) |
| 1364 | 1407 | verbose_logger.info( |
| 1365 | - "embed_image result detail | count=%d first_vector=%s latency_ms=%.2f", | |
| 1408 | + "%s result detail | count=%d first_vector=%s latency_ms=%.2f", | |
| 1409 | + route, | |
| 1366 | 1410 | len(result.vectors), |
| 1367 | 1411 | result.vectors[0][: _VECTOR_PREVIEW_DIMS] |
| 1368 | 1412 | if result.vectors and result.vectors[0] is not None |
| ... | ... | @@ -1383,24 +1427,73 @@ async def embed_image( |
| 1383 | 1427 | cache_misses=cache_misses, |
| 1384 | 1428 | ) |
| 1385 | 1429 | logger.error( |
| 1386 | - "embed_image failed | priority=%s inputs=%d normalize=%s latency_ms=%.2f error=%s", | |
| 1430 | + "%s failed | priority=%s inputs=%d normalize=%s latency_ms=%.2f error=%s", | |
| 1431 | + route, | |
| 1387 | 1432 | _priority_label(effective_priority), |
| 1388 | - len(urls), | |
| 1433 | + len(items_in), | |
| 1389 | 1434 | effective_normalize, |
| 1390 | 1435 | latency_ms, |
| 1391 | 1436 | e, |
| 1392 | 1437 | exc_info=True, |
| 1393 | 1438 | extra=build_request_log_extra(request_id, user_id), |
| 1394 | 1439 | ) |
| 1395 | - raise HTTPException(status_code=502, detail=f"Image embedding backend failure: {e}") from e | |
| 1440 | + raise HTTPException(status_code=502, detail=f"{route} backend failure: {e}") from e | |
| 1396 | 1441 | finally: |
| 1397 | 1442 | if limiter_acquired: |
| 1398 | 1443 | remaining = _image_request_limiter.release(success=success) |
| 1399 | 1444 | logger.info( |
| 1400 | - "embed_image finalize | success=%s priority=%s active_after=%d", | |
| 1445 | + "%s finalize | success=%s priority=%s active_after=%d", | |
| 1446 | + route, | |
| 1401 | 1447 | success, |
| 1402 | 1448 | _priority_label(effective_priority), |
| 1403 | 1449 | remaining, |
| 1404 | 1450 | extra=build_request_log_extra(request_id, user_id), |
| 1405 | 1451 | ) |
| 1406 | 1452 | reset_request_log_context(log_tokens) |
| 1453 | + | |
| 1454 | + | |
| 1455 | +@app.post("/embed/image") | |
| 1456 | +async def embed_image( | |
| 1457 | + images: List[str], | |
| 1458 | + http_request: Request, | |
| 1459 | + response: Response, | |
| 1460 | + normalize: Optional[bool] = None, | |
| 1461 | + priority: int = 0, | |
| 1462 | +) -> List[Optional[List[float]]]: | |
| 1463 | + if _image_model is None: | |
| 1464 | + raise HTTPException(status_code=503, detail="Image embedding model not loaded in this service") | |
| 1465 | + items = _parse_string_inputs(images, kind="image", empty_detail="empty URL/path") | |
| 1466 | + return await _run_image_lane_embed( | |
| 1467 | + route="embed_image", | |
| 1468 | + lane="image", | |
| 1469 | + items=items, | |
| 1470 | + http_request=http_request, | |
| 1471 | + response=response, | |
| 1472 | + normalize=normalize, | |
| 1473 | + priority=priority, | |
| 1474 | + preview_chars=_LOG_IMAGE_PREVIEW_CHARS, | |
| 1475 | + ) | |
| 1476 | + | |
| 1477 | + | |
| 1478 | +@app.post("/embed/clip_text") | |
| 1479 | +async def embed_clip_text( | |
| 1480 | + texts: List[str], | |
| 1481 | + http_request: Request, | |
| 1482 | + response: Response, | |
| 1483 | + normalize: Optional[bool] = None, | |
| 1484 | + priority: int = 0, | |
| 1485 | +) -> List[Optional[List[float]]]: | |
| 1486 | + """CN-CLIP 文本塔,与 ``POST /embed/image`` 同向量空间。""" | |
| 1487 | + if _image_model is None: | |
| 1488 | + raise HTTPException(status_code=503, detail="Image embedding model not loaded in this service") | |
| 1489 | + items = _parse_string_inputs(texts, kind="text", empty_detail="empty string") | |
| 1490 | + return await _run_image_lane_embed( | |
| 1491 | + route="embed_clip_text", | |
| 1492 | + lane="clip_text", | |
| 1493 | + items=items, | |
| 1494 | + http_request=http_request, | |
| 1495 | + response=response, | |
| 1496 | + normalize=normalize, | |
| 1497 | + priority=priority, | |
| 1498 | + preview_chars=_LOG_TEXT_PREVIEW_CHARS, | |
| 1499 | + ) | ... | ... |
scripts/es_debug_search.py
| ... | ... | @@ -412,7 +412,7 @@ def _looks_like_image_ref(url: str) -> bool: |
| 412 | 412 | |
| 413 | 413 | def _encode_clip_query_vector(query: str) -> List[float]: |
| 414 | 414 | """ |
| 415 | - 与索引中 image_embedding 同空间:图走 HTTP /embed/image;文本走 clip-as-service gRPC encode。 | |
| 415 | + 与索引中 image_embedding 同空间:图走 ``POST /embed/image``;文本走 ``POST /embed/clip_text``(6008)。 | |
| 416 | 416 | """ |
| 417 | 417 | import numpy as np |
| 418 | 418 | |
| ... | ... | @@ -420,40 +420,14 @@ def _encode_clip_query_vector(query: str) -> List[float]: |
| 420 | 420 | if not q: |
| 421 | 421 | raise ValueError("empty query") |
| 422 | 422 | |
| 423 | - from config.services_config import get_embedding_image_backend_config | |
| 424 | - | |
| 425 | - backend, cfg = get_embedding_image_backend_config() | |
| 423 | + from embeddings.image_encoder import CLIPImageEncoder | |
| 426 | 424 | |
| 425 | + enc = CLIPImageEncoder() | |
| 427 | 426 | if _looks_like_image_ref(q): |
| 428 | - from embeddings.image_encoder import CLIPImageEncoder | |
| 429 | - | |
| 430 | - enc = CLIPImageEncoder() | |
| 431 | 427 | vec = enc.encode_image_from_url(q, normalize_embeddings=True, priority=1) |
| 432 | - return vec.astype(np.float32).flatten().tolist() | |
| 433 | - | |
| 434 | - if backend != "clip_as_service": | |
| 435 | - raise RuntimeError( | |
| 436 | - "mode 5 纯文本查询需要 CN-CLIP 文本向量(与 clip-as-service 同空间)。" | |
| 437 | - "当前 image_backend 为 local_cnclip,本脚本不加载本地模型。" | |
| 438 | - "请将 config 中 services.embedding.image_backend 设为 clip_as_service 并启动 grpc " | |
| 439 | - "(默认 51000),或输入图片 URL/路径(将调用 POST /embed/image 到 6008)。" | |
| 440 | - ) | |
| 441 | - | |
| 442 | - from embeddings.clip_as_service_encoder import _ensure_clip_client_path | |
| 443 | - | |
| 444 | - _ensure_clip_client_path() | |
| 445 | - from clip_client import Client | |
| 446 | - | |
| 447 | - server = str(cfg.get("server") or "grpc://127.0.0.1:51000").strip() | |
| 448 | - normalize = bool(cfg.get("normalize_embeddings", True)) | |
| 449 | - client = Client(server) | |
| 450 | - arr = client.encode([q], batch_size=1, show_progress=False) | |
| 451 | - vec = np.asarray(arr[0], dtype=np.float32).flatten() | |
| 452 | - if normalize: | |
| 453 | - n = float(np.linalg.norm(vec)) | |
| 454 | - if np.isfinite(n) and n > 0.0: | |
| 455 | - vec = vec / n | |
| 456 | - return vec.tolist() | |
| 428 | + else: | |
| 429 | + vec = enc.encode_clip_text(q, normalize_embeddings=True, priority=1) | |
| 430 | + return vec.astype(np.float32).flatten().tolist() | |
| 457 | 431 | |
| 458 | 432 | |
| 459 | 433 | def search_title_knn(es: Any, index_name: str, query: str, size: int) -> List[Dict[str, Any]]: |
| ... | ... | @@ -467,7 +441,6 @@ def search_title_knn(es: Any, index_name: str, query: str, size: int) -> List[Di |
| 467 | 441 | qv = vec.astype("float32").flatten().tolist() |
| 468 | 442 | num_cand = max(size * 10, 100) |
| 469 | 443 | body: Dict[str, Any] = { |
| 470 | - "size": size, | |
| 471 | 444 | "knn": { |
| 472 | 445 | "field": "title_embedding", |
| 473 | 446 | "query_vector": qv, |
| ... | ... | @@ -484,7 +457,6 @@ def search_image_knn(es: Any, index_name: str, query: str, size: int) -> List[Di |
| 484 | 457 | num_cand = max(size * 10, 100) |
| 485 | 458 | field = "image_embedding.vector" |
| 486 | 459 | body: Dict[str, Any] = { |
| 487 | - "size": size, | |
| 488 | 460 | "knn": { |
| 489 | 461 | "field": field, |
| 490 | 462 | "query_vector": qv, | ... | ... |
scripts/service_ctl.sh
| ... | ... | @@ -871,7 +871,7 @@ Special targets: |
| 871 | 871 | |
| 872 | 872 | Examples: |
| 873 | 873 | ./scripts/service_ctl.sh up all |
| 874 | - ./scripts/service_ctl.sh up tei cnclip embedding translator reranker | |
| 874 | + ./scripts/service_ctl.sh up tei cnclip embedding embedding-image translator reranker | |
| 875 | 875 | ./scripts/service_ctl.sh up backend indexer frontend |
| 876 | 876 | ./scripts/service_ctl.sh restart |
| 877 | 877 | ./scripts/service_ctl.sh monitor-start all | ... | ... |
tests/ci/test_service_api_contracts.py
| ... | ... | @@ -556,6 +556,9 @@ class _FakeImageModel: |
| 556 | 556 | def encode_image_urls(self, urls, batch_size=8, normalize_embeddings=True): |
| 557 | 557 | return [np.array([0.3, 0.2, 0.1], dtype=np.float32) for _ in urls] |
| 558 | 558 | |
| 559 | + def encode_clip_texts(self, texts, batch_size=8, normalize_embeddings=True): | |
| 560 | + return [np.array([0.31, 0.21, 0.11], dtype=np.float32) for _ in texts] | |
| 561 | + | |
| 559 | 562 | |
| 560 | 563 | class _EmbeddingCacheMiss: |
| 561 | 564 | """Avoid Redis/module cache hits so contract tests exercise the encode path.""" |
| ... | ... | @@ -579,6 +582,7 @@ def embedding_module(): |
| 579 | 582 | emb_server._text_backend_name = "tei" |
| 580 | 583 | emb_server._text_cache = _EmbeddingCacheMiss() |
| 581 | 584 | emb_server._image_cache = _EmbeddingCacheMiss() |
| 585 | + emb_server._clip_text_cache = _EmbeddingCacheMiss() | |
| 582 | 586 | yield emb_server |
| 583 | 587 | |
| 584 | 588 | |
| ... | ... | @@ -604,6 +608,17 @@ def test_embedding_image_contract(embedding_module): |
| 604 | 608 | assert len(data[0]) == 3 |
| 605 | 609 | |
| 606 | 610 | |
| 611 | +def test_embedding_clip_text_contract(embedding_module): | |
| 612 | + from fastapi.testclient import TestClient | |
| 613 | + | |
| 614 | + with TestClient(embedding_module.app) as client: | |
| 615 | + resp = client.post("/embed/clip_text", json=["纯棉短袖", "street tee"]) | |
| 616 | + assert resp.status_code == 200 | |
| 617 | + data = resp.json() | |
| 618 | + assert len(data) == 2 | |
| 619 | + assert len(data[0]) == 3 | |
| 620 | + | |
| 621 | + | |
| 607 | 622 | class _FakeTranslator: |
| 608 | 623 | model = "qwen-mt" |
| 609 | 624 | supports_batch = True | ... | ... |