d90e7428
tangwang
补充重排
|
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
|
A minimal, production-ready reranker service based on **BAAI/bge-reranker-v2-m3**.
Features
- FP16 on GPU
- Length-based sorting to reduce padding waste
- Deduplication to avoid redundant inference
- Scores returned in original input order
- Simple FastAPI service
## Files
- `reranker/bge_reranker.py`: core model loading + scoring logic
- `reranker/server.py`: FastAPI service with `/health` and `/rerank`
- `reranker/config.py`: simple configuration
## Requirements
Install Python deps (already in project requirements):
- `torch`
- `modelscope`
- `fastapi`
- `uvicorn`
## Configuration
Edit `reranker/config.py`:
- `MODEL_NAME`: default `BAAI/bge-reranker-v2-m3`
- `DEVICE`: `None` (auto), `cuda`, or `cpu`
- `USE_FP16`: enable fp16 on GPU
- `BATCH_SIZE`: default 64
- `MAX_LENGTH`: default 512
- `PORT`: default 6007
- `MAX_DOCS`: request limit (default 1000)
## Run the Service
```bash
uvicorn reranker.server:app --host 0.0.0.0 --port 6007
```
## API
### Health
```
GET /health
```
### Rerank
```
POST /rerank
Content-Type: application/json
{
"query": "wireless mouse",
"docs": ["logitech mx master", "usb cable", "wireless mouse bluetooth"]
}
```
Response:
```
{
"scores": [0.93, 0.02, 0.88],
"meta": {
"input_docs": 3,
"usable_docs": 3,
"unique_docs": 3,
"dedup_ratio": 0.0,
"elapsed_ms": 12.4,
"model": "BAAI/bge-reranker-v2-m3",
"device": "cuda",
"fp16": true,
"batch_size": 64,
"max_length": 512,
"normalize": true,
"service_elapsed_ms": 13.1
}
}
```
## Logging
The service uses standard Python logging. For structured logs and full output,
run uvicorn with:
```bash
uvicorn reranker.server:app --host 0.0.0.0 --port 6007 --log-level info
```
## Notes
- No caching is used by design.
- Inputs are deduplicated by exact string match.
- Empty or null docs are skipped and scored as 0.
|