14 Apr, 2026
1 commit
09 Apr, 2026
1 commit
-
- 数据转换放到 scripts/data_import/README.md - 诊断巡检放到 scripts/inspect/README.md - 运维辅助放到 scripts/ops/README.md - 前端辅助服务放到 scripts/frontend/frontend_server.py - 翻译模型下载放到 scripts/translation/download_translation_models.py - 临时图片补 embedding 脚本收敛成 scripts/maintenance/embed_tenant_image_urls.py - Redis 监控脚本并入 redis/,现在是 scripts/redis/monitor_eviction.py 同时我把真实调用链都改到了新位置: - scripts/start_frontend.sh - scripts/start_cnclip_service.sh - scripts/service_ctl.sh - scripts/setup_translator_venv.sh - scripts/README.md 文档里涉及这些脚本的路径也同步修了,主要是 docs/QUICKSTART.md 和 translation/README.md。
18 Mar, 2026
1 commit
-
Implemented CTranslate2 for the three local translation models and switched the existing local_nllb / local_marian factories over to it. The new runtime lives in local_ctranslate2.py, including HF->CT2 auto-conversion, float16 compute type mapping, Marian direction handling, and NLLB target-prefix decoding. The service wiring is in service.py (line 113), and the three model configs now point at explicit ctranslate2-float16 dirs in config.yaml (line 133). I also updated the setup path so this is usable end-to-end: ctranslate2>=4.7.0 was added to requirements_translator_service.txt and requirements.txt, the download script now supports pre-conversion in download_translation_models.py (line 27), and the docs/config examples were refreshed in translation/README.md. I installed ctranslate2 into .venv-translator, pre-converted all three models, and the CT2 artifacts are now already on disk: models/translation/facebook/nllb-200-distilled-600M/ctranslate2-float16 models/translation/Helsinki-NLP/opus-mt-zh-en/ctranslate2-float16 models/translation/Helsinki-NLP/opus-mt-en-zh/ctranslate2-float16 Verification was solid. python3 -m compileall passed, direct TranslationService smoke tests ran successfully in .venv-translator, and the focused NLLB benchmark on the local GPU showed a clear win: batch_size=16: HF 0.347s/batch, 46.1 items/s vs CT2 0.130s/batch, 123.0 items/s batch_size=1: HF 0.396s/request vs CT2 0.126s/request One caveat: translation quality on some very short phrases, especially opus-mt-en-zh, still looks a bit rough in smoke tests, so I’d run your real quality set before fully cutting over. If you want, I can take the next step and update the benchmark script/report so you have a fresh full CT2 performance report for all three models.
17 Mar, 2026
1 commit