24 Apr, 2026

1 commit

  • - `baseline`(top771 最优,`seed_baseline`)
      - `es_bias: 10.0`, `es_exponent: 0.05`
      - `text_bias: 0.1`, `text_exponent: 0.35`, `text_translation_weight: 1.0`
      - `knn_text_weight: 1.0`, `knn_image_weight: 2.0`, `knn_tie_breaker: 0.3`
      - `knn_bias: 0.2`, `knn_exponent: 5.6`
      - `knn_text_bias: 0.2`, `knn_text_exponent: 0.0`
      - `knn_image_bias: 0.2`, `knn_image_exponent: 0.0`
    - `54 条上得到的极端解`(`seed_legacy_bo234`)
      - `es_bias: 7.214`, `es_exponent: 0.2025`
      - `text_bias: 4.0`, `text_exponent: 1.584`, `text_translation_weight: 1.4441`
      - `knn_text_weight: 0.1`, `knn_image_weight: 5.6232`, `knn_tie_breaker: 0.021`
      - `knn_bias: 0.0019`, `knn_exponent: 11.8477`
      - `knn_text_bias: 2.3125`, `knn_text_exponent: 1.1547`
      - `knn_image_bias: 0.9641`, `knn_image_exponent: 5.8671`
    - `bo_012`(`Primary_Metric_Score=0.485027`)
      - `es_bias: 6.6233`, `es_exponent: 0.2377`
      - `text_bias: 0.049`, `text_exponent: 0.4446`, `text_translation_weight: 1.6236`
      - `knn_text_weight: 1.0344`, `knn_image_weight: 1.3565`, `knn_tie_breaker: 0.212`
      - `knn_bias: 0.0052`, `knn_exponent: 4.4639`
      - `knn_text_bias: 0.1148`, `knn_text_exponent: 1.0926`
      - `knn_image_bias: 0.0114`, `knn_image_exponent: 5.2496`
    - `bo_018`(`Primary_Metric_Score=0.484691`)
      - `es_bias: 8.8861`, `es_exponent: 0.2794`
      - `text_bias: 0.0189`, `text_exponent: 0.2`, `text_translation_weight: 1.7178`
      - `knn_text_weight: 1.7459`, `knn_image_weight: 4.2658`, `knn_tie_breaker: 0.2814`
      - `knn_bias: 0.001`, `knn_exponent: 1.4923`
      - `knn_text_bias: 4.0`, `knn_text_exponent: 0.9309`
      - `knn_image_bias: 0.01`, `knn_image_exponent: 5.8289`
    
    **怎么找(可复现)**
    - 从 `leaderboard.csv` 找(含分数+参数一行全):`artifacts/search_evaluation/tuning_runs/coarse_fusion_clothing_top771_resilient_20260422T091650Z/leaderboard.csv`
      - 例:`rg '^2,bo_012,' artifacts/search_evaluation/tuning_runs/coarse_fusion_clothing_top771_resilient_20260422T091650Z/leaderboard.csv`
    - 从 `trials.jsonl` 找(最权威,调参器实际写入的 params):`artifacts/search_evaluation/tuning_runs/coarse_fusion_clothing_top771_resilient_20260422T091650Z/trials.jsonl`
      - 例:`rg '\"name\": \"bo_012\"' artifacts/search_evaluation/tuning_runs/coarse_fusion_clothing_top771_resilient_20260422T091650Z/trials.jsonl`
      - 例:`rg '\"name\": \"seed_legacy_bo234\"' artifacts/search_evaluation/tuning_runs/coarse_fusion_clothing_top771_resilient_20260422T091650Z/trials.jsonl`
    
    **已补到 `config.yaml`**
    - 我已把这 4 套参数作为“注释 presets”补在 `coarse_rank.fusion` 旁边:`config/config.yaml:236`
    - 注意:你当前 `config/config.yaml` 里 `coarse_rank.fusion` 的生效值是 `knn_bias=0.6 / knn_exponent=0.4`,更像 `seed_low_knn_global`,不是本次大集最优的 baseline。
    tangwang
     

22 Apr, 2026

1 commit

  • - 把 batch timeout 改成“可无限长跑”:
      - [tune_fusion.py](/data/saas-search/scripts/evaluation/tune_fusion.py:400)
      - 现在 `--batch-eval-timeout-sec <= 0` 时,不再给 `subprocess.run` 设置 Python 层超时
    - 新增 resilient wrapper,负责自动续跑:
      - [run_coarse_fusion_tuning_resilient.sh](/data/saas-search/scripts/evaluation/run_coarse_fusion_tuning_resilient.sh)
      - 逻辑是:检查 `trials.jsonl` 里已完成的 live eval 数量,没到 `max_evals` 就继续 `resume-run`
      - 即使异常退出,也会 sleep 后自动从已有 `run_dir` 继续
    - 启动/续跑脚本都切到 resilient 模式:
      - [start_coarse_fusion_tuning_long.sh](/data/saas-search/scripts/evaluation/start_coarse_fusion_tuning_long.sh)
      - [resume_coarse_fusion_tuning_long.sh](/data/saas-search/scripts/evaluation/resume_coarse_fusion_tuning_long.sh)
    
    **当前任务**
    - `run_name`: `coarse_fusion_clothing_top771_resilient_20260422T091650Z`
    - `run_dir`: [coarse_fusion_clothing_top771_resilient_20260422T091650Z](/data/saas-search/artifacts/search_evaluation/tuning_runs/coarse_fusion_clothing_top771_resilient_20260422T091650Z)
    - `launch log`: [coarse_fusion_clothing_top771_resilient_20260422T091650Z.log](/data/saas-search/artifacts/search_evaluation/tuning_launches/coarse_fusion_clothing_top771_resilient_20260422T091650Z.log)
    
    **已确认**
    - wrapper 已启动并进入 `attempt=1`
    - 真正传入的是 `--batch-eval-timeout-sec 0`
    - `tune_fusion.py` 正在运行
    - `build_annotation_set.py batch` 已经在运行
    - `eval.log` 已经打出这轮的前几条 query 评测进度,说明不是空转
    
    **监控方式**
    - `tail -f artifacts/search_evaluation/tuning_launches/coarse_fusion_clothing_top771_resilient_20260422T091650Z.log`
    - `tail -f logs/eval.log`
    - `tail -f artifacts/search_evaluation/tuning_runs/coarse_fusion_clothing_top771_resilient_20260422T091650Z/trials.jsonl`
    - `cat artifacts/search_evaluation/tuning_runs/coarse_fusion_clothing_top771_resilient_20260422T091650Z/leaderboard.csv`
    
    **这次和上次的关键区别**
    - 上次是“单轮 batch 被 Python 超时截断”
    - 这次是“单轮不设 Python 超时 + 外层 wrapper 自动续跑”
    - 所以长时间运行、中途中断、再恢复,都会沿着同一个 `run_dir` 往下推进
    tangwang