86d8358b
tangwang
config optimize
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
|
# Configuration System Review And Redesign
## 1. Goal
This document reviews the current configuration system and proposes a practical redesign for long-term maintainability.
The target is a configuration system that is:
- unified in loading and ownership
- clear in boundaries and precedence
- visible in effective behavior
- easy to evolve across development, deployment, and operations
This review is based on the current implementation, not only on the intended architecture in docs.
## 2. Project Context
The repo already defines the right architectural direction:
- `config/config.yaml` should be the main configuration source for search behavior and service wiring
- `.env` should mainly carry deployment-specific values and secrets
- provider/backend expansion should stay centralized instead of spreading through business code
That direction is described in:
- [`README.md`](/data/saas-search/README.md)
- [`docs/DEVELOPER_GUIDE.md`](/data/saas-search/docs/DEVELOPER_GUIDE.md)
- [`docs/QUICKSTART.md`](/data/saas-search/docs/QUICKSTART.md)
- [`translation/README.md`](/data/saas-search/translation/README.md)
The problem is not the architectural intent. The problem is that the current implementation only partially follows it.
## 3. Current-State Review
### 3.1 What exists today
The current system effectively has several configuration channels:
- `config/config.yaml`
- search behavior
- rerank behavior
- services registry
- tenant config
- `config/config_loader.py`
- parses search behavior and tenant config into `SearchConfig`
- also injects some defaults from code
- `config/services_config.py`
- reparses `config/config.yaml` again, independently
- resolves translation, embedding, rerank service config
- also applies env overrides
- `config/env_config.py`
- loads `.env`
- defines ES, Redis, DB, host/port, service URLs, namespace, model path defaults
- service-local config modules
- [`embeddings/config.py`](/data/saas-search/embeddings/config.py)
- [`reranker/config.py`](/data/saas-search/reranker/config.py)
- startup scripts
- derive defaults from shell env, Python config, and YAML in different combinations
- inline fallbacks in business logic
- query parsing
- indexing
- service startup
### 3.2 Main findings
#### Finding A: there is no single loader for the full effective configuration
`ConfigLoader` and `services_config` both parse `config/config.yaml`, but they do so separately and with different responsibilities.
- [`config/config_loader.py`](/data/saas-search/config/config_loader.py#L148)
- [`config/services_config.py`](/data/saas-search/config/services_config.py#L33)
Impact:
- the same file is loaded twice through different code paths
- search config and services config can drift in interpretation
- alternative config paths are hard to support cleanly
- tests and tools cannot ask one place for the full effective config tree
#### Finding B: precedence is not explicit, stable, or globally enforced
Current precedence differs by subsystem:
- search behavior mostly comes from YAML plus code defaults
- embedding and rerank allow env overrides for provider/backend/url
- translation intentionally blocks some env overrides
- startup scripts still choose host/port and mode via env
- some values are reconstructed from other env vars
Examples:
- env override for embedding provider/url/backend:
- [`config/services_config.py`](/data/saas-search/config/services_config.py#L52)
- [`config/services_config.py`](/data/saas-search/config/services_config.py#L68)
- [`config/services_config.py`](/data/saas-search/config/services_config.py#L139)
- host/port and service URL reconstruction:
- [`config/env_config.py`](/data/saas-search/config/env_config.py#L55)
- [`config/env_config.py`](/data/saas-search/config/env_config.py#L75)
- translator host/port still driven by startup env:
- [`scripts/start_translator.sh`](/data/saas-search/scripts/start_translator.sh#L28)
Impact:
- operators cannot reliably predict the effective configuration by reading one file
- the same setting category behaves differently across services
- incidents become harder to debug because source-of-truth depends on the code path
#### Finding C: defaults are duplicated across YAML and code
There are several layers of default values:
- dataclass defaults in `QueryConfig`
- fallback defaults in `ConfigLoader._parse_config`
- defaults in `config.yaml`
- defaults in `env_config.py`
- defaults in `embeddings/config.py`
- defaults in `reranker/config.py`
- defaults in startup scripts
Examples:
- query defaults duplicated in dataclass and parser:
- [`config/config_loader.py`](/data/saas-search/config/config_loader.py#L24)
- [`config/config_loader.py`](/data/saas-search/config/config_loader.py#L240)
- embedding defaults duplicated in YAML, `services_config`, `embeddings/config.py`, and startup script:
- [`config/config.yaml`](/data/saas-search/config/config.yaml#L196)
- [`embeddings/config.py`](/data/saas-search/embeddings/config.py#L14)
- [`scripts/start_embedding_service.sh`](/data/saas-search/scripts/start_embedding_service.sh#L29)
- reranker defaults duplicated in YAML and `reranker/config.py`:
- [`config/config.yaml`](/data/saas-search/config/config.yaml#L214)
- [`reranker/config.py`](/data/saas-search/reranker/config.py#L6)
Impact:
- changing a default is risky because there may be multiple hidden copies
- code review cannot easily tell whether a value is authoritative or dead legacy
- “same config” may behave differently across processes
#### Finding D: config is still embedded in runtime logic
Some important behavior remains encoded as inline fallback logic rather than declared config.
Examples:
- query-time translation target languages fallback to `["en", "zh"]`:
- [`query/query_parser.py`](/data/saas-search/query/query_parser.py#L339)
- indexer text handling and LLM enrichment also fallback to `["en", "zh"]`:
- [`indexer/document_transformer.py`](/data/saas-search/indexer/document_transformer.py#L216)
- [`indexer/document_transformer.py`](/data/saas-search/indexer/document_transformer.py#L310)
- [`indexer/document_transformer.py`](/data/saas-search/indexer/document_transformer.py#L649)
Impact:
- configuration is not fully visible in config files
- behavior can silently change when tenant config is missing or malformed
- “default behavior” is spread across business modules
#### Finding E: some configuration assets are not managed as first-class config
Query rewrite is configured through an external file, but the file path is hardcoded and currently inconsistent with the repository content.
- loader expects:
- [`config/config_loader.py`](/data/saas-search/config/config_loader.py#L162)
- repo currently contains:
- [`config/query_rewrite.dict`](/data/saas-search/config/query_rewrite.dict)
There is also an admin API that mutates rewrite rules in memory only:
- [`api/routes/admin.py`](/data/saas-search/api/routes/admin.py#L68)
- [`query/query_parser.py`](/data/saas-search/query/query_parser.py#L622)
Impact:
- rewrite rules are neither cleanly file-backed nor fully runtime-managed
- restart behavior is unclear
- configuration visibility and persistence are weak
#### Finding F: visibility is limited
The system exposes only a small sanitized subset at `/admin/config`.
- [`api/routes/admin.py`](/data/saas-search/api/routes/admin.py#L42)
At the same time, the true effective config includes:
- tenant overlays
- env overrides
- service backend selections
- script-selected modes
- hidden defaults in code
Impact:
- there is no authoritative “effective config” view
- debugging configuration mismatches requires source reading
- operators cannot easily verify what each process actually started with
#### Finding G: the indexer does not really consume the unified config as a first-class dependency
Indexer startup explicitly says config is loaded only for parity/logging and routes do not depend on it.
- [`api/indexer_app.py`](/data/saas-search/api/indexer_app.py#L76)
Impact:
- configuration is not truly system-wide
- search-side and indexer-side behavior can drift
- the current “unified config” is only partially unified
#### Finding H: docs still carry legacy and mixed mental models
Most high-level docs describe the desired centralized model, but some implementation/docs still expose legacy concepts such as `translate_to_en` and `translate_to_zh`.
- desired model:
- [`README.md`](/data/saas-search/README.md#L78)
- [`docs/DEVELOPER_GUIDE.md`](/data/saas-search/docs/DEVELOPER_GUIDE.md#L207)
- [`translation/README.md`](/data/saas-search/translation/README.md#L161)
- legacy tenant translation flags still documented:
- [`indexer/README.md`](/data/saas-search/indexer/README.md#L39)
Impact:
- new developers may follow old mental models
- cleanup work keeps getting deferred because old and new systems appear both “supported”
## 4. Design Principles For The Redesign
The redesign should follow these rules.
### 4.1 One logical configuration system
It is acceptable to have multiple files, but not multiple loaders with overlapping ownership.
There must be one loader pipeline that produces one typed `AppConfig`.
### 4.2 Configuration files declare, parser code interprets, env provides runtime injection
Responsibilities should be:
- configuration files
- declare non-secret desired behavior and non-secret deployable settings
- parsing logic
- load, merge, validate, normalize, and expose typed config
- never invent hidden business behavior
- environment variables
- carry secrets and a small set of runtime/process values
- do not redefine business behavior casually
### 4.3 One precedence rule for the whole system
Every config category should follow the same merge model unless explicitly exempted.
### 4.4 No silent implicit fallback for business behavior
Fail fast at startup when required config is missing or invalid.
Do not silently fall back to legacy behavior such as hardcoded language lists.
### 4.5 Effective configuration must be observable
Every service should be able to show:
- config version or hash
- source files loaded
- environment name
- sanitized effective configuration
## 5. Recommended Target Design
## 5.1 Boundary model
Use three clear layers.
### Layer 1: repository-managed static config
Purpose:
- search behavior
- tenant behavior
- provider/backend registry
- non-secret service topology defaults
- feature switches
Examples:
- field boosts
- query strategy
- rerank fusion parameters
- tenant language plans
- translation capability registry
- embedding backend selection default
### Layer 2: environment-specific overlays
Purpose:
- per-environment non-secret differences
- service endpoints by environment
- resource sizing defaults by environment
- dev/test/prod operational differences
Examples:
- local embedding URL vs production URL
- dev rerank backend vs prod rerank backend
- lower concurrency in local development
### Layer 3: environment variables
Purpose:
- secrets
- bind host/port
- external infrastructure credentials
- container-orchestrator last-mile injection
Examples:
- `ES_HOST`, `ES_USERNAME`, `ES_PASSWORD`
- `DB_HOST`, `DB_USERNAME`, `DB_PASSWORD`
- `REDIS_HOST`, `REDIS_PASSWORD`
- `DASHSCOPE_API_KEY`, `DEEPL_AUTH_KEY`
- `API_HOST`, `API_PORT`, `INDEXER_PORT`, `TRANSLATION_PORT`
Rule:
- environment variables should not be the normal path for choosing business behavior such as translation model, embedding backend, or tenant language policy
- if an env override is allowed for a non-secret field, it must be explicitly listed and documented as an operational override, not a hidden convention
## 5.2 Unified precedence
Recommended precedence:
1. schema defaults in code
2. `config/base.yaml`
3. `config/environments/<env>.yaml`
4. tenant overlay from `config/tenants/`
5. environment variables for the explicitly allowed runtime keys
6. CLI flags for the current process only
Important rule:
- only one module may implement this merge logic
- no business module may call `os.getenv()` directly for configuration
## 5.3 Recommended directory structure
```text
config/
schema.py
loader.py
sources.py
base.yaml
environments/
dev.yaml
test.yaml
prod.yaml
tenants/
_default.yaml
1.yaml
162.yaml
170.yaml
dictionaries/
query_rewrite.dict
README.md
.env.example
```
Notes:
- `base.yaml` contains shared defaults and feature behavior
- `environments/*.yaml` contains environment-specific non-secret overrides
- `tenants/*.yaml` contains tenant-specific overrides only
- `dictionaries/` stores first-class config assets such as rewrite dictionaries
- `schema.py` defines the typed config model
- `loader.py` is the only entry point that loads and merges config
If the team prefers fewer files, `tenants.yaml` is also acceptable. The key requirement is not “one file”, but “one loading model with clear ownership”.
## 5.4 Typed configuration model
Introduce one root object, for example:
```python
class AppConfig(BaseModel):
runtime: RuntimeConfig
infrastructure: InfrastructureConfig
search: SearchConfig
services: ServicesConfig
tenants: TenantCatalogConfig
assets: ConfigAssets
```
Suggested subtrees:
- `runtime`
- environment name
- config revision/hash
- bind addresses/ports
- `infrastructure`
- ES
- DB
- Redis
- index namespace
- `search`
- field boosts
- query config
- function score
- rerank behavior
- spu config
- `services`
- translation
- embedding
- rerank
- `tenants`
- default tenant config
- tenant overrides
- `assets`
- rewrite dictionary path
Benefits:
- one validated object shared by backend, indexer, translator, embedding, reranker
- one place for defaults
- one place for schema evolution
## 5.5 Loading flow
Recommended loading flow:
1. determine `APP_ENV` or `RUNTIME_ENV`
2. load schema defaults
3. load `config/base.yaml`
4. load `config/environments/<env>.yaml` if present
5. load tenant files
6. inject first-class assets such as rewrite dictionary
7. apply allowed env overrides
8. validate the final `AppConfig`
9. freeze and cache the config object
10. expose a sanitized effective-config view
Important:
- every process should call the same loader
- services should receive a resolved `AppConfig`, not re-open YAML independently
## 5.6 Clear responsibility split
### Configuration files are responsible for
- what the system should do
- what providers/backends are available
- which features are enabled
- tenant language/index policies
- non-secret service topology
### Parser/loader code is responsible for
- locating sources
- merge precedence
- type validation
- normalization
- deprecation warnings
- producing the final immutable config object
### Environment variables are responsible for
- secrets
- bind addresses/ports
- infrastructure endpoints when the deployment platform injects them
- a very small set of documented operational overrides
### Business code is not responsible for
- inventing defaults for missing config
- loading YAML directly
- calling `os.getenv()` for normal application behavior
## 5.7 How to handle service config
Unify all service-facing config under one structure:
```yaml
services:
translation:
endpoint: "http://translator:6006"
timeout_sec: 10
default_model: "llm"
default_scene: "general"
capabilities: ...
embedding:
endpoint:
text: "http://embedding:6005"
image: "http://embedding-image:6008"
backend: "tei"
backends: ...
rerank:
endpoint: "http://reranker:6007/rerank"
backend: "qwen3_vllm"
backends: ...
```
Rules:
- `endpoint` is how callers reach the service
- `backend` is how the service itself is implemented
- only the service process cares about `backend`
- only callers care about `endpoint`
- both still belong to the same config tree, because they are part of one system
## 5.8 How to handle tenant config
Tenant config should become explicit policy, not translation-era leftovers.
Recommended tenant fields:
- `primary_language`
- `index_languages`
- `search_languages`
- `translation_policy`
- `facet_policy`
- optional tenant-specific ranking overrides
Avoid keeping `translate_to_en` and `translate_to_zh` as active concepts in the long-term model.
If compatibility is needed, support them only in the loader as deprecated aliases and emit warnings.
## 5.9 How to handle rewrite rules and similar assets
Treat them as declared config assets.
Recommended rules:
- file path declared in config
- one canonical location under `config/dictionaries/`
- loader validates presence and format
- admin runtime updates either:
- are removed, or
- write back through a controlled persistence path
Do not keep a hybrid model where startup loads one file and admin mutates only in memory.
## 5.10 Observability improvements
Add the following:
- `config dump` CLI that prints sanitized effective config
- startup log with config hash, environment, and config file list
- `/admin/config/effective` endpoint returning sanitized effective config
- `/admin/config/meta` endpoint returning:
- environment
- config hash
- loaded source files
- deprecated keys in use
This is important for operations and for multi-service debugging.
## 6. Practical Refactor Plan
The refactor should be incremental.
### Phase 1: establish the new config core without changing behavior
- create `config/schema.py`
- create `config/loader.py`
- move all current defaults into schema models
- make loader read current `config/config.yaml`
- make loader read `.env` only for approved keys
- expose one `get_app_config()`
Result:
- same behavior, but one typed root config becomes available
### Phase 2: remove duplicate readers
- make `services_config.py` a thin adapter over `get_app_config()`
- make `tenant_config_loader.py` read from `get_app_config()`
- stop reparsing YAML in `services_config.py`
- stop service modules from depending on legacy local config modules for behavior
Result:
- one parsing path
- fewer divergence risks
### Phase 3: move hidden defaults out of business logic
- remove hardcoded fallback language lists from query/indexer modules
- require tenant defaults to come from config schema only
- remove duplicate behavior defaults from service code
Result:
- behavior becomes visible and reviewable
### Phase 4: clean service startup configuration
- make startup scripts ask the unified loader for resolved values
- keep only bind host/port and secret injection in shell env
- retire or reduce `embeddings/config.py` and `reranker/config.py`
Result:
- startup behavior matches runtime config model
### Phase 5: split config files by responsibility
- keep a single root loader
- split current giant `config.yaml` into:
- `base.yaml`
- `environments/<env>.yaml`
- `tenants/*.yaml`
- `dictionaries/query_rewrite.dict`
Result:
- config remains unified logically, but is easier to read and maintain physically
### Phase 6: deprecate legacy compatibility
- deprecate `translate_to_en` and `translate_to_zh`
- deprecate env-based backend/provider selection except for explicitly approved keys
- remove old code paths after one or two release cycles
Result:
- the system becomes simpler instead of carrying two generations forever
## 7. Concrete Rules To Adopt
These rules should be documented and enforced in code review.
### Rule 1
Only `config/loader.py` may load config files or `.env`.
### Rule 2
Only `config/loader.py` may read `os.getenv()` for application config.
### Rule 3
Business modules receive typed config objects and do not read files or env directly.
### Rule 4
Each config key has one owner.
Examples:
- `search.query.knn_boost` belongs to search behavior config
- `services.embedding.backend` belongs to service implementation config
- `infrastructure.redis.password` belongs to env/secrets
### Rule 5
Every fallback must be either:
- declared in schema defaults, or
- rejected at startup
No hidden fallback in runtime logic.
### Rule 6
Every configuration asset must be visible in one of these places only:
- config file
- env var
- generated runtime metadata
Not inside parser code as an implicit constant.
## 8. Recommended Naming Conventions
Suggested conventions:
- config keys use noun-based hierarchical names
- avoid mixing transport and implementation concepts in one field
- use `endpoint` for caller-facing addresses
- use `backend` for service-internal implementation choice
- use `enabled` only for true feature toggles
- use `default_*` only when a real selection happens at runtime
Examples:
- good: `services.rerank.endpoint`
- good: `services.rerank.backend`
- good: `tenants.default.index_languages`
- avoid: `service_url`, `base_url`, `provider`, `backend`, and script env all meaning slightly different things without a common model
## 9. Highest-Priority Cleanup Items
If the team wants the shortest path to improvement, start here:
1. build one root `AppConfig`
2. make `services_config.py` stop reparsing YAML
3. declare rewrite dictionary path explicitly and fix the current mismatch
4. remove hardcoded `["en", "zh"]` fallbacks from query/indexer logic
5. replace `/admin/config` with an effective-config endpoint
6. retire `embeddings/config.py` and `reranker/config.py` as behavior sources
7. deprecate legacy tenant translation flags
## 10. Expected Outcome
After the redesign:
- developers can answer “where does this setting come from?” in one step
- operators can see effective config without reading source code
- backend, indexer, translator, embedding, and reranker all share one model
- tenant behavior is explicit instead of partially implicit
- migration becomes safer because defaults and precedence are centralized
- adding a new provider/backend becomes configuration extension, not configuration archaeology
## 11. Summary
The current system has the right intent but not yet the right implementation shape.
Today the main problems are:
- duplicate config loaders
- inconsistent precedence
- duplicated defaults
- config hidden in runtime logic
- weak effective-config visibility
- leftover legacy concepts
The recommended direction is:
- one root typed config
- one loader pipeline
- explicit layered sources
- narrow env responsibility
- no hidden business fallbacks
- observable effective config
That design is practical to implement incrementally in this repository and aligns well with the project's multi-tenant, multi-service, provider/backend-based architecture.
|