YAML Configuration Reference¶
Learning Objectives
By the end of this section you will understand:
The complete structure of a homodyne YAML configuration file
All configuration sections and their options
Default values for every setting
How to generate and validate configuration templates
Configuration precedence rules
—
Configuration Overview¶
All homodyne analysis is driven by a YAML configuration file. The ConfigManager
class loads, validates, and provides access to all settings:
from homodyne.config import ConfigManager
config = ConfigManager("config.yaml")
# Access configuration sections
print(config.analysis_mode) # "static" or "laminar_flow"
print(config.get_initial_parameters()) # dict of initial param values
print(config.get_cmc_config()) # CMC settings dict
Generate a template with the CLI:
# Static mode template
homodyne-config --mode static --output config_static.yaml
# Laminar flow template
homodyne-config --mode laminar_flow --output config_flow.yaml
# Validate an existing config
homodyne-config --validate --input my_config.yaml
—
Complete Configuration Schema¶
Below is a fully annotated configuration with all available options and their defaults:
# ============================================================
# DATA SECTION
# ============================================================
data:
# Path to the HDF5 file containing XPCS data
file_path: "data.h5" # Required
# Internal HDF5 path to the C2 array (n_phi, n_t1, n_t2)
dataset_path: "/exchange/data" # Default: "/exchange/data"
# Scattering vector magnitude in Å⁻¹
q_value: 0.054 # Required
# Stator-rotor gap distance in µm (only for laminar_flow)
gap_distance: 500.0 # Default: null (required for laminar_flow)
# Frame time interval in seconds
dt: 0.1 # Default: inferred from time arrays
# Optional: restrict angle range (degrees)
phi_range: null # Default: null (load all angles)
# phi_range: [-90, 90] # Example: load only ±90°
# Optional: restrict time range (seconds)
t_range: null # Default: null (load all times)
# ============================================================
# ANALYSIS SECTION
# ============================================================
analysis:
mode: "static" # "static" or "laminar_flow"
# ============================================================
# OPTIMIZATION SECTION
# ============================================================
optimization:
method: "nlsq" # "nlsq" or "cmc" or "both"
# -------- NLSQ Settings --------
nlsq:
# Memory management
memory_fraction: 0.75 # Trigger streaming above 75% RAM
# memory_threshold_gb: null # Or explicit GB limit
# Anti-degeneracy (per-angle scaling)
anti_degeneracy:
per_angle_mode: "auto" # "auto", "constant", "individual", "fourier"
constant_scaling_threshold: 3 # Angle count threshold for "auto" mode
# CMA-ES global optimization (optional)
cmaes:
enable: false # Set true for multi-scale problems
preset: "cmaes-global" # "cmaes-fast", "cmaes-global", "cmaes-hpc"
refine_with_nlsq: true # Run NLSQ after CMA-ES
# Hybrid recovery (when standard NLSQ fails)
hybrid_recovery:
enable: true # Enable 3-attempt retry strategy
max_attempts: 3
# -------- CMC Settings --------
cmc:
# Sharding (data partitioning)
sharding:
max_points_per_shard: "auto" # Always use "auto"
sharding_strategy: "stratified" # "stratified", "random", "contiguous"
max_shards: null # null = dynamic based on dataset size
# Per-shard MCMC settings
per_shard_mcmc:
num_warmup: 500 # Warmup/burn-in samples per chain
num_samples: 1500 # Posterior samples per chain
num_chains: 4 # NUTS chains per shard
target_accept_prob: 0.8 # NUTS target acceptance probability
max_tree_depth: 10 # Max NUTS tree depth (2^10 leapfrog steps)
chain_method: "parallel" # "parallel" (default), "vectorized", "sequential"
adaptive_sampling: true # Reduce samples for small shards
# JAX profiling (advanced)
enable_jax_profiling: false
jax_profile_dir: "./profiles/jax"
# Per-angle mode (must match nlsq.anti_degeneracy.per_angle_mode)
per_angle_mode: "auto"
constant_scaling_threshold: 3
# Quality filtering
validation:
max_divergence_rate: 0.10 # Reject shards with > 10% divergences
require_nlsq_warmstart: false # True = fail if no NLSQ warm-start
# Execution backend
backend_config:
name: "auto" # "auto", "multiprocessing", "pjit", "pbs"
# Posterior combination
combination:
method: "consensus_mc" # Recommended
min_success_rate: 0.90
# "consensus_mc": precision-weighted means (correct)
# "weighted_gaussian": legacy element-wise (deprecated)
# "simple_average": unweighted (deprecated)
# Convergence criteria (under validation section)
validation:
max_divergence_rate: 0.10
max_per_shard_rhat: 1.05 # Maximum R-hat for accepted shards
min_ess: 400 # Minimum effective sample size
# ============================================================
# PARAMETER SPACE SECTION
# ============================================================
parameter_space:
# Format for each parameter:
# <name>:
# initial: <float> # Initial value for optimization
# bounds: [<low>, <high>] # Hard bounds
# prior: "uniform" # "uniform" or "normal" (CMC)
# prior_mean: null # For normal prior (CMC)
# prior_std: null # For normal prior (CMC)
# --- Diffusion parameters (static + laminar_flow) ---
D0:
initial: 1.0
bounds: [0.001, 1.0e6]
prior: "uniform"
alpha:
initial: -0.5
bounds: [-2.0, 2.0]
prior: "uniform"
D_offset:
initial: 0.01
bounds: [0.0, 1.0e4]
prior: "uniform"
# --- Shear parameters (laminar_flow only) ---
gamma_dot_0:
initial: 0.01
bounds: [1.0e-6, 1.0e4]
prior: "uniform"
beta:
initial: 0.0
bounds: [-2.0, 2.0]
prior: "uniform"
gamma_dot_t_offset:
initial: 0.001
bounds: [0.0, 1.0e3]
prior: "uniform"
phi_0:
initial: 0.0
bounds: [-180.0, 180.0]
prior: "uniform"
# ============================================================
# OUTPUT SECTION
# ============================================================
output:
directory: "./results" # Output directory
save_arrays: true # Save NPZ arrays
save_json: true # Save JSON summary
save_plots: false # Save matplotlib plots
run_id: null # null = auto-generated timestamp
# Output formats
formats:
hdf5: true # Save results as HDF5
json: true # Save summary as JSON
csv: false # Save parameter table as CSV
# Compression
compress_hdf5: true # Enable HDF5 compression
compression_level: 6 # gzip level 0-9 (higher = smaller)
# ============================================================
# STRATIFICATION SECTION (NLSQ)
# ============================================================
# Angle-aware chunking for large datasets
stratification:
enabled: "auto" # "auto" | true | false
target_chunk_size: 100000 # Target chunk size [points]
max_imbalance_ratio: 5.0 # Max angle imbalance before fallback
force_sequential_fallback: false # Force per-angle optimization
check_memory_safety: true # Check available memory
use_index_based: false # Zero-copy indexing (low memory)
collect_diagnostics: false # Collect performance metrics
log_diagnostics: false # Log diagnostic report
# ============================================================
# SEQUENTIAL SECTION (NLSQ fallback)
# ============================================================
# Per-angle optimization fallback strategy
sequential:
min_success_rate: 0.5 # Min fraction of angles converged
weighting: "inverse_variance" # "inverse_variance" | "uniform" | "n_points"
# ============================================================
# PHI FILTERING SECTION
# ============================================================
phi_filtering:
enabled: true # Enable angle filtering
target_ranges:
- min_angle: -10.0 # Angular window [degrees]
max_angle: 10.0
description: "Parallel to flow"
fallback_to_all_angles: true # Use all angles if none match
algorithm: "range_based" # Currently the only option
tolerance: 3.0 # Angular tolerance [degrees]
quality_control:
min_angles_required: 1 # Minimum angles needed
max_angle_spread: 36.0 # Max spread within range [deg]
validate_coverage: true # Validate angle coverage
require_orthogonal_angles: false # Only for anisotropy analysis
# ============================================================
# PLOTTING SECTION
# ============================================================
plotting:
save_plots: true # Save plots to files
show_plots: false # Display interactive plots
format: "png" # "png" | "pdf" | "svg"
dpi: 300 # Resolution [dots per inch]
style: "publication" # Matplotlib style
preview_mode: false # true: fast Datashader preview
fit_surface: "solver" # "solver" | "posthoc"
color_scale:
mode: "legacy" # "legacy" | "percentile" | "fixed"
pin_legacy_range: true # Pin to legacy color range
percentile_min: 1.0 # Low percentile for auto-scale
percentile_max: 99.0 # High percentile
fixed_min: 1.0 # Fixed range minimum
fixed_max: 1.5 # Fixed range maximum
datashader:
canvas_width: 1200 # Canvas width [pixels]
canvas_height: 1200 # Canvas height [pixels]
matplotlib:
interpolation: "bilinear" # "none" | "bilinear" | "bicubic"
use_tight_layout: true # Use tight_layout
# ============================================================
# LOGGING SECTION
# ============================================================
logging:
enabled: true
level: "INFO" # "DEBUG" | "INFO" | "WARNING" | "ERROR"
console:
enabled: true
level: "INFO"
format: "detailed" # "simple" | "detailed"
colors: true
file:
enabled: false
level: "DEBUG"
path: "./logs/"
filename: "homodyne_analysis.log"
max_size_mb: 10 # Max file size before rotation
backup_count: 5 # Rotated files to keep
# ============================================================
# PERFORMANCE SECTION
# ============================================================
performance:
strategy_override: null # Force: "standard" | "large" | "streaming"
memory_limit_gb: null # Custom memory limit [GB]
computation:
enable_jit: true # Enable JAX JIT compilation
cpu_threads: "auto" # "auto" or explicit int
—
Additional Configuration Details¶
Stratification activates automatically when per_angle_scaling=True and
dataset size exceeds 100,000 points. It creates angle-balanced chunks for NLSQ.
When max_imbalance_ratio is exceeded, NLSQ falls back to sequential
(per-angle) optimization using the sequential settings.
Phi filtering selects which azimuthal angles to analyze. Each target_ranges
entry defines an angular window; angles within tolerance degrees of the range
are included. Supports wrapping (e.g., 170 to -170 = 170 to 190 degrees).
Performance strategy is normally auto-selected based on dataset size:
Strategy |
Dataset Size |
Memory |
Method |
|---|---|---|---|
|
< 10M points |
~2 GB per M |
Full Jacobian |
|
10M–100M |
~2 GB per M |
Chunked Jacobian |
|
> 100M |
~2 GB fixed |
Hybrid streaming |
Quick reference patterns:
Purpose |
Configuration |
|---|---|
Single static optimization |
|
Laminar flow with uncertainty |
|
Large dataset (>10M) |
|
Fast preview |
|
Publication figures |
|
Low memory |
|
—
Configuration Validation¶
Validate your configuration before running a long fit:
homodyne-config --validate --input config.yaml
Or validate programmatically:
from homodyne.config import ConfigManager
try:
config = ConfigManager("config.yaml")
print("Configuration valid")
except ValueError as e:
print(f"Configuration error: {e}")
Physics validation:
from homodyne.config import validate_all_parameters
params = config.get_initial_parameters()
violations = validate_all_parameters(params)
for v in violations:
print(f"{v.severity}: {v.parameter} — {v.message}")
—
Configuration Precedence¶
When the CLI is used, the base optimization.mcmc settings override
optimization.cmc.per_shard_mcmc. Keep them aligned:
# This is the correct way to align base mcmc and cmc settings:
optimization:
mcmc: # Base MCMC (CLI applies these to per_shard_mcmc)
num_warmup: 500
num_samples: 1500
num_chains: 4
cmc:
per_shard_mcmc:
num_warmup: 500 # Same as base mcmc
num_samples: 1500 # Same as base mcmc
num_chains: 4 # Same as base mcmc
—
Template Generation¶
The homodyne-config command generates complete, valid configuration
templates:
# Static mode template to stdout
homodyne-config --mode static
# Laminar flow template to file
homodyne-config --mode laminar_flow --output my_config.yaml
# Interactive configuration builder (asks questions)
homodyne-config --interactive
# Validate existing config
homodyne-config --validate --input existing_config.yaml
# Format/normalize config (auto-format YAML)
homodyne-config --format --input messy_config.yaml --output clean_config.yaml
—
See Also¶
Analysis Modes — Mode selection
NLSQ Fitting Guide — NLSQ workflow
Bayesian Inference with CMC — CMC workflow
Performance Tuning: CPU/NUMA Optimization — XLA and CPU optimization