homodyne.data¶
The homodyne.data package provides data ingestion for XPCS experiments.
XPCSDataLoader supports both legacy APS and modern APS-U HDF5 file formats
and produces JAX-compatible arrays ready for optimisation.
XPCSDataLoader¶
XPCSDataLoader is the single entry point for loading XPCS correlation
data. It handles:
Auto-detection of APS vs APS-U HDF5 format
Half-matrix reconstruction for correlation matrices
Mandatory diagonal correction applied post-load
Smart NPZ caching to avoid reloading large HDF5 files
Optional physics-based quality validation
JAX array output with NumPy fallback when JAX is unavailable
- class homodyne.data.xpcs_loader.XPCSDataLoader[source]
Bases:
objectEnhanced XPCS data loader for Homodyne.
Supports both APS (old) and APS-U (new) formats with YAML-first configuration, intelligent caching, and JAX integration.
Features: - YAML-first configuration with JSON support - Auto-detection of HDF5 format (APS vs APS-U) - Smart NPZ caching with compression - Half-matrix reconstruction for correlation matrices - Mandatory diagonal correction applied consistently - JAX array output when available - Integration with v2 physics validation
- __init__(config_path=None, config_dict=None, configure_logging=True, generate_quality_reports=False)[source]
Initialize XPCS data loader with YAML-first configuration.
- Parameters:
config_path (
str|None) – Path to YAML or JSON configuration fileconfig_dict (
dict|None) – Configuration dictionary (alternative to config_path)configure_logging (
bool) – Whether to apply logging configuration from configgenerate_quality_reports (
bool) – Whether to generate quality reports (default: False)
- Raises:
XPCSDependencyError – If required dependencies are not available
XPCSConfigurationError – If configuration is invalid
- load_experimental_data()[source]
Load experimental data with priority: cache NPZ → raw HDF → error.
HDF5 Format Requirements¶
Homodyne supports two HDF5 layouts:
APS old format (legacy)
/exchange/
correlation/ # C2 matrix: (n_phi, n_t1, n_t2)
lag_steps/ # time lag indices
/measurement/
sample/
q_value # scalar
phi_values # (n_phi,)
APS-U new format (APS Upgrade, current)
/xpcs/
g2/ # C2 data: (n_phi, n_delay)
delay_frames/ # frame delay values
q_values/ # (n_phi,)
phi_values/ # (n_phi,)
dt # frame time step (seconds)
Note
Homodyne detects the format automatically. If your file uses a non-standard
layout, pass format_hint="aps" or format_hint="apsu" to the
constructor to skip auto-detection.
NPZ Caching¶
Loading large HDF5 files repeatedly is slow. XPCSDataLoader caches the
preprocessed arrays as compressed NPZ files alongside the HDF5 file. On
subsequent loads, if the cache is valid (same file mtime), the NPZ is loaded
directly — typically 10–100× faster.
Set use_cache=False in the YAML config to disable caching:
data:
use_cache: false
cache_dir: null # defaults to same directory as HDF5 file
Note
Caches are loaded with allow_pickle=False (since v2.23.2). Cache metadata
is stored as a JSON-encoded scalar (cache_metadata_json) and parsed with
json.loads() rather than unpickled, so a cache file at a config-controlled
path cannot trigger arbitrary object deserialization. Legacy caches that used
the older cache_metadata object-array format are rejected with a clear
error — delete the stale .npz and it regenerates on the next load.
Data Validation¶
Optional physics-based validation checks:
Check |
Description |
|---|---|
Shape consistency |
Verifies C2 matrix dimensions against phi and time axes |
NaN / Inf detection |
Raises |
Monotonicity |
Verifies lag time array is strictly increasing |
Value bounds |
Checks C2 values fall in physically reasonable range |
Enable strict validation via:
data:
validate: true
strict_bounds: true
Output Data Structure¶
XPCSDataLoader.load_experimental_data() returns a dictionary with the following keys:
Key |
Shape |
Description |
|---|---|---|
|
|
Flattened C2 correlation values |
|
|
First time indices (absolute, seconds) |
|
|
Second time indices (absolute, seconds) |
|
|
Scattering angle per data point (degrees) |
|
scalar |
Scattering wavevector magnitude (Å-1) |
|
scalar |
Gap / characteristic length (Å) |
|
scalar |
Frame time step (seconds) |
|
scalar |
Number of azimuthal angles |
Usage Examples¶
From a YAML config file¶
from homodyne.data.xpcs_loader import XPCSDataLoader
loader = XPCSDataLoader(config_path="my_config.yaml")
data = loader.load_experimental_data()
print(f"Data points: {len(data['c2'])}")
print(f"Phi angles: {data['n_phi']}")
print(f"q: {data['q']:.4g} Angstrom^-1")
print(f"Time step dt: {data['dt']:.4g} s")
From a ConfigManager¶
from homodyne.config.manager import ConfigManager
from homodyne.data.xpcs_loader import XPCSDataLoader
config_manager = ConfigManager("config.yaml")
loader = XPCSDataLoader(config_dict=config_manager.config)
data = loader.load_experimental_data()
Using the convenience function¶
from homodyne.data import load_xpcs_data
data = load_xpcs_data(config_path="my_config.yaml")
Supplementary Modules¶
XPCS Data Loader for Homodyne¶
Enhanced XPCS data loader supporting both APS (old) and APS-U (new) HDF5 formats with YAML-first configuration system, JAX compatibility, and modern architecture integration.
This module provides: - YAML-first configuration with JSON support - Smart NPZ caching to avoid reloading large HDF5 files - Auto-detection of APS vs APS-U format - Half-matrix reconstruction for correlation matrices - Mandatory diagonal correction applied post-load - JAX array output with numpy fallback - Integration with v2 logging and physics validation
Key Features: - Format Support: APS old format and APS-U new format - Configuration: YAML primary, JSON via converter - Caching: Intelligent NPZ caching with compression - Output: JAX arrays when available, numpy fallback - Validation: Optional physics-based data quality checks
- exception homodyne.data.xpcs_loader.XPCSDataFormatError[source]
Bases:
ExceptionRaised when XPCS data format is not recognized or invalid.
- exception homodyne.data.xpcs_loader.XPCSDependencyError[source]
Bases:
ExceptionRaised when required dependencies are not available.
- exception homodyne.data.xpcs_loader.XPCSConfigurationError[source]
Bases:
ExceptionRaised when configuration is invalid or missing required parameters.
- homodyne.data.xpcs_loader.load_xpcs_config(config_path)[source]
Load XPCS configuration from YAML or JSON file.
Primary format: YAML JSON support: Automatically converted to YAML format
- homodyne.data.xpcs_loader.load_xpcs_data(config_path=None, config_dict=None)[source]
Convenience function to load XPCS data from configuration file or dict.
Supports both YAML and JSON configuration files with auto-detection, or direct configuration dictionary for programmatic use (backward compatible).
- Parameters:
- Return type:
- Returns:
Dictionary containing loaded experimental data with JAX arrays when available
Example
>>> # From config file >>> data = load_xpcs_data(config_path="xpcs_config.yaml") >>> print(data.keys()) dict_keys(['wavevector_q_list', 'phi_angles_list', 't1', 't2', 'c2_exp'])
>>> # From dict (backward compatible - positional) >>> config = {"data_file": "experiment.h5", "analysis_mode": "static_isotropic"} >>> data = load_xpcs_data(config)
>>> # From dict (keyword argument) >>> data = load_xpcs_data(config_dict=config)