invode.sampling

Contents

invode.sampling#

Functions

lhs_sample(param_bounds, n_samples[, seed])

Generate Latin Hypercube Samples for parameter space exploration.

invode.sampling.lhs_sample(param_bounds, n_samples, seed=None)[source]#

Generate Latin Hypercube Samples for parameter space exploration.

This function creates a set of well-distributed parameter samples using Latin Hypercube Sampling (LHS), a stratified sampling technique that ensures good coverage of the parameter space. LHS divides each parameter dimension into equally probable intervals and samples exactly once from each interval, providing better space-filling properties than random sampling.

Latin Hypercube Sampling is particularly effective for:

  • High-dimensional parameter spaces where uniform coverage is important

  • Expensive function evaluations where sample efficiency matters

  • Situations requiring reproducible sampling with controlled randomness

  • Initial exploration phases of optimization algorithms

Parameters:
  • param_bounds (dict) –

    Dictionary mapping parameter names to their bounds. Each key should be a string parameter name, and each value should be a tuple (min_val, max_val) defining the lower and upper bounds for that parameter.

    Example: {'k1': (0.1, 10.0), 'k2': (0.01, 1.0), 'alpha': (-1, 1)}

  • n_samples (int) – Number of parameter samples to generate. Must be a positive integer. Each sample will contain values for all parameters specified in param_bounds.

  • seed (int, optional) – Random seed for reproducible sampling. If None, the sampling will be non-deterministic. Using the same seed with identical inputs guarantees identical sample sets, which is useful for debugging and reproducible research. Default is None.

Returns:

A list containing n_samples dictionaries, where each dictionary represents one parameter sample. Each dictionary has the same keys as param_bounds, with values sampled from the corresponding parameter ranges using LHS.

The returned samples have the following properties:

  • Each parameter dimension is divided into n_samples equally probable strata

  • Exactly one sample is drawn from each stratum in each dimension

  • Samples are randomly permuted to avoid correlation between dimensions

  • All parameter values are within their specified bounds

Return type:

list of dict

Raises:
  • ValueError – If n_samples is not a positive integer, or if any parameter bounds are invalid (e.g., min_val >= max_val).

  • TypeError – If param_bounds is not a dictionary or contains non-numeric bounds.

Notes

The function uses SciPy’s quasi-Monte Carlo (qmc) module for LHS generation, which provides high-quality space-filling sequences. The sampling process involves three steps:

  1. Generate unit hypercube samples using LHS in [0,1]^d

  2. Scale samples to the specified parameter bounds

  3. Convert arrays back to parameter dictionaries

The Latin Hypercube design ensures that:

  • The marginal distribution of each parameter is uniform over its bounds

  • No two samples share the same stratum in any single dimension

  • The samples collectively provide good coverage of the parameter space

  • Correlation between different parameter dimensions is minimized

Examples

Basic usage with two parameters:

>>> bounds = {'rate': (0.1, 1.0), 'decay': (0.01, 0.1)}
>>> samples = lhs_sample(bounds, n_samples=5, seed=42)
>>> len(samples)
5
>>> samples[0].keys()
dict_keys(['rate', 'decay'])
>>> all(0.1 <= s['rate'] <= 1.0 for s in samples)
True

High-dimensional parameter space:

>>> param_bounds = {
...     'k1': (0.1, 10.0),
...     'k2': (0.01, 1.0),
...     'k3': (-5.0, 5.0),
...     'alpha': (0.0, 1.0),
...     'beta': (1e-6, 1e-3)
... }
>>> samples = lhs_sample(param_bounds, n_samples=100, seed=123)
>>> print(f"Generated {len(samples)} samples in {len(param_bounds)}D space")
Generated 100 samples in 5D space

Reproducible sampling for debugging:

>>> # Same seed produces identical samples
>>> samples1 = lhs_sample({'x': (0, 1)}, n_samples=3, seed=42)
>>> samples2 = lhs_sample({'x': (0, 1)}, n_samples=3, seed=42)
>>> samples1 == samples2
True

Integration with optimization loops:

>>> def objective_function(params):
...     return (params['x'] - 0.5)**2 + (params['y'] - 0.3)**2
>>>
>>> bounds = {'x': (0, 1), 'y': (0, 1)}
>>> candidates = lhs_sample(bounds, n_samples=50, seed=42)
>>> errors = [objective_function(params) for params in candidates]
>>> best_idx = np.argmin(errors)
>>> best_params = candidates[best_idx]
>>> print(f"Best parameters: {best_params}")

Comparison with Random Sampling#

LHS provides better space coverage than pure random sampling:

>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>>
>>> # Generate LHS samples
>>> lhs_samples = lhs_sample({'x': (0, 1), 'y': (0, 1)}, n_samples=20, seed=42)
>>> lhs_x = [s['x'] for s in lhs_samples]
>>> lhs_y = [s['y'] for s in lhs_samples]
>>>
>>> # Generate random samples for comparison
>>> np.random.seed(42)
>>> rand_x = np.random.uniform(0, 1, 20)
>>> rand_y = np.random.uniform(0, 1, 20)
>>>
>>> # Plot comparison
>>> fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
>>> ax1.scatter(lhs_x, lhs_y, alpha=0.7)
>>> ax1.set_title('Latin Hypercube Sampling')
>>> ax1.grid(True, alpha=0.3)
>>>
>>> ax2.scatter(rand_x, rand_y, alpha=0.7)
>>> ax2.set_title('Random Sampling')
>>> ax2.grid(True, alpha=0.3)
>>> plt.show()

Performance Characteristics#

  • Time Complexity: O(n_samples * d) where d is the number of parameters

  • Space Complexity: O(n_samples * d) for storing the sample matrix

  • Quality: Provides better uniformity than random sampling for the same number of samples

  • Scalability: Efficient for high-dimensional spaces (tested up to 100+ dimensions)

See also

scipy.stats.qmc.LatinHypercube

The underlying LHS generator

scipy.stats.qmc.scale

Function for scaling unit samples to custom bounds

numpy.random.uniform

Alternative random sampling approach

References