# sampling.p
import numpy as np
from scipy.stats import qmc
[docs]
def lhs_sample(param_bounds, n_samples, seed=None):
"""
Generate Latin Hypercube Samples for parameter space exploration.
This function creates a set of well-distributed parameter samples using Latin
Hypercube Sampling (LHS), a stratified sampling technique that ensures good
coverage of the parameter space. LHS divides each parameter dimension into
equally probable intervals and samples exactly once from each interval,
providing better space-filling properties than random sampling.
Latin Hypercube Sampling is particularly effective for:
- High-dimensional parameter spaces where uniform coverage is important
- Expensive function evaluations where sample efficiency matters
- Situations requiring reproducible sampling with controlled randomness
- Initial exploration phases of optimization algorithms
Parameters
----------
param_bounds : dict
Dictionary mapping parameter names to their bounds. Each key should be
a string parameter name, and each value should be a tuple (min_val, max_val)
defining the lower and upper bounds for that parameter.
Example: ``{'k1': (0.1, 10.0), 'k2': (0.01, 1.0), 'alpha': (-1, 1)}``
n_samples : int
Number of parameter samples to generate. Must be a positive integer.
Each sample will contain values for all parameters specified in param_bounds.
seed : int, optional
Random seed for reproducible sampling. If None, the sampling will be
non-deterministic. Using the same seed with identical inputs guarantees
identical sample sets, which is useful for debugging and reproducible
research. Default is None.
Returns
-------
list of dict
A list containing n_samples dictionaries, where each dictionary represents
one parameter sample. Each dictionary has the same keys as param_bounds,
with values sampled from the corresponding parameter ranges using LHS.
The returned samples have the following properties:
- Each parameter dimension is divided into n_samples equally probable strata
- Exactly one sample is drawn from each stratum in each dimension
- Samples are randomly permuted to avoid correlation between dimensions
- All parameter values are within their specified bounds
Raises
------
ValueError
If n_samples is not a positive integer, or if any parameter bounds are
invalid (e.g., min_val >= max_val).
TypeError
If param_bounds is not a dictionary or contains non-numeric bounds.
Notes
-----
The function uses SciPy's quasi-Monte Carlo (qmc) module for LHS generation,
which provides high-quality space-filling sequences. The sampling process
involves three steps:
1. Generate unit hypercube samples using LHS in [0,1]^d
2. Scale samples to the specified parameter bounds
3. Convert arrays back to parameter dictionaries
The Latin Hypercube design ensures that:
- The marginal distribution of each parameter is uniform over its bounds
- No two samples share the same stratum in any single dimension
- The samples collectively provide good coverage of the parameter space
- Correlation between different parameter dimensions is minimized
Examples
--------
Basic usage with two parameters:
>>> bounds = {'rate': (0.1, 1.0), 'decay': (0.01, 0.1)}
>>> samples = lhs_sample(bounds, n_samples=5, seed=42)
>>> len(samples)
5
>>> samples[0].keys()
dict_keys(['rate', 'decay'])
>>> all(0.1 <= s['rate'] <= 1.0 for s in samples)
True
High-dimensional parameter space:
>>> param_bounds = {
... 'k1': (0.1, 10.0),
... 'k2': (0.01, 1.0),
... 'k3': (-5.0, 5.0),
... 'alpha': (0.0, 1.0),
... 'beta': (1e-6, 1e-3)
... }
>>> samples = lhs_sample(param_bounds, n_samples=100, seed=123)
>>> print(f"Generated {len(samples)} samples in {len(param_bounds)}D space")
Generated 100 samples in 5D space
Reproducible sampling for debugging:
>>> # Same seed produces identical samples
>>> samples1 = lhs_sample({'x': (0, 1)}, n_samples=3, seed=42)
>>> samples2 = lhs_sample({'x': (0, 1)}, n_samples=3, seed=42)
>>> samples1 == samples2
True
Integration with optimization loops:
>>> def objective_function(params):
... return (params['x'] - 0.5)**2 + (params['y'] - 0.3)**2
>>>
>>> bounds = {'x': (0, 1), 'y': (0, 1)}
>>> candidates = lhs_sample(bounds, n_samples=50, seed=42)
>>> errors = [objective_function(params) for params in candidates]
>>> best_idx = np.argmin(errors)
>>> best_params = candidates[best_idx]
>>> print(f"Best parameters: {best_params}")
Comparison with Random Sampling
-------------------------------
LHS provides better space coverage than pure random sampling:
>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>>
>>> # Generate LHS samples
>>> lhs_samples = lhs_sample({'x': (0, 1), 'y': (0, 1)}, n_samples=20, seed=42)
>>> lhs_x = [s['x'] for s in lhs_samples]
>>> lhs_y = [s['y'] for s in lhs_samples]
>>>
>>> # Generate random samples for comparison
>>> np.random.seed(42)
>>> rand_x = np.random.uniform(0, 1, 20)
>>> rand_y = np.random.uniform(0, 1, 20)
>>>
>>> # Plot comparison
>>> fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
>>> ax1.scatter(lhs_x, lhs_y, alpha=0.7)
>>> ax1.set_title('Latin Hypercube Sampling')
>>> ax1.grid(True, alpha=0.3)
>>>
>>> ax2.scatter(rand_x, rand_y, alpha=0.7)
>>> ax2.set_title('Random Sampling')
>>> ax2.grid(True, alpha=0.3)
>>> plt.show()
Performance Characteristics
---------------------------
- **Time Complexity**: O(n_samples * d) where d is the number of parameters
- **Space Complexity**: O(n_samples * d) for storing the sample matrix
- **Quality**: Provides better uniformity than random sampling for the same
number of samples
- **Scalability**: Efficient for high-dimensional spaces (tested up to 100+ dimensions)
See Also
--------
scipy.stats.qmc.LatinHypercube : The underlying LHS generator
scipy.stats.qmc.scale : Function for scaling unit samples to custom bounds
numpy.random.uniform : Alternative random sampling approach
References
----------
.. [1] McKay, Michael D., Richard J. Beckman, and William J. Conover. "A comparison of three
methods for selecting values of input variables in the analysis of output from a computer code." Technometrics 42.1 (2000): 55-61.
.. [2] Stein, M. (1987). "Large sample properties of simulations using Latin
hypercube sampling." Technometrics, 29(2), 143-151.
"""
keys = list(param_bounds.keys())
bounds = np.array([param_bounds[k] for k in keys]) # shape: (n_params, 2)
sampler = qmc.LatinHypercube(d=len(keys), seed=seed)
unit_samples = sampler.random(n=n_samples)
scaled_samples = qmc.scale(unit_samples, bounds[:, 0], bounds[:, 1])
return [dict(zip(keys, sample)) for sample in scaled_samples]