gp_api package
Subpackages
Submodules
gp_api.gaussian_process module
Provides the GaussianProcess class
- class gp_api.gaussian_process.GaussianProcess(x, y, LL, predictor, kernel, train_err=1e-10, hypercube_rescale=False, param_names=None, metadata=None)
Bases:
objectFits and predicts a generic GaussianProcess
- LLy_cholesky(y)
Generate a predictor using the cholesky factor with object attributes
- classmethod available_labels(fname)
Get all labels in the specified file
Each of these should be a valid label to pass to the
loadmethod.TODO: Needs to be implemented.
- static compute_LLy_cholesky(LL, y, sparse=True)
Generate a predictor by evaluating the cholesky factor on training data
- equiv(other)
Implementation detail-agnostic equality check
Compares to
other, returningTrueif they are equivalent. Unlike__eq__, this does not consider differences in backend choices.
- classmethod fit(x, y, kernel=None, hypercube_rescale=False, param_names=None, metadata=None, train_err=None)
Fit a GP to training data and return.
- classmethod load(filename, label=None)
Deserialize from an HDF5 file
- Parameters
filename (str) – Location of fit file
label (str) – Name of fit within file
- classmethod load_all(fname, max_workers=1)
Loads all GP objects in the specified file
Returns a dict mapping each label to the associated GP.
Can be run in parallel by setting max_workers larger than 1.
- mean(x_sample)
Computes the GP mean at each sample point
See page 88 of R&W’s Gaussian Processes and Machine Learning
- Parameters
x_sample (array like, shape = (n_pts,)) – Samples for evaluation under the kernel
- rvs(n_samples, x_sample, y_std=None, random_state=None)
Draws GP samples at each sample point
Takes
n_samplesdraws from this Gaussian process, evaluated at each point inx_sample.
- sample_density(limits, n_sample, n_uniform, random_state=None)
Generate random samples from gp as density function
- Parameters
limits (array like, shape = (dim, 2)) – List of [min,max] pairs for each dimension
n_sample (int) – number of samples to draw from density
n_uniform (int) – number of samples used initially for potential samples
object (random_state; numpy.random.RandomState) – supports random number generation
- save(filename, store=None, force=False, label=None)
Serialize to an HDF5 file
- Parameters
filename (str) – The location for saving the fit
store (unknown type, optional) – Save data as well as options?
force (bool, optional) – Overwrites saved file if enabled
label (str, optional) – Option for more than one fit to be stored in the same file
- store_options = frozenset({'predictor', 'x'})
- variance(x_sample)
Compute the GP variance at sample compared to training data
- Parameters
x_test (array like, shape = (n_pts,)) – Samples for evaluation under the kernel
gp_api.marginals module
Fit the marginals of a set of samples
- class gp_api.marginals.Marginal(x_sample, limits, weights=None, verbose=False)
Bases:
objectRepresents marginal distributions
- fit_hist1d(bins, index, grab_edge=False)
Fit a 1d histogram of the marginal
- Parameters
bins (int) – Number of bins to use
index (int) – Axis along which to make the histogram
grab_edge (bool, optional) – Special bins for edges. If enabled, ensures the histogram has bins hanging halfway off the sample space. Since we pruned samples outside of our limits, this means our edge-bins are effectively half as large, but allows us to compute the histogram with uniform bin widths.
- fit_hist2d(bins, index, jndex, grab_edge=False)
Fit a 1d histogram of the marginal
- Parameters
index (int) – Axis along which to make the histogram’s first dimension
jndex (int) – Axis along which to make the histogram’s second dimension
bins (int) – Number of bins to use
grab_edge (bool, optional) – Special bins for edges. If enabled, ensures the histogram has bins hanging halfway off the sample space. Since we pruned samples outside of our limits, this means our edge-bins are effectively half as large, but allows us to compute the histogram with uniform bin widths.
- fit_histdd(indices, bins, grab_edge=False)
Fit a high dimensinoal histogram of the distribution
- Parameters
indices (list of int) – Axes along which to make each of the histogram’s dimensions
bins (int) – Number of bins to use
grab_edge (bool, optional) – Special bins for edges. If enabled, ensures the histogram has bins hanging halfway off the sample space. Since we pruned samples outside of our limits, this means our edge-bins are effectively half as large, but allows us to compute the histogram with uniform bin widths.
warning: (..) – It’s not advised to use more than three dimensions:
- fit_marginal(indices=None, ks_threshold=0.001, grab_edge=False, min_bins=2, max_bins=20, **fit_kwargs)
Fit marginal for one dimension
Minimizes error with automatic bining
- fit_marginal_methods(indices, bins=None, mode='like', min_bins=7, max_bins=20, **kwargs)
Different search modes for marginal fits in 2D
- joint_training_goodness(x_train_1, y_train_1, x_train_2, y_train_2, **fit_kwargs)
Evaluate the goodness of fit for two sets of training data
- Parameters
x_train_1 (array like, shape = (npts1, dim)) – First set of training samples
y_train_1 (array like, shape = (npts1,)) – First set of training values
x_train_2 (array like, shape = (npts2, dim)) – Second set of training samples
y_train_2 (array like, shape = (npts2,)) – Second set of training values
- multifit_marginal1d(ks_threshold=0.001, grab_edge=False, max_bins=20, **fit_kwargs)
- multifit_marginal1d2d(ks_threshold=0.001, grab_edge=False, max_bins=20, **fit_kwargs)
- multifit_marginal2d(ks_threshold=0.001, grab_edge=False, max_bins=20, **fit_kwargs)
- gp_api.marginals.bin_combination_seeds(ndim, max_bins)
Lists all possible starting bin configurations
- gp_api.marginals.grab_edge_factor(x_train)
Grab the factors for edge values
- gp_api.marginals.grab_edge_ind(x_train)
Grab the indices of each training point on an edge
- gp_api.marginals.min_bins_for_grab_edge(ndim)
Verifies more area is inside the box than outside
Specifically, checks that:
\[(n+1)/n < 2^{1/k}\]
- gp_api.marginals.prune_samples(x_sample, limits)
Prune samples not within limits
- gp_api.marginals.update_edge_centers(x_train, dx)
Update the edge centers
- gp_api.marginals.weighted_histogram_density_error(raw_hist_output, N, dx)
Estimate error values