gp_api package

Subpackages

Submodules

gp_api.gaussian_process module

Provides the GaussianProcess class

class gp_api.gaussian_process.GaussianProcess(x, y, LL, predictor, kernel, train_err=1e-10, hypercube_rescale=False, param_names=None, metadata=None)

Bases: object

Fits and predicts a generic GaussianProcess

LLy_cholesky(y)

Generate a predictor using the cholesky factor with object attributes

classmethod available_labels(fname)

Get all labels in the specified file

Each of these should be a valid label to pass to the load method.

TODO: Needs to be implemented.

static compute_LLy_cholesky(LL, y, sparse=True)

Generate a predictor by evaluating the cholesky factor on training data

equiv(other)

Implementation detail-agnostic equality check

Compares to other, returning True if they are equivalent. Unlike __eq__, this does not consider differences in backend choices.

classmethod fit(x, y, kernel=None, hypercube_rescale=False, param_names=None, metadata=None, train_err=None)

Fit a GP to training data and return.

classmethod load(filename, label=None)

Deserialize from an HDF5 file

Parameters
  • filename (str) – Location of fit file

  • label (str) – Name of fit within file

classmethod load_all(fname, max_workers=1)

Loads all GP objects in the specified file

Returns a dict mapping each label to the associated GP.

Can be run in parallel by setting max_workers larger than 1.

mean(x_sample)

Computes the GP mean at each sample point

See page 88 of R&W’s Gaussian Processes and Machine Learning

Parameters

x_sample (array like, shape = (n_pts,)) – Samples for evaluation under the kernel

rvs(n_samples, x_sample, y_std=None, random_state=None)

Draws GP samples at each sample point

Takes n_samples draws from this Gaussian process, evaluated at each point in x_sample.

save(filename, store=None, force=False, label=None)

Serialize to an HDF5 file

Parameters
  • filename (str) – The location for saving the fit

  • store (unknown type, optional) – Save data as well as options?

  • force (bool, optional) – Overwrites saved file if enabled

  • label (str, optional) – Option for more than one fit to be stored in the same file

store_options = frozenset({'predictor', 'x'})
variance(x_sample)

Compute the GP variance at sample compared to training data

Parameters

x_test (array like, shape = (n_pts,)) – Samples for evaluation under the kernel

gp_api.marginals module

Fit the marginals of a set of samples

class gp_api.marginals.Marginal(x_sample, limits, weights=None, verbose=False)

Bases: object

Represents marginal distributions

fit_hist1d(bins, index, grab_edge=False)

Fit a 1d histogram of the marginal

Parameters
  • bins (int) – Number of bins to use

  • index (int) – Axis along which to make the histogram

  • grab_edge (bool, optional) – Special bins for edges. If enabled, ensures the histogram has bins hanging halfway off the sample space. Since we pruned samples outside of our limits, this means our edge-bins are effectively half as large, but allows us to compute the histogram with uniform bin widths.

fit_hist2d(bins, index, jndex, grab_edge=False)

Fit a 1d histogram of the marginal

Parameters
  • index (int) – Axis along which to make the histogram’s first dimension

  • jndex (int) – Axis along which to make the histogram’s second dimension

  • bins (int) – Number of bins to use

  • grab_edge (bool, optional) – Special bins for edges. If enabled, ensures the histogram has bins hanging halfway off the sample space. Since we pruned samples outside of our limits, this means our edge-bins are effectively half as large, but allows us to compute the histogram with uniform bin widths.

fit_histdd(indices, bins, grab_edge=False)

Fit a high dimensinoal histogram of the distribution

Parameters
  • indices (list of int) – Axes along which to make each of the histogram’s dimensions

  • bins (int) – Number of bins to use

  • grab_edge (bool, optional) – Special bins for edges. If enabled, ensures the histogram has bins hanging halfway off the sample space. Since we pruned samples outside of our limits, this means our edge-bins are effectively half as large, but allows us to compute the histogram with uniform bin widths.

  • warning: (..) – It’s not advised to use more than three dimensions:

fit_marginal(indices=None, ks_threshold=0.001, grab_edge=False, min_bins=2, max_bins=20, **fit_kwargs)

Fit marginal for one dimension

Minimizes error with automatic bining

fit_marginal_methods(indices, bins=None, mode='like', min_bins=7, max_bins=20, **kwargs)

Different search modes for marginal fits in 2D

joint_training_goodness(x_train_1, y_train_1, x_train_2, y_train_2, **fit_kwargs)

Evaluate the goodness of fit for two sets of training data

Parameters
  • x_train_1 (array like, shape = (npts1, dim)) – First set of training samples

  • y_train_1 (array like, shape = (npts1,)) – First set of training values

  • x_train_2 (array like, shape = (npts2, dim)) – Second set of training samples

  • y_train_2 (array like, shape = (npts2,)) – Second set of training values

multifit_marginal1d(ks_threshold=0.001, grab_edge=False, max_bins=20, **fit_kwargs)
multifit_marginal1d2d(ks_threshold=0.001, grab_edge=False, max_bins=20, **fit_kwargs)
multifit_marginal2d(ks_threshold=0.001, grab_edge=False, max_bins=20, **fit_kwargs)
gp_api.marginals.bin_combination_seeds(ndim, max_bins)

Lists all possible starting bin configurations

gp_api.marginals.grab_edge_factor(x_train)

Grab the factors for edge values

gp_api.marginals.grab_edge_ind(x_train)

Grab the indices of each training point on an edge

gp_api.marginals.min_bins_for_grab_edge(ndim)

Verifies more area is inside the box than outside

Specifically, checks that:

\[(n+1)/n < 2^{1/k}\]
gp_api.marginals.prune_samples(x_sample, limits)

Prune samples not within limits

gp_api.marginals.update_edge_centers(x_train, dx)

Update the edge centers

gp_api.marginals.weighted_histogram_density_error(raw_hist_output, N, dx)

Estimate error values

Module contents