dmgp.layers
Base Variational Layer
_BaseVariationalLayer
- class dmgp.layers.base_variational_layer._BaseVariationalLayer[source]
The base variational layer is implemented as a
torch.nn.Modulethat, when called on two distributions \(Q\) and \(P\) returns atorch.Tensorthat represents the KL divergence between two Gaussians \(\left( Q\parallel P \right)\).\[\begin{equation*} D_{\text{KL}}\left( Q\parallel P \right)= \sum_{x\in \mathcal{X}}Q(x)\log\left( \frac{Q(x)}{P(x)} \right) \end{equation*}\]- kl_div(mu_q, sigma_q, mu_p, sigma_p)[source]
Calculates kl divergence between two gaussians (Q || P)
- Parameters:
mu_q (torch.Tensor) – mean of distribution Q
- Sigma_q:
deviation of distribution Q
- Mu_p:
mean of distribution P
- Sigma_p:
deviation of distribution P
- Returns:
the KL divergence between Q and P.
GP Activation Layers
Tensor Markov Kernel (TMK)
- class dmgp.layers.TMK(in_features, n_level=2, input_lb=-2, input_ub=2, kernel=LaplaceProductKernel(), design_class=<class 'dmgp.utils.sparse_design.design_class.HyperbolicCrossDesign'>)[source]
Implements tensor markov GP as an activation layer using sparse grid structure.
\[\begin{equation*} k\left( \mathbf{x}, X^{SG} \right)R^{-1} \end{equation*}\]- Parameters:
in_features (int) – Size of each input sample.
n_level (int, optional) – Level of sparse grid design. (Default: 2.)
input_lb (float, optional) – Input lower boundary. (Default: -2.)
input_ub (float, optional) – Input upper boundary. (Default: 2.)
design_class (class, dmgp.utils.sparse_design.design_class, optional) – Base design class of sparse grid. (Default: HyperbolicCrossDesign.)
kernel (class, dmgp.kernels, optional) – Kernel function of deep GP. (Default: LaplaceProductKernel(lengthscale=1.).)
Additive Markov Kernel (AMK)
- class dmgp.layers.AMK(in_features, n_level=3, input_lb=-2, input_ub=2, kernel=LaplaceProductKernel(), design_class=<class 'dmgp.utils.sparse_design.design_class.HyperbolicCrossDesign'>)[source]
Implements additive markov GP as an activation layer using additive structure.
\[\begin{equation*} \left\{ k\left( x_i, X^{SG} \right)R^{-1} \right\}^{d}_{i=1} \end{equation*}\]- Parameters:
in_features (int) – Size of each input sample.
n_level (int, optional) – Level of induced points for approximating GP. (Default: 3.)
input_lb (float, optional) – Input lower boundary. (Default: -2.)
input_ub (float, optional) – Input upper boundary. (Default: 2.)
design_class (class, dmgp.utils.sparse_design.design_class, optional) – Base design class of sparse grid. (Default: HyperbolicCrossDesign.)
kernel (class, dmgp.kernels, optional) – Kernel function of deep GP. (Default: LaplaceProductKernel(lengthscale=1.).)
Linear Layers
Linear Reparameterization
- class dmgp.layers.LinearReparameterization(in_features, out_features, prior_mean=0, prior_variance=1, posterior_mu_init=0, posterior_rho_init=-3.0, bias=True)[source]
Implements Linear layer with reparameterization trick. Inherits from dmgp.layers._BaseVariationalLayer
- Parameters:
in_features (int) – Size of each input sample.
out_features (int) – Size of each output sample.
prior_mean (float, optional) – Mean of the prior arbitrary distribution to be used on the complexity cost. (Default: 0.)
prior_variance (float, optional) – Variance of the prior arbitrary distribution to be used on the complexity cost. (Default: 1.0.)
posterior_mu_init (float, optional) – Initialized trainable mu parameter representing mean of the approximate posterior. (Default: 0.)
posterior_rho_init (float, optional) – Initialized trainable rho parameter representing the sigma of the approximate posterior through softplus function. (Default: -3.0.)
bias (bool, optional) – If set to False, the layer will not learn an additive bias. (Default: True.)
Linear Flipout
- class dmgp.layers.LinearFlipout(in_features, out_features, prior_mean=0, prior_variance=1, posterior_mu_init=0, posterior_rho_init=-3.0, bias=True)[source]
Implements Linear layer with Flipout reparameterization trick. Ref: https://arxiv.org/abs/1803.04386. Inherits from dmgp.layers._BaseVariationalLayer.
- Parameters:
in_features (int) – Size of each input sample.
out_features (int) – Size of each output sample.
prior_mean (float, optional) – Mean of the prior arbitrary distribution to be used on the complexity cost. (Default: 0.)
prior_variance (float, optional) – Variance of the prior arbitrary distribution to be used on the complexity cost. (Default: 1.0.)
posterior_mu_init (float, optional) – Initialized trainable mu parameter representing mean of the approximate posterior. (Default: 0.)
posterior_rho_init (float, optional) – Initialized trainable rho parameter representing the sigma of the approximate posterior through softplus function. (Default: -3.0.)
bias (bool, optional) – If set to False, the layer will not learn an additive bias. (Default: True.)