Utilities for Bayesian models and distributions

Confidence/credible intervals

Credible intervals quantify the uncertainty of a parameter by providing the range of values containing the true parameter value with a given probability. Credible intervals can be calculated using the equal-tailed quantile method (ETI) or the highest posterior density method (HDI).

The HDI method calculates the interval such that \(P(l < z < u) = 1 - \delta\), where \(\delta\) denotes the significant level, and \(l\) and \(u\) denote the lower and upper bound of the interval, respectively. The HDI computes the narrowest interval by solving the minimization problem, see [CS99]

\[\underset{l < u}{\text{min}}\left(|f(u) - f(l)| + |F(u) - F(l) - (1 -\delta)| \right).\]

We reformulate the problem by removing absolute values and adding the narrowest interval \(u - l\) on the objective function,

\[\begin{split}\underset{u,l, t, w}{\text{min}} &\quad t + w + u - l\\ \text{s.t.} &\quad -t + f(u) - f(l) \ge 0\\ &\quad t + f(u) - f(l) \ge 0\\ &\quad -w + F(u) - F(l) - (1 -\delta) \ge 0\\ &\quad w + F(u) - F(l) - (1 -\delta) \ge 0\\ &\quad u - l - \epsilon \ge 0\\ &\quad l \in [l_{\min}, l_{\max}]]\\ &\quad u \in [u_{\min}, u_{\max}]\end{split}\]

where \(\epsilon > 0\), say \(\epsilon = 0.000001\). Parameters \(l_{\min}\), \(l_{\max}\), \(u_{\min}\) and \(u_{\max}\) denote the bounds for the interval limits \(l\) and \(u\), there are dependent on the statistical distribution support.

cprior.cdist.ci_interval(x, interval_length=0.9, method='ETI')

Compute confidence/credible intervals (CI).

Parameters:
  • x (array-like, shape = (n_samples)) –
  • interval_length (float (default=0.9)) – Compute interval_length% credible interval. This is a value in [0, 1].
  • method (str (default="ETI")) – Method to compute credible intervals. Supported methods are Highest Density interval (method="HDI") and Equal-tailed interval (method="ETI").
Returns:

ci_interval – The lower and upper limit of the interval.

Return type:

numpy.ndarray

Example

>>> from scipy import stats
>>> from cprior.cdist import ci_interval
>>> x = stats.norm.rvs(size=int(1e6), random_state=42)
>>> ci_interval(x=x, interval_length=0.95, method="ETI")
array([-1.96315029,  1.95842544])
>>> ci_interval(x=x, interval_length=0.95, method="HDI")
array([-1.95282024,  1.9679026 ])
cprior.cdist.ci_interval_exact(dist, interval_length=0.9, method='ETI', bounds=None)

Compute exact confidence/credible intervals (CI).

Parameters:
  • dist (object) – A class representing a probability distribution with methods pdf, cdf and ppf.
  • interval_length (float (default=0.9)) – Compute interval_length% credible interval. This is a value in [0, 1].
  • method (str (default="ETI")) – Method to compute credible intervals. Supported methods are Highest Density interval (method="HDI") and Equal-tailed interval (method="ETI").
  • bounds (list or None (default=None)) – Sequence of (min, max) pairs for lower and upper limit of the interval. None is used to specify no bound.
Returns:

ci_interval – The lower and upper limit of the interval.

Return type:

numpy.ndarray

Example

>>> from scipy import stats
>>> from cprior.cdist import ci_interval
>>> dist = stats.beta(4, 10)
>>> ci_interval_exact(dist=dist, interval_length=0.9, method="ETI")
array([0.11266578, 0.49464973])
>>> bounds = [(0, 1), (0, 1)]
>>> ci_interval_exact(dist=dist, interval_length=0.9, method="HDI", bounds=bounds)
array([0.09439576, 0.46944915])

References

[CS99]Mi. Chen and Q. Shao. Monte Carlo Estimation of Bayesian Credible and HPD Intervals. Journal of Computational and Graphical Statistics, 8(1):69–92, 1999.