diagnostics
Diagnostic functions for brms models with ArviZ integration.
This module provides diagnostic functions for analyzing fitted brms models.
All fitted models return arviz.InferenceData objects by default through the
.idata attribute, enabling seamless integration with ArviZ's diagnostic toolkit.
ArviZ Integration
brmspy models work directly with ArviZ functions without conversion:
- Summary & Convergence:
az.summary(),az.rhat(),az.ess() - Visualization:
az.plot_trace(),az.plot_posterior(),az.plot_pair() - Model Comparison:
az.loo(),az.waic(),az.compare() - Predictive Checks:
az.plot_ppc()
For multivariate models, use the var_name parameter in ArviZ functions
to specify which response variable to analyze (e.g., az.loo(model.idata, var_name="y1")).
Quick Example
import brmspy
import arviz as az
# Fit model
model = brmspy.fit("count ~ zAge + (1|patient)", data=data, family="poisson")
# Diagnostics
print(az.summary(model.idata)) # Parameter estimates with Rhat, ESS
az.plot_trace(model.idata) # MCMC trace plots
az.plot_ppc(model.idata) # Posterior predictive check
# Model comparison
loo = az.loo(model.idata)
print(loo)
See Also
Diagnostics with ArviZ : Complete guide with examples https://kaitumisuuringute-keskus.github.io/brmspy/api/diagnostics-arviz/
Notes
The InferenceData structure contains:
- posterior: All parameter samples with brms naming (e.g.,
b_Intercept,sd_patient__Intercept) - posterior_predictive: Posterior predictive samples for each response
- log_likelihood: Pointwise log-likelihood for LOO/WAIC
- observed_data: Original response values
Classes¶
Functions¶
summary(model, **kwargs)
¶
Generate comprehensive summary statistics for fitted brms model.
Returns a SummaryResult dataclass containing model information,
parameter estimates, and diagnostic information. The SummaryResult object provides
pretty printing via str() or print() and structured access to all components.
BRMS documentation and parameters
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
FitResult
|
Fitted model from |
required |
**kwargs
|
Additional arguments passed to brms::summary(), such as:
- probs: Quantiles for credible intervals, e.g., |
{}
|
Returns:
| Type | Description |
|---|---|
SummaryResult
|
A dataclass containing:
|
See Also
brms::summary.brmsfit : R documentation https://paulbuerkner.com/brms/reference/summary.brmsfit.html
Examples:
Basic usage with pretty printing:
import brmspy
model = brmspy.fit("y ~ x", data=data, chains=4)
summary = brmspy.summary(model)
# Pretty print full summary
print(summary)
Access specific components:
# Get population-level effects as DataFrame
fixed_effects = summary.fixed
print(fixed_effects)
# Get family-specific parameters (e.g., sigma)
spec_params = summary.spec_pars
print(spec_params)
# Access random effects (if present)
random_effects = summary.random
for group_name, group_df in random_effects.items():
print(f"Random effects for {group_name}:")
print(group_df)
# Check model metadata
print(f"Formula: {summary.formula}")
print(f"Total draws: {summary.total_ndraws}")
print(f"Rhat reported: {summary.has_rhat}")
Custom credible intervals:
fixef(object, summary=True, robust=False, probs=(0.025, 0.975), pars=None, **kwargs)
¶
Extract population-level (fixed) effects estimates from a fitted brms model.
Returns a pandas DataFrame containing estimates and uncertainty intervals for
all population-level parameters (fixed effects). By default, returns summary
statistics (mean, standard error, credible intervals). Can also return raw
posterior samples when summary=False.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object
|
FitResult or ListVector
|
Fitted model from |
required |
summary
|
bool
|
If True, return summary statistics (mean/median, SE/MAD, credible intervals). If False, return matrix of posterior samples (iterations × parameters). |
True
|
robust
|
bool
|
If True, use median and MAD instead of mean and SD for summary statistics.
Only used when |
False
|
probs
|
tuple of float
|
Quantiles for credible intervals, e.g., (0.025, 0.975) for 95% intervals.
Only used when |
(0.025, 0.975)
|
pars
|
list of str
|
Specific parameter names to extract. If None, returns all fixed effects. Useful for subsetting when you only need specific coefficients. |
None
|
**kwargs
|
Additional arguments passed to brms::fixef() |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
When When |
See Also
brms::fixef.brmsfit : R documentation https://paulbuerkner.com/brms/reference/fixef.brmsfit.html summary() : Full model summary with all parameter types
Examples:
Basic usage with summary statistics:
import brmspy
model = brmspy.fit("y ~ x1 + x2", data=data, chains=4)
# Get fixed effects summary
fixed_effects = brmspy.fixef(model)
print(fixed_effects)
# Estimate Est.Error Q2.5 Q97.5
# Intercept 10.234567 0.123456 9.992345 10.47689
# x1 0.456789 0.098765 0.263456 0.65012
# x2 -0.234567 0.087654 -0.406789 -0.06234
Get specific parameters only:
# Extract only specific coefficients
x1_x2_effects = brmspy.fixef(model, pars=["x1", "x2"])
print(x1_x2_effects)
Use robust estimates (median and MAD):
# Use median and MAD instead of mean and SD
robust_effects = brmspy.fixef(model, robust=True)
print(robust_effects)
Custom credible intervals:
Get raw posterior samples:
ranef(object, summary=True, robust=False, probs=(0.025, 0.975), pars=None, groups=None, **kwargs)
¶
Extract group-level (random) effects as xarray DataArrays.
This is a wrapper around brms::ranef(). For summary=True (default),
each grouping factor is returned as a 3D array with dimensions
("group", "stat", "coef"). For summary=False, each factor is
returned as ("draw", "group", "coef") with one slice per posterior draw.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object
|
FitResult or ListVector
|
Fitted model returned by :func: |
required |
summary
|
bool
|
If True, return posterior summaries for the group-level effects (means, errors, intervals). If False, return per-draw random effects. |
True
|
robust
|
bool
|
If True, use robust summaries (median and MAD) instead of mean and SD.
Passed through to |
False
|
probs
|
tuple of float
|
Central posterior interval probabilities, as in |
(0.025, 0.975)
|
pars
|
str or sequence of str
|
Subset of group-level parameters to include. Passed to |
None
|
groups
|
str or sequence of str
|
Subset of grouping factors to include. Passed to |
None
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
dict[str, DataArray]
|
Mapping from grouping-factor name (e.g.
|
Examples:
Compute summary random effects and inspect all coefficients for a single group level:
from brmspy import brms
from brmspy.brms import ranef
fit = brms.fit("count ~ zAge + zBase * Trt + (1 + zBase + Trt | patient)",
data=data, family="poisson")
re = ranef(fit) # summary=True by default
patient_re = re["patient"].sel(group="1", stat="Estimate")
Extract per-draw random effects for downstream MCMC analysis:
posterior_summary(object, variable=None, probs=(0.025, 0.975), robust=False, **kwargs)
¶
Extract posterior summary statistics for all or selected model parameters.
Provides a DataFrame with estimates, standard errors, and credible intervals
for all parameters in a brms model, including fixed effects, random effects,
and auxiliary parameters. More comprehensive than fixef()
or ranef() as it covers all parameter types.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object
|
FitResult or ListVector
|
Fitted model from |
required |
variable
|
str or list of str
|
Specific variable name(s) to extract. If None, returns all parameters. Supports regex patterns for flexible selection. |
None
|
probs
|
tuple of float
|
Quantiles for credible intervals, e.g., (0.025, 0.975) for 95% intervals. |
(0.025, 0.975)
|
robust
|
bool
|
If True, use median and MAD instead of mean and SD for summary statistics. |
False
|
**kwargs
|
Additional arguments passed to brms::posterior_summary() |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with parameters as rows and columns for Estimate, Est.Error, and quantiles (e.g., Q2.5, Q97.5). Includes all model parameters: population-level effects, group-level effects, and auxiliary parameters. |
See Also
brms::posterior_summary : R documentation
https://paulbuerkner.com/brms/reference/posterior_summary.brmsfit.html
fixef() : Extract only population-level effects
ranef() : Extract only group-level effects
Examples:
Get summary for all parameters:
import brmspy
model = brmspy.fit("y ~ x1 + (1|group)", data=data, chains=4)
# Get all parameter summaries
all_params = brmspy.posterior_summary(model)
print(all_params)
Extract specific parameters:
# Get summary for specific parameters
intercept = brmspy.posterior_summary(model, variable="b_Intercept")
print(intercept)
# Multiple parameters
fixed_only = brmspy.posterior_summary(model, variable=["b_Intercept", "b_x1"])
print(fixed_only)
Custom credible intervals with robust estimates:
prior_summary(object, all=True, **kwargs)
¶
Extract prior specifications used in a fitted brms model.
Returns a DataFrame containing all prior distributions that were used (either explicitly set or defaults) when fitting the model. Useful for documenting model specifications and understanding which priors were applied.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object
|
FitResult or ListVector
|
Fitted model from |
required |
all
|
bool
|
If True, return all priors including default priors. If False, return only explicitly set priors. |
True
|
**kwargs
|
Additional arguments passed to brms::prior_summary() |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with columns describing prior specifications: - prior: Prior distribution formula - class: Parameter class (b, sd, Intercept, etc.) - coef: Specific coefficient (if applicable) - group: Grouping factor (if applicable) - resp: Response variable (for multivariate models) - dpar: Distributional parameter (if applicable) - nlpar: Non-linear parameter (if applicable) - lb/ub: Bounds for truncated priors - source: Origin of prior (default, user, etc.) |
See Also
brms::prior_summary : R documentation
https://paulbuerkner.com/brms/reference/prior_summary.brmsfit.html
get_prior() : Get prior structure before fitting
default_prior() : Get default priors for a model
Examples:
Get all priors used in a model:
import brmspy
model = brmspy.fit(
"y ~ x1 + (1|group)",
data=data,
priors=[brmspy.prior("normal(0, 1)", "b")],
chains=4
)
# Get all priors (including defaults)
priors = brmspy.prior_summary(model)
print(priors)
Get only explicitly set priors:
# Get only user-specified priors
user_priors = brmspy.prior_summary(model, all=False)
print(user_priors)
Compare with what would be used before fitting:
validate_newdata(newdata, object, re_formula=None, allow_new_levels=False, newdata2=None, resp=None, check_response=True, incl_autocor=True, group_vars=None, req_vars=None, **kwargs)
¶
Validate new data for predictions from a fitted brms model.
Ensures that new data contains all required variables and has the correct structure for making predictions. Checks variable types, factor levels, grouping variables, and autocorrelation structures. This function is primarily used internally by prediction methods but can be called directly for debugging or validation purposes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
newdata
|
DataFrame
|
DataFrame containing new data to be validated against the model. Must include all predictor variables used in the model formula. |
required |
object
|
FitResult or ListVector
|
Fitted model from |
required |
re_formula
|
str
|
Formula string specifying group-level effects to include in validation. If None (default), include all group-level effects. If NA, include no group-level effects. |
None
|
allow_new_levels
|
bool
|
Whether to allow new levels of grouping variables not present in the original training data. If False, raises an error for new levels. |
False
|
newdata2
|
DataFrame
|
Additional data that cannot be passed via |
None
|
resp
|
str or list of str
|
Names of response variables to validate. If specified, validation is performed only for the specified responses (relevant for multivariate models). |
None
|
check_response
|
bool
|
Whether to check if response variables are present in newdata. Set to False when making predictions where response is not needed. |
True
|
incl_autocor
|
bool
|
Whether to include autocorrelation structures originally specified in the model. If True, validates autocorrelation-related variables. |
True
|
group_vars
|
list of str
|
Names of specific grouping variables to validate. If None (default), validates all grouping variables present in the model. |
None
|
req_vars
|
list of str
|
Names of specific variables required in newdata. If None (default), all variables from the original training data are required (unless excluded by other parameters). |
None
|
**kwargs
|
Additional arguments passed to brms::validate_newdata() |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Validated DataFrame based on newdata, potentially with added or modified columns to ensure compatibility with the model. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If newdata is missing required variables |
ValueError
|
If factor levels in newdata don't match those in training data (when allow_new_levels=False) |
ValueError
|
If grouping variables have invalid structure |
See Also
brms::validate_newdata : R documentation
https://paulbuerkner.com/brms/reference/validate_newdata.html
posterior_predict() : Uses validate_newdata internally
posterior_epred() : Uses validate_newdata internally
Examples:
Basic validation for prediction data:
import brmspy
import pandas as pd
# Fit model
model = brmspy.fit("y ~ x1 + x2", data=train_data, chains=4)
# Prepare new data
new_data = pd.DataFrame({
'x1': [1.0, 2.0, 3.0],
'x2': [0.5, 1.0, 1.5]
})
# Validate before prediction
validated_data = brmspy.validate_newdata(new_data, model)
print(validated_data)
Validate with group-level effects:
# Model with random effects
model = brmspy.fit("y ~ x + (1|group)", data=train_data, chains=4)
# New data with grouping variable
new_data = pd.DataFrame({
'x': [1.0, 2.0],
'group': ['A', 'B'] # Must match training data groups
})
# Validate - will error if groups A or B weren't in training
validated_data = brmspy.validate_newdata(
new_data,
model,
allow_new_levels=False
)
Allow new levels for population-level predictions:
# Allow new group levels (makes population-level predictions only)
new_data_with_new_groups = pd.DataFrame({
'x': [3.0, 4.0],
'group': ['C', 'D'] # New groups not in training
})
validated_data = brmspy.validate_newdata(
new_data_with_new_groups,
model,
allow_new_levels=True
)
Skip response variable checking: