Changelog
0.3.1 - Standardized and modular InferenceData, improved memory management¶
25.12.23
This release standardizes the InferenceData structure across all prediction methods, ensuring consistent dimensions (chain, draw, obs_id) and variable naming conventions. It also improves shared-memory transport for Pandas DataFrames, enabling high-fidelity roundtripping of Categoricals and mixed types between R and Python.
Standardised idata¶
All idata returned from brmspy functions is now standardised to be joinable with one another, keep DataFrame indexes correctly in obs_id and works uniformly for univariate and multivariate models.
- brm(): Optional
return_idata: boolargument. In case of large models, using false and only running methods you may need can be better for memory management (e.g brms.posterior_pred(fit)). Whenreturn_idata=Truethe function now also includesconstant_data(Issue #51) - posterior(): Returns draws in
posteriorandconstant_dataas idata. (Issue #51) - observed_data() Returns
observed_dataandconstant_dataas idata (Issue #51) - posterior_epred() Now returns
predictionsandpredictions_constant_datain case there is newdata andposteriorandconstant_datawhen no newdata. Target variables are now suffixed with_mean. (Issue #51) - posterior_predict() Now returns
predictionsandpredictions_constant_datain case there is newdata andposterior_predictiveandconstant_datawhen no newdata. idata. (Issue #51) - posterior_linpred() Now returns
predictionsandpredictions_constant_datain case there is newdata andposteriorandconstant_datawhen no newdata. Target variables are now suffixed with_linpred. (Issue #51) - log_lik() (Issue #51) Always returns
log_likelihoodand depending onnewdata=Nonereturnsconstant_dataorpredictions_constant_data. - Added
newdatakwarg based overloads for static typechecking to automatically recognise the correct returned groups for idata
This change allows composable architectures, where the user picks only the parts of idata they need for their analysis.
Pandas & R Type Conversion¶
- Columnar SHM Transport: Improved
ShmDataFrameColumnsto transport DataFrames with mixed types via shared memory. Numeric and categorical columns now move between processes with zero-copy overhead, while complex object columns fall back to pickling individually. - Categorical Fidelity: R factors now correctly roundtrip to
pandas.CategoricalDtype, preserving categories, integer codes, and ordered status across the main-worker boundary. (issue #52) - Broad Dtype Support: Enhanced converters to robustly handle pandas nullable integers (
Int64), nullable floats, strings during R conversion.
Bug fixes and enhancements¶
- Worker crash recovery (Issue #50): Added automatic recovery for R worker crashes
(segfaults, BrokenPipeError, ConnectionResetError). The worker is restarted
transparently and the call raises
RWorkerCrashedError. The exception includes arecovered: boolflag indicating whether a clean worker session was successfully started, allowing pipelines to distinguish retryable crashes (recovered=True) from hard failures (recovered=False). - Numpy Encoding: Standardised encoding for object arrays. String arrays are now optimized as
ShmArray; mixed object arrays gracefully fall back to pickling. - Improved SHM memory management: Introduced explicit temporary buffers that are cleaned up immediately after use, while non-temporary buffers are now tracked by ShmPool only until the next main <-> worker exchange; buffer lifetime is then transferred to CodecRegistry, which ties shared-memory mappings to reconstructed objects via weakrefs, minimizing the number of active mappings and allowing timely resource release once those objects are garbage-collected.
0.3.0 - Process-Isolated R & Hot-Swappable Runtimes¶
25.12.14
This release introduces a redesigned main–worker–R architecture to address stability issues caused by embedding R directly in the Python process. In real-world use, unpredictable failures, ranging from R segfaults to rpy2 crashes, could take down the entire Python runtime, invalidate IDE sessions, and make test behavior OS-dependent. The old single-process model also made R state effectively immutable after import, limited runtime switching, and required brittle workarounds for package management and CI isolation. These issues were not fixable at the level of defensive coding alone.
The new architecture jails R inside a dedicated worker process with shared-memory transport, zero-copy codecs, and a proxy module session that preserves the public API. All R data structures (matrices, data frames, ArviZ objects) transfer via shared memory without heap duplication, keeping memory use equivalent to the original design even for multi-gigabyte posteriors. R runtimes and environments are now fully isolated, hot-swappable, and safely mutable through a context manager. Worker-side crashes no longer affect the main interpreter, and previously fragile operations (e.g., loo functions) run without instability. The result is a predictable, reproducible, OS-agnostic execution model with significantly reduced failure modes.
Breaking Changes¶
- Removal of top-level functions:
brmspy.fit,brmspy.install_brms, and other direct exports from the root package have been removed. Users should import thebrmsmodule (e.g.,from brmspy import brmsorfrom brmspy.brms import bf). install_brmsAPI change: Global installation functions (install_brms,install_runtime,install_rpackage) are removed from the public namespace. Installation and environment modification must now be performed inside abrms.manage()context to ensure safe worker restarts.- Opaque R handles: The
.rattribute on result objects (e.g.,FitResult.r) is no longer a liverpy2object but aSexpWrapperhandle. These handles cannot be manipulated directly in the main process but can be passed back intobrmsfunctions for processing in the worker. They retain the R repr for debugging purposes. - Runtime module internalization:
brmspy.runtimehas been moved tobrmspy._runtimeand is considered internal. Public runtime interactions should occur viabrms.manage(). - Formula logic:
FormulaResulthas been replaced byFormulaConstruct. Formulas are now built as pure Python DSL trees and only converted to R objects during execution in the worker.
New Features¶
- Context-managed environments: Added
brms.manage(), a context manager that spins up a dedicated worker for administrative tasks. Exposes methodsctx.install_brms(),ctx.install_runtime(), andctx.install_rpackage()among others which persist changes to the active environment. - Multi-environment support: Users can create and switch between named environments (e.g.,
with brms.manage(environment_name="dev"): ...) which maintain separate user libraries (Rlib) layered on top of the base runtime. - Environment persistence: Active environment configurations are saved to
~/.brmspy/environment_state.jsonand~/.brmspy/environment/<name>/config.json. - Status API: Added
brms.environment_exists()andbrms.environment_activate()helpers for managing the lifecycle of R environments programmatically.
Environments & Runtime¶
- Process Isolation: R now runs in a dedicated
spawned worker process. Calls from the main process are serialized and sent via IPC. - Shared Memory Transport: Implemented a custom
SharedMemoryManagerbased transport layer. Large numeric payloads (NumPy arrays, pandas DataFrames, ArviZ InferenceData) are written to shared memory buffers, avoiding serialization overhead. - Hot-swappable sessions: The R worker can be restarted with a different configuration (R_HOME, library paths) on the fly without restarting the Python interpreter.
- Zero-copy codecs: Added internal codecs (
NumpyArrayCodec,PandasDFCodec,InferenceDataCodec) that handle SHM allocation and view reconstruction transparently. - Sexp Cache: Implemented a worker-side cache for R objects (
Sexp). The main process holds lightweightSexpWrapperreferences (by ID) which are rehydrated into real R objects when passed back to the worker.
API & Behaviour¶
- Pure Python Formulas: Formula helpers (
bf,lf,nlf, etc.) now returnFormulaConstructdataclasses. This allows formula composition (+operator) to happen in Python without requiring a running R session until fit time. - Worker Proxy: The
brmspy.brmsmodule is now a dynamic proxy (RModuleSession). Accessing attributes triggers remote lookups, and calling functions triggers IPC requests. - Logging Bridge: Worker-side logs (including R console output) are captured and forwarded to the main process's logging handlers via a
QueueListener.
Documentation & Infrastructure¶
- Versioned Documentation: Added
mikesupport for deploying versioned docs (e.g.,/0.3/,/stable/) to GitHub Pages via thedocs-versionedworkflow. - Architecture enforcement: Added
import-linterwith strict contracts to prevent leakage of internal layers (e.g., ensuringrpy2.robjectsis never imported in the main process). - Internal Docs Generation: Added scripts to auto-generate API reference stubs for internal modules (
_runtime,_session, etc.) to aid development.
Testing & CI¶
- Hot-swap stress tests: Added tests that repeatedly restart the entire R runtime and SharedMemoryManager in a loop, then immediately access old SHM-backed arrays and InferenceData. These scenarios would crash instantly if any lifetime or reference handling were incorrect, making them an effective torture test of the new architecture.
- Worker Test Marker: Introduced
@pytest.mark.workerand aworker_runnerfixture to execute specific tests inside the isolated worker process. - Coverage Aggregation: Updated CI to merge coverage reports from the main process and the spawned worker process.
- R Dependency Tests: Switched
r-dependencies-testsworkflow to use the new isolated test runner script.
0.2.1 - Stability hotfix¶
25.12.07
- Try to enforce rpy2 RPY2_CFFI_MODE ABI mode on import with warnings if not possible. API/BOTH can cause instability on linux and macos (Issue: #45)
- Added R_HOME and LD_LIBRARY_PATH to github workflows (required on most environments in ABI mode)
- The environment now does its best attempts to detect invalid R setups and log them
0.2.0 - Runtime Refactor & Formula DSL¶
25.12.07
Breaking Changes¶
- Removed Diagnostics: Removed
loo,loo_compare, andadd_criteriondue to frequent segfaults in embedded R mode. Users should rely onarviz.looandarviz.compareusing theidataproperty of the fit result. - Installation API: Renamed
use_prebuilt_binariesargument touse_prebuiltininstall_brms(). - Installation API now consists of:
install_brms,install_runtime,deactivate_runtime,activate_runtime,find_local_runtime,get_active_runtime,get_brms_version - Deprecations: Renamed
fittobrmandformulatobf. Previous names are still exported as aliases, but might be removed in a future version.
New Features¶
- Formula DSL: Implemented
bf,lf,nlf,acformula,set_rescor,set_mecor, andset_nl. These objects support additive syntax (e.g.,bf(...) + set_rescor(True) + gaussian()) mirroring native brms behavior. - Generic Data Loader: Added
get_data()to load datasets from any installed R package, complementingget_brms_data(). - Runtime Status: Added
brmspy.runtime.status()to programmatically inspect the current R environment, toolchain compatibility, and active runtime configuration. - Families now in package root: Families can now be imported from package root, e.g
from brmspy import gaussian
Runtime & Installation¶
- Core Refactor: Completely re-architected
brmspy.runtimeinto strict layers (_config,_r_env,_platform,_install, etc) to eliminate side effects during import and prevent circular dependencies. - Atomic Activation:
activate_runtime()now validates manifest integrity and system fingerprints before mutating the R environment, ensuring atomic success or rollback. - Auto-Persistence: The last successfully activated runtime is automatically restored on module import via
runtime._autoload, creating persistent sessions across restarts. - Windows Toolchain: Modularized RTools detection logic to accurately map R versions to RTools versions (4.0–4.5) and handle path updates safely.
Documentation & Infrastructure¶
- MkDocs Migration: Ported all documentation to MkDocs with the Material theme for better navigability and API references.
- Rendered notebooks: Added more notebook examples that are now rendered fully with links to running each in Google Colab.
- ArViz diagnostics examples: can now be found under API reference
- Test coverage: Test coverage for brms functions is now at 88% and for R environment and package management at 68%
0.1.13 - Enhanced Diagnostics & Type-Safe Summaries¶
25.12.04
Diagnostics¶
summary()Rewrite: ReturnsSummaryResultdataclass with structured access tofixed,spec_pars,random,prior, and model metadata. Includes pretty-print support.fixef(): Extract population-level effects as DataFrame. Supportssummary,robust,probs, andparsarguments.ranef(): Extract group-level effects as xarray DataArrays. Returns dict mapping grouping factors to arrays with configurable summary/raw modes.posterior_summary(): Extract all model parameters (fixed, random, auxiliary) as DataFrame. Supports variable selection and regex patterns.prior_summary(): Return DataFrame of prior specifications. Option to show all priors or only user-specified.loo(): Compute LOO-CV using PSIS. ReturnsLooResultwith elpd_loo, p_loo, looic, and Pareto k diagnostics.loo_compare(): Compare multiple models via LOO-CV. ReturnsLooCompareResultranked by performance withelpd_diffand standard errors.validate_newdata(): Validate prediction data against fitted model requirements. Checks variables, factor levels, and grouping structure.
Type System¶
- DataFrame Detection:
r_to_py()now correctly preserves row indexes, column names, and proper type conversion from R DataFrames. LooResult/LooCompareResult: Added__repr__()for formatted notebook output.
Generic Function Access¶
call(): Universal wrapper for calling any brms or R function by name with automatic type conversion.sanitised_name(): Helper to convert Python-style names to valid R identifiers.
Testing¶
- Added 14 tests covering all new diagnostics functions.
- Optimized test iterations (
iter=100, warmup=50) for faster CI.
0.1.12 - RDS I/O, Families Module, Default Priors¶
25.12.03
New Features¶
save_rds(): Savebrmsfitor generic R objects to RDS files.load_rds_fit(): Load savedbrmsfitobjects, returningFitResultwith attachedInferenceData.load_rds_raw(): Load arbitrary R objects from RDS files.brmAlias: Addedbrmas alias forfit.
Families¶
- Added
brmspy.familiesmodule withbrmsfamily()andfamily()wrappers. - Implemented keyword-argument wrappers for 40+ families:
student,bernoulli,beta_binomial,negbinomial,geometric,lognormal,shifted_lognormal,skew_normal,exponential,weibull,frechet,gen_extreme_value,exgaussian,wiener,Beta,dirichlet,logistic_normal,von_mises,asym_laplace,cox,hurdle_*,zero_inflated_*,categorical,multinomial,cumulative,sratio,cratio,acat.
Priors¶
default_prior(): Retrieve default priors for a model formula and dataset.get_prior(): Inspect prior structure before fitting.
Internal¶
- Reorganized brms wrappers into modular files under
brmspy/brms_functions/. - Added
RListVectorExtensionprotocol for automatic R list extraction in type conversion.
0.1.11 - Persistent Runtimes & Logging¶
25.12.01
New Features¶
- Persistent Runtimes: Activated runtime path saved to
~/.brmspy/config.jsonand auto-loaded on import. - Configurable Logging: Replaced print statements with centralized logger.
- Optimized Activation: Made aggressive unloading conditional for faster runtime activation.
0.1.10 - Windows Stability & CI Improvements¶
25.12.01
Windows Support¶
- Implemented aggressive R package unloading (detach, unloadNamespace, DLL unload) to prevent file locking errors.
- Refined RTools detection; relaxed
g++version requirements when valid RTools is detected. - Changed
install_rtoolsdefault toFalseininstall_brms()to prevent unintended PATH modifications. - Fixed PowerShell command syntax generation.
- Windows prebuilt binaries currently require R4.5.
Build & CI¶
- Expanded CI matrix: Windows, macOS, Ubuntu on Python 3.12.
- Optimized GitHub Actions caching for R libraries and CmdStan.
- Fixed artifact pruning logic in runtime builder workflows.
Bug Fixes¶
- Ensured
jsonlitedependency is explicitly resolved during manifest generation. - Fixed workflow path referencing and quoting issues.
0.1.9 - Prebuilt Runtimes & Windows Toolchain¶
25.11.30
New Features¶
- Prebuilt Runtimes: Added
brmspy.binariessubpackage for precompiled R environments withbrmsandcmdstanr(up to 50x faster install). - Fast Installation: Added
use_prebuilt_binaries=Trueargument toinstall_brms(). - Windows Toolchain: Automatic Rtools (MinGW-w64) detection and installation in
install_brms().
Enhancements¶
- Linux Binaries: Prioritize Posit Package Manager (P3M) binary repositories based on OS codename.
- Documentation: Added docstrings to all public and internal functions.
Infrastructure¶
- Added
.runtime_builderDockerfiles for reproducible Linux runtime environments.
0.1.8 - RStan Support & Version Pinning¶
25.11.29
New Features¶
- RStan Backend: Added
rstanas alternative backend.install_brms()acceptsinstall_rstanparam;fit()acceptsbackend="rstan". - Version Pinning:
install_brms()supports pinning specific R package versions (e.g.,version="2.21.0") viaremotes.
Platform Support¶
- Windows Toolchain: Automatic Rtools detection and setup in
install_brms(). - macOS/Windows Binaries: Fixed installation failures by defaulting to
type="both"instead of forcing source compilation.
Infrastructure¶
- Added cross-platform CI workflow (Windows, macOS, Ubuntu).
0.1.7 - Import Fixes¶
25.11.29
- Fixed library refusing import when R dependencies are missing.
- R libraries now automatically imported after installation.
0.1.6 - Segfault Fix & Stability¶
25.11.29
Core Stability¶
- Fixed segfault occurring when
fit()was called insidetqdmloops or repeated call contexts. - All R imports (
brms,cmdstanr,posterior) now performed once at module import, never inside functions.
Performance¶
- Repeated model fits now faster due to eliminated R namespace reloads.
- Reduced memory churn by removing redundant converter/namespace setup.
Testing¶
- Added
test_fit_tqdm_segfault()regression test.
0.1.5 - Priors, Formula Helper, Typed ArviZ¶
25.11.28
API & Types¶
formula(): Added helper for building reusable model formulas with kwargs support.- Typed ArviZ Aliases: Added
IDFit,IDPredict,IDLinpred,IDLogLik,IDEpredfor differentInferenceDatashapes. - Exported Types:
FitResult,PosteriorEpredResult,PosteriorPredictResult,PosteriorLinpredResult,LogLikResult,GenericResultnow in public API.
Priors¶
prior()Helper: Now recommended way to specify priors instead of raw tuples.- Improved internal prior-building logic for better mapping to
brms::set_prior(). - Supports
class_,coef,group,dparcombinations.
Internal¶
- Improved
fit()kwargs parsing for more robust forwarding tobrms/cmdstanr. - Expanded test coverage for priors,
get_stancode,summary, and fit-without-sampling paths.