Home

Research

Publications

Software

Teaching



parallelMCMCcombine R package

The parallelMCMCcombine R package is based on the following paper:

Miroshnikov, A., Conlon, E.M. (2014) parallelMCMCcombine: An R package for Bayesian methods for big data and analytics. PLOS ONE, 9(9), e108425. link

Description

parallelMCMCcombine is an R package that implements recent Bayesian Markov chain Monte Carlo (MCMC) methods that have been developed for big data sets. Here, big data refers to data sets that are too large to be analyzed in their entirety, due to limitations on either computer memory or storage size. The methods divide the data into subsets and perform communication-free parallel Bayesian MCMC analyses on the subsets, creating independent subposterior samples. These samples are then combined to estimate a posterior density given the full data set. The package includes four methods for combining the subposterior samples, including averaging, weighted averaging and kernel smoothing techniques. The package assumes the user has run the Bayesian data analysis outside of the package, and takes as input the parallel MCMC subposterior samples. The methods have been shown to be useful for Bayesian statistical models including Bayesian Gaussian mixture models, Bayesian logistic regression and Bayesian hierarchical Poisson-Gamma models.

Obtaining parallelMCMCcombine R package

The parallelMCMCcombine package is implemented in R and is available for download from the Comprehensive R Archive Network (CRAN) at the following link: parallelMCMCcombine R package download.

References

  • Scott, S.L., Blocker, A.W., Bonassi, F.V., Chipman, H.A., George, E.I., McCulloch, R.E. (2016) Bayes and big data: The consensus Monte Carlo algorithm. International Journal of Management Science and Engineering Management 11, 78-88.
  • Neiswanger, W., Wang, C., Xing, E. (2014) Asymptotically exact, embarrassingly parallel MCMC. In Uncertainty in Artificial Intelligence 30, N. Zhang and J. Tian, eds., pp. 623-632.
  • Silverman, B.W. (1986) Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC.