Nonparametric Bayesian methods: a gentle introduction and overview

  • MacEachern, Steven N. (Department of Statistics, The Ohio State University)
  • Received : 2016.10.22
  • Accepted : 2016.11.08
  • Published : 2016.11.30


Nonparametric Bayesian methods have seen rapid and sustained growth over the past 25 years. We present a gentle introduction to the methods, motivating the methods through the twin perspectives of consistency and false consistency. We then step through the various constructions of the Dirichlet process, outline a number of the basic properties of this process and move on to the mixture of Dirichlet processes model, including a quick discussion of the computational methods used to fit the model. We touch on the main philosophies for nonparametric Bayesian data analysis and then reanalyze a famous data set. The reanalysis illustrates the concept of admissibility through a novel perturbation of the problem and data, showing the benefit of shrinkage estimation and the much greater benefit of nonparametric Bayesian modelling. We conclude with a too-brief survey of fancier nonparametric Bayesian methods.



  1. Antoniak CE (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems, Annals of Statistics, 2, 1152-1174.
  2. Barrientos AF, Jara A, and Quintana FA (2012). On the support of MacEachern's dependent Dirichlet processes and extensions, Bayesian Analysis, 7, 277-310.
  3. Bean A, Xu X, and MacEachern SN (2016). Transformations and Bayesian density estimation. To appear in the Electronic Journal of Statistics, 10, 3355-3373.
  4. Berger JO (1985). Statistical Decision Theory and Bayesian Analysis (2nd ed), Springer-Verlag, New York.
  5. Berry DA and Christensen R (1979). Empirical Bayes estimation of a binomial parameter via mixtures of Dirichlet processes, Annals of Statistics, 7, 558-568.
  6. Blackwell D and MacQueen JB (1973). Ferguson distributions via Polya urn schemes, Annals of Statistics, 1, 353-355.
  7. Blei DM and Jordan MI (2006). Variational inference for Dirichlet process mixtures, Bayesian Analysis, 1, 121-143.
  8. Broderick T, Pitman J, and Jordan MI (2013). Feature allocations, probability functions, and paintboxes, Bayesian Analysis, 8, 801-836.
  9. Bush CA, Lee J, and MacEachern SN (2010). Minimally informative prior distributions for nonparametric Bayesian analysis, Journal of the Royal Statistical Society Series B (Statistical Methodology), 72, 253-268.
  10. Bush CA and MacEachern SN (1996). A semiparametric model for randomised block designs, Biometrika, 83, 275-285.
  11. Dahl DB (2003). An improved merge-split sampler for conjugate Dirichlet process mixture models, Department of Statistics, University of Wisconsin. Technical Report 1086.
  12. De Iorio M, Muller P, Rosner G, and MacEachern SN (2004). An ANOVA model for dependent random measures, Journal of the American Statistical Association, 99, 205-215.
  13. Doksum K (1974). Tailfree and neutral random probabilities and their posterior distributions, Annals of Probability, 2, 183-201.
  14. Dunson DB and Park JH (2008). Kernel stick-breaking processes, Biometrika, 95, 307-323.
  15. Dunson DB, Pillai N, and Park JH (2007). Bayesian density regression, Journal of the Royal Statistical Society Series B (Statistical Methodology), 69, 163-183.
  16. Dykstra RL and Laud P (1981). Bayesian nonparametric approach to reliability, Annals of Statistics, 9, 356-367.
  17. Efron B and Morris C (1975). Data analysis using Stein's estimator and its generalizations, Journal of the American Statistical Association, 70, 311-319.
  18. Escobar MD (1988). Estimating the means of several normal populations by estimating the distribution of the means (Doctoral dissertation), Yale University, New Haven, CT.
  19. Escobar MD (1994). Estimating normal means with a Dirichlet process prior, Journal of the American Statistical Association, 89, 268-277.
  20. Escobar MD and West M (1995). Bayesian density estimation and inference using mixtures, Journal of the American Statistical Association, 90, 577-588.
  21. Ferguson TS (1973). A Bayesian analysis of some nonparametric problems, Annals of Statistics, 1, 209-230.
  22. Gelfand AE and Kottas A (2002). A computational approach for full nonparametric Bayesian inference under Dirichlet process mixture models, Journal of Computational and Graphical Statistics, 11, 289-305.
  23. Gelfand AE, Kottas A, and MacEachern SN (2005). Bayesian nonparametric spatial modeling with Dirichlet process mixing, Journal of the American Statistical Association, 100, 1021-1035.
  24. Gelfand AE and Smith AFM (1990). Sampling-based approaches to calculating marginal densities, Journal of the American Statistical Association, 85, 398-409.
  25. Ghosal S and van der Vaart AW (2001). Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities, Annals of Statistics, 29, 1233-1263.
  26. Ghosh JK and Ramamoorthi RV (2003). Bayesian Nonparametrics, Springer, New York.
  27. Griffin JE (2010). Default priors for density estimation with mixture models, Bayesian Analysis, 5, 45-64.
  28. Griffin JE and Steel MFJ (2006). Order-based dependent Dirichlet processes, Journal of the American Statistical Association, 101, 179-194.
  29. Griffiths TL and Ghahramani Z (2011). The Indian buffet process: an introduction and review, Journal of Machine Learning Research, 12, 1185-1224.
  30. Guha S (2008). Posterior simulation in the generalized linear mixed model with semiparametric random effects, Journal of Computational and Graphical Statistics, 17, 410-425.
  31. Hahn PR and Carvalho CM (2015). Decoupled shrinkage and selection in Bayesian linear models: a posterior summary perspective, Journal of the American Statistical Association, 110, 435-448.
  32. Hanson TE (2006). Inference for mixtures of finite Polya tree models, Journal of the American Statistical Association, 101, 1548-1565.
  33. Hjort NL (1990). Nonparametric Bayes estimators based on beta processes in models for life history data, Annals of Statistics, 18, 1259-1294.
  34. Huber PJ (1981). Robust Statistics, John Wiley & Sons, New York.
  35. Ishwaran H and James LF (2001). Gibbs sampling methods for stick-breaking priors, Journal of the American Statistical Association, 96, 161-173.
  36. James LF, Lijoi A, and Prunster I (2005). Conjugacy as a distinctive feature of the Dirichlet process, Scandinavian Journal of Statistics, 33, 105-120.
  37. Jain S and Neal RM (2004). A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model, Journal of Computational and Graphical Statistics, 13, 158-182.
  38. Jain S and Neal RM (2007). Splitting and merging components of a nonconjugate Dirichlet process mixture model, Bayesian Analysis, 2, 445-472.
  39. Jara A, Hanson T, Quintana FA, Muller P, and Rosner GL (2011). DPpackage: Bayesian semi- and nonparametric modeling in R, Journal of Statistical Software, 40, 1-30.
  40. Johnson W and Christensen R (1986). Bayesian nonparametric survival analysis for grouped data, Canadian Journal of Statistics, 14, 307-314.
  41. Kalli M, Griffin JE, andWalker SG (2011). Slice sampling mixture models, Statistics and Computing, 21, 93-105.
  42. Kessler DC, Hoff PD, and Dunson DB (2014). Marginally specified priors for nonparametric Bayesian estimation, Journal of the Royal Statistical Society B(Statistical Methodology), 77, 35-58.
  43. Kim Y (1999). Nonparametric Bayesian estimators for counting processes, Annals of Statistics, 27, 562-588.
  44. Kim Y and Lee J (2003). Bayesian bootstrap for proportional hazards models, Annals of Statistics, 31, 1905-1922.
  45. Kleinman KP and Ibrahim JG (1998). A semiparametric Bayesian approach to the random effects model, Biometrics, 54, 921-938.
  46. Kuo L and Smith AF (1992). Bayesian computations in survival models via the Gibbs sampler (with discussion). In JP Klein and PK Goel (Eds), Survival Analysis: State of the Art (pp. 11-24), Springer Netherlands, Dordrecht.
  47. Lavine M (1992). Some aspects of Polya tree distributions for statistical modelling, Annals of Statistics, 20, 1222-1235.
  48. Lee J and MacEachern SN (2014). Inference functions in high dimensional Bayesian inference, Statistics and Its Interface, 7, 477-486
  49. Lee J, MacEachern SN, Lu Y, and Mills GB (2014). Local-mass preserving prior distributions for nonparametric Bayesian models, Bayesian Analysis, 9, 307-330.
  50. Lee J, Quintana FA, Muller P, and Trippa L (2013). Defining predictive probability functions for species sampling models, Statistical Science, 28, 209-222.
  51. Lenk PJ (1988). The logistic normal distribution for Bayesian, nonparametric, predictive densities, Journal of the American Statistical Association, 83, 509-516.
  52. Lijoi A, Mena RH, and Prunster I (2005). Hierarchical mixture modelling with normalized inverse-Gaussian priors, Journal of the American Statistical Association, 100, 1278-1291.
  53. Liu JS (1996). Nonparametric hierarchical Bayes via sequential imputations, Annals of Statistics, 24, 910-930.
  54. Lo AY (1984). On a class of Bayesian nonparametric estimates: I. Density estimates, Annals of Statistics, 12, 351-357.
  55. MacEachern SN (1988). Sequential Bayesian bioassay design (Doctoral dissertation), University of Minnesota, Minneapolis, MN.
  56. MacEachern SN (1994). Estimating normal means with a conjugate style Dirichlet process prior, Communications in Statistics - Simulation and Computation, 23, 727-741.
  57. MacEachern SN (1999). Dependent nonparametric processes, in American Statistical Association 1999 Proceedings of the Section on Bayesian Statistics, Alexandria, VA, 50-55.
  58. MacEachern SN (2000). Dependent Dirichlet Processes, The Ohio State University, Department of Statistics, Columbus, OH.
  59. MacEachern SN (2001). Decision theoretic aspects of dependent nonparametric processes, in In Bayesian Methods with Applications to Science, Policy, and Official Statistics, (pp. 551-560), Eurostat, Luxembourg.
  60. MacEachern SN (2007). Comment on article by Jain and Neal, Bayesian Analysis, 2, 483-494.
  61. MacEachern SN, Clyde M, and Liu JS (1999). Sequential importance sampling for nonparametric Bayes models: the next generation, Canadian Journal of Statistics, 27, 251-267.
  62. MacEachern SN and Guha S (2011). Parametric and semiparametric hypotheses in the linear model, Canadian Journal of Statistics, 39, 165-180.
  63. MacEachern SN, Kottas A, and Gelfand AE (2001). Spatial nonparametric Bayesian models, In Proceedings of the 2001 Joint Statistical Meetings, Atlanta, GA.
  64. MacEachern SN and Muller P (1998). Estimating mixture of Dirichlet process models, Journal of Computational and Graphical Statistics, 7, 223-238.
  65. Martin R and Tokdar ST (2009). Asymptotic properties of predictive recursion: robustness and rate of convergence, Electronic Journal of Statistics, 3, 1455-1472.
  66. Mauldin RD, Sudderth WD, and Williams SC (1992). Polya trees and random distributions, Annals of Statistics, 20, 1203-1221.
  67. Muller P, Erkanli A, and West M (1996). Bayesian curve fitting using multivariate normal mixtures, Biometrika, 83, 67-79.
  68. Muller P and Mitra R (2013). Bayesian nonparametric inference: why and how, Bayesian Analysis, 8, 1-35
  69. Muller P and Quintana FA (2004). Nonparametric Bayesian data analysis, Statistical Science, 19, 95-110.
  70. Muller P, Quintana FA, Jara A, and Hanson T (2015). Bayesian Nonparametric Data Analysis, Springer, New York.
  71. Muller P, Quintana FA, and Rosner G (2004). A method for combining inference across related nonparametric Bayesian models, Journal of the Royal Statistical Society B (Statistical Methodology), 66, 735-749.
  72. Neal RM (2000). Markov chain sampling methods for Dirichlet process mixture models, Journal of Computational and Graphical Statistics, 9, 249-265.
  73. Newton MA and Raftery AE (1994). Approximate Bayesian inference with the weighted likelihood bootstrap, Journal of the Royal Statistical Society B (Methodological), 56, 3-48.
  74. Newton MA and Zhang Y (1999). A recursive algorithm for nonparametric analysis with missing data, Biometrika, 86, 15-26.
  75. Orbanz P and Roy DM (2015). Bayesian models of graphs, arrays, and other exchangeable random structures, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 437-461.
  76. Pennell ML and Dunson DB (2006). Bayesian semiparametric dynamic frailty models for multiple event time data, Biometrics, 62, 1044-1052.
  77. Petrone S (1999). Bayesian density estimation using Bernstein polynomials, Canadian Journal of Statistics, 27, 105-126.
  78. Regazzini E, Lijoi A, and Prunster I (2003). Distributional results for means of normalized random measures with independent increments, Annals of Statistics, 31, 560-585.
  79. Rodriguez A, Dunson DB, and Gelfand AE (2008). The nested Dirichlet process, Journal of the American Statistical Association, 103, 1131-1154.
  80. Rubin DB (1981). The Bayesian bootstrap, Annals of Statistics, 9, 130-134.
  81. Savage LJ (1954). The Foundations of Statistics, John Wiley & Sons, New York.
  82. Sethuraman J (1994). A constructive definition of Dirichlet priors, Statistica Sinica, 4, 639-650.
  83. Susarla V and Van Ryzin J (1976). Nonparametric Bayesian estimation of survival curves from incomplete observations, Journal of the American Statistical Association, 71, 897-902.
  84. Teh YW, Jordan MI, Beal MJ, and Blei DM (2006). Hierarchical Dirichlet processes, Journal of the American Statistical Association, 101, 1566-1581.
  85. Tokdar ST (2007). Towards a faster implementation of density estimation with logistic Gaussian process priors, Journal of Computational and Graphical Statistics, 16, 633-655.
  86. Tomlinson GA (1998). Analysis of densities (Doctoral dissertation), University of Toronto, ON.
  87. Walker SG (2004). New approaches to Bayesian consistency, Annals of Statistics, 32, 2028-2043.
  88. Walker SG (2007). Sampling the Dirichlet mixture model with slices, Communications in Statistics - Simulation and Computation, 36, 45-54.
  89. Walker SG, Damien P, Laud PW, and Smith AFM (1999). Bayesian nonparametric inference for random distributions and related functions, Journal of the Royal Statistical Society B (Statistical Methodology), 61, 485-527.
  90. Walker SG and Gutierrez-Pena E (1999). Robustifying Bayesian procedures, Bayesian Statistics, 6, 685-710.
  91. Wang Z (2009). Semiparametric Bayesian models extending weighted least squares (Doctoral dissertation), The Ohio State University, Columbus, OH.
  92. Xu X, Lu P, MacEachern SN, and Xu R (2012). Calibrated Bayes factor for model comparison and prediction, Department of Statistics, The Ohio State University, Technical Report.
  93. Yang L and Marron JS (1999). Iterated transformation-kernel density estimation, Journal of the American Statistical Association, 94, 580-589.

Cited by

  1. A review of tree-based Bayesian methods vol.24, pp.6, 2017,
  2. Bayesian methods in clinical trials with applications to medical devices vol.24, pp.6, 2017,
  3. Identifying differentially expressed genes using the Polya urn scheme vol.24, pp.6, 2017,
  4. Geometric Sensitivity Measures for Bayesian Nonparametric Density Estimation Models pp.0976-8378, 2018,