三教九流什么意思| 真露酒属于什么酒| 反犬旁和什么有关| 舌苔厚白用什么泡水喝| 冰粉籽是什么植物| 湿气太重吃什么排湿最快| rgp是什么| 摩羯座女和什么星座最配| 脾肾气虚的症状是什么| 为什么不建议打水光针| 左眼皮跳跳好事要来到是什么歌| 特首是什么意思| 什么叫辅酶q10| 尿酸高有什么症状表现| 卑劣是什么意思| 小孩小便红色是什么原因| 腹水是什么意思| 安五行属什么| 女性脱发严重是什么原因引起的| 手麻看什么科| freeze是什么意思| 京东e卡是什么| 失信人是什么意思| 低血糖要吃什么| 活检检查是什么意思| 自理是什么意思| 曹丕为什么不杀曹植| 白案是什么意思| 内分泌失调是什么原因引起的| 十月十六号是什么星座| 大山羊是什么病| 百忧解是什么药| 什么叫屌丝| 吃银耳有什么好处和坏处| 做肉丸用什么淀粉最佳| 金牛女喜欢什么样的男生| 故人什么意思| icd是什么意思| 古天乐属什么生肖| 山药跟淮山有什么区别| 甲状腺球蛋白高是什么原因| 右枕前位是什么意思| 风团是什么| 牙齿掉了一小块是什么原因| 新陈代谢是指什么| mrsa医学上是什么意思| 7月属什么生肖| 晚上难以入睡是什么原因| 下午5点到7点是什么时辰| 白细胞偏高什么原因| 四肢冰凉是什么原因| 听吧新征程号角吹响是什么歌| 老虎菜是什么菜| 得了性疾病有什么症状| 枸杞是补什么的| 额头和下巴长痘痘是什么原因| 金牛座是什么象星座| cv什么意思| 气阴两虚吃什么药| 今年男宝宝取什么名字好| 停月经有什么症状| 检查脖子挂什么科| 住院门槛费是什么意思| 软坚散结是什么意思| 瘊子是什么| 动脉硬化吃什么药最好| 健康证是什么样的| 立本是什么意思| 什么是孤独症| 美团和美团外卖有什么区别| 梦到和婆婆吵架是什么意思| 低血糖的症状是什么| 属蛇的和什么属相最配| 汉族为什么叫汉族| 什么是透析治疗| 四月28日是什么星座| 什么是拿铁| 为什么会中暑| 孟买血型是什么意思| 勇往直前是什么意思| 附骨疽在现代叫什么病| 闭口是什么样子图片| 心志是什么意思| 小便解不出来是什么原因| 舌边有齿痕是什么原因| 守岁是什么意思| 梦到下雪是什么意思| 吉士粉是什么粉| 香蕉像什么比喻句| 什么是丁克| 月子餐第一周吃什么| 喉咙干是什么病的前兆| 心电图逆钟向转位什么意思| 红枣有什么功效| 胆囊结石不宜吃什么| 肾的功能是什么| 珠海有什么特产| 浅卡其色裤子配什么颜色上衣| 海尔兄弟叫什么| 玫瑰花泡水喝有什么功效| 鹅喜欢吃什么食物| 甲状腺结节用什么药| cpm是什么意思| goldlion是什么牌子| 胃属于什么科室| 老是掉头发什么原因| 苍白的什么| 为什么会长痔疮| 子宫肌瘤是什么病严重吗| 湘字五行属什么的| 蜂蜜水喝了有什么好处| 斐乐属于什么档次| 挂科有什么影响| 头胀是什么原因导致的| 孕妇可以喝什么汤| 考拉是什么意思| 白细胞高是什么意思| 偏光镜什么意思| 茶水费是什么意思| 恩替卡韦片是什么药| 贴秋膘是什么意思啊| 小孩手指头脱皮是什么原因| 户籍信息是什么| 丁目是什么意思| 什么提示你怀了女宝宝| 什么叫管状腺瘤| 黑丝是什么| 清明是什么季节| 苦口婆心是什么生肖| 坐以待毙是什么意思| 面子是什么意思| 共青团书记是什么级别| 什么饺子馅好吃| 好运是什么意思| 得之坦然失之淡然是什么意思| 女生下面叫什么| 办健康证在什么地方办| 白猫进家有什么预兆| suki是什么意思| 培坤丸有什么作用功效| 窥视是什么意思| 晚上11点多是什么时辰| 嗜碱性粒细胞偏高是什么原因| 璎珞是什么意思| 什么长而什么| 潴留囊肿是什么意思| 2012年是什么命| 秋天有什么花| 多种维生素什么牌子的效果最好| 百合有什么功效和作用| 脉紧是什么意思| 宝宝手足口病吃什么药| 牙齿松动吃什么药| bpc是什么意思| 口腔扁平苔藓吃什么药| 黑科技是什么| 面膜什么牌子好| pubg是什么意思| 手臂酸痛什么原因| 玛卡是什么药| 女孩小名叫什么好| 4月份是什么星座| 醉氧是什么意思| 双非是什么| 卢沟桥事变又称什么| 老虎属于什么科| 转氨酶高是什么问题| 补充胶原蛋白吃什么最好| 一语惊醒梦中人是什么意思| 吃什么水果补气血| 甲硝唑的副作用是什么| 合肥有什么特产| 骨结核是什么病| 腮腺炎不能吃什么东西| 栖字五行属什么| 小孩的指甲脱落是什么原因| 尿潜血十一是什么意思| 工匠精神是什么| 用什么点豆腐最健康| 琪五行属性是什么| 跑完步头疼是为什么| 白头发有什么方法变黑| 吃过期的药有什么后果| 无什么什么什么| 为什么会肛裂| 孕妇喝什么汤| 结肠憩室是什么意思| 苎麻是什么| 头自动摇摆是什么原因| 八带是什么| 转是什么意思| 吃什么食物养肝护肝| 7月4号什么星座| 更年期燥热吃什么食物| 肝脏挂什么科| 美是什么生肖| 滂沱是什么意思| mds医学上是什么意思| 网球肘用什么膏药效果好| 一什么火焰| 阑尾炎应该挂什么科| 黄茶属于什么茶| 白介素8升高说明什么| 拉肚子低烧是什么原因| 9.7号是什么星座| 马加大是什么字| 鲟鱼吃什么| 86年属虎是什么命| 什么叫囊肿| 荞麦枕头有什么好处| 半夜十二点是什么时辰| 什么是闰月| 小孩瘦小不长肉是什么原因| 六味地黄丸吃多了有什么副作用| aww是什么意思| 吃什么对胃好养胃的食物| 痛风不能吃什么食物| 百合和拉拉有什么区别| 什么东西抗衰老最好| 吃什么菜对眼睛好| 白天不咳嗽晚上咳嗽吃什么药| 腿抽筋什么原因引起的| 钟点房什么意思| 郑成功是什么朝代的| 尿糖阴性什么意思| 吃什么才能减肥最快| 副词是什么| 棉纱是什么面料| 头一直摇晃是什么病| 肠炎有什么症状| 炒菜用什么油| 乙木的根是什么| 一带一路指的是什么| 标准工资指什么| 早餐什么时候吃最好| 老是做噩梦是什么原因| 脚趾甲真菌感染用什么药| 日本豆腐是什么材料| 双脚麻木是什么病的前兆| 甲状腺结节吃什么好| 尿不尽吃什么药| 五步蛇长什么样| 咳嗽挂什么科| pussy是什么意思| kappa属于什么档次| 吃什么增加卵泡| 无利起早是什么生肖| 不成功便成仁的仁是什么意思| 眼角膜是什么| 牙龈发炎肿痛吃什么药| 怀孕了什么不可以吃| 七寸是什么意思| 拉肚子出血是什么原因| 发腮是什么意思| 下午2点是什么时辰| 梦到好多蛇是什么意思| 御字五行属什么| 蛇配什么生肖最好| 最近流行什么病毒| 反射弧长是什么意思| 甘油三酯指的是什么| 胎方位roa是什么意思| 怀孕之后身体有什么变化| 屁股后面骨头疼是什么原因| 百度

壶口迎来“桃花汛” 黄河两岸架彩虹

百度 《纽约时报》22日报道分析称,特朗普政府放弃几十年来朝着开放市场和世界经济一体化前进的方向,转而采取一种更加明确的保护主义做法,在美国堡垒的周围设置障碍,这些措施将会进一步孤立美国。

In Bayesian statistics, the posterior predictive distribution is the distribution of possible unobserved values conditional on the observed values.[1][2]

Given a set of N i.i.d. observations , a new value will be drawn from a distribution that depends on a parameter , where is the parameter space.

It may seem tempting to plug in a single best estimate for , but this ignores uncertainty about , and because a source of uncertainty is ignored, the predictive distribution will be too narrow. Put another way, predictions of extreme values of will have a lower probability than if the uncertainty in the parameters as given by their posterior distribution is accounted for.

A posterior predictive distribution accounts for uncertainty about . The posterior distribution of possible values depends on :

And the posterior predictive distribution of given is calculated by marginalizing the distribution of given over the posterior distribution of given :

Because it accounts for uncertainty about , the posterior predictive distribution will in general be wider than a predictive distribution which plugs in a single best estimate for .

Prior vs. posterior predictive distribution

edit

The prior predictive distribution, in a Bayesian context, is the distribution of a data point marginalized over its prior distribution  . That is, if   and  , then the prior predictive distribution is the corresponding distribution  , where

 

This is similar to the posterior predictive distribution except that the marginalization (or equivalently, expectation) is taken with respect to the prior distribution instead of the posterior distribution.

Furthermore, if the prior distribution   is a conjugate prior, then the posterior predictive distribution will belong to the same family of distributions as the prior predictive distribution. This is easy to see. If the prior distribution   is conjugate, then

 

i.e. the posterior distribution also belongs to   but simply with a different parameter   instead of the original parameter   Then,

 

Hence, the posterior predictive distribution follows the same distribution H as the prior predictive distribution, but with the posterior values of the hyperparameters substituted for the prior ones.

The prior predictive distribution is in the form of a compound distribution, and in fact is often used to define a compound distribution, because of the lack of any complicating factors such as the dependence on the data   and the issue of conjugacy. For example, the Student's t-distribution can be defined as the prior predictive distribution of a normal distribution with known mean μ but unknown variance σx2, with a conjugate prior scaled-inverse-chi-squared distribution placed on σx2, with hyperparameters ν and σ2. The resulting compound distribution   is indeed a non-standardized Student's t-distribution, and follows one of the two most common parameterizations of this distribution. Then, the corresponding posterior predictive distribution would again be Student's t, with the updated hyperparameters   that appear in the posterior distribution also directly appearing in the posterior predictive distribution.

In some cases the appropriate compound distribution is defined using a different parameterization than the one that would be most natural for the predictive distributions in the current problem at hand. Often this results because the prior distribution used to define the compound distribution is different from the one used in the current problem. For example, as indicated above, the Student's t-distribution was defined in terms of a scaled-inverse-chi-squared distribution placed on the variance. However, it is more common to use an inverse gamma distribution as the conjugate prior in this situation. The two are in fact equivalent except for parameterization; hence, the Student's t-distribution can still be used for either predictive distribution, but the hyperparameters must be reparameterized before being plugged in.

In exponential families

edit

Most, but not all, common families of distributions are exponential families. Exponential families have a large number of useful properties. One of these is that all members have conjugate prior distributions — whereas very few other distributions have conjugate priors.

Prior predictive distribution in exponential families

edit

Another useful property is that the probability density function of the compound distribution corresponding to the prior predictive distribution of an exponential family distribution marginalized over its conjugate prior distribution can be determined analytically. Assume that   is a member of the exponential family with parameter   that is parametrized according to the natural parameter  , and is distributed as

 

while   is the appropriate conjugate prior, distributed as

 

Then the prior predictive distribution   (the result of compounding   with  ) is

 

The last line follows from the previous one by recognizing that the function inside the integral is the density function of a random variable distributed as  , excluding the normalizing function  . Hence the result of the integration will be the reciprocal of the normalizing function.

The above result is independent of choice of parametrization of  , as none of  ,   and   appears. (  is a function of the parameter and hence will assume different forms depending on choice of parametrization.) For standard choices of   and  , it is often easier to work directly with the usual parameters rather than rewrite in terms of the natural parameters.

The reason the integral is tractable is that it involves computing the normalization constant of a density defined by the product of a prior distribution and a likelihood. When the two are conjugate, the product is a posterior distribution, and by assumption, the normalization constant of this distribution is known. As shown above, the density function of the compound distribution follows a particular form, consisting of the product of the function   that forms part of the density function for  , with the quotient of two forms of the normalization "constant" for  , one derived from a prior distribution and the other from a posterior distribution. The beta-binomial distribution is a good example of how this process works.

Despite the analytical tractability of such distributions, they are in themselves usually not members of the exponential family. For example, the three-parameter Student's t distribution, beta-binomial distribution and Dirichlet-multinomial distribution are all predictive distributions of exponential-family distributions (the normal distribution, binomial distribution and multinomial distributions, respectively), but none are members of the exponential family. This can be seen above due to the presence of functional dependence on  . In an exponential-family distribution, it must be possible to separate the entire density function into multiplicative factors of three types: (1) factors containing only variables, (2) factors containing only parameters, and (3) factors whose logarithm factorizes between variables and parameters. The presence of   makes this impossible unless the "normalizing" function  either ignores the corresponding argument entirely or uses it only in the exponent of an expression.

Posterior predictive distribution in exponential families

edit

When a conjugate prior is being used, the posterior predictive distribution belongs to the same family as the prior predictive distribution, and is determined simply by plugging the updated hyperparameters for the posterior distribution of the parameter(s) into the formula for the prior predictive distribution. Using the general form of the posterior update equations for exponential-family distributions (see the appropriate section in the exponential family article), we can write out an explicit formula for the posterior predictive distribution:

 

where

 

This shows that the posterior predictive distribution of a series of observations, in the case where the observations follow an exponential family with the appropriate conjugate prior, has the same probability density as the compound distribution, with parameters as specified above. The observations themselves enter only in the form  

This is termed the sufficient statistic of the observations, because it tells us everything we need to know about the observations in order to compute a posterior or posterior predictive distribution based on them (or, for that matter, anything else based on the likelihood of the observations, such as the marginal likelihood).

Joint predictive distribution, marginal likelihood

edit

It is also possible to consider the result of compounding a joint distribution over a fixed number of independent identically distributed samples with a prior distribution over a shared parameter. In a Bayesian setting, this comes up in various contexts: computing the prior or posterior predictive distribution of multiple new observations, and computing the marginal likelihood of observed data (the denominator in Bayes' law). When the distribution of the samples is from the exponential family and the prior distribution is conjugate, the resulting compound distribution will be tractable and follow a similar form to the expression above. It is easy to show, in fact, that the joint compound distribution of a set   for   observations is

 

This result and the above result for a single compound distribution extend trivially to the case of a distribution over a vector-valued observation, such as a multivariate Gaussian distribution.

Relation to Gibbs sampling

edit

Collapsing out a node in a collapsed Gibbs sampler is equivalent to compounding. As a result, when a set of independent identically distributed (i.i.d.) nodes all depend on the same prior node, and that node is collapsed out, the resulting conditional probability of one node given the others as well as the parents of the collapsed-out node (but not conditioning on any other nodes, e.g. any child nodes) is the same as the posterior predictive distribution of all the remaining i.i.d. nodes (or more correctly, formerly i.i.d. nodes, since collapsing introduces dependencies among the nodes). That is, it is generally possible to implement collapsing out of a node simply by attaching all parents of the node directly to all children, and replacing the former conditional probability distribution associated with each child with the corresponding posterior predictive distribution for the child conditioned on its parents and the other formerly i.i.d. nodes that were also children of the removed node. For an example, for more specific discussion and for some cautions about certain tricky issues, see the Dirichlet-multinomial distribution article.

See also

edit

References

edit
  1. ^ "Posterior Predictive Distribution". SAS. Retrieved 19 July 2014.
  2. ^ Gelman, Andrew; Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki; Rubin, Donald B. (2013). Bayesian Data Analysis (Third ed.). Chapman and Hall/CRC. p. 7. ISBN 978-1-4398-4095-5.

Further reading

edit
  • Ntzoufras, Ioannis (2009). "The Predictive Distribution and Model Checking". Bayesian Modeling Using WinBUGS. Wiley. ISBN 978-0-470-14114-4.
槑是什么意思 公费是什么意思 线粒体是什么 赘疣是什么意思 hbv是什么意思
1946年中国发生了什么 儿童肺炎吃什么药 舌头边上有锯齿状是什么原因 为什么长不高 室内用什么隔墙最便宜
胖大海和什么搭配最好 心悸吃什么中成药 草字头加西读什么 妒忌是什么意思 南乳是什么
经常吃南瓜有什么好处和坏处 八字华盖是什么意思 小儿咳嗽吃什么药好 甯是什么意思 一什么善心
喉咙挂什么科室hcv9jop4ns5r.cn 半胱氨酸是什么hcv7jop5ns6r.cn 南京都有什么大学hcv9jop4ns4r.cn 心脏造影是什么sanhestory.com 事无巨细什么意思hcv9jop2ns9r.cn
莱赛尔纤维是什么面料cl108k.com 多子多福是什么意思xinmaowt.com 朱棣是朱元璋的什么人hcv8jop5ns3r.cn 猥亵是什么意思hcv8jop7ns9r.cn 韩红什么军衔hcv8jop4ns5r.cn
neu是什么意思hcv9jop6ns6r.cn 什么是水印照片hcv8jop3ns0r.cn 软件开发需要学什么hcv7jop7ns1r.cn 双肾泥沙样结石是什么意思hcv7jop7ns3r.cn 上火喝什么饮料hcv8jop2ns0r.cn
两肋胀满闷胀是什么病hcv8jop9ns7r.cn 急性化脓性扁桃体炎吃什么药hcv9jop6ns6r.cn 什么样的脚好看hcv9jop5ns9r.cn 消炎药吃多了有什么副作用hcv9jop4ns7r.cn 我追呀追呀是什么歌曲hcv7jop6ns2r.cn
百度