体毛旺盛是什么原因| 三伏天要注意什么| 什么治失眠最有效| 黄标车是什么意思| 转氨酶偏高是什么原因| 拉谷谷女装什么档次的| 没有润滑剂可以用什么代替| 冬至要注意什么| 默的部首是什么| 211是什么星座| 老舍的原名叫什么| 割包皮有什么好处和坏处| 营养心脏最好的药是什么药| 七月份可以种什么菜| 什么水果补充维生素c| 热休克蛋白90a检查高是什么原因| 无水乙醇是什么| 冠状动脉粥样硬化性心脏病吃什么药| 乳腺结节是什么| 富氢水是什么| 咋啦是什么意思| 叶黄素是什么| 肝脏彩超能检查出什么| 完全性右束支阻滞是什么意思| 先天愚型是什么病| 什么虫子咬了会起水泡| 起伏跌宕什么意思| 牛腩是什么部位的肉| 今天什么节日| 属鸡的是什么命| c反应蛋白是查什么的| 腰肌劳损用什么药| 什么是干燥综合症| 什么冲冲| 命格是什么| 头痛是什么病的前兆| kenzo是什么牌子| nilm是什么意思| 924是什么星座| land rover是什么车| 长焦是什么意思| 无锡有什么好玩的| 药流用什么药| 忘带洗面奶用什么代替| 肝功能不全是什么意思| 吃什么对卵巢有好处| 什么叫尿潜血| 子宫内膜回声欠均匀什么意思| cas是什么| 天秤男和什么星座最配| 为什么额头反复长痘痘| 什么桥下没有水| 上午11点是什么时辰| 妊娠是什么意思啊| 什么是时装| 带状疱疹吃什么药| 医美是什么意思| 挪威用什么货币| 什么是核素| alt是什么意思| 铝中毒有什么症状| 如果你是什么就什么造句| 戊午五行属什么| mr检查是什么意思| 什么的微风填空| 血糖吃什么食物好| 黄瓜敷脸有什么好处| 偷鸡不成蚀把米什么意思| 什么人不能吃西洋参| 穆字五行属什么| 止血芳酸又叫什么| 什么鱼吃鱼屎| 卵巢早衰是什么原因引起的| 家里起火代表什么预兆| 心律不齐吃什么食物好| 指甲长的快是什么原因| 长绒棉是什么面料| 什么钙片好| bhp是什么单位| 孕妇为什么不能参加婚礼| 五月二十一号是什么星座| 高危性行为是什么意思| 亡羊补牢的寓意是什么| 敬谢不敏是什么意思| 干嘛是什么意思| 手掌像什么| 尿蛋白是什么原因造成的| 情人眼里出西施是什么意思| 中位生存期什么意思| 血压低头晕是什么原因导致的| 桂圆跟龙眼有什么区别| 什么是低聚果糖| 企业hr是什么意思| 颈椎反曲是什么意思| 头发掉是什么原因引起的| b型血和o型血生的孩子是什么血型| 酸野是什么| 精神慰藉什么意思| 微不足道的意思是什么| 百合花代表什么意思| mafia是什么意思| 人情世故什么意思| 无舌苔是什么原因| 股癣用什么药| 女性潮热是什么症状| 巴郎子是什么意思| 专业术语是什么意思| 口腔溃疡吃什么药| 怀孕有褐色分泌物是什么原因| 绊倒是什么意思| 经常吃豆腐有什么好处和坏处| 喝茶是什么意思| 姑息治疗什么意思| 八月底什么星座| 吴亦凡属什么| 温度计里面红色液体是什么| 房室传导阻滞是什么意思| 女人吃什么补充胶原蛋白| 肉碱是什么| 腮腺炎输液用什么药| 喝椰子粉有什么好处| 喝什么缓解痛经最有效| 插入是什么感觉| m1是什么单位| 转归是什么意思| 幸福是什么的经典语录| 男生下面长什么样| 痛经什么原因| 简单是什么意思| 泄气是什么意思| 过期的牛奶有什么用途| 一柱擎天什么意思| 酷暑难当是什么意思| 发动机抖动是什么原因| 猴子偷桃是什么生肖| 卉字五行属什么| 结节性红斑吃什么药| 睾丸炎吃什么药| 氩气是什么气体| 玉米须有什么作用| 养老院护工都做些什么| 淋巴癌是什么| 腿抽筋吃什么钙片好| 菜瓜是什么瓜| 水疗是什么意思| 艾灸起水泡是什么原因| 一望无际是什么意思| 歌帝梵巧克力什么档次| 外阴又疼又痒用什么药| 什么药降尿蛋白| 不以为然是什么意思| 肺结节是什么| 鸡蛋炒什么菜好吃| 什么叫法令纹| 前列腺增大钙化是什么意思| 啖是什么意思| 殿试第一名叫什么| 高什么远什么| 呕吐腹泻是什么原因| sad什么意思| 吃了头孢不能吃什么| 肝内胆管结石有什么症状表现| 肾病可以吃什么水果| 甲状腺结节对身体有什么影响| 快速眼动是什么意思| 王字旁和什么有关| 纺织厂是做什么的| 阴道瘙痒什么原因| 憔悴是什么意思| 哆啦a梦的寓意是什么| 吃什么对血管好| 偶尔头晕是什么原因| 请佛像回家有什么讲究| 缺钙吃什么补钙最快| 9号来的月经什么时候是排卵期| 情绪什么意思| 什么是动物奶油| 三个火念什么| 赤脚走路有什么好处| 减肥喝什么饮料| 月指什么生肖| 眼底出血是什么症状| 大什么大什么| 残疾证有什么补贴| 分解酒精的是什么酶| 藏毛窦挂什么科| 女性生活疼痛什么原因| 软科是什么意思| 猪血和鸭血有什么区别| 什么食物补钙效果最好最快| 老鸨是什么| 天空中有什么| 刘备代表什么生肖| 猫为什么不怕蛇| 驻马店有什么大学| 王安石号什么| 皮包公司是什么意思| 由是什么意思| 如来藏是什么意思| 为什么呀| 为什么一生气就胃疼| 2025年什么年| 梦到发大水是什么意思| 挂帅是什么意思| 耳鸣是什么病的前兆| 鼻子突然流血是什么原因| 迎刃而解是什么意思| 血糖高可以吃什么| 病机是什么意思| 黑素瘤早期什么症状| 两三分钟就射什么原因| 什么是soho| chemical是什么意思| 亚洲没有什么气候| 02年是什么生肖| 霍山石斛有什么作用| 生理盐水是什么东西| 什么值得买怎么用| 9月13日是什么纪念日| 什么肥什么壮| 取是什么意思| 降血脂喝什么茶最好| 肺大泡吃什么药| 乙酉日五行属什么| 保姆是什么意思| 理财什么意思| 头晕吃什么药| 世界大同是什么意思| 不由自主的摇头是什么病| 牙龈出血吃什么| 什么牌子的氨基酸洗面奶好| 阴道炎不能吃什么| 低回声是什么意思| 11月15日出生是什么星座| 正月是什么意思| 每天早上喝一杯蜂蜜水有什么好处| 衣钵是什么意思| pad是什么| 腋下出汗有异味是什么原因| 手柄是什么意思| 口水是甜的是什么原因| 盲盒是什么意思| 黑鱼吃什么| 山药什么季节成熟| 梦见打苍蝇是什么意思| 肠易激综合征中医叫什么| 张牙舞爪是什么生肖| 例假少是什么原因| 甲状腺吃什么食物好| 睡觉翻白眼是什么原因| 五个月宝宝可以吃什么水果| 多潘立酮片是什么药| 枸杞补什么| 老公不交工资意味什么| 汪字五行属什么| 牙齿有黑线是什么原因| 吃什么补免疫力最快| 早上9点到10点是什么时辰| 心脏反流吃什么药| 新店开业送什么好| 黄精有什么作用和功效| 什么是脂溢性皮炎| 返点是什么意思| 笑得什么| 百度

希特勒为什么自杀

百度 上岸后,贺海德顾不得把湿衣服脱下,立刻对孩子采取急救措施,经过几番紧张地施救,孩子终于脱离危险。

A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, the data-generating process.[1] When referring specifically to probabilities, the corresponding term is probabilistic model. All statistical hypothesis tests and all statistical estimators are derived via statistical models. More generally, statistical models are part of the foundation of statistical inference. A statistical model is usually specified as a mathematical relationship between one or more random variables and other non-random variables. As such, a statistical model is "a formal representation of a theory" (Herman Adèr quoting Kenneth Bollen).[2]

Introduction

edit

Informally, a statistical model can be thought of as a statistical assumption (or set of statistical assumptions) with a certain property: that the assumption allows us to calculate the probability of any event. As an example, consider a pair of ordinary six-sided dice. We will study two different statistical assumptions about the dice.

The first statistical assumption is this: for each of the dice, the probability of each face (1, 2, 3, 4, 5, and 6) coming up is ?1/6?. From that assumption, we can calculate the probability of both dice coming up 5:? ?1/6? × ?1/6? = ?1/36?.? More generally, we can calculate the probability of any event: e.g. (1 and 2) or (3 and 3) or (5 and 6). The alternative statistical assumption is this: for each of the dice, the probability of the face 5 coming up is ?1/8? (because the dice are weighted). From that assumption, we can calculate the probability of both dice coming up 5:? ?1/8? × ?1/8? = ?1/64?.? We cannot, however, calculate the probability of any other nontrivial event, as the probabilities of the other faces are unknown.

The first statistical assumption constitutes a statistical model: because with the assumption alone, we can calculate the probability of any event. The alternative statistical assumption does not constitute a statistical model: because with the assumption alone, we cannot calculate the probability of every event. In the example above, with the first assumption, calculating the probability of an event is easy. With some other examples, though, the calculation can be difficult, or even impractical (e.g. it might require millions of years of computation). For an assumption to constitute a statistical model, such difficulty is acceptable: doing the calculation does not need to be practicable, just theoretically possible.

Formal definition

edit

In mathematical terms, a statistical model is a pair ( ), where   is the set of possible observations, i.e. the sample space, and   is a set of probability distributions on  .[3] The set   represents all of the models that are considered possible. This set is typically parameterized:  . The set   defines the parameters of the model. If a parameterization is such that distinct parameter values give rise to distinct distributions, i.e.   (in other words, the mapping is injective), it is said to be identifiable.[3]

In some cases, the model can be more complex.

  • In Bayesian statistics, the model is extended by adding a probability distribution over the parameter space  .
  • A statistical model can sometimes distinguish two sets of probability distributions. The first set   is the set of models considered for inference. The second set   is the set of models that could have generated the data which is much larger than  . Such statistical models are key in checking that a given procedure is robust, i.e. that it does not produce catastrophic errors when its assumptions about the data are incorrect.

An example

edit

Suppose that we have a population of children, with the ages of the children distributed uniformly, in the population. The height of a child will be stochastically related to the age: e.g. when we know that a child is of age 7, this influences the chance of the child being 1.5 meters tall. We could formalize that relationship in a linear regression model, like this: heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to obtain a prediction of height, εi is the error term, and i identifies the child. This implies that height is predicted by age, with some error.

An admissible model must be consistent with all the data points. Thus, a straight line (heighti = b0 + b1agei) cannot be admissible for a model of the data—unless it exactly fits all the data points, i.e. all the data points lie perfectly on the line. The error term, εi, must be included in the equation, so that the model is consistent with all the data points. To do statistical inference, we would first need to assume some probability distributions for the εi. For instance, we might assume that the εi distributions are i.i.d. Gaussian, with zero mean. In this instance, the model would have 3 parameters: b0, b1, and the variance of the Gaussian distribution. We can formally specify the model in the form ( ) as follows. The sample space,  , of our model comprises the set of all possible pairs (age, height). Each possible value of   = (b0, b1, σ2) determines a distribution on  ; denote that distribution by  . If   is the set of all possible values of  , then  . (The parameterization is identifiable, and this is easy to check.)

In this example, the model is determined by (1) specifying   and (2) making some assumptions relevant to  . There are two assumptions: that height can be approximated by a linear function of age; that errors in the approximation are distributed as i.i.d. Gaussian. The assumptions are sufficient to specify  —as they are required to do.

General remarks

edit

A statistical model is a special class of mathematical model. What distinguishes a statistical model from other mathematical models is that a statistical model is non-deterministic. Thus, in a statistical model specified via mathematical equations, some of the variables do not have specific values, but instead have probability distributions; i.e. some of the variables are stochastic. In the above example with children's heights, ε is a stochastic variable; without that stochastic variable, the model would be deterministic. Statistical models are often used even when the data-generating process being modeled is deterministic. For instance, coin tossing is, in principle, a deterministic process; yet it is commonly modeled as stochastic (via a Bernoulli process). Choosing an appropriate statistical model to represent a given data-generating process is sometimes extremely difficult, and may require knowledge of both the process and relevant statistical analyses. Relatedly, the statistician Sir David Cox has said, "How [the] translation from subject-matter problem to statistical model is done is often the most critical part of an analysis".[4]

There are three purposes for a statistical model, according to Konishi & Kitagawa:[5]

  1. Predictions
  2. Extraction of information
  3. Description of stochastic structures

Those three purposes are essentially the same as the three purposes indicated by Friendly & Meyer: prediction, estimation, description.[6]

Dimension of a model

edit

Suppose that we have a statistical model ( ) with  . In notation, we write that   where k is a positive integer (  denotes the real numbers; other sets can be used, in principle). Here, k is called the dimension of the model. The model is said to be parametric if   has finite dimension.[citation needed] As an example, if we assume that data arise from a univariate Gaussian distribution, then we are assuming that

 .

In this example, the dimension, k, equals 2. As another example, suppose that the data consists of points (x, y) that we assume are distributed according to a straight line with i.i.d. Gaussian residuals (with zero mean): this leads to the same statistical model as was used in the example with children's heights. The dimension of the statistical model is 3: the intercept of the line, the slope of the line, and the variance of the distribution of the residuals. (Note the set of all possible lines has dimension 2, even though geometrically, a line has dimension 1.)

Although formally   is a single parameter that has dimension k, it is sometimes regarded as comprising k separate parameters. For example, with the univariate Gaussian distribution,   is formally a single parameter with dimension 2, but it is often regarded as comprising 2 separate parameters—the mean and the standard deviation. A statistical model is nonparametric if the parameter set   is infinite dimensional. A statistical model is semiparametric if it has both finite-dimensional and infinite-dimensional parameters. Formally, if k is the dimension of   and n is the number of samples, both semiparametric and nonparametric models have   as  . If   as  , then the model is semiparametric; otherwise, the model is nonparametric.

Parametric models are by far the most commonly used statistical models. Regarding semiparametric and nonparametric models, Sir David Cox has said, "These typically involve fewer assumptions of structure and distributional form but usually contain strong assumptions about independencies".[7]

Nested models

edit

Two statistical models are nested if the first model can be transformed into the second model by imposing constraints on the parameters of the first model. As an example, the set of all Gaussian distributions has, nested within it, the set of zero-mean Gaussian distributions: we constrain the mean in the set of all Gaussian distributions to get the zero-mean distributions. As a second example, the quadratic model

y = b0 + b1x + b2x2 + ε,    ε ~ ??(0, σ2)

has, nested within it, the linear model

y = b0 + b1x + ε,    ε ~ ??(0, σ2)

—we constrain the parameter b2 to equal 0.

In both those examples, the first model has a higher dimension than the second model (for the first example, the zero-mean model has dimension 1). Such is often, but not always, the case. As an example where they have the same dimension, the set of positive-mean Gaussian distributions is nested within the set of all Gaussian distributions; they both have dimension 2.

Comparing models

edit

Comparing statistical models is fundamental for much of statistical inference. Konishi & Kitagawa (2008, p. 75) state: "The majority of the problems in statistical inference can be considered to be problems related to statistical modeling. They are typically formulated as comparisons of several statistical models." Common criteria for comparing models include the following: R2, Bayes factor, Akaike information criterion, and the likelihood-ratio test together with its generalization, the relative likelihood.

Another way of comparing two statistical models is through the notion of deficiency introduced by Lucien Le Cam.[8]

See also

edit

Notes

edit
  1. ^ Cox 2006, p. 178
  2. ^ Adèr 2008, p. 280
  3. ^ a b McCullagh 2002
  4. ^ Cox 2006, p. 197
  5. ^ Konishi & Kitagawa 2008, §1.1
  6. ^ Friendly & Meyer 2016, §11.6
  7. ^ Cox 2006, p. 2
  8. ^ Le Cam, Lucien (1964). "Sufficiency and Approximate Sufficiency". Annals of Mathematical Statistics. 35 (4). Institute of Mathematical Statistics: 1429. doi:10.1214/aoms/1177700372.

References

edit

Further reading

edit
犀牛吃什么 叶子是什么意思 经期为什么不能拔牙 一什么棉花糖 一切就绪是什么意思
过敏性咳嗽吃什么药好 ufo是什么 佛是什么意思 短头发烫什么发型好看 鹅肉不能和什么一起吃
出柜什么意思 调戏是什么意思 阴性阳性什么意思 莘字五行属什么 立冬吃什么
木变石是什么 眼白有点黄是什么原因 燕窝什么时候吃好 核磁共振是什么 羽下面隹什么字
甘油三脂高是什么意思hcv8jop8ns2r.cn 见什么知什么hcv8jop1ns4r.cn 动脉硬化吃什么药最好hcv8jop7ns5r.cn 冰心的原名叫什么hcv8jop8ns9r.cn 9月15号是什么星座hcv9jop6ns3r.cn
两肺纹理增多什么意思hanqikai.com 六月二十一是什么日子yanzhenzixun.com 脸上长痘痘用什么药膏效果好hcv8jop7ns0r.cn 广基息肉是什么意思hcv8jop5ns3r.cn 电器发生火灾用什么灭火器hcv9jop7ns3r.cn
女性私处为什么会变黑hcv9jop7ns1r.cn 急腹症是什么意思hcv8jop1ns0r.cn 12月21是什么星座hcv7jop9ns5r.cn 第二天叫什么日naasee.com bull是什么意思beikeqingting.com
小孩嘴唇发红是什么原因hcv8jop2ns4r.cn 老头晕是什么原因引起的hcv9jop6ns9r.cn 腱鞘炎在什么位置hcv8jop3ns3r.cn 为什么人会打嗝hcv8jop1ns4r.cn 囹圄是什么意思hcv9jop2ns5r.cn
百度