拓端tecdat|R語言Fisher檢驗探究地區(qū)間公寓價格的關(guān)系

2021-07-02 18:00 作者:拓端tecdat 0人讀過 | 我要投稿

原文鏈接：http://tecdat.cn/?p=18927?

原文出處：拓端數(shù)據(jù)部落公眾號

本文使用波蘭公寓價格數(shù)據(jù)說明Fisher檢驗。

with(data = apart , boxplot(price ~ dis ))

我們在這里對公寓進行分組（這也可以通過簡單的回歸，這里5個解釋變量并不重要）。我們可以重新排列?

A = A[order(A$x),]

我們以這里最便宜的地區(qū)為參考，

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2968.36 58.02 51.160 <2e-16 ***
districtBielany 17.38 84.16 0.207 0.836
districtPraga 26.45 85.12 0.311 0.756
districtUrsynow 42.01 82.65 0.508 0.611
districtBemowo 80.10 83.71 0.957 0.339
districtUrsus 102.01 82.25 1.240 0.215
districtZoliborz 829.59 83.94 9.884 <2e-16 ***
districtMokotow 887.10 81.86 10.837 <2e-16 ***
districtOchota 987.93 84.16 11.738 <2e-16 ***
districtSrodmiescie 2214.39 83.28 26.591 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 597.4 on 990 degrees of freedom
Multiple R-squared: 0.5698, Adjusted R-squared: 0.5659
F-statistic: 145.7 on 9 and 990 DF, p-value: < 2.2e-16

我們可以檢驗前5個地區(qū)價格，這是一個多重檢驗，我們將使用Fisher檢驗：

linHypo(reg, c("districtBielany = 0"
"districtPraga = 0"
"districtUrsynow = 0"
"districtBemowo = 0"
"districtUrsus = 0")
Linear hypothesis test
Model 1: restricted model
Model 2: m2.price ~ district
Res.Df RSS Df Sum of Sq F Pr(>F)
1 995 354051715
2 990 353269202 5 782513 0.4386 0.8217

Fisher的統(tǒng)計數(shù)據(jù)很低，??p值為82％。

Linear hypothesis test
Model 1: restricted model
Model 2: m2.price ~ district
Res.Df RSS Df Sum of Sq F Pr(>F)
1 996 405455409
2 990 353269202 6 52186207 24.374 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

我們將對前6種地區(qū)進行重組（并稱A為地區(qū)重組）。如果我們看平均價格，按地區(qū)，我們得到

with(data = apar , boxplot( price ~ distr ))

?

我們再次開始，以最便宜的地區(qū)作為參考，我們想檢驗線性回歸中接下來的兩個地區(qū)的系數(shù)是否為零。

Linear hypothesis test
Model 1: restricted model
Model 2: m2.price ~ district
Res.Df RSS Df Sum of Sq F Pr(<F)
1 997 355292524
2 995 354051715 2 1240809 1.7435 0.1754

P為0.17，我們可以接受原假設(shè)。然后，我們有三組地區(qū)，名稱分別為A，B和C。我們獲得以下框線圖

with(data = apart , boxplot( price ~ dist ))

因此，最終我們可以分類成三個不同的地區(qū)，如果目標是預(yù)測價格，則無需使用10類分類，而3類分類就足夠了！

最受歡迎的見解

1.Matlab馬爾可夫鏈蒙特卡羅法（MCMC）估計隨機波動率（SV，Stochastic Volatility）模型

2.基于R語言的疾病制圖中自適應(yīng)核密度估計的閾值選擇方法

3.WinBUGS對多元隨機波動率模型：貝葉斯估計與模型比較

4.R語言回歸中的hosmer-lemeshow擬合優(yōu)度檢驗

5.matlab實現(xiàn)MCMC的馬爾可夫切換ARMA – GARCH模型估計

6.R語言區(qū)間數(shù)據(jù)回歸分析

7.R語言WALD檢驗 VS 似然比檢驗

8.python用線性回歸預(yù)測股票價格

9.R語言如何在生存分析與Cox回歸中計算IDI，NRI指標

標簽：