當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

UA MATH571A QE练习 R语言非参数回归上

發(fā)布時間：2025/4/14 编程问答 25 豆豆

生活随笔收集整理的這篇文章主要介紹了 UA MATH571A QE练习 R语言非参数回归上小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

UA MATH571A QE練習(xí) R語言非參數(shù)回歸上

2014年5月第五題
2015年1月第四題
2015年5月第四題

這一篇介紹2014年5月第五題、2015年1月第四題、2015年5月第四題。

2014年5月第五題

一項研究試圖探索1950年代到2000年代后期北美模特體質(zhì)比的變化。現(xiàn)在有的數(shù)據(jù)是BMI平均值的時間序列數(shù)據(jù)，數(shù)據(jù)為時間和BMI，時間是月份（字符）+年份（數(shù)值），我們先處理一下時間：

q3.df = read.csv( file.choose() ) attach( q3.df ) month.digit = match( Month,month.abb ) date = paste( Year,month.digit,sep='-' ) library( zoo ) month = as.yearmon(date)

第一行選取數(shù)據(jù)，第二行給它貼個名字，第三行把月份從字符變成數(shù)字，第四行把年份和月份用-連起來，接下來就是用as.yearmon這個函數(shù)把xxxx-xx型的時間數(shù)據(jù)變成這種

> as.yearmon("2007-12") [1] "12月 2007"

變成這種之后有一個好處，就是可以用as.numeric變成數(shù)值

> as.numeric(as.yearmon("2007-12")) [1] 2007.917

轉(zhuǎn)換規(guī)則是第 $i$ 個月變成小數(shù)加在年份后面，轉(zhuǎn)換規(guī)則是 $(i ? 1) / 12$ 。把時間做了這些處理之后畫出散點(diǎn)圖：

month = as.yearmon(date) x = as.numeric(month); Y = BMI plot( Y~x, pch=19 )

這個散點(diǎn)圖反映出三個特征：

數(shù)據(jù)存在replicate；

整體呈現(xiàn)模特平均BMI逐年下降的趨勢；

數(shù)據(jù)顯然不是線性的

下面用loess方法擬合一下，

BMI1r.loess = loess( Y~x, span = 0.5, degree = 1, family='symmetric' ) Ysmooth1r = predict( BMI1r.loess, data.frame(x = seq(1953,2009,.25) )) plot( Y~x, pch=19, xlim=c(1950,2009), ylim=c(15,24) ); par( new=T ) plot( Ysmooth1r~seq(1953,2009,.25), type='l', lwd=2 , xaxt='n',yaxt='n' , xlab='', ylab='', xlim=c(1950,2009), ylim=c(15,24) )

大致可以看出到1983年左右是呈現(xiàn)下降趨勢的，1983年之后鮮有回升。殘差圖如下

plot( resid(BMI1r.loess)~x, pch=19 ); abline( h=0 )

基本是均勻分布在橫軸兩側(cè)的，但最上面的點(diǎn)疑似異常值。

2015年1月第四題

這道題和上一道非常像，我就不寫文字了，數(shù)據(jù)數(shù)隔行存的，讀進(jìn)來很多nan，可以用na.omit去掉。

gapminder.df = read.csv( file.choose() ) G = na.omit(gapminder.df$GDP) Y = na.omit(gapminder.df$life.expect) hist( G, prob=T, main='' ) lines ( density(G) )

X = log(G) plot( Y ~ X, pch=19 )

gapminder.loess = loess( Y~X, span=0.7, degree=1, family='symmetric' ) Ysmooth1r = predict( gapminder.loess, data.frame(X=seq(4,11)) ) plot( Y~X, pch=19, xlim=c(4,11), ylim=c(40,90) ); par( new=T ) plot( Ysmooth1r~seq(4,11), type='l', lwd=2 , xaxt='n',yaxt='n' , xlab='', ylab='', xlim=c(4,11), ylim=c(40,90) )

plot( resid(gapminder.loess)~X, pch=19, ylab='Resid.' ); abline( h=0 )

2015年5月第四題

loess方法或許可以用來做模型診斷，在用線性模型擬合了數(shù)據(jù)之后，我們可以用loess模型對殘差關(guān)于擬合值的散點(diǎn)圖做一個非參數(shù)回歸，如果非參數(shù)回歸曲線不平，就說明有異方差；如果非參數(shù)回歸曲線比較平，就說明是同方差。

先看一下數(shù)據(jù)

baseball.df = read.csv( file.choose() ) Y = na.omit(baseball.df$batting.average) X = na.omit(baseball.df$years) plot( Y ~ X, pch=19 )

再看一下一元回歸殘差絕對值關(guān)于擬合值的散點(diǎn)圖

baseballSLR.lm <- lm(Y ~ X) absresid = abs(resid(baseballSLR.lm)) Yhat = fitted(baseballSLR.lm) plot(absresid~Yhat,pch=19)

如果取 $q = 0.75$ ，則

baseball.lo = loess( absresid~Yhat, span = 0.75, degree = 2,family='symmetric' ) Ysmooth = predict( baseball.lo,data.frame(Yhat = seq(min(Yhat),max(Yhat),.001)) ) plot( absresid~Yhat, xlim=c(.25,.29), ylim=c(0,.11) ) par( new=TRUE ) plot( Ysmooth~seq(min(Yhat),max(Yhat),.001), type='l', lwd=2,xaxt='n', yaxt='n' , xlab='', ylab='', xlim=c(.25,.29), ylim=c(0,.11))

如果取 $q = 0.33$ ，則

baseball.lo = loess( absresid~Yhat, span = 0.33, degree = 2,family='symmetric' ) Ysmooth = predict( baseball.lo,data.frame(Yhat = seq(min(Yhat),max(Yhat),.001)) ) plot( absresid~Yhat, xlim=c(.25,.29), ylim=c(0,.11) ) par( new=TRUE ) plot( Ysmooth~seq(min(Yhat),max(Yhat),.001), type='l', lwd=2,xaxt='n', yaxt='n' , xlab='', ylab='', xlim=c(.25,.29), ylim=c(0,.11))

如果取 $q = 0.5$ ，則

baseball.lo = loess( absresid~Yhat, span = 0.5, degree = 2,family='symmetric' ) Ysmooth = predict( baseball.lo,data.frame(Yhat = seq(min(Yhat),max(Yhat),.001)) ) plot( absresid~Yhat, xlim=c(.25,.29), ylim=c(0,.11) ) par( new=TRUE ) plot( Ysmooth~seq(min(Yhat),max(Yhat),.001), type='l', lwd=2,xaxt='n', yaxt='n' , xlab='', ylab='', xlim=c(.25,.29), ylim=c(0,.11))

從上面三組結(jié)果的比較我們可以得出兩個結(jié)論：

基本可以認(rèn)定非參數(shù)回歸曲線是平坦的，雖然

q = 0.33

時比較彎曲，但此時的非參數(shù)回歸曲線并不夠平滑，不能體現(xiàn)出數(shù)據(jù)的趨勢；

非參數(shù)回歸做診斷的結(jié)果對超參比較敏感；

《新程序員》：云原生和全面數(shù)字化實踐50位技術(shù)專家共同創(chuàng)作，文字、視頻、音頻交互閱讀

總結(jié)

以上是生活随笔為你收集整理的UA MATH571A QE练习 R语言非参数回归上的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： UA MATH571B 试验设计 QE练
下一篇： UA MATH564 概率不等式 QE练