文章目錄 第1章 R語言概述 第2章 數(shù)據(jù)對象與數(shù)據(jù)讀寫 第3章 數(shù)據(jù)集基本處理 第4章 函數(shù)與控制流 第5章 初級繪圖 第6章 高級繪圖 第7章 可視化數(shù)據(jù)挖掘工具Rattle 資源 Reference
第1章 R語言概述
1.選擇題
(1)多行注釋的快捷鍵是(C)。 A.Ctrl+Shin+N B.Ctrl+N C.Ctrl+Shin+C D.Ctrl+C
(2)以下函數(shù)不能直接查看plot函數(shù)的幫助文檔的是(B)。 A. ?plot B.??plot C.help(plot) D.help(plot)
(3)以下R包的加載方式正確的是(A)。 A.install.package 函數(shù) B.library 函數(shù) C…libPaths 函數(shù) D.install 函數(shù)
(4)以下R包中不能調(diào)用分類算法的是(D)。 A.nnet包 B.e1071包 C.tree包 D.arules包
2.操作題
(1)依據(jù)1.1節(jié)的R下載及安裝方法,在計算機上安裝R,通過熟悉基本操作的命令及操作界面,掌握軟件的使用方法。 (2)依據(jù)1.2節(jié)的RStudio下載及安裝方法,在計算機上安裝RStudio,并嘗試通過幫助文檔學(xué)習(xí)使用plot函數(shù)繪制簡單的散點圖。 (3)依據(jù)1.3節(jié)的R包下載及安裝方法,在計算機上安裝DT包(用于創(chuàng)建交互式表格),并在命令運行窗口運行命令 datatable(iris),將得到交互式表格,如圖1-29所示。
圖1-29 iris數(shù)據(jù)集的交互式表格 (4)依據(jù)1.4節(jié)內(nèi)容,加載boot包中的acme數(shù)據(jù)集,并查看acme數(shù)據(jù)集的前6項。同時,通過help函數(shù)查看acme數(shù)據(jù)集的數(shù)據(jù)含義并進行說明。
第2章 數(shù)據(jù)對象與數(shù)據(jù)讀寫
1.選擇題
(1)下列可以判別字符型數(shù)據(jù)的函數(shù)是(A)。 A. is.numeric B. is.logical C. is.character D.is.na (2)下列可以判別數(shù)值型數(shù)據(jù)的函數(shù)是(D)。 A. is.complex B. is.na C.is.integer D. is.numeric
(3)可將對象轉(zhuǎn)換為邏輯型數(shù)據(jù)的函數(shù)是(D)。 A. as.character B. is.numeric C.as.logica D. as.complex
(4)下列選項不是邏輯型數(shù)據(jù)的是(C)。 A.T B.F C.NA D.10
(5)下列可以求矩陣的特征值和特征向量的函數(shù)是(B)。 A. diag B. eigen C.solve D. det
(6)下列選項中可以使得列表轉(zhuǎn)換為向量的是(D)。 A. as.matrix B. as.data.frame C. as.list D. unlist
(7)下列用來轉(zhuǎn)換數(shù)據(jù)框的函數(shù)是(B)。 A. as.list B. as.matrix C. as.data.frame D. as.vector
(8)下列用鍵盤導(dǎo)人數(shù)據(jù)的函數(shù)是(B)。 A.read.table B. read.csv C.edit D.readHTMLTable
(9)RODBC包中向數(shù)據(jù)庫提交一個查詢,并返回結(jié)果的函數(shù)是(B)。 A.odbcConnect B.sqlFetch C. sqlQuery D. sqlDrop
(10)抓取網(wǎng)頁上的表格,可使用XML包的是(D)函數(shù)。 A.read.csv B. read.table C.read.xlsx D. read HTMLTable
2.操作題
(1)創(chuàng)建一個對象,并進行數(shù)據(jù)類型的轉(zhuǎn)換、判別等操作,步驟如下: ①創(chuàng)建一個對象x,內(nèi)含元素為序列:1,3,5,6,8 ②判斷對象x是否是數(shù)值型數(shù)據(jù) ③將對象轉(zhuǎn)換為邏輯數(shù)據(jù),記為x1 ④判斷x1是否為邏輯型數(shù)據(jù)
x
<- c
( 1 , 3 , 5 , 6 , 8 )
is.numeric
( x
)
x1
<- as.logical
( x
)
is.logical
( x1
)
(2)創(chuàng)建多種數(shù)據(jù)結(jié)構(gòu),并進行數(shù)據(jù)結(jié)構(gòu)的轉(zhuǎn)換、索引、擴展等編輯操作,步驟如下: ①設(shè)置工作空間目錄 ②創(chuàng)建一個向量x,內(nèi)含元素為序列:11,23,25,46,38,30,59,47,21,67 ③查詢向量x中序號為23和46的元素,查詢向量x中大于等于50的元素的位置。 ④創(chuàng)建一個重復(fù)因子序列Species:水平數(shù)為3,各水平重復(fù)兩次,序列長度為5;3個水平為setosa、versicolor、virginica ⑤創(chuàng)建一個5行2列的矩陣,元素為向量x,按列填充 ⑥將矩陣寫入數(shù)據(jù)框data_iris,更改列名為Sepal.Length、Sepal.Width ⑦將數(shù)據(jù)框data_iris保存為TXT文件,保存到工作空間的test目錄下 ⑧將數(shù)據(jù)框data_iris轉(zhuǎn)換為向量y ⑨判斷是否轉(zhuǎn)換成功
setwd
( "./第2章 數(shù)據(jù)對象與數(shù)據(jù)讀寫/02-習(xí)題程序/code" )
x
<- c
( 11 , 23 , 25 , 46 , 38 , 30 , 59 , 47 , 21 , 67 )
x
[ c
( 2 , 4 ) ]
which
( x
> 35 & x
<= 50 )
Species
<- rep
( c
( "setosa" , "versicolor" , "virginica" ) , each
= 2 , length.out
= 5 ) a
<- matrix
( x
, 5 , 2 )
data_iris
<- data.frame
( Sepal.Length
= a
[ , 1 ] , Sepal.Width
= a
[ , 2 ] )
write.table
( data_iris
, " ./data_iris.txt" )
b
<- as.matrix
( data_iris
)
y
<- as.vector
( b
)
is.vector
( y
)
(3)讀取TXT文件,進行編輯操作,再寫入另外一個CSV文件中,步驟如下: ①讀取保存再test目錄下的TXT文件data_iris ②將R的示例數(shù)據(jù)集iris中的第6~10行寫入數(shù)據(jù)框data_iris1中 ③將數(shù)據(jù)框data_iris與data_iris1合并為數(shù)據(jù)框data_iris2,并保存在CSV文件所在的目錄下
read.table
( "./第2章 數(shù)據(jù)對象與數(shù)據(jù)讀寫/02-習(xí)題程序/code/data_iris.txt" )
data_iris1
<- data.frame
( iris
[ 6 : 10 , ] )
data_iris2
<- cbind
( data_iris
, data_iris1
)
write.csv
( data_iris2
, "./第2章 數(shù)據(jù)對象與數(shù)據(jù)讀寫/02-習(xí)題程序/code/data_iris2.csv" )
第3章 數(shù)據(jù)集基本處理
1.選擇題
(1)下列不屬于用于修改變量名的函數(shù)是(C)。 A. rename 函數(shù) B.names 函數(shù) C.name函數(shù) D.colnames 函數(shù)
(2)下列用于修改矩陣變量名的函數(shù)是(B)。 A.rename函數(shù) B.colnames 函數(shù) C.names 函數(shù) D.name函數(shù)
(3)下列屬于as.Date函數(shù)功能的選項是(A)。 A.將字符串形式的日期值轉(zhuǎn)換為日期變量 B.返回系統(tǒng)當前的日期 C.將字符申轉(zhuǎn)換為包含時間及時區(qū)的日期變量 D.將日期變量轉(zhuǎn)換成指定格式的字符型變量
(4)下列關(guān)于合并數(shù)據(jù)集不正確的選項是(C)。 A.數(shù)據(jù)框的合并可以通過rbind函數(shù)和cbind函數(shù) B.rbind函數(shù)的自變量的寬度(列數(shù))應(yīng)該與原數(shù)據(jù)框的寬度相等 C.rbind函數(shù)的自變量的高度(行數(shù))應(yīng)該與原數(shù)據(jù)框的寬度相等 D.cbind函數(shù)的自變量的高度(行數(shù))應(yīng)該與原數(shù)據(jù)框的高度相等
(5)下列不屬于sample函數(shù)功能的是(B)。 A.放回隨機抽樣 B.函數(shù)排序 C.可對數(shù)據(jù)進行隨機分組 D.不放回隨機抽樣
(6)下列關(guān)于subset函數(shù)表達的錯誤選項是(D)。 A.可用來選取變量與觀測變量 B.其中x是所要選擇的數(shù)據(jù)框 C.subset是所要查看信息的方法 D.select查看的某個區(qū)域可以大于數(shù)據(jù)框x
(7)使用merge函數(shù)合并數(shù)據(jù)時,下列為默認值的是(B)。 A.相同列名的列 B.相同行名的行 C.第1列數(shù)據(jù) D.第1行數(shù)據(jù)
(8)使用mell函數(shù)操作n維數(shù)組時,返回的結(jié)果有(A)列。 D.n+2 A.n B.n+1 C.n-1
(9)元字符·的含義為(B)。 A.前面的字符或表達式重復(fù)零次或更多次 B.前面的字符或表達式重復(fù)一次或更多次 C.前面的字符或表達式重復(fù)零次或一次 D.前面的字符或表達式重復(fù)零次
(10)下列關(guān)于paste函數(shù)表達錯誤的選項是(D)。 A.參數(shù)sep表示分隔符,默認為空格 B.參數(shù) collapse不指定值時,返回值是自變量之間通過sep指定的分隔符連接后得到的一個字符型向量 C.參數(shù) collapse指定了特定的值,則自變量連接后的字符型向量會再被連接成一個字符申,之間通過collapse的值分隔 D.設(shè)置collapse參數(shù),返回值為字符向量
2.操作題
(1)創(chuàng)建一個矩陣,并使用交互式編輯器修改變量名;創(chuàng)建一個數(shù)據(jù)框,并在數(shù)據(jù)框中添加3個新變量,分別為原數(shù)據(jù)的差、乘積和余數(shù)。
a
<- matrix
( 1 : 12 , 3 , 4 )
fix
( a
)
a
<- c
( 1 , 2 , 3 , 4 )
b
<- c
( 11 , 22 , 33 , 44 )
x1
<- data.frame
( a
, b
, d
= a
- b
, e
= a
* b
, f
= a
%% b
)
x1
(2)構(gòu)建一個含有缺失值的數(shù)據(jù)框,檢測該數(shù)據(jù)框是否含有缺失值并刪除包含缺失值的行;創(chuàng)建一個字符串的日期值,分別使用as.Date 函數(shù)、as.POSIXIt 函數(shù)、strptime 函數(shù)轉(zhuǎn)換為日期變量;使用sort函數(shù)對score的Chinese列進行從大到小排列,并且把缺失值放在最后。
x2
<- data.frame
( id
= c
( 1 , 2 , 3 , 4 ) , name
= c
( "張三" , "李四" , "王五" , "趙六" ) , math
= c
( 70 , 89 , NA , 80 ) , English
= c
( 86 , 78 , 65 , 92 ) )
anyNA
( x2
)
na.omit
( x2
)
dates
<- c
( "10/27/2017" , "02/25/2017" , "01/14/2017" , "07/18/2017" , "04/01/2017" )
( date
<- as.Date
( dates
, "%m/%d/%Y" ) )
datas1
<- c
( "2017-09-08 11:17:52" , "2017-08-07 20:33:02" )
as.POSIXlt
( datas1
, tz
= "" , "%Y-%m-%d %H:%M:%S" )
( datas2
<- strptime
( datas1
, "%Y-%m-%d %H:%M:%S" ) )
score
<- data.frame
( student
= c
( "A" , "B" , "C" , "D" ) , gender
= c
( "M" , "M" , "F" , "F" ) , math
= c
( 90 , 70 , 80 , 60 ) , Eng
= c
( 88 , 78 , 69 , 98 ) , p1
= c
( 66 , 59 , NA , 88 ) )
names
( score
) [ 5 ] = "Chinese"
score
sort
( score
$ Chinese
, decreasing
= TRUE , na.last
= TRUE )
(3)構(gòu)建一個數(shù)據(jù)框,并使用兩種方法來選取變量;使用sample函數(shù)實現(xiàn)放回隨機抽樣與不放回隨機抽樣。
data
<- data.frame
( a
= c
( 5.1 , 4.9 , 4.7 ) , b
= c
( 3.5 , 3.0 , 3.2 ) , c
= c
( 1.4 , 1.3 , 1.5 ) , d
= rep
( 0.2 , 3 ) )
newdata
<- data
[ , c
( 3 : 4 ) ]
newdata1
<- subset
( data
, a
== "4.9" , select
= c
( b
, d
) ) a
<- c
( 11 , 22 , 33 , 44 , 55 , 66 , 77 , 88 , 99 )
sample
( a
, 5 , replace
= TRUE )
sample
( a
, 5 , replace
= FALSE )
(4)使用SQL語句對文中數(shù)據(jù)框stuscore進行計算。 ①計算每個人的總成績并排名(要求顯示字段:學(xué)號、總成績)。 ②計算每個人單科的最高成績(要求顯示字段:學(xué)號、課程、最高成績)。 ③列出各門課程成績最好的學(xué)生(要求顯示字段:學(xué)號、科目、成績)。 ④列出各門課程成績最差的學(xué)生(要求顯示字段:學(xué)號、科目、成績)。
name
<- c
( rep
( "張三" , 1 , 3 ) , rep
( "李四" , 3 ) )
subject
<- c
( "數(shù)學(xué)" , "語文" , "英語" , "數(shù)學(xué)" , "語文" , "英語" )
score
<- c
( 89 , 80 , 70 , 90 , 70 , 80 )
stuid
<- c
( 1 , 1 , 1 , 2 , 2 , 2 )
stuscore
<- data.frame
( name
, subject
, score
, stuid
)
library
( sqldf
)
sqldf
( "select stuid
, sum
( score
) as allscore from stuscore group by stuid order by allscore"
)
sqldf
( "select stuid
, subject
, max
( score
) as maxscore from stuscore group by stuid"
)
sqldf
( "select stuid
, subject
, max
( score
) as maxscore from stuscore group by subject order by stuid"
)
sqldf
( "select stuid
, subject
, min
( score
) as minscore from stuscore group by subject order by stuid"
)
(5)創(chuàng)建一個列表,并使用melt函數(shù)將其融合。
data
<- list
( a
= c
( 11 , 22 , 33 , 44 ) , b
= matrix
( 1 : 10 , nrow
= 2 ) , c
= "one,two,three" , d
= c
( TRUE , FALSE ) )
data
library
( reshape2
)
melt
( data
, varnames
= c
( "X" , "Y" ) , value.name
= "value" , na.rm
= FALSE )
(6)構(gòu)建一個字符型向量,并使用sub函數(shù)和 gsub 函數(shù)完成字符串替換;使用paste 兩數(shù)分別返回一個字符型向量和一個字符串。
data1
<- c
( "we" , "are" , "family" , "you" , "good" )
sub
( "good" , "bad" , data1
)
gsub
( "we" , "student" , data1
) b
<- paste
( "ab" , 1 : 3 , sep
= "" )
x
<- list
( a
= "st" , b
= "nd" , c
= "yw" )
y
<- list
( d
= 1 , e
= 2 )
c
<- paste
( x
, y
, sep
= "-" , collapse
= "; " )
c
第4章 函數(shù)與控制流
1.選擇題
(1)下列能返回不小于x的最小整數(shù)的數(shù)學(xué)函數(shù)是(C)。 A.trunc B.floor C.ceiling D.mad
(2)下列不屬于apply函數(shù)使用對象的是(B)。 A.矩陣 B.向量 C.數(shù)組 D.數(shù)據(jù)框
(3)在ifelse(condition,statementl,statement2)語句中,當condition為TRUE時,執(zhí)行的語句是(A)。 A.statementl B.statement2 C.statement3 D.statement4
(4)在switch(expression,list)語句中,當list是有名定義、表達式等于變量名時,返回的結(jié)果是(B)。 A.列表相應(yīng)位置的值 B.變量名對應(yīng)的值 C.NULL”值 D.列表名
(5)下列不屬于條件分支語句的數(shù)據(jù)分析的應(yīng)用場景的是(B)。 A.if-else 語句 B.for循環(huán)語句 C.switch語句 D.ifelse語句
(6)下列關(guān)于cal(exprl,expr2…)函數(shù)表達不正確的是(A)。 A.exprl、expr2為需要輸出的內(nèi)容,必須為字符串 B.若exprl為“name”,則輸出字符申“name C.若exprl為變量name,則輸出name的值 D.符號“l(fā)n”表示換行,表示“l(fā)n”后的語句在下一行輸出
(7)能讓while((i<=10){expr}語句停止循環(huán)的選項是(B)。 A.i == 10 B.i == 11 C.i == 5 D.i == 1
(8)使用自定義函數(shù)時可通過(A)調(diào)用。 A.source 函數(shù) B.var 函數(shù) C.range函數(shù) D.signif函數(shù)
(9)函數(shù)體不包括(C)部分。 A.異常處理 B.返回值 C.輸入值 D.運算過程
(10)下列選項中表示返回值的函數(shù)是(D)。 A.median B.dnorm C.source D.retum
2.操作題
(1)使用apply函數(shù)族中的函數(shù)計算列表x<-list(a=1:5,b=exp(0:3))中的各子列表的最大值、最小值與中位數(shù)。
x
<- list
( a
= 1 : 5 , b
= exp
( 0 : 3 ) )
lapply
( x
, max
)
lapply
( x
, min
)
lapply
( x
, median
)
(2)在區(qū)間[-5,5上繪制標準正態(tài)曲線,求位于z左側(cè)的標準正態(tài)曲線下方的面積。
x
<- pretty
( c
( - 5 , 5 ) , 50 )
y
<- dnorm
( x
)
plot
( x
, y
, type
= 'l' , xlab
= 'Normal Deviate' , ylab
= 'Density' , yaxs
= 'i' )
pnorm
( 1.96 )
(3)用條件分支語句將成績劃分為5個等級:A(大于等于90)、B(大于等于80)、(大于等于70)、D(大于等于60)、E(小于60).例如,對成績87分進行判斷。
x
<- sample
( 0 : 100 , 1 )
if ( x
>= 90 ) { grade
<- '成績等級:A'
} else if ( x
>= 80 ) { grade
<- '成績等級:B'
} else if ( x
>= 70 ) { grade
<- '成績等級:C'
} else if ( x
>= 60 ) { grade
<- '成績等級:D'
} else { grade
<- '成績等級:E'
}
grade
(4)判斷101-200之間有多少個素數(shù),并輸出所有素數(shù)。
x
<- 101 : 200
y
<- matrix
( nrow
= 200 , ncol
= 100 )
count
<- 0
c
<- NULL
for ( i
in 1 : 100 ) { for ( j
in 1 : 200 ) { y
[ j
, i
] <- x
[ i
] %% j
} if ( length
( which
( y
[ , i
] == 0 ) ) <= 2 ) { count
<- count
+ 1 c
[ count
] <- i
}
}
count
x
[ c
]
(5)編寫一個自定義函數(shù)求兩個矩陣的乘積,并找出乘積矩陣中的最大元素。
POM
<- function ( x
, y
)
{ maxer
<- function ( x.
) { print
( max
( x.
) ) } m1
<- ncol
( x
) n
<- nrow
( y
) if ( m1
!= n
) { print
( 'error dimension is not siutable' ) return
( 0 ) } m
<- nrow
( x
) n1
<- ncol
( y
) s
<- matrix
( 0 , m
, n1
) for ( i
in 1 : m
) for ( j
in 1 : n1
) s
[ i
, j
] <- sum
( x
[ i
, ] * y
[ , j
] ) maxer
( s
) return
( s
)
}
x
<- matrix
( c
( 1 : 6 ) , 2 , 3 , byrow
= TRUE )
y
<- matrix
( c
( 1 : 6 ) , 3 , 2 , byrow
= FALSE )
POM
( x
, y
)
第5章 初級繪圖
1.選擇題
(1)下列可用于繪制散點圖的是(A)。 A.plot函數(shù) B.barplot函數(shù) C.boxplot函數(shù) D.hist函數(shù)
(2)下列圖形中,不能分析數(shù)據(jù)分布情況的是(C)。 A.散點圖 B.直方圖 C.多變量相關(guān)矩陣圖 D.箱線圖
(3)數(shù)據(jù)維度較大時,為比較兩兩變量之間的相關(guān)關(guān)系,可以考慮繪制的圖形是(D)。 A.散點圖 B.箱線圖 C.餅圖 D.多變量相關(guān)矩陣圖
(4)為展示數(shù)據(jù)類別的占比情況,可以考慮繪制的圖形是(C)。 A.散點圖 B.箱線圖 D.多變量相關(guān)矩陣圖 C.餅圖
(5)下列繪制的圖形與R函數(shù)對應(yīng)關(guān)系不正確的是(B)。 A.散點圖-plot函數(shù) B.箱線圖-barplot函數(shù) C.QQ圖-qqplot 函數(shù) D.散點矩陣圖-pairs函數(shù)
(6)下列選項中,(D)不是R自帶的修改顏色函數(shù)。 A.ngb函數(shù) B.colors 函數(shù) C.palette 函數(shù) D.brewer.pal函數(shù)
(7)能修改點樣式的參數(shù)是(B)。 A.cex B.pch C.lwd D.font
(8)可以在圖形中的任意位置添加文字說明的函數(shù)是(B)。 A.title函數(shù) B.text 函數(shù) C.mtext函數(shù) D.main函數(shù)
(9)為在現(xiàn)有圖形上添加擬合直線,可以考慮的函數(shù)是(B)。 A.line函數(shù) B.lines 函數(shù) C.abline 函數(shù) D.ablines函數(shù)
(10)下列圖形參數(shù)與說明的對應(yīng)關(guān)系不正確的是(D)。 A.axes-是否顯示坐標軸 B.xlim-x軸的取值范圍 C.col-顏色設(shè)置 D.font-是否顯示標題
2.操作題
(1)表5-23是某銀行貸款拖欠率的數(shù)據(jù)bankloan. 表5-23 銀行貸款拖欠率的數(shù)據(jù) Link:banklload.csv ①比較有違約與無違約行為特征的人群分布。 ②探索不同特征的人群收入與負債的分布情況。 ③探索不同特征的人群收入與負債的關(guān)系。
bankloan
<- read.csv
( "../bankloan.csv" )
summary
( bankloan
)
bankloan
$ age_group
<- cut
( bankloan
$ age
, breaks
= paste0
( 2 : 6 , 0 ) , include.lowest
= TRUE )
bankloan
$ seniority_group
<- cut
( bankloan
$ seniority
, breaks
= c
( 0 , 1 , 3 , 5 , 10 , 15 , 20 , 30 , 40 ) , include.lowest
= TRUE )
bankloan
$ education
<- factor
( bankloan
$ education
)
bankloan
$ debt
<- bankloan
$ debt_rate
/ 100 * bankloan
$ income
attach
( bankloan
)
pal
<- RColorBrewer
:: brewer.pal
( 8 , "Set1" )
de_e
<- ftable
( education
, default
)
barplot
( de_e
, col
= pal
[ 1 : 5 ] , beside
= TRUE , xlab
= "default" )
legend
( "topright" , levels
( education
) , pch
= 15 , col
= pal
, bty
= "n" )
text
( 11 , 270 , "education:" )
de_a
<- ftable
( age_group
, default
)
barplot
( de_a
, col
= pal
[ 1 : 4 ] , beside
= TRUE , xlab
= "default" )
legend
( "topright" , levels
( age_group
) , pch
= 15 , col
= pal
, bty
= "n" )
text
( 8.8 , 204 , "age:" )
de_s
<- ftable
( seniority_group
, default
)
barplot
( de_s
, col
= pal
[ 1 : 8 ] , beside
= TRUE , xlab
= "default" )
legend
( "topright" , levels
( seniority_group
) , pch
= 15 , col
= pal
, bty
= "n" )
text
( 15.3 , 140 , "seniority:" )
dotchart
( de_e
, bg
= pal
[ 1 : 5 ] , labels
= levels
( education
) )
dotchart
( de_a
, bg
= pal
[ 1 : 4 ] , labels
= levels
( age_group
) )
dotchart
( de_s
, bg
= pal
[ 1 : 8 ] , labels
= levels
( seniority_group
) )
set.seed
( 1234 )
norm_income
<- rnorm
( 1000 , mean
( income
) , sd
( income
) )
hist
( income
, freq
= FALSE , breaks
= 50 )
lines
( density
( income
) , col
= 2 )
lines
( density
( norm_income
) , lty
= 2 , col
= 3 )
legend
( "topright" , c
( "density" , "normal" ) , lty
= 1 : 2 , col
= 2 : 3 , bty
= "n" )
norm_debt
<- rnorm
( 1000 , mean
( debt
) , sd
( debt
) )
hist
( debt
, freq
= FALSE , breaks
= 50 )
lines
( density
( debt
) , col
= 2 )
lines
( density
( norm_debt
) , lty
= 2 , col
= 3 )
legend
( "topright" , c
( "density" , "normal" ) , lty
= 1 : 2 , col
= 2 : 3 , bty
= "n" )
library
( sm
)
sm.density.compare
( income
, factor
( education
) )
legend
( "topright" , levels
( education
) , lty
= 1 : 5 , col
= 2 : 6 , bty
= "n" )
text
( 450 , 0.015 , "education:" ) sm.density.compare
( income
, factor
( age_group
) )
legend
( "topright" , levels
( age_group
) , lty
= 1 : 4 , col
= 2 : 5 , bty
= "n" )
text
( 430 , 0.026 , "age:" ) sm.density.compare
( income
, factor
( seniority_group
) )
legend
( "topright" , levels
( seniority_group
) , lty
= 1 : 8 , col
= 2 : 9 , bty
= "n" )
text
( 425 , 0.03 , "seniority:" )
sm.density.compare
( debt
, factor
( education
) )
legend
( "topright" , levels
( education
) , lty
= 1 : 5 , col
= 2 : 6 , bty
= "n" )
text
( 40 , 0.125 , "education:" ) sm.density.compare
( debt
, factor
( age_group
) )
legend
( "topright" , levels
( age_group
) , lty
= 1 : 4 , col
= 2 : 5 , bty
= "n" )
text
( 38 , 0.16 , "age:" ) sm.density.compare
( debt
, factor
( seniority_group
) )
legend
( "topright" , levels
( seniority_group
) , lty
= 1 : 8 , col
= 2 : 9 , bty
= "n" )
text
( 37 , 0.16 , "seniority:" )
boxplot
( income
~ education
, horizontal
= TRUE )
boxplot
( income
~ age_group
, horizontal
= TRUE )
boxplot
( income
~ seniority_group
, horizontal
= TRUE )
boxplot
( debt
~ education
, horizontal
= TRUE )
boxplot
( debt
~ age_group
, horizontal
= TRUE )
boxplot
( debt
~ seniority_group
, horizontal
= TRUE )
income_e
<- tapply
( income
, education
, function ( t
) t
)
income_a
<- tapply
( income
, age_group
, function ( t
) t
)
income_s
<- tapply
( income
, seniority_group
, function ( t
) t
)
library
( vioplot
)
vioplot
( income_e
$ `
1 `
, income_e
$ `
2 `
, income_e
$ `
3 `
, income_e
$ `
4 `
, income_e
$ `
5 `
, names
= levels
( education
) , border
= "black" , col
= "light green" , rectCol
= "blue" ) vioplot
( income_a
$ `
[ 20 , 30 ] `
, income_a
$ `
( 30 , 40 ] `
, income_a
$ `
( 40 , 50 ] `
, income_a
$ `
( 50 , 60 ] `
, names
= levels
( age_group
) , border
= "black" , col
= "light green" , rectCol
= "blue" ) vioplot
( income_s
$ `
[ 0 , 1 ] `
, income_s
$ `
( 1 , 3 ] `
, income_s
$ `
( 3 , 5 ] `
, income_s
$ `
( 5 , 10 ] `
, income_s
$ `
( 10 , 15 ] `
, income_s
$ `
( 15 , 20 ] `
, income_s
$ `
( 20 , 30 ] `
, income_s
$ `
( 30 , 40 ] `
, names
= levels
( seniority_group
) , border
= "black" , col
= "light green" , rectCol
= "blue" )
debt_e
<- tapply
( debt
, education
, function ( t
) t
)
debt_a
<- tapply
( debt
, age_group
, function ( t
) t
)
debt_s
<- tapply
( debt
, seniority_group
, function ( t
) t
)
vioplot
( debt_e
$ `
1 `
, debt_e
$ `
2 `
, debt_e
$ `
3 `
, debt_e
$ `
4 `
, debt_e
$ `
5 `
, names
= levels
( education
) , border
= "black" , col
= "light green" , rectCol
= "blue" ) vioplot
( debt_a
$ `
[ 20 , 30 ] `
, debt_a
$ `
( 30 , 40 ] `
, debt_a
$ `
( 40 , 50 ] `
, debt_a
$ `
( 50 , 60 ] `
, names
= levels
( age_group
) , border
= "black" , col
= "light green" , rectCol
= "blue" )
vioplot
( debt_s
$ `
[ 0 , 1 ] `
, debt_s
$ `
( 1 , 3 ] `
, debt_s
$ `
( 3 , 5 ] `
, debt_s
$ `
( 5 , 10 ] `
, debt_s
$ `
( 10 , 15 ] `
, debt_s
$ `
( 15 , 20 ] `
, debt_s
$ `
( 20 , 30 ] `
, debt_s
$ `
( 30 , 40 ] `
, names
= levels
( seniority_group
) , border
= "black" , col
= "light green" , rectCol
= "blue" )
le
<- levels
( education
)
op
<- par
( mfrow
= c
( 1 , 5 ) )
for ( i
in 1 : nlevels
( education
) ) { plot
( income
[ education
== le
[ i
] ] , debt
[ education
== le
[ i
] ] , main
= paste
( "education = " , le
[ i
] ) , xlab
= "income" , ylab
= "debt" ) abline
( lm
( debt
[ education
== le
[ i
] ] ~ income
[ education
== le
[ i
] ] ) , col
= "red" )
}
par
( op
) la
<- levels
( age_group
)
op
<- par
( mfrow
= c
( 1 , 4 ) )
for ( i
in 1 : nlevels
( age_group
) ) { plot
( income
[ age_group
== la
[ i
] ] , debt
[ age_group
== la
[ i
] ] , main
= paste
( "age_group = " , la
[ i
] ) , xlab
= "income" , ylab
= "debt" ) abline
( lm
( debt
[ age_group
== la
[ i
] ] ~ income
[ age_group
== la
[ i
] ] ) , col
= "red" )
}
par
( op
) ls
<- levels
( seniority_group
)
op
<- par
( mfrow
= c
( 1 , 4 ) )
for ( i
in 1 : nlevels
( seniority_group
) ) { plot
( income
[ seniority_group
== ls
[ i
] ] , debt
[ seniority_group
== ls
[ i
] ] , main
= paste
( "seniority_group = " , ls
[ i
] ) , xlab
= "income" , ylab
= "debt" ) abline
( lm
( debt
[ seniority_group
== ls
[ i
] ] ~ income
[ seniority_group
== ls
[ i
] ] ) , col
= "red" )
}
par
( op
)
(2)根據(jù)VADeaths數(shù)據(jù)集,分別繪制城鎮(zhèn)居民與農(nóng)村居民死亡情況的餅圖,添加標題及圖例說明,并分析圖表。
my_sum
<- colSums
( VADeaths
)
Rural
<- sum
( my_sum
[ grep
( 'Rural' , names
( my_sum
) ) ] )
Urban
<- sum
( my_sum
[ grep
( 'Urban' , names
( my_sum
) ) ] )
pie
( c
( Rural
, Urban
) , labels
= c
( 'Rural' , 'Urban' ) , col
= c
( 'blue' , 'green' ) )
title
( '城鎮(zhèn)居民與農(nóng)村居民死亡情況' )
legend
( 'topleft' , legend
= c
( 'Rural' , 'Urban' ) , fill
= c
( 'blue' , 'green' ) )
(3)在同一畫布上繪制iris數(shù)據(jù)集的4個屬性兩兩之間的散點圖,所得到的結(jié)果如圖5-36所示。 圖5-36 屬性兩兩之間的散點圖
path
= paste0
( 'VADeaths' , '.png' )
png
( filename
= path
)
pie
( c
( Rural
, Urban
) , labels
= c
( 'Rural' , 'Urban' ) , col
= c
( 'blue' , 'green' ) )
title
( '城鎮(zhèn)居民與農(nóng)村居民死亡情況' )
legend
( 'topleft' , legend
= c
( 'Rural' , 'Urban' ) , fill
= c
( 'blue' , 'green' ) )
dev.off
( )
(4)將第(3)題的結(jié)果保存為PNG文件格式,并儲存到當前工作目錄下。
path
= paste
( 'ex5_4' , '.jpg' )
jpeg
( file
= path
)
par
( mfrow
= c
( 2 , 3 ) )
name
<- colnames
( iris
) [ 1 : 4 ]
for ( i
in 1 : 3 ) { for ( j
in ( i
+ 1 ) : 4 ) { plot
( iris
[ , i
] , iris
[ , j
] , xlab
= name
[ i
] , ylab
= name
[ j
] ) }
}
dev.off
( )
第6章 高級繪圖
1.選擇題
(1)下列不能作為lattice包中的繪圖函數(shù) formula輸人的是(B)。 A.x~y B.~y C.x~y|A D.x~y|A-B
(2)下列繪圖函數(shù)不屬于lattice包的是(C)。 A.xyplot B.qq C.qqplot D.qqmath
(3)lattice包中可實現(xiàn)圖形組合的是(C)。 A.par函數(shù) B.layout 函數(shù) C.split參數(shù) D.newpage參數(shù)
(4)下列繪圖函數(shù)與圖形對應(yīng)關(guān)系錯誤的是(A)。 A.histogram-散點圖 B.barchar-條形圖 C.bwplot-箱線圖 D.splom-散點矩陣圖
(5)lattice包中的繪圖函數(shù)的條件變量不能輸入(A)。 A.連續(xù)型變量 B.離散型變量 C.因子型數(shù)據(jù) D.字符型數(shù)據(jù)
(6)一個圖層不包含(D)。 A.data B.aes C.mapping D.geom
(7)下列選項中不能描述坐標系轉(zhuǎn)換關(guān)系的是(C)。 A.餅圖=堆疊長條圖+polar coordinates B.靶心圖=餅圖+polar coordinates C.鋸齒圖=餅圖+polar coordinates D.鋸齒圖=柱狀圖+polar coordinates
(8)ggplot包中實現(xiàn)分面的函數(shù)是(D)。 A.par函數(shù) B.layout 函數(shù) C.split參數(shù) D.facet grid函數(shù)
(9)下列選項不能描述繪圖函數(shù)與圖形對應(yīng)關(guān)系的是(B)。 A.geom abline-線 B.geom_histogram-條形圖 C.geom_boxplot-箱線圖 D.geom_point-點
(10)下列不屬于圖形屬性的是(D)。 A.alpha B.color C.linetype D.ncol
2.操作題
(1)表6-7所示是某銀行的貸款拖欠率的數(shù)據(jù) bankloan.要求使用lattice 包完成以下圖形的繪制。 Link:bankloan.csv ①繪制不同年齡、受教育程度和工齡的客戶的收入與負債的直方圖及密度分布曲線。 ②繪制不同年齡、受教育程度和工齡的客戶的收入與負債的散點圖,并添加回歸線。 ③繪制不同年齡、受教育程度和工齡的客戶違約與否的條形圖。 ④繪制客戶的收人和負債與違約與否的散點圖,非添加logistic回歸線。 表6-7銀行貸款拖欠率數(shù)據(jù)
bankloan
= read.csv
( "./第6章 高級繪圖/02-習(xí)題程序/code/data/bankloan.csv" )
summary
( bankloan
)
bankloan
$ age_group
<- cut
( bankloan
$ age
, breaks
= paste0
( 2 : 6 , 0 ) , include.lowest
= TRUE )
bankloan
$ seniority_group
<- cut
( bankloan
$ seniority
, breaks
= c
( 0 , 1 , 3 , 5 , 10 , 15 , 20 , 30 , 40 ) , include.lowest
= TRUE )
bankloan
$ education
<- factor
( bankloan
$ education
)
bankloan
$ debt
<- bankloan
$ debt_rate
/ 100 * bankloan
$ income
library
( lattice
)
histogram_ie
<- histogram
( ~ income
| education
, data
= bankloan
, layout
= c
( 1 , 5 ) , nint
= 30 , type
= "count" )
densityplot_ie
<- densityplot
( ~ income
, groups
= education
, data
= bankloan
, plot.points
= FALSE , lty
= 1 : 5 , col
= 1 : 5 , key
= list
( title
= "education" , text
= list
( levels
( bankloan
$ education
) ) , column
= 2 , lines
= list
( lty
= 1 : 5 , col
= 1 : 5 ) ) )
plot
( histogram_ie
, position
= c
( 0 , 0 , 0.5 , 1 ) )
plot
( densityplot_ie
, position
= c
( 0.5 , 0 , 1 , 1 ) , newpage
= FALSE )
histogram_ia
<- histogram
( ~ income
| age_group
, data
= bankloan
, layout
= c
( 1 , 4 ) , nint
= 30 , type
= "count" )
densityplot_ia
<- densityplot
( ~ income
, groups
= age_group
, data
= bankloan
, plot.points
= FALSE , lty
= 1 : 4 , col
= 1 : 4 , key
= list
( title
= "age" , text
= list
( levels
( bankloan
$ age_group
) ) , column
= 2 , lines
= list
( lty
= 1 : 4 , col
= 1 : 4 ) ) )
plot
( histogram_ia
, position
= c
( 0 , 0 , 0.5 , 1 ) )
plot
( densityplot_ia
, position
= c
( 0.5 , 0 , 1 , 1 ) , newpage
= FALSE )
histogram_is
<- histogram
( ~ income
| seniority_group
, data
= bankloan
, layout
= c
( 1 , 8 ) , nint
= 30 , type
= "count" )
densityplot_is
<- densityplot
( ~ income
, groups
= seniority_group
, data
= bankloan
, plot.points
= FALSE , lty
= 1 : 8 , col
= 1 : 8 , key
= list
( title
= "seniority" , text
= list
( levels
( bankloan
$ seniority_group
) ) , column
= 2 , lines
= list
( lty
= 1 : 8 , col
= 1 : 8 ) ) )
plot
( histogram_is
, position
= c
( 0 , 0 , 0.5 , 1 ) )
plot
( densityplot_is
, position
= c
( 0.5 , 0 , 1 , 1 ) , newpage
= FALSE )
histogram_de
<- histogram
( ~ debt_rate
| education
, data
= bankloan
, layout
= c
( 1 , 5 ) , nint
= 30 , type
= "count" )
densityplot_de
<- densityplot
( ~ debt_rate
, groups
= education
, data
= bankloan
, plot.points
= FALSE , lty
= 1 : 5 , col
= 1 : 5 , key
= list
( title
= "education" , text
= list
( levels
( bankloan
$ education
) ) , column
= 2 , lines
= list
( lty
= 1 : 5 , col
= 1 : 5 ) ) )
plot
( histogram_de
, position
= c
( 0 , 0 , 0.5 , 1 ) )
plot
( densityplot_de
, position
= c
( 0.5 , 0 , 1 , 1 ) , newpage
= FALSE )
histogram_da
<- histogram
( ~ debt_rate
| age_group
, data
= bankloan
, layout
= c
( 1 , 4 ) , nint
= 30 , type
= "count" )
densityplot_da
<- densityplot
( ~ debt_rate
, groups
= age_group
, data
= bankloan
, plot.points
= FALSE , lty
= 1 : 4 , col
= 1 : 4 , key
= list
( title
= "age" , text
= list
( levels
( bankloan
$ age_group
) ) , column
= 2 , lines
= list
( lty
= 1 : 4 , col
= 1 : 4 ) ) )
plot
( histogram_da
, position
= c
( 0 , 0 , 0.5 , 1 ) )
plot
( densityplot_da
, position
= c
( 0.5 , 0 , 1 , 1 ) , newpage
= FALSE )
histogram_ds
<- histogram
( ~ debt_rate
| seniority_group
, data
= bankloan
, layout
= c
( 1 , 8 ) , nint
= 30 , type
= "count" )
densityplot_ds
<- densityplot
( ~ debt_rate
, groups
= seniority_group
, data
= bankloan
, plot.points
= FALSE , lty
= 1 : 8 , col
= 1 : 8 , key
= list
( title
= "seniority" , text
= list
( levels
( bankloan
$ seniority_group
) ) , column
= 2 , lines
= list
( lty
= 1 : 8 , col
= 1 : 8 ) ) )
plot
( histogram_ds
, position
= c
( 0 , 0 , 0.5 , 1 ) )
plot
( densityplot_ds
, position
= c
( 0.5 , 0 , 1 , 1 ) , newpage
= FALSE )
id_e
<- xyplot
( debt
~ income
| education
, data
= bankloan
, layout
= c
( 1 , 5 ) , panel
= function ( ... ) { panel.lmline
( ... ) panel.xyplot
( ... ) } )
id_a
<- xyplot
( debt
~ income
| age_group
, data
= bankloan
, layout
= c
( 1 , 4 ) , panel
= function ( ... ) { panel.lmline
( ... ) panel.xyplot
( ... ) } )
id_s
<- xyplot
( debt
~ income
| seniority_group
, data
= bankloan
, layout
= c
( 1 , 8 ) , panel
= function ( ... ) { panel.lmline
( ... ) panel.xyplot
( ... ) } )
plot
( id_e
, split
= c
( 1 , 1 , 3 , 1 ) )
plot
( id_a
, split
= c
( 2 , 1 , 3 , 1 ) , newpage
= FALSE )
plot
( id_s
, split
= c
( 3 , 1 , 3 , 1 ) , newpage
= FALSE )
glmdei
<- as.vector
( glm
( default
~ income
, data
= bankloan
, family
= binomial
( link
= "logit" ) ) $ coefficients
)
glmded
<- as.vector
( glm
( default
~ debt_rate
, data
= bankloan
, family
= binomial
( link
= "logit" ) ) $ coefficients
) paneli
<- function ( x
, y
) { panel.curve
( exp
( glmdei
[ 1 ] + glmdei
[ 2 ] * x
) / ( 1 + exp
( glmdei
[ 1 ] + glmdei
[ 2 ] * x
) ) , col
= "red" , lwd
= 1 , lty
= 2 ) panel.xyplot
( x
, y
)
}
paneld
<- function ( x
, y
) { panel.curve
( exp
( glmded
[ 1 ] + glmded
[ 2 ] * x
) / ( 1 + exp
( glmded
[ 1 ] + glmded
[ 2 ] * x
) ) , col
= "red" , lwd
= 1 , lty
= 2 ) panel.xyplot
( x
, y
)
}
glmla_dei
<- xyplot
( default
~ income
, data
= bankloan
, panel
= paneli
)
glmla_ded
<- xyplot
( default
~ income
, data
= bankloan
, panel
= paneld
)
plot
( glmla_dei
, split
= c
( 1 , 1 , 2 , 1 ) )
plot
( glmla_ded
, split
= c
( 2 , 1 , 2 , 1 ) , newpage
= FALSE )
tbankloan
<- table
( bankloan
$ education
, bankloan
$ seniority_group
, bankloan
$ age_group
, bankloan
$ default
)
barchart
( tbankloan
, auto.key
= list
( title
= "default" , columns
= 2 ) )
(2)針對(1)中提及的bankloan的示例表,使用ggplo12包完成以下圖形的繪制。 ①繪制不同年齡、受教育程度和工齡的客戶的收入與負債的直方圖和密度分布曲線。 ②繪制不同年齡、受教育程度和工齡的客戶的收人與負債的散點圖,并添加回歸線。 ③繪制不同年齡、受教育程度和工齡的客戶違約與否的條形圖。 ④繪制客戶的收人和負債與違約與否的散點圖,并添加logistic回歸線。
bankloan
= read.csv
( "./第6章 高級繪圖/02-習(xí)題程序/code/data/bankloan.csv" )
summary
( bankloan
)
bankloan
$ age_group
<- cut
( bankloan
$ age
, breaks
= paste0
( 2 : 6 , 0 ) , include.lowest
= TRUE )
bankloan
$ seniority_group
<- cut
( bankloan
$ seniority
, breaks
= c
( 0 , 1 , 3 , 5 , 10 , 15 , 20 , 30 , 40 ) , include.lowest
= TRUE )
bankloan
$ education
<- factor
( bankloan
$ education
)
bankloan
$ debt
<- bankloan
$ debt_rate
/ 100 * bankloan
$ income
library
( ggplot2
)
library
( grid
)
page
<- function ( x
, y
) viewport
( layout.pos.row
= x
, layout.pos.col
= y
)
hist_ie
<- ggplot
( data
= bankloan
, aes
( x
= income
) ) + geom_histogram
( bins
= 30 , fill
= "#0080ff" ) + facet_grid
( education
~ .
)
density_ie
<- ggplot
( data
= bankloan
, aes
( x
= income
, colour
= education
) ) + geom_density
( ) + theme
( legend.position
= "top" )
grid.newpage
( )
pushViewport
( viewport
( layout
= grid.layout
( 1 , 2 ) ) )
print
( hist_ie
, vp
= page
( 1 , 1 ) )
print
( density_ie
, vp
= page
( 1 , 2 ) )
hist_ia
<- ggplot
( data
= bankloan
, aes
( x
= income
) ) + geom_histogram
( bins
= 30 , fill
= "#0080ff" ) + facet_grid
( age_group
~ .
)
density_ia
<- ggplot
( data
= bankloan
, aes
( x
= income
, colour
= age_group
) ) + geom_density
( ) + theme
( legend.position
= "top" )
grid.newpage
( )
pushViewport
( viewport
( layout
= grid.layout
( 1 , 2 ) ) )
print
( hist_ia
, vp
= page
( 1 , 1 ) )
print
( density_ia
, vp
= page
( 1 , 2 ) )
hist_is
<- ggplot
( data
= bankloan
, aes
( x
= income
) ) + geom_histogram
( bins
= 30 , fill
= "#0080ff" ) + facet_grid
( seniority_group
~ .
)
density_is
<- ggplot
( data
= bankloan
, aes
( x
= income
, colour
= seniority_group
) ) + geom_density
( ) + theme
( legend.position
= "top" )
grid.newpage
( )
pushViewport
( viewport
( layout
= grid.layout
( 1 , 2 ) ) )
print
( hist_is
, vp
= page
( 1 , 1 ) )
print
( density_is
, vp
= page
( 1 , 2 ) )
hist_de
<- ggplot
( data
= bankloan
, aes
( x
= debt_rate
) ) + geom_histogram
( bins
= 30 , fill
= "#0080ff" ) + facet_grid
( education
~ .
)
density_de
<- ggplot
( data
= bankloan
, aes
( x
= debt_rate
, colour
= education
) ) + geom_density
( ) + theme
( legend.position
= "top" )
grid.newpage
( )
pushViewport
( viewport
( layout
= grid.layout
( 1 , 2 ) ) )
print
( hist_de
, vp
= page
( 1 , 1 ) )
print
( density_de
, vp
= page
( 1 , 2 ) )
hist_da
<- ggplot
( data
= bankloan
, aes
( x
= debt_rate
) ) + geom_histogram
( bins
= 30 , fill
= "#0080ff" ) + facet_grid
( age_group
~ .
)
density_da
<- ggplot
( data
= bankloan
, aes
( x
= debt_rate
, colour
= age_group
) ) + geom_density
( ) + theme
( legend.position
= "top" )
grid.newpage
( )
pushViewport
( viewport
( layout
= grid.layout
( 1 , 2 ) ) )
print
( hist_da
, vp
= page
( 1 , 1 ) )
print
( density_da
, vp
= page
( 1 , 2 ) )
hist_ds
<- ggplot
( data
= bankloan
, aes
( x
= debt_rate
) ) + geom_histogram
( bins
= 30 , fill
= "#0080ff" ) + facet_grid
( seniority_group
~ .
)
density_ds
<- ggplot
( data
= bankloan
, aes
( x
= debt_rate
, colour
= seniority_group
) ) + geom_density
( ) + theme
( legend.position
= "top" )
grid.newpage
( )
pushViewport
( viewport
( layout
= grid.layout
( 1 , 2 ) ) )
print
( hist_ds
, vp
= page
( 1 , 1 ) )
print
( density_ds
, vp
= page
( 1 , 2 ) )
ggid_e
<- ggplot
( data
= bankloan
, aes
( x
= income
, y
= debt
) ) + geom_point
( colour
= "#0080ff" ) + stat_smooth
( method
= lm
, lwd
= 0.5 , se
= FALSE ) + facet_grid
( education
~ .
)
ggid_a
<- ggplot
( data
= bankloan
, aes
( x
= income
, y
= debt
) ) + geom_point
( colour
= "#0080ff" ) + stat_smooth
( method
= lm
, lwd
= 0.5 , se
= FALSE ) + facet_grid
( age_group
~ .
)
ggid_s
<- ggplot
( data
= bankloan
, aes
( x
= income
, y
= debt
) ) + geom_point
( colour
= "#0080ff" ) + stat_smooth
( method
= lm
, lwd
= 0.5 , se
= FALSE ) + facet_grid
( seniority_group
~ .
) grid.newpage
( )
pushViewport
( viewport
( layout
= grid.layout
( 1 , 3 ) ) )
print
( ggid_e
, vp
= page
( 1 , 1 ) )
print
( ggid_a
, vp
= page
( 1 , 2 ) )
print
( ggid_s
, vp
= page
( 1 , 3 ) )
glmgg_dei
<- ggplot
( data
= bankloan
, aes
( x
= income
, y
= default
) ) + geom_point
( ) + stat_smooth
( method
= glm
, method.args
= list
( family
= "binomial" ) , se
= FALSE )
glmgg_ded
<- ggplot
( data
= bankloan
, aes
( x
= debt_rate
, y
= default
) ) + geom_point
( ) + stat_smooth
( method
= glm
, method.args
= list
( family
= "binomial" ) , se
= FALSE )
grid.newpage
( )
pushViewport
( viewport
( layout
= grid.layout
( 1 , 2 ) ) )
print
( glmgg_dei
, vp
= page
( 1 , 1 ) )
print
( glmgg_ded
, vp
= page
( 1 , 2 ) )
ggplot
( bankloan
, aes
( x
= education
, fill
= factor
( default
) ) ) + geom_bar
( position
= "stack" ) + facet_grid
( age_group
~ seniority_group
)
(3)結(jié)合(1)與(2)的操作題,用以上兩個操作題中所繪制的圖形創(chuàng)建腳本ui.R和server.R,利用shiny包搭建數(shù)據(jù)可視化平臺demo.
第7章 可視化數(shù)據(jù)挖掘工具Rattle
1.選擇題
(1)下列不屬于Rattle工具功能的是(C)。 A.相關(guān)性分析 B.周期性分析 C.數(shù)據(jù)集成 D.數(shù)據(jù)變換
(2)Rattle工具不能導(dǎo)入(D)。 A.R包的數(shù)據(jù) B.R工作空間的數(shù)據(jù) C.R文件的數(shù)據(jù) D.R命名的數(shù)據(jù)
(3)如果想要查看數(shù)據(jù)總體的概況,那么應(yīng)該運用(A)功能。 A. Summary B. Distribution C.Correlation D.Principal Components
(4)Rattle的交互圖GGobi包不可以實現(xiàn)(D)的綜合使用。 A.散點圖 B.散點矩陣圖 C.三維圖 D.星狀圖
(5)數(shù)據(jù)建模中,聚類分析不能得到的結(jié)果是(C)。 A.聚類的結(jié)果 B.聚類分布圖 C.聚類的評價 D.聚類樣本投影圖
(6)關(guān)聯(lián)規(guī)則的Apriori算法不會默認設(shè)置的是(A)。 A. Basket B.Support C.Confidence D.Min Length
(7)在Model選項中,分類算法需要設(shè)置Min Bucket的值,如果要分成70、15、15的占比,則應(yīng)該設(shè)置為(B)。 A.0.7/0.15/0.15 B.70/15/15 C. 7/1.5/1.5 D.70%/15%/15%
(8)Rattle工具的隨機森林模型不能得到(D)。 A.ROC曲線 B.重要性Gini指數(shù) C.錯誤率 D.分類樹
(9)下列不屬于Rattle工具的模型評估方法的是(B)。 A.混淆矩陣 B.風險評估圖 C.敏感度與特異性圖 D.累計增益圖
(10)下列數(shù)據(jù)挖掘算法中,適合用Rale工具的是(C)。 A.分類與預(yù)測 B.聚類分析 C.智能推薦 D.關(guān)聯(lián)規(guī)則
2.操作題
打開Rattle工具的圖形界面,導(dǎo)人Telephone.csv數(shù)據(jù),并將數(shù)據(jù)按照70:15:15的比例分成訓(xùn)練集、驗證集和測試集。然后對數(shù)據(jù)進行探索,完成描述性統(tǒng)計分析、圖形探索等操作。提示:在Data選項卡中選擇合適的變量構(gòu)建模型,在Model選項卡中選擇合適的分類模型,并對模型進行評估。
資源
百度網(wǎng)盤–課后習(xí)題答案&源代碼&數(shù)據(jù) PS:永久有效,自動填充提取碼,提取碼:1111
Reference
R語言編程基礎(chǔ)-圖書-人郵教育社區(qū) →\rightarrow → 49611-R語言編程基礎(chǔ)-習(xí)題數(shù)據(jù)和答案.rar
總結(jié)
以上是生活随笔 為你收集整理的【R语言编程基础】【课后习题答案】【全】 的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔 推薦給好友。