日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

1. R语言中grep函数和gsub()函数的使用

發(fā)布時(shí)間:2023/12/14 编程问答 36 豆豆
生活随笔 收集整理的這篇文章主要介紹了 1. R语言中grep函数和gsub()函数的使用 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

1.grep 函數(shù)

1)語(yǔ)法結(jié)構(gòu)

grep(pattern, x, ignore.case = FALSE, perl = FALSE, value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE)
各參數(shù)的含義如下:
(1)pattern: 字符串類型,正則表達(dá)式,指定搜索模式,當(dāng)將fixed參數(shù)設(shè)置為TRUE時(shí),也可以是一個(gè)待搜索的字符串。
(2)x : 字符串向量,用于被搜索的字符串。
(3)ignore.case: 是否忽略大小寫。為FALSE時(shí),大小寫敏感,為TRUE時(shí),忽略大小寫。
(4)perl: 用于指定是否Perl兼容的正則表達(dá)式
(5)value:邏輯值,為FALSE時(shí),grep返回搜索結(jié)果的位置信息,為TRUE時(shí),返回結(jié)果位置的值。
(6)fixed:邏輯值,為TRUE時(shí),按pattern指定的字符串進(jìn)行原樣搜索,且會(huì)忽略產(chǎn)生沖突的參數(shù)設(shè)置。
(7) useBytes:邏輯值,如果為真,則按字節(jié)進(jìn)行匹配,而不是按字符進(jìn)行匹配。
(8)invert:邏輯值,如果為TRUE,則返回未匹配項(xiàng)的索引或值,也就是反向搜索

2) 案例學(xué)習(xí)

(1)提取gene1到gene40中末尾是3的基因,提取末尾不是3的基因,提取末尾是3但不是gene3的基因.

geen = paste0("gene",1:40) # 或者str_c("gene",1:40) # 注意:library(stringr) 1. 含有3的基因 geen[grep("3",geen)] # grep("3",geen,value = T) # [1] "gene3" "gene13" "gene23" "gene30" "gene31" # [6] "gene32" "gene33" "gene34" "gene35" "gene36" # [11] "gene37" "gene38" "gene39" 2.末尾是3的基因 geen[grep("3$",geen)] # 或者grep("3$",geen,value = T) # [1] "gene3" "gene13" "gene23" "gene33" 3.末尾不是3的基因 geen[-grep("3$",geen)] # 或者 grep("3$",geen,invert = T,value = TRUE) # [1] "gene1" "gene2" "gene4" "gene5" "gene6" # [6] "gene7" "gene8" "gene9" "gene10" "gene11" # [11] "gene12" "gene14" "gene15" "gene16" "gene17" # [16] "gene18" "gene19" "gene20" "gene21" "gene22" # [21] "gene24" "gene25" "gene26" "gene27" "gene28" # [26] "gene29" "gene30" "gene31" "gene32" "gene34" # [31] "gene35" "gene36" "gene37" "gene38" "gene39" # [36] "gene40" 4.提取末尾是3但不是gene3的基因. grep("[0-9]3$",geen,value = TRUE) 或者 setdiff(grep("3$",geen,value = T),"gene3") # [1] "gene13" "gene23" "gene33"

3) grep 和grepl的區(qū)別

1.語(yǔ)法結(jié)構(gòu) grep(pattern, x, ignore.case = FALSE, perl = FALSE, value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE);grepl(pattern, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)2. 返回值 grep函數(shù):在向量x中尋找含有特定字符串(pattern參數(shù)指定)的元素,返回其在x中的下標(biāo); grepl函數(shù):返回邏輯向量(TRUEFALSE),即是否包含pattern

2. gsub()函數(shù)

gsub()可以用于字段的刪減、增補(bǔ)、替換和切割,可以處理一個(gè)字段也可以處理由字段組成的向量。

1.用法:gsub(“目標(biāo)字符”, “替換字符”, 對(duì)象)

text1 <- "ABcdEfgh . ljkl MNNM" gsub("Efg","RRR",text1) # #將Efg改為RRR,區(qū)分大小寫# 任何符號(hào),包括空格、Tab和換行都是可以識(shí)別的 gsub(" l","q",text1) # #可識(shí)別空格 # [1] "ABcdEfgh .qjkl MNNM"# 同時(shí)字符可以識(shí)別多個(gè),進(jìn)行批量置換 gsub("M","O",text1) # [1] "ABcdEfgh . ljkl ONNO"# 除此之外,gsub還有其他批量操作的方法 gsub("^.*l(j).*$","\\1",text1) ##只保留一個(gè)j # [1] "j"gsub("^.* ", "a", text1) #選擇從開頭到最后一個(gè)空格(注意字符"^.* "后引號(hào)前有一個(gè)空格)替換為a # [1] "aMNNM"gsub(" .*","a",text1) #第一個(gè)空格直達(dá)結(jié)尾替換成agsub("\\..*","\\+",text1) # #句號(hào).和加號(hào)+是特殊的,要添加\\來(lái)識(shí)別 # [1] "ABcdEfgh +"gsub("\\ ..*","",text1) # [1] "ABcdEfgh"gsub("\\.","\\+",text1) # [1] "ABcdEfgh + ljkl MNNM" gsub("\\s","a",text1) # [1] "ABcdEfgha.aljklaMNNM"

2. 特殊字符

Syntax Description \\d Digit, 0,1,2 ... 9 \\D Not Digit \\s Space \\S Not Space \\w Word \\W Not Word \\t Tab \\n New line ^ Beginning of the string $ End of the string \ Escape special characters, e.g. \\ is "\", \+ is "+" | Alternation match. e.g. /(e|d)n/ matches "en" and "dn" ? Any character, except \n or line terminator [ab] a or b [^ab] Any character except a and b [0-9] All Digit [A-Z] All uppercase A to Z letters [a-z] All lowercase a to z letters [A-z] All Uppercase and lowercase a to z letters i+ i at least one time i* i zero or more times i? i zero or 1 time i{n} i occurs n times in sequence i{n1,n2} i occurs n1 - n2 times in sequence i{n1,n2}? non greedy match, see above example i{n,} i occures >= n times [:alnum:] Alphanumeric characters: [:alpha:] and [:digit:] [:alpha:] Alphabetic characters: [:lower:] and [:upper:] [:blank:] Blank characters: e.g. space, tab [:cntrl:] Control characters [:digit:] Digits: 0 1 2 3 4 5 6 7 8 9 [:graph:] Graphical characters: [:alnum:] and [:punct:] [:lower:] Lower-case letters in the current locale [:print:] Printable characters: [:alnum:], [:punct:] and space [:punct:] Punctuation character: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~ [:space:] Space characters: tab, newline, vertical tab, form feed, carriage return, space [:upper:] Upper-case letters in the current locale [:xdigit:] Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f

3. sub()和gsub()函數(shù)有什么區(qū)別

text <- c("we are the world", "we are the children") sub("w", "W", text) #第一個(gè)句子有兩個(gè)w,但sub()只識(shí)別第一個(gè)相應(yīng)的字符 # [1] "We are the world" "We are the children" sub("W","w",text) # [1] "we are the world" "we are the children" gsub("W","w",text) #gsub()識(shí)別全部對(duì)應(yīng)的字符 # [1] "we are the world" "we are the children" gsub("w","W",text) # [1] "We are the World" "We are the children"

1.sub()和gsub()的區(qū)別在于,前者只替換第一次匹配的字符串,而后者會(huì)替換掉所有匹配的字符串。
2.gsub()是對(duì)向量里面的每個(gè)元素進(jìn)行搜素,如果發(fā)現(xiàn)元素里面有多個(gè)位置匹配了模式,則全部進(jìn)行替換,而grep()也是對(duì)向量里每個(gè)元素進(jìn)行搜索,但它僅僅知道元素是否匹配了模式(并返回該元素在向量中的下標(biāo)),但具體元素中匹配了多少次卻無(wú)法知道。

總結(jié)

以上是生活随笔為你收集整理的1. R语言中grep函数和gsub()函数的使用的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。