搜索
查看: 703|回复: 0

[R] 使用stringr处理字符串数据P3

[复制链接]

14

主题

26

帖子

141

积分

注册会员

Rank: 2

积分
141
发表于 2018-10-18 19:17:29 | 显示全部楼层 |阅读模式
使用stringr处理字符串数据工具匹配检测
Detect the presence or absence of a pattern in a string.
简单理解一下,就是被检测字符串中是否包含想要检测的字符
准备工作
[AppleScript] 纯文本查看 复制代码
1library(tidyverse)
简单运行
[AppleScript] 纯文本查看 复制代码
1x <- c("apple","banana","pear")
2str_detect(x,"e")
3[1]  TRUE FALSE  TRUE
进阶
  • 在R中定义FALSE为0,TRUE为1。这使得可以使用前面所学的数学函数对其进行运算
[AppleScript] 纯文本查看 复制代码
1#分解一下以下的语句
2sum(str_detect(words,"^t"))
3#>head(str_detect(words,"^t"))
4[1] FALSE FALSE FALSE FALSE FALSE FALSE
5#aaa <- str_detect(words,"^t") %>% sum()
6#925 *0 +65 *1 =65
7mean(str_detect(words,"[aeiou]$"))#跟上边的是一样的。
  • 复杂逻辑条件下调用正则表达式
    解释下面的正则表达式
    ^[aeiou]+$
    先解释[]内的含义表示非aeiou
    +号表示重复一次或者多次
    以非aeiou开头的单词重复一次或者多次并以非aeiou结尾的单词。
  • 取子集和filter筛选
  • str_count返回的是每个字符串中需求的字符个数
提取匹配内容
如果我想知道匹配检测为T的单词是什么?就需要对匹配检测的内容进行提取。
注意一点的是str_*系列的函数需要一个string和一个正则表达式才可以(pattern)。
例如:str_detect(string, pattern)
[AppleScript] 纯文本查看 复制代码
1> has_color <- str_subset(sentences,colors)
2> has_color
3[1] "The spot on the blotter was made by green ink."
4[2] "Torn scraps littered the stone floor."         
5[3] "It is hard to erase blue or red ink."          
6[4] "The box is held by a bright red snapper."      
7[5] "Nine men were hired to dig the ruins."         
8[6] "A man in a blue sweater sat at the desk."      
9[7] "The sky in the west is tinged with orange red."
 1has_color <- str_subset(sentences,color_match)
 2has_color
 3 [1] "Glue the sheet to the dark blue background."       
 4 [2] "Two blue fish swam in the tank."                   
 5 [3] "The colt reared and threw the tall rider."         
 6 [4] "The wide road shimmered in the hot sun."           
 7 [5] "See the cat glaring at the scared mouse."          
 8 [6] "A wisp of cloud hung in the blue air."             
 9 [7] "Leaves turn brown and yellow in the fall."         
10 [8] "He ordered peach pie with ice cream."              
11 [9] "Pure bred poodles have curls."                     
12[10] "The spot on the blotter was made by green ink."    
13[11] "Mud was spattered on the front of his white shirt."
14[12] "The sofa cushion is red and of light weight."      
15[13] "The sky that morning was clear and bright blue."   
16[14] "Torn scraps littered the stone floor."             
17[15] "The doctor cured him with these pills."            
18[16] "The new girl was fired today at noon."             
19[17] "The third act was dull and tired the players."     
20[18] "A blue crane is a tall wading bird."               
21[19] "Lire wires should be kept covered."                
22[20] "It is hard to erase blue or red ink."              
23[21] "The wreck occurred by the bank on Main Street."    
24[22] "The lamp shone with a steady green flame."         
25[23] "The box is held by a bright red snapper."          
26[24] "The prince ordered his head chopped off."          
27[25] "The houses are built of red clay bricks."          
28[26] "The red tape bound the smuggled food."             
29[27] "Nine men were hired to dig the ruins."             
30[28] "The flint sputtered and lit a pine torch."         
31[29] "Hedge apples may stain your hands green."          
32[30] "The old pan was covered with hard fudge."          
33[31] "The plant grew large and green in the window."     
34[32] "The store walls were lined with colored frocks."   
35[33] "The purple tie was ten years old."                 
36[34] "Bathe and relax in the cool green grass."          
37[35] "The clan gathered on each dull night."             
38[36] "The lake sparkled in the red hot sun."             
39[37] "Mark the spot with a sign painted red."            
40[38] "Smoke poured out of every crack."                  
41[39] "Serve the hot rum to the tired heroes."            
42[40] "The couch cover and hall drapes were blue."        
43[41] "He offered proof in the form of a lsrge chart."    
44[42] "A man in a blue sweater sat at the desk."          
45[43] "The sip of tea revives his tired friend."          
46[44] "The door was barred, locked, and bolted as well."  
47[45] "A thick coat of black paint covered all."          
48[46] "The small red neon lamp went out."                 
49[47] "Paint the sockets in the wall dull green."         
50[48] "Wake and rise, and step into the green outdoors."  
51[49] "The green light in the brown box flickered."       
52[50] "He put his last cartridge into the gun and fired." 
53[51] "The ram scared the school children off."           
54[52] "Tear a thin sheet from the yellow pad."            
55[53] "Dimes showered down from all sides."               
56[54] "The sky in the west is tinged with orange red."    
57[55] "The red paper brightened the dim stage."           
58[56] "The hail pattered on the burnt brown grass."       
59[57] "The big red apple fell to the ground."  
其实在这里我有一些疑问,为什么要将单字符转换为字符串啊!
后来我读了读帮助,恍然大悟
str_subset() is a wrapper around x[str_detect(x, pattern)], and is equivalent to grep(pattern, x, value = TRUE). str_which() is a wrapper around which(str_detect(x, pattern)), and is equivalent to grep(pattern, x
[AppleScript] 纯文本查看 复制代码
1sum(str_detect(sentences,colors))

2[1] 7
因为它只会返回一个与输入向量具有同样长度的逻辑向量啊!多么痛的领悟!
以后跟if,for,apply一起用吧!




上一篇:小洁详解《R数据科学》--第八章 readr
下一篇:第10章 使用stringr处理字符串P2
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|手机版|小黑屋|生信技能树 ( 粤ICP备15016384号  

GMT+8, 2019-12-16 11:33 , Processed in 0.040835 second(s), 25 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.