str_subset用於把有符合查詢的字元或字串整個提取出來,這個函數有一個相同功能的函數str_which,也是將符合查詢的字元或字串提取出來,不過提取的是位置,也就是輸出為數字,以下列出幾個例子供大家參考
基本語法
str_subset(dataset$string, 檢查的字元或字串)
str_which(dataset$string, 檢查的字元或字串)
可用參數
negate = TRUE,預設FALSE,設定為TRUE則返回相反結果
有幾點要注意一下
- " " >> 準確查找,也就是AND,完全符合才行
- "[ ]" >> 模糊查找,也就是OR,部分符合即可
- 小寫與大寫判斷為不同字元,需要完全確定大小寫
設定一組包含NA的數據集
# 包含NA
stringr_df <- tibble(
weekday = c("Sunday", "Monday", "Tuesday", "Wednesday",
"Thursday", "Friday", NA,"Saturday")
)
# 輸出結果
r$> stringr_df
# A tibble: 8 x 1
weekday
<chr>
1 Sunday
2 Monday
3 Tuesday
4 Wednesday
5 Thursday
6 Friday
7 NA
8 Saturday
透過str_subset與str_which進行查找,可以看到str_subset是將符合的整個提取,而str_which則是返回有"a"的字串位置,這裡有一點須注意,就是NA並不在查找範圍,所以不管事字串還是位置,都不會返回NA的結果
# 將有"a"的字串提取出來
str_subset(stringr_df$weekday, "a")
# 輸出結果
r$> str_subset(stringr_df$weekday, "a")
[1] "Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday"
[7] "Saturday"
# 將有"a"的字串位置提取出來
str_which(stringr_df$weekday, "a")
# 輸出結果 >> 跳過NA的位置
r$> str_which(stringr_df$weekday, "a")
[1] 1 2 3 4 5 6 8
查找的規則也與其他函數相同," "進行AND(精確)查找,[ ]進行OR(模糊)查找,大小寫的查找也不相同。
# 精確查找包含"ab"的字串
str_subset(stringr_df$weekday, "ab")
# 輸出結果
r$> str_subset(stringr_df$weekday, "ab")
character(0)
# 模糊查找有"ab"的字串
str_subset(stringr_df$weekday, "[ab]")
# 輸出結果
r$> str_subset(stringr_df$weekday, "[ab]")
[1] "Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday"
[7] "Saturday"
# 精確查找包含"ab"的字串
str_which(stringr_df$weekday, "ab")
# 輸出結果
r$> str_which(stringr_df$weekday, "ab")
integer(0)
# 模糊查找有"ab"的字串
str_which(stringr_df$weekday, "[ab]")
# 輸出結果
r$> str_which(stringr_df$weekday, "[ab]")
[1] 1 2 3 4 5 6 8
加入參數設定為TRUE
# 精確查找不包含"ab"的字串
str_subset(stringr_df$weekday, "ab", negate = TRUE)
# 輸出結果
r$> str_subset(stringr_df$weekday, "ab", negate = TRUE)
[1] "Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday"
[7] "Saturday"
# 精確查找不包含"ab"的字串
str_which(stringr_df$weekday, "ab", negate = TRUE)
# 輸出結果
r$> str_which(stringr_df$weekday, "ab", negate = TRUE)
[1] 1 2 3 4 5 6 8
沒有留言:
張貼留言