stringr的函數系列進入尾聲,最後一個介紹的是str_split,也就是拆解,拆解完成返回向量或是矩陣的型態
基本語法
str_split(dataset$string, 拆解的字元或字串)
str_split(dataset$string, 拆解的字元或字串, n)
可用參數
simplify = TRUE,預設為FALSE,若為TRUE,則拆解為矩陣
n = 矩陣組數
建立數據框
stringr_df <- tibble(animal = c(
"dog chase cat",
"rat eat fish",
"lion or tiger or wolf")
)
# 輸出結果
r$> stringr_df
# A tibble: 3 x 1
animal
<chr>
1 dog chase cat
2 rat eat fish
3 lion or tiger or wolf
將連接的詞句拆解
# 拆解連接詞
str_split(stringr_df$animal, "chase | eat | or")
# 輸出結果
r$> str_split(stringr_df$animal, "chase | eat | or")
[[1]]
[1] "dog " " cat"
[[2]]
[1] "rat " " fish"
[[3]]
[1] "lion " " tiger " " wolf"
# 拆解連接詞並返回矩陣
str_split(stringr_df$animal, "chase | eat | or", simplify = TRUE)
# 輸出結果
r$> str_split(stringr_df$animal, "chase | eat | or", simplify = TRUE)
[,1] [,2] [,3]
[1,] "dog " " cat" ""
[2,] "rat " " fish" ""
[3,] "lion " " tiger " " wolf"
拆解完成後返回矩陣型態,n設定為返回的矩陣數,依照設定數字進行拆解返回,僅返回符合設定值的拆解,以例子來看會較清楚,若設定超過字串數,則返回空矩陣
# 拆解連接詞並返回矩陣,n = 1
str_split_fixed(stringr_df$animal, "chase | eat | or", 1)
# 輸出結果
r$> str_split_fixed(stringr_df$animal, "chase | eat | or", 1)
[,1]
[1,] "dog chase cat"
[2,] "rat eat fish"
[3,] "lion or tiger or wolf"
# 拆解連接詞並返回矩陣,n = 2
str_split_fixed(stringr_df$animal, "chase | eat | or", 2)
# 輸出結果
r$> str_split_fixed(stringr_df$animal, "chase | eat | or", 2)
[,1] [,2]
[1,] "dog " " cat"
[2,] "rat " " fish"
[3,] "lion " " tiger or wolf"
# 拆解連接詞並返回矩陣,n = 3
str_split_fixed(stringr_df$animal, "chase | eat | or", 3)
# 輸出結果
r$> str_split_fixed(stringr_df$animal, "chase | eat | or", 3)
[,1] [,2] [,3]
[1,] "dog " " cat" ""
[2,] "rat " " fish" ""
[3,] "lion " " tiger " " wolf"
沒有留言:
張貼留言