搜尋感興趣的網誌

所有文章連結

2022年3月29日 星期二

R Packages stringr - str_split | R包stringr - str_split

 



stringr的函數系列進入尾聲,最後一個介紹的是str_split,也就是拆解,拆解完成返回向量或是矩陣的型態


基本語法

str_split(dataset$string, 拆解的字元或字串)

str_split(dataset$string, 拆解的字元或字串, n)


可用參數

simplify = TRUE,預設為FALSE,若為TRUE,則拆解為矩陣

n = 矩陣組數


建立數據框

stringr_df <- tibble(animal = c(
    "dog chase cat",
    "rat eat fish",
    "lion or tiger or wolf")
    )

# 輸出結果
r$> stringr_df # A tibble: 3 x 1 animal <chr> 1 dog chase cat 2 rat eat fish 3 lion or tiger or wolf


將連接的詞句拆解

# 拆解連接詞
str_split(stringr_df$animal, "chase | eat | or")

# 輸出結果
r$> str_split(stringr_df$animal, "chase | eat | or") [[1]] [1] "dog " " cat" [[2]] [1] "rat " " fish" [[3]] [1] "lion " " tiger " " wolf"


# 拆解連接詞並返回矩陣
str_split(stringr_df$animal, "chase | eat | or", simplify = TRUE)

# 輸出結果
r$> str_split(stringr_df$animal, "chase | eat | or", simplify = TRUE) [,1] [,2] [,3] [1,] "dog " " cat" "" [2,] "rat " " fish" "" [3,] "lion " " tiger " " wolf"


拆解完成後返回矩陣型態,n設定為返回的矩陣數,依照設定數字進行拆解返回,僅返回符合設定值的拆解,以例子來看會較清楚,若設定超過字串數,則返回空矩陣

# 拆解連接詞並返回矩陣,n = 1
str_split_fixed(stringr_df$animal, "chase | eat | or", 1)

# 輸出結果
r$> str_split_fixed(stringr_df$animal, "chase | eat | or", 1) [,1] [1,] "dog chase cat" [2,] "rat eat fish" [3,] "lion or tiger or wolf"


# 拆解連接詞並返回矩陣,n = 2
str_split_fixed(stringr_df$animal, "chase | eat | or", 2)

# 輸出結果
r$> str_split_fixed(stringr_df$animal, "chase | eat | or", 2) [,1] [,2] [1,] "dog " " cat" [2,] "rat " " fish" [3,] "lion " " tiger or wolf"


# 拆解連接詞並返回矩陣,n = 3
str_split_fixed(stringr_df$animal, "chase | eat | or", 3)

# 輸出結果
r$> str_split_fixed(stringr_df$animal, "chase | eat | or", 3) [,1] [,2] [,3] [1,] "dog " " cat" "" [2,] "rat " " fish" "" [3,] "lion " " tiger " " wolf"

沒有留言:

張貼留言

其他文章

看看精選文章

納希克房價分析 | Nashik Apartment Price Analyze – 語法解析(上)

  這次 Nashik 的房價分析有上傳至 Kaggle ,有興趣的朋友可以前往閱覽, RMarkdown PDF 報告存放在 Google 雲端,程式碼則是存放於 Github ,照慣例會分享好用的函式語法,雖說基本的 Packages 與語法可能很多人都會完整的閱覽,但是實際...