這是一個完美的解決方案。
library('magrittr')
library('tidyverse')
df <- tribble(
~Category, ~Value,
'Name_01', 10,
'Name_02', 12,
'Name_03', 13,
'Name_04', 11,
'Name_05', 12,
'Name_06', 21,
'Name_07', 3,
'Name_08', 1,
'Name_09', 23,
'Name_10', 1,
'Name_11', 123,
'Name_12', 12,
'Name_13', 1,
'Name_14', 1,
'Name_15', 12,
'Name_16', 1,
'Name_17', 2,
'Name_18', 33,
'Name_19', 21,
'Name_20', 123,
'Name_21', 32,
'Name_22', 23,
'Name_23', 21
)
首先,我們使用parse_number
來提取類別ID。我們使用dense_rank(Category)
來獲得不同的類別數量。我們用這個來增加每20個不同類別的group_id
。我們還根據該組的最小/最大值category_id
創建一個file_name
列。
df2 <- df %>%
mutate(
category_id = parse_number(Category),
group_id = cumsum(dense_rank(Category) %% 20 == 1)) %>%
group_by(group_id) %>%
mutate(file_name = stringr::str_c('names_', min(category_id), '-', max(category_id), '.txt'))
print(df2, n=100)
# # A tibble: 23 x 6
# # Groups: group_id [2]
# Category Value g category_id group_id file_name
# <chr> <dbl> <dbl> <dbl> <int> <chr>
# 1 Name_01 10 1 1 1 names_1-20.txt
# 2 Name_02 12 1 2 1 names_1-20.txt
# 3 Name_03 13 1 3 1 names_1-20.txt
# 4 Name_04 11 1 4 1 names_1-20.txt
# 5 Name_05 12 1 5 1 names_1-20.txt
# 6 Name_06 21 1 6 1 names_1-20.txt
# 7 Name_07 3 2 7 1 names_1-20.txt
# 8 Name_08 1 2 8 1 names_1-20.txt
# 9 Name_09 23 2 9 1 names_1-20.txt
# 10 Name_10 1 2 10 1 names_1-20.txt
# 11 Name_11 123 2 11 1 names_1-20.txt
# 12 Name_12 12 2 12 1 names_1-20.txt
# 13 Name_13 1 2 13 1 names_1-20.txt
# 14 Name_14 1 3 14 1 names_1-20.txt
# 15 Name_15 12 3 15 1 names_1-20.txt
# 16 Name_16 1 3 16 1 names_1-20.txt
# 17 Name_17 2 3 17 1 names_1-20.txt
# 18 Name_18 33 3 18 1 names_1-20.txt
# 19 Name_19 21 3 19 1 names_1-20.txt
# 20 Name_20 123 3 20 1 names_1-20.txt
# 21 Name_21 32 3 21 2 names_21-23.txt
# 22 Name_22 23 3 22 2 names_21-23.txt
# 23 Name_23 21 3 23 2 names_21-23.txt
現在,我們可以將nest
的原始列轉換爲列表列。
df2 <- df2 %>%
group_by(group_id, file_name) %>%
nest(Category, Value)
print(df2, n=100)
# # A tibble: 2 x 3
# group_id file_name data
# <int> <chr> <list>
# 1 1 names_1-20.txt <tibble [20 x 2]>
# 2 2 names_21-23.txt <tibble [3 x 2]>
然後,我們通過walk
使用write_delim
每個data
+ file_name
對和輸出每個文件。
df2 %$%
walk2(
.$data,
.$file_name,
write_delim)
我們可以將上述所有步驟組合到一個管道中。
df %>%
mutate(
category_id = parse_number(Category),
group_id = cumsum(dense_rank(Category) %% 20 == 1)) %>%
group_by(group_id) %>%
mutate(file_name = stringr::str_c('names_', min(category_id), '-', max(category_id), '.txt')) %>%
group_by(group_id, file_name) %>%
nest(Category, Value) %$%
walk2(
.$data,
.$file_name,
write_delim)
道歉,我還沒有找到解決辦法。有沒有人有任何想法? –