2017-10-19 61 views
-1

我有一個這樣的格式的數據,爲了能夠可視化數據我需要將格式更改爲像第二個例子,你知道我可以如何改變這樣的格式?如何轉換我的數據R

The first row is Age range 
0–14 15–24 25–34 35–44 45–54 55–64 65 years and over 
1,873.4 1,088.4 1,296.4 1,157.2 1,207.5 1,177.5 1,498.7 
513.0 351.8 339.1 419.1 485.0 624.1 925.7 
1,049.9 666.4 594.2 682.9 645.7 650.2 727.1 
422.6 287.7 354.1 344.9 400.6 411.5 528.3 
2,069.1 1,234.7 1,429.0 1,310.3 1,323.1 1,229.6 1,514.9 
178.0 306.8 253.8 248.9 178.5 75.2 42.1  
2,247.2 1,541.5 1,682.9 1,559.2 1,501.5 1,304.8 1,557.0 

如何,我可以將我的數據是這樣的:

Age  Count 
0-14 1,873.4 
15-24 1,088.4 
25-34 1,296.4 
35-44 1,157.2 
45-54 1,207.5 
55-64 1,177.5 
65+  1,498.7 
0-14 513.0 
15-24 351.8 
25-34 339.1 
35-44 419.1 
45-54 485.0 
55-64 624.1 
65+  925.7 
0-14 1,049.0 
15-24 666.4 
25-34 594.2 
35-44 682.9 
45-54 645.7 
55-64 650.2 
65+  727.1 
0-14 422.6 
15-24 287.7 
25-34 354.1 
35-44 344.9 
45-54 400.6 
55-64 411.5 
65+  528.3 
+4

請使用'dput'分享你的數據 - 現在它的結構尚不清楚。只需將'dput(your_data)'粘貼到問題中即可。 – Gregor

+1

這個字符是「1,873.4」嗎? – PoGibas

回答

-1

如果您的表被命名爲tab,我會做:

# Load library 
    library(reshape) 

# Use melt() function 
    melt.tab <- melt(tab) 

你要找的是第二和第三行melt.tab;那就是:

melt.tab[,-1] 
0

我會使用reshape2::melt但不同。 首先數據,格式爲dput

Tamra <- 
structure(list(`0–14` = c("1,873.4", "513.0", "1,049.9", "422.6", 
"2,069.1", "178.0", "2,247.2"), `15–24` = c("1,088.4", "351.8", 
"666.4", "287.7", "1,234.7", "306.8", "1,541.5"), `25–34` = c("1,296.4", 
"339.1", "594.2", "354.1", "1,429.0", "253.8", "1,682.9"), `35–44` = c("1,157.2", 
"419.1", "682.9", "344.9", "1,310.3", "248.9", "1,559.2"), `45–54` = c("1,207.5", 
"485.0", "645.7", "400.6", "1,323.1", "178.5", "1,501.5"), `55–64` = c("1,177.5", 
"624.1", "650.2", "411.5", "1,229.6", "75.2", "1,304.8"), `65+` = c("1,498.7", 
"925.7", "727.1", "528.3", "1,514.9", "42.1", "1,557.0")), .Names = c("0–14", 
"15–24", "25–34", "35–44", "45–54", "55–64", "65+"), class = "data.frame", row.names = c(NA, 
-7L)) 

現在的代碼。

molten <- reshape2::melt(Tamra, measure.vars = names(Tamra), 
         variable.name = "Age", value.name = "Count") 
#head(molten) 
# Age Count 
#1 0–14 1,873.4 
#2 0–14 513.0 
#3 0–14 1,049.9 
#4 0–14 422.6 
#5 0–14 2,069.1 
#6 0–14 178.0 
0

解決方案與dplyr + tidyr

library(dplyr) 
library(tidyr) 
df %>% 
    gather(AgeRange, count) %>% 
    mutate(count = as.numeric(gsub(",", "", count))) %>% 
    arrange(rep(1:nrow(df), ncol(df))) 

結果:

AgeRange count 
1  0–14 1873.4 
2  15–24 1088.4 
3  25–34 1296.4 
4  35–44 1157.2 
5  45–54 1207.5 
6  55–64 1177.5 
7  65+ 1498.7 
8  0–14 513.0 
9  15–24 351.8 
10 25–34 339.1 
11 35–44 419.1 
12 45–54 485.0 
13 55–64 624.1 
14  65+ 925.7 
15  0–14 1049.9 
16 15–24 666.4 
17 25–34 594.2 
18 35–44 682.9 
19 45–54 645.7 
20 55–64 650.2 
... 

注:

我還添加了mutate步到counts列轉換爲numeric

數據:

df = structure(list(`0–14` = c("1,873.4", "513.0", "1,049.9", "422.6", 
"2,069.1", "178.0", "2,247.2"), `15–24` = c("1,088.4", "351.8", 
"666.4", "287.7", "1,234.7", "306.8", "1,541.5"), `25–34` = c("1,296.4", 
"339.1", "594.2", "354.1", "1,429.0", "253.8", "1,682.9"), `35–44` = c("1,157.2", 
"419.1", "682.9", "344.9", "1,310.3", "248.9", "1,559.2"), `45–54` = c("1,207.5", 
"485.0", "645.7", "400.6", "1,323.1", "178.5", "1,501.5"), `55–64` = c("1,177.5", 
"624.1", "650.2", "411.5", "1,229.6", "75.2", "1,304.8"), `65+` = c("1,498.7", 
"925.7", "727.1", "528.3", "1,514.9", "42.1", "1,557.0")), .Names = c("0–14", 
"15–24", "25–34", "35–44", "45–54", "55–64", "65+"), class = "data.frame", row.names = c(NA, 
-7L))