2013-08-21 123 views
-1

我有這些數據有兩列。正如你在圖表中看到的那樣,數據有太多的噪音。所以,我想要將大小爲5的列「r」離散化,並將每一行分配到其對應的bin,然後計算每個bin的f的平均值。R計算對應於另一列的每個格的一列的平均值

> dr 
      r  f 
1 65.06919 21.796 
2 62.36986 22.836 
3 59.81639 22.980 
4 57.42822 22.061 
5 55.22681 21.012 
6 53.23533 21.274 
7 51.47815 21.594 
8 49.98000 22.117 
9 48.76474 20.366 
10 47.85394 18.991 
11 47.26521 20.920 
12 47.01064 20.161 
13 47.09565 22.328 
14 47.51842 19.610 
15 48.27007 18.615 
16 49.33559 21.753 
17 50.69517 22.754 
18 52.32590 22.096 
19 54.20332 22.020 
20 56.30275 22.111 
21 58.60034 21.395 
22 61.07373 22.635 
23 63.70243 22.128 
24 66.46804 21.698 
25 62.24147 21.879 
26 59.41380 21.637 
27 56.72742 21.991 
28 54.20332 21.535 
29 51.86521 21.093 
30 49.73932 20.496 
31 47.85394 21.737 
32 46.23851 21.890 
33 44.92215 21.236 
34 43.93177 19.997 
35 43.28972 19.661 
36 43.01163 20.692 
37 43.10452 19.663 
38 43.56604 19.273 
39 44.38468 20.743 
40 45.54119 22.604 
41 47.01064 22.167 
42 48.76474 20.427 
43 50.77401 21.543 
44 53.00943 21.391 
45 55.44367 21.313 
46 58.05170 22.501 
47 60.81118 22.414 
48 63.70243 22.920 
49 59.54830 21.571 
50 56.58622 22.454 
51 53.75872 22.643 
52 51.08816 20.219 
53 48.60041 20.300 
54 46.32494 19.832 
55 44.29447 20.284 
56 42.54409 21.284 
57 41.10961 21.350 
58 40.02499 20.784 
59 39.31921 20.383 
60 39.01282 20.508 
61 39.11521 19.413 
62 39.62323 20.043 
63 40.52160 18.583 
64 41.78516 19.512 
65 43.38202 20.849 
66 45.27693 21.349 
67 47.43416 20.734 
68 49.81967 22.055 
69 52.40229 22.108 
70 55.15433 23.184 
71 58.05170 23.147 
72 61.07373 23.207 
73 57.00877 21.467 
74 53.90733 21.549 
75 50.93133 23.035 
76 48.10405 20.684 
77 45.45327 20.189 
78 43.01163 19.304 
79 40.81666 19.739 
80 38.91015 20.976 
81 37.33631 21.305 
82 36.13862 21.319 
83 35.35534 20.133 
84 35.01428 20.179 
85 35.12834 20.634 
86 35.69314 22.478 
87 36.68787 21.608 
88 38.07887 20.964 
89 39.82462 18.409 
90 41.88078 20.627 
91 44.20407 20.980 
92 46.75468 22.206 
93 49.49747 21.828 
94 52.40229 20.844 
95 55.44367 21.619 
96 58.60034 21.498 
97 54.64430 19.433 
98 51.40039 21.293 
99 48.27007 20.687 
100 45.27693 21.377 
101 42.44997 21.282 
102 39.82462 20.910 
103 37.44329 18.810 
104 35.35534 21.223 
105 33.61547 20.197 
106 32.28002 20.765 
107 31.40064 19.781 
108 31.01612 20.536 
109 31.14482 21.245 
110 31.78050 21.117 
111 32.89377 20.303 
112 34.43835 20.795 
113 36.35932 20.754 
114 38.60052 21.025 
115 41.10961 20.924 
116 43.84062 21.475 
117 46.75468 21.435 
118 49.81967 20.380 
119 53.00943 21.590 
120 56.30275 20.743 
121 52.47857 20.600 
122 49.09175 20.818 
123 45.80393 21.514 
124 42.63801 21.922 
125 39.62323 21.469 
126 36.79674 22.186 
127 34.20526 19.625 
128 31.90611 19.703 
129 29.96665 18.793 
130 28.46050 18.912 
131 27.45906 19.239 
132 27.01851 18.467 
133 27.16616 18.974 
134 27.89265 20.090 
135 29.15476 19.155 
136 30.88689 20.526 
137 33.01515 20.273 
138 35.46830 19.956 
139 38.18377 21.547 
140 41.10961 21.260 
141 44.20407 20.802 
142 47.43416 19.719 
143 50.77401 21.645 
144 54.20332 18.957 
145 50.53712 21.410 
146 47.01064 20.536 
147 43.56604 20.963 
148 40.22437 20.775 
149 37.01351 22.257 
150 33.97058 21.868 
151 31.14482 18.907 
152 28.60070 19.644 
153 26.41969 17.694 
154 24.69818 17.883 
155 23.53720 17.975 
156 23.02173 18.778 
157 23.19483 18.896 
158 24.04163 19.561 
159 25.49510 20.137 
160 27.45906 19.922 
161 29.83287 19.574 
162 32.52691 19.029 
163 35.46830 20.356 
164 38.60052 20.330 
165 41.88078 20.005 
166 45.27693 20.006 
167 48.76474 21.056 
168 52.32590 20.143 
169 48.84670 22.094 
170 45.18849 21.252 
171 41.59327 22.023 
172 38.07887 21.563 
173 34.66987 21.408 
174 31.40064 21.334 
175 28.31960 19.855 
176 25.49510 18.648 
177 23.02173 17.397 
178 21.02380 17.311 
179 19.64688 16.714 
180 19.02630 18.152 
181 19.23538 18.187 
182 20.24846 19.910 
183 21.95450 20.451 
184 24.20744 19.820 
185 26.87006 19.862 
186 29.83287 19.987 
187 33.01515 19.363 
188 36.35932 19.498 
189 39.82462 19.121 
190 43.38202 20.479 
191 47.01064 20.311 
192 50.69517 21.666 
193 47.43416 21.995 
194 43.65776 23.158 
195 39.92493 24.632 
196 36.24914 23.273 
197 32.64966 22.535 
198 29.15476 19.933 
199 25.80698 18.277 
200 22.67157 16.169 

enter image description here

所以,行走trhough的程序,看着每行從第1行開始在將被分配給倉[65-70],行2將在[60-65] ...

然後對於最終結果,我想每個bin的中點和它的f值的平均值。 S,與我可以畫一條線對於f爲f(R)的函數

+0

你可以證明你到目前爲止已經嘗試過嗎?這看起來像一個需求列表,而不是一個問題 - 它有助於展示你自己完成的工作。 –

+0

所有我可以離散列「r」,但不知道如何將每行的值添加到其相應的bin: db <-data.frame( br = as.integer(dr $ r/5) , bf = rep(0,length(dr $ r)) ) –

+0

try * cut * or * hist * – Fernando

回答

2

另外,你可以使用美妙的plyr包。

library(plyr) 
ddply(df, .(cut(df$r, 5)), colwise(mean)) 

但是,如果你要問像上面一個問題,你只是用tapply解決方案的罰款。

+0

在使用ddply命令時,有沒有辦法在數據中有分類或字符屬性的中值? – alily

+0

如何定義分類屬性的中位數?對於中位數工作,你需要有某種自然排序(即'foo'意味着1,'egg'是2等)。如果是這樣的話,最簡單的事情就是用它們的排序來替換分類變量值並取其中值。 – ktdrv

3

由於@Fernando在他的評論中已經提到的,你可以嘗試cut(分檔)和tapply

tapply(df$f, cut(df$r, seq(15, 70, by=5)), mean) 
# (15,20] (20,25] (25,30] (30,35] (35,40] (40,45] (45,50] (50,55] (55,60] (60,65] (65,70] 
#17.68433 18.55918 19.28683 20.49000 20.87942 20.65430 20.96155 21.35146 21.92259 22.57414 21.74700 
相關問題