我從1980年這組數據 - 2004年各月(下面給出它的一部分),但我不知道如何從CSV讀取它並將其轉換成一個矩陣具有這種形式:數據[緯度,經度,時間]在時間從1開始(2004至1980年)* 12讀取數據,並重塑其在r中
...
我從1980年這組數據 - 2004年各月(下面給出它的一部分),但我不知道如何從CSV讀取它並將其轉換成一個矩陣具有這種形式:數據[緯度,經度,時間]在時間從1開始(2004至1980年)* 12讀取數據,並重塑其在r中
...
的數據是在.rda
數據文件已經存在,所以閱讀它很容易。用乾淨的工作空間啓動,請執行以下操作:
load("fedfire8004.rda")
ls() ## What objects were read in?
# [1] "fedfire8004"
str(fedfire8004) ## What does that object look like?
# List of 10
# $ lon : num [1:24] -124 -124 -122 -122 -120 ...
# $ lat : num [1:18] 31.5 32.5 33.5 34.5 35.5 36.5 37.5 38.5 39.5 40.5 ...
# $ x : num [1:25] -125 -124 -123 -122 -121 -120 -119 -118 -117 -116 ...
# $ y : num [1:19] 31 32 33 34 35 36 37 38 39 40 ...
# $ year : int [1:300] 1980 1980 1980 1980 1980 1980 1980 1980 1980 1980 ...
# $ month: int [1:300] 1 2 3 4 5 6 7 8 9 10 ...
# $ acres: num [1:24, 1:18, 1:300] NA NA NA NA NA NA NA NA NA NA ...
# ..- attr(*, "dimnames")=List of 3
# .. ..$ lon : chr [1:24] "-124.5" "-123.5" "-122.5" "-121.5" ...
# .. ..$ lat : chr [1:18] "31.5" "32.5" "33.5" "34.5" ...
# .. ..$ month: chr [1:300] "1980.1" "1980.2" "1980.3" "1980.4" ...
# $ fires: num [1:24, 1:18, 1:300] NA NA NA NA NA NA NA NA NA NA ...
# ..- attr(*, "dimnames")=List of 3
# .. ..$ lon : chr [1:24] "-124.5" "-123.5" "-122.5" "-121.5" ...
# .. ..$ lat : chr [1:18] "31.5" "32.5" "33.5" "34.5" ...
# .. ..$ month: chr [1:300] "1980.1" "1980.2" "1980.3" "1980.4" ...
# $ meta : chr "USFS, NPS, BLM, BIA total fires and acres on 1 degree monthly grid 1980-2004"
# $ cite : chr "Westerling, A.L., T.J. Brown, A. Gershunov, D.R. Cayan and M.D. Dettinger, 2003: Climate and Wildfire in the Western United Sta"| __truncated__
正如你所看到的,核心的數據似乎是acres
和fires
列表項。將這些數據重塑爲long
數據集可能會更方便。執行此操作的最直接方法可能是「reshape2」包中的melt
。
library(reshape2)
Acres <- melt(fedfire8004$acres)
Fires <- melt(fedfire8004$fires)
讓我們來看看每個新對象的前幾行和最後幾行。
head(Acres)
# lon lat month value
# 1 -124.5 31.5 1980.1 NA
# 2 -123.5 31.5 1980.1 NA
# 3 -122.5 31.5 1980.1 NA
# 4 -121.5 31.5 1980.1 NA
# 5 -120.5 31.5 1980.1 NA
# 6 -119.5 31.5 1980.1 NA
tail(Acres)
# lon lat month value
# 129595 -106.5 48.5 2004.12 0
# 129596 -105.5 48.5 2004.12 0
# 129597 -104.5 48.5 2004.12 71
# 129598 -103.5 48.5 2004.12 NA
# 129599 -102.5 48.5 2004.12 NA
# 129600 -101.5 48.5 2004.12 NA
head(Fires)
# lon lat month value
# 1 -124.5 31.5 1980.1 NA
# 2 -123.5 31.5 1980.1 NA
# 3 -122.5 31.5 1980.1 NA
# 4 -121.5 31.5 1980.1 NA
# 5 -120.5 31.5 1980.1 NA
# 6 -119.5 31.5 1980.1 NA
tail(Fires)
# lon lat month value
# 129595 -106.5 48.5 2004.12 0
# 129596 -105.5 48.5 2004.12 0
# 129597 -104.5 48.5 2004.12 2
# 129598 -103.5 48.5 2004.12 NA
# 129599 -102.5 48.5 2004.12 NA
# 129600 -101.5 48.5 2004.12 NA
這很棒。我不知道如何處理.rda文件。非常感謝。 – SaZa
@ user2607526,沒問題。 '.rda'是用於指定R數據文件格式的常用擴展之一。 – A5C1D2H2I1M1N2O1R2T1
你應該(總是)試圖重新組織數據,以便每列包含一種類型的信息:
Year Month Lat Lon Value
python腳本可能b E中的最好的方式做到這一點。一旦你這種風格有它,它會很容易導入和R.
我做了一個腳本,將重新組織你的數據,你分析......但它不是清楚它是否容易讓你運行它。你在做什麼系統?
這裏是腳本...輸出低於...從腳本
#!/usr/bin/env python
import csv
file_obj = open('originaldata.txt', 'r')
Input = csv.reader(file_obj, delimiter='\t')
LineNo = 0
year,month,data = [],[],[]
for items in Input:
if LineNo == 0:
lat = items[2:]
elif LineNo == 1:
lon = items[2:]
else:
year.append(items[0])
month.append(items[1])
data.append(items[2:])
LineNo += 1
# print header
print "%s\t%s\t%s\t%s\t%s"% ("Year","Month","Lat","Lon","Data")
for La,Lo,Ind in zip(lat,lon,range(len(lat))):
for Y,M,D in zip(year,month,data):
print "%s\t%s\t%s\t%s\t%s"% (Y,M,La,Lo,D[Ind])
輸出:
Year Month Lat Lon Data
1980 1 31.5 -111.5 0
1980 2 31.5 -111.5 0
1980 3 31.5 -111.5 0
1980 4 31.5 -111.5 0
1980 5 31.5 -111.5 8.1
1980 6 31.5 -111.5 5.1
1980 7 31.5 -111.5 0
1980 8 31.5 -111.5 0
1980 9 31.5 -111.5 0
1980 10 31.5 -111.5 0
1980 11 31.5 -111.5 0
1980 12 31.5 -111.5 0
1981 1 31.5 -111.5 0
1981 2 31.5 -111.5 0
1981 3 31.5 -111.5 0
1981 4 31.5 -111.5 0
1981 5 31.5 -111.5 0
1981 6 31.5 -111.5 0
1981 7 31.5 -111.5 0
1981 8 31.5 -111.5 0
1981 9 31.5 -111.5 0
1981 10 31.5 -111.5 0
1981 11 31.5 -111.5 0
1981 12 31.5 -111.5 0
1980 1 31.5 -110.5 0
1980 2 31.5 -110.5 0
1980 3 31.5 -110.5 0
1980 4 31.5 -110.5 881
1980 5 31.5 -110.5 794.1
1980 6 31.5 -110.5 644.4
1980 7 31.5 -110.5 85.2
1980 8 31.5 -110.5 0.1
1980 9 31.5 -110.5 0
1980 10 31.5 -110.5 0
1980 11 31.5 -110.5 0
1980 12 31.5 -110.5 0
1981 1 31.5 -110.5 0
1981 2 31.5 -110.5 0
1981 3 31.5 -110.5 0
1981 4 31.5 -110.5 0
1981 5 31.5 -110.5 0
1981 6 31.5 -110.5 0
1981 7 31.5 -110.5 0
1981 8 31.5 -110.5 0
1981 9 31.5 -110.5 0
1981 10 31.5 -110.5 0
加載容易
meaningful.name<-read.csv(file.choose(new = FALSE))
meaningful.name<-as.matrix(meaningful.name)
meaningful.name$time<-1:nrow(meaningful.name)
比我後不知道你在做什麼,你能澄清一下嗎?
請提供[最小的,可再現的數據集(http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610)(即不是屏幕轉儲)以及您嘗試過的代碼。謝謝! – Henrik
@Henrik:數據可以在這裏下載:ulmo.ucmerced.edu/w_FireData.html文件名是FedFire8004.zip – SaZa
@Ananda Mahto:還有一個問題。是否可以將這些文件進行轉換:例如「英畝」從原始格式轉換爲netcdf格式? – SaZa