2013-01-05 59 views
1

我在R.一個新手這一次,我真的需要讀取數據包括時間,IP和這樣的事情:IP閱讀和一些遺漏值

18:00:04.940864 129.63.50.235.53 > 129.63.71.70.1111: udp 107 
18:00:04.957456 129.63.80.240.161 > 129.63.152.10.39518: udp 151 
18:00:04.958432 129.63.152.10.39518 > 129.63.80.240.161: udp 136 (DF) 
18:00:04.963312 217.79.96.182.53 > 129.63.1.1.1564: udp 48 (DF) 
18:00:05.000976 129.63.50.235.1028 > 218.232.110.133.53: udp 34 
18:00:05.207888 129.63.50.235.1028 > 203.50.0.24.53: udp 32 

我開始

read.table(file='sample.txt',head=F,'%H:%M:%S',sep='') 

比我堅持在那一點,因爲有很少的分離類型:空間,'>'和':' 最後是在那裏可以或不可以有(DF)的最後一個向量。

任何人都可以給我一個想法來解決這種數據?非常感謝

回答

0

這是一個蠻力的方法。

tt <- read.table(header=FALSE, fill=TRUE, stringsAsFactors=FALSE, 
text="18:00:04.940864 129.63.50.235.53 > 129.63.71.70.1111: udp 107 
18:00:04.957456 129.63.80.240.161 > 129.63.152.10.39518: udp 151 
18:00:04.958432 129.63.152.10.39518 > 129.63.80.240.161: udp 136 (DF) 
18:00:04.963312 217.79.96.182.53 > 129.63.1.1.1564: udp 48 (DF) 
18:00:05.000976 129.63.50.235.1028 > 218.232.110.133.53: udp 34 
18:00:05.207888 129.63.50.235.1028 > 203.50.0.24.53: udp 32") 

last <- apply(tt[-(1:4)], 1, paste, collapse=' ') 
tt[,5] <- last 
tt[,4] <- sub(':', '', tt[,4]) 
tt <- tt[c(1,2,4,5)] 

> tt 
##    V1     V2     V4   V5 
## 1 18:00:04.940864 129.63.50.235.53 129.63.71.70.1111  udp 107 
## 2 18:00:04.957456 129.63.80.240.161 129.63.152.10.39518  udp 151 
## 3 18:00:04.958432 129.63.152.10.39518 129.63.80.240.161 udp 136 (DF) 
## 4 18:00:04.963312 217.79.96.182.53  129.63.1.1.1564 udp 48 (DF) 
## 5 18:00:05.000976 129.63.50.235.1028 218.232.110.133.53  udp 34 
## 6 18:00:05.207888 129.63.50.235.1028  203.50.0.24.53  udp 32 
+0

這幫了我很多,在此先感謝 –